skillindiajobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Site Reliability Engineer, GFN

8.00 to 10.00 Years   Bangalore   03 Apr, 2022
Job LocationBangalore
EducationNot Mentioned
SalaryNot Disclosed
IndustryIT - Hardware / Networking
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

    Site Reliability Engineering is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across different systems, networking, coding, database, capacity management, continuous delivery and deployment.NVIDIA is looking for a Site Reliability Engineer (SRE) to join its GeForce Now (GFN) team. SRE at NVIDIA ensures that our internal and external facing GPU cloud gaming services have reliability and uptime as promised to the users and at the same time enabling developers to make changes to the existing system through careful preparation and planning while keeping an eye on capacity, latency and performance. SRE is also a mindset and a set of engineering approaches to running better production systems and optimizations. Much of our software development focuses on eliminating manual work through automation, performance tuning and growing efficiency of production systems. As SREs are responsible for the big picture of how our systems relate to each other, we use a breadth of tools and approaches to tackle a broad spectrum of problems. Practices such as limiting time spent on reactive operational work , blameless postmortems and proactive identification of potential outages factor into iterative improvement that is key to both product quality and interesting and dynamic day-to-day work.What you ll be doing:
    • Designing and implementing critical high-performance, large-scale services and libraries
    • Building data pipelines for collecting & processing data from multiple data sources: from the point of ingestion to useful insight
    • Design and build data ingestion services for handling trillions of events monthly.
    • Design and build AI Inferencing services with A/B testing ML models
    • Partner with our other engineering and business teams to integrate your amazing innovations and algorithms into our production systems
    • Supervising performance and advising any vital infrastructure changes
    • Be part of GeForce Now team to triage and debug issues during oncall.
    • Automate everything for measuring, testing, updating, monitoring, and alerting the data platform
    • Build large scale distributed message queues with industry standard security mechanisms
    • Oversee Data Engineering development priorities
    What we need to see:
    • Bachelors or Master s degree in Computer Science or a related technical field with 8 years of software engineering experience
    • Proficient understanding of distributed computing principles
    • Excellent SW development skills in one or more: Java/Scala/Python/C
    • Excellent knowledge on building cloud-native applications
    • Experience in automating deployment, management, scaling, networking with Dockers and Kubernetes etc
    • Background with using monitoring tools like Prometheus or Zabbix
    • Excellent interpersonal skills including the ability to identify and communicate data driven insights
    Ways to stand out from the crowd:
    • Knowledge of ML model validation and online/offline/inferencing
    • Knowledge of MLOps, ML observability
    • Experience working with enterprise developers building ML/DL models, or data analytics applications
    ,

Keyskills :
javaacademicsacpalgorithmsandroiddata analyticsproduct qualitycomputer sciencemodel validationmonitoring toolsdata engineeringperformance tuningproduction systemscapacity managementcontinuous deliverysystems engineeringspectrum management

Site Reliability Engineer, GFN Related Jobs

© 2020 Skillindia All Rights Reserved