skillindiajobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Site Reliability Engineer (DevOps) - Product Development

4.00 to 8.00 Years   Australia, United Kingdom, United States of America, Pune, Singapore   01 Nov, 2021
Job LocationAustralia, United Kingdom, United States of America, Pune, Singapore
EducationNot Mentioned
SalaryNot Disclosed
IndustryManagement Consulting / Strategy
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

Summary of the Position

Looking for people who are part of a product development company, especially cater to the Machine Learning and Big data domain, currently based in Pune and can join immediately / within a month. Hands-on experience in Network troubleshooting experience with Python / Bash Scripting and hands-on experience in Linux Based system is mandatory.

About the Organisation:

HQ in Singapore, it has offices in Singapore, Sydney, London, and New York but it services the marketing needs of organisations in every corner of the globe. Their petabyte-scale data platform with a key focus on finding solutions that can support the Machine Learning product road-map.

About the Role

  • In this role, you will be working on bleeding edge hybrid cloud/on-premise infrastructure handing billions of events and terabytes of data a day.
  • You will be responsible for working closely with various engineering teams to design, build and maintain a globally distributed infrastructure footprint.
  • As part of role, you will be responsible for researching new technologies, managing a large fleet of active services and their underlying servers, automating the deployment, monitoring and scaling of components and optimizing the infrastructure for cost and performance.

Day-to-day responsibilities Ensure the operational integrity of the global infrastructure Design repeatable continuous integration and delivery systems Test and measure new methods, applications and frameworks Analyze and leverage various AWS-native functionality Support and build out an on-premise data centre footprint Provide support and diagnose issues to other teams related to our infrastructure Participate in 24/7 on-call rotation ( No night shift involved, only on call support if required)

Candidates Profile:

Essential Qualifications Expert-level administrator of Linux-based systems Expert-level scripting with Python or Bash Prior experience of managing monitoring platform Prometheus, Grafana to the extend of writing custom metrics. Prior experience of designing alerts with Alert Manager and integration with PagerDuty. Prior experience of managing large infrastructure deployments using Ansible or equivalent Configuration Management tools. Prior Experience in automating provisioning and managing Hybrid-Cloud infrastructure (AWS and On-Prem) at scale with terraform. Flexible working hours and ability to participate in 24/7 on call support with other team members. Working Knowledge of managing distributed data platforms (Kafka, Spark, Cassandra, etc) Aerospike experience is a plus. Working knowledge with continuous delivery systems (Jenkins, Gitlab, BitBucket, Docker) Network troubleshooting experience (TCP, DNS, IPv6 and tcpdump) Experience managing hundreds to thousands of servers globally. Ability to troubleshoot problems in complex systems Ability to adapt to a rapidly changing environment Comfortable collaborating and supporting a diverse team of engineers Enjoy automating tasks, rather than repeating them

Candidate Profile:

  • Minimum 3 Years experience as a Site Reliability Engineer / DevOps in a product development company
  • Someone who can join immediately / Within a month
  • hands-on experience in AWS infrastructure and ES6, EC2, Lambda etc.
  • Hands-on experience in Python / Bash Scripting
  • Used tools like Prometheus and Grafana
  • Hands-on experience in Alert Management System
  • Excellent Communication Skill
Qualifications

© 2020 Skillindia All Rights Reserved