skillindiajobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Lead, Systems Reliable Engineer

8.00 to 12.00 Years   Mumbai City   12 Jun, 2022
Job LocationMumbai City
EducationNot Mentioned
SalaryNot Disclosed
IndustryBanking / Financial Services
Functional AreaGeneral / Other Software
EmploymentTypeFull-time

Job Description

    Recruitment assessments - some of our roles use assessments to help us understand how suitable you are for the role youve applied to. If you are invited to take an assessment, this is great news. It means your application has progressed to an important stage of our recruitment process.Role Responsibilities Strategy Resiliency
    • Lead / part of SRE team to enhance application and infrastructure resiliency of service through self-healing and automated failovers - target a 99.9% up-time to customers.
    • Oversee the planned / unplanned disruption of production infrastructure to ensure accountability for building resilient, always-on systems.
    • Build resilience into the application so underlying system failures are handled gracefully and do not impact end users. Influence design/development teams to always be thinking of the rainy-day scenarios.
    Efficiency
    • Identify opportunities to eliminate all manual and repeatable activities (toil) via tooling and automation.
    • Reduce the number of repeat incidents by permanently fixing the underlying root cause of issues.
    Capacity Planning
    • Develop automated predictive analysis of future capacity needs and drive the proactive upgrade of service capacity well in advance.
    • Using Standard Chartered s SDI (Software Defined Infrastructure) develop auto-scaling to deliver robust resilience to fluctuations in critical service demand.
    • Continuously monitor service demand / capacity for any discrepancies or spikes.
    Business Availability / Reliability
    • Take responsibility for meeting SLA / XLA expectations around the operability and reliability of our critical user service journeys, where our customers expect a 24x7 digital service offering. Examples of always on techniques to be used include caching, circuit breakers, dark and canary releases, store and service patterns and alternate user experience flows.
    • Lead, own, manage, monitor and optimize the reliability and health of all environments.
    • Design, code, implement break fixes to improve service availability based on outcomes from thematic reviews.
    Latency & Performance
    • Drive conversation around development velocity using SLIs / SLOs data to ensure development velocity vs. service reliability is optimized in partnership with Product Teams.
    • Iteratively review SLI / SLO / Error Budget policy to ensure the quantitative indicators of customer experience are accurate.
    • Where an increased focus on reliability is required influence senior stakeholders to ensure resourcing / effort is made available.
    Processes Transition to Production
    • Champion and evolve continuous delivery best practice standards to reduce release related incidents, manual hands-off and achieve our aspiration of zero ops .
    • Partner with development teams to ensure applications are designed with scale, resilience, and performance in mind.
    • Responsible for the continuous improvement of the service level through timely analysis of data and corrective actions derived
    • Provides performance reports and analysis, trends and pro-active recommendations based on in-depth trend analysis
    • Drive and report achievements through key performance indicators and success stories
    • Responsible for creating a framework for periodic reporting and identification for areas of improvement in operational performance parameters
    • Define and help to develop interactive dashboards / analytical reports
    • Lead department s branding and marketing strategy, manage and standardize communications channels and publications
    • Support department head on general and team administrative, strategy execution, transformation, stakeholders and vendor management
    • Streamline and simplify processes to support agile and speed to markets while maintaining high level of controls
    Monitoring
    • Optimize monitoring to reduce false positive alerts.
    • Creatively deepen monitoring capabilities leveraging the 3 tenets of observability logs, metrics and traces.
    • Ensure all critical user service journeys are traceable end to end.
    • Ensure Production Solutions are fit for purpose. Where gaps are identified put a plan in place to uplift the toolset.
    People and Talent
    • Establish and manage SRE team when applicable.
    • Drive efficient target operating model and enhance the existing capabilities of the team.
    • Lead through example and build the appropriate culture and values.
    • Ensure the provision of ongoing training and development of people, and ensure that holders of all critical functions are suitably skilled and qualified for their roles ensuring that they have effective supervision in place to mitigate any risks.
    • Set and monitor job descriptions and objectives for direct reports and provide feedback and rewards in line with their performance against those responsibilities and objectives.
    Risk Management
    • Identify key issues in the business areas being supported, and based on this information, put in place appropriate controls and measures to assess, monitor, control & mitigate risks.
    • Ensure a full understanding of the risk and control environment within Technology Services.
    • Ensure support procedures are in place and adhere to Group Security & Audit policies within Technology Services.
    • Active engagement with all audit issues arising in this support environment.
    Governance
    • Responsible for assessing the effectiveness of the governance, oversight and controls and, if necessary, oversee changes in these areas
    • Awareness and understanding of the regulatory framework, in which the Group operates, and the regulatory requirements and expectations relevant to the role.
    Regulatory & Business Conduct
    • Display exemplary conduct and live by the Group s Values and Code of Conduct.
    • Take personal responsibility for embedding the highest standards of ethics, including regulatory and business conduct, across Standard Chartered Bank. This includes understanding and ensuring compliance with, in letter and spirit, all applicable laws, regulations, guidelines and the Group Code of Conduct.
    • Effectively and collaboratively identify, escalate, mitigate and resolve risk, conduct and compliance matters.
    Our Ideal Candidate
    • Relevant degree in Computer Science / Technology and evidence of continuous professional development in an IT role
    • Certified Scrum Master
    • More than 8 years of overall IT experience in application development, production support, DevOps
    • Or SRE & added support engineer.
    Any experience in 1 or more of the following technical skills will be added advantage.Technical Skills
    • Batch management / Scheduling tools such as Control M, Autosys.
    • Monitoring / analytical tools such as ITRS, Grafana, ELK, Prometheus, Sysdig.
    • Microservices management tools such as Kong, Istio, Jersey.
    • In-memory / Distributed databases like Redis, MongoDB, Cassandra, Hazelcast, Postgres.
    • DevOps tool stack such as Jenkins, Ansible, BitBucket, GIT, Maven, Docker, Rundeck.
    • Messaging / Streaming tools like Kafka, Active MQ, iMFT, FileID.
    • Programming / Scripting languages such as Python, Perl, Power shell, Bash, Java / J2EE, Angular JS, Spring & Microservices.
    • Basics of Oracle/MS SQL database administration.
    • Basics of Unix / Linux, Windows administration, OpenShift, Kubernetes.
    • AWS / Azure cloud platform support.
    Soft skills
    • Excellent communicator with strong command in English.
    • Expressive, creative and ability to think out-of-box.
    • Enthusiastic problem solver with exceptional analytical skill.
    • Strong team player but able to complete tasks independently with minimal guidance.
    • Never settle mindset, always challenge the status quo.
    • High learning aptitude, able to learn new products and technology very quickly.
    Visit our careers website www.sc.com/careers ,

Keyskills :
keeping things simplekey performance indicatorscontrol mroot causesql databasehuman skillsservice levellife insurance

Lead, Systems Reliable Engineer Related Jobs

© 2020 Skillindia All Rights Reserved