skillindiajobs
Hyderabad Jobs
Banglore Jobs
Chennai Jobs
Delhi Jobs
Ahmedabad Jobs
Mumbai Jobs
Pune Jobs
Vijayawada Jobs
Gurgaon Jobs
Noida Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

SRE / Production Support

6.00 to 10.00 Years   Hyderabad   01 Dec, 2021
Job LocationHyderabad
EducationNot Mentioned
SalaryNot Disclosed
IndustryBanking / Financial Services
Functional AreaProduction
EmploymentTypeFull-time

Job Description

Key Responsibilities Application Support & Monitoring:

  • Monitor infrastructure, servers, middleware, databases, and batch jobs.
  • Aggressively respond to service requests from business partners facing support teams, Operations, Risk/control partners, etc.
  • Troubleshoot environment, data control and operational issues.
  • Create and Maintain documentation to ensure knowledge accessibility.
  • Automate and streamline process using scripts and scheduling tools.
  • Liaise with other application support teams and internal/external business and technical partners.
  • Provide ad hoc and on-demand reports.
  • Perform timely escalation of critical issues and proactively identify patterns of recurring issues to improve production.
  • Lead problem resolution and conduct root cause analysis and establish processes that will help incident prevention.
  • Participates in the Incident and Problem Management processes as a resolver accountable for root cause analysis, resolution and reporting.
  • Ensures that all production changes are processed according to Change Management policies and procedures.
  • Ensures that appropriate levels of Quality Assurance have been met for all new and existing products.
  • Support Sustained Resiliency, Disaster Recovery, and High Availability events.
  • Help Level2 operation team with setting up monitoring and bridging the gaps in current monitoring setup.
  • Play key part in setting up reporting and be a key component in Monitor -> Report -> Improve principle
Incident Management:
  • Coordinate incident management coverage, to ensure appropriate coverage.
  • Call facilitation, coordination and communications during critical outage situations.
  • Call documentation, queue management, ticket analysis and interface to impacting lines of business for incident impact analysis via the Production Assurance process.
  • End to end view of issues for objectivity.
  • Influence senior technology leads across organizations to ensure timely resolution of incidents
Problem Management:
  • Participate and ensure RCA (root cause analysis) activities on client impacting incidents are executed and action items are assigned / completed.
  • Provide expertise and support during critical incidents, interfacing with all impacted groups to better manage the message.
  • Chronic issue coordination and leadership.
  • Guidance to all staff involved and vendors in driving a coordinated approach for results.
Hygiene and Capacity Maintenance:
  • Responsible for data quality of PLM.
  • Work aggressively to make sure all servers are up to company standards as per uptimes, patch level etc.
  • Work on Capacity planning for applications, estimating and analysing growth rates of vital infrastructure components and adding capacity pro actively as and when required.
Know Your Application:
  • Understand application code, work flow and business usage of application.
  • Understand DB component of application.
  • Understand the impacts of application based on seasonality of critical applications.
  • Document known errors and play important role in Knowledge transfer to Level 1 team.
  • Reduce escalations to Level 3 based on incremental learning about applications.
Qualifications
  • Minimum 6 years of relevant Information Technology experience.
  • Should be able to provide 24/7 on-call support.
  • Proven experience in incident/problem management with a good understanding of any of the tools used for this purpose.
  • Understanding of SRE concepts and a proven experience working on automation or application development using any programing language.
  • Solid technical skills including knowledge of client server technology, networking basics, database technology, end to end understanding of 3-tier application architecture (frontend application server database).
  • Good understanding of both UNIX and Windows operating systems
  • Good understanding of web hosting technologies like apache / tomcat or other equivalent web/app servers.
  • Good understanding of Big Data & cloud concepts.
  • Good understanding of database technologies like ORACLE and SQL.
  • Good understanding of monitoring tools is an added advantage.
  • Excellent communication skills, both verbal and written, with the ability to lead/manage large conference calls.
  • Comfortable providing clear problem descriptions and guidance to business users in a time critical environment.
  • Ability to be proactive with a strong bias for action, naturally inquisitive, and bias for continuous improvement of practices / process.
  • Excellent influence, negotiation and presentation skills.
  • Experience in working with cross line of business teams, Outside Service Providers and Partner Organizations.
  • Outstanding interpersonal skills and ability to establish strong relationships with all levels of management.
  • Solid understanding of the major functionality bundled into a release, both from a technology and business point of view.
  • Strong knowledge of relevant applications and development life cycles.
  • Experience working with geographically distributed and culturally diverse work-groups.
  • Strong desire to learn new technology.
  • Ability to work independently as a self-starter, and within a team environment.
Comfortable in a fast dynamic environment with an ability to handle multiple tasks simultaneously. The ability to work on-call nights/weekends as needed., Key Responsibilities Application Support & Monitoring:
  • Monitor infrastructure, servers, middleware, databases, and batch jobs.
  • Aggressively respond to service requests from business partners facing support teams, Operations, Risk/control partners, etc.
  • Troubleshoot environment, data control and operational issues.
  • Create and Maintain documentation to ensure knowledge accessibility.
  • Automate and streamline process using scripts and scheduling tools.
  • Liaise with other application support teams and internal/external business and technical partners.
  • Provide ad hoc and on-demand reports.
  • Perform timely escalation of critical issues and proactively identify patterns of recurring issues to improve production.
  • Lead problem resolution and conduct root cause analysis and establish processes that will help incident prevention.
  • Participates in the Incident and Problem Management processes as a resolver accountable for root cause analysis, resolution and reporting.
  • Ensures that all production changes are processed according to Change Management policies and procedures.
  • Ensures that appropriate levels of Quality Assurance have been met for all new and existing products.
  • Support Sustained Resiliency, Disaster Recovery, and High Availability events.
  • Help Level2 operation team with setting up monitoring and bridging the gaps in current monitoring setup.
  • Play key part in setting up reporting and be a key component in Monitor -> Report -> Improve principle
Incident Management:
  • Coordinate incident management coverage, to ensure appropriate coverage.
  • Call facilitation, coordination and communications during critical outage situations.
  • Call documentation, queue management, ticket analysis and interface to impacting lines of business for incident impact analysis via the Production Assurance process.
  • End to end view of issues for objectivity.
  • Influence senior technology leads across organizations to ensure timely resolution of incidents
Problem Management:
  • Participate and ensure RCA (root cause analysis) activities on client impacting incidents are executed and action items are assigned / completed.
  • Provide expertise and support during critical incidents, interfacing with all impacted groups to better manage the message.
  • Chronic issue coordination and leadership.
  • Guidance to all staff involved and vendors in driving a coordinated approach for results.
Hygiene and Capacity Maintenance:
  • Responsible for data quality of PLM.
  • Work aggressively to make sure all servers are up to company standards as per uptimes, patch level etc.
  • Work on Capacity planning for applications, estimating and analysing growth rates of vital infrastructure components and adding capacity pro actively as and when required.
Know Your Application:
  • Understand application code, work flow and business usage of application.
  • Understand DB component of application.
  • Understand the impacts of application based on seasonality of critical applications.
  • Document known errors and play important role in Knowledge transfer to Level 1 team.
  • Reduce escalations to Level 3 based on incremental learning about applications.
Qualifications
  • Minimum 6 years of relevant Information Technology experience.
  • Should be able to provide 24/7 on-call support.
  • Proven experience in incident/problem management with a good understanding of any of the tools used for this purpose.
  • Understanding of SRE concepts and a proven experience working on automation or application development using any programing language.
  • Solid technical skills including knowledge of client server technology, networking basics, database technology, end to end understanding of 3-tier application architecture (frontend application server database).
  • Good understanding of both UNIX and Windows operating systems
  • Good understanding of web hosting technologies like apache / tomcat or other equivalent web/app servers.
  • Good understanding of Big Data & cloud concepts.
  • Good understanding of database technologies like ORACLE and SQL.
  • Good understanding of monitoring tools is an added advantage.
  • Excellent communication skills, both verbal and written, with the ability to lead/manage large conference calls.
  • Comfortable providing clear problem descriptions and guidance to business users in a time critical environment.
  • Ability to be proactive with a strong bias for action, naturally inquisitive, and bias for continuous improvement of practices / process.
  • Excellent influence, negotiation and presentation skills.
  • Experience in working with cross line of business teams, Outside Service Providers and Partner Organizations.
  • Outstanding interpersonal skills and ability to establish strong relationships with all levels of management.
  • Solid understanding of the major functionality bundled into a release, both from a technology and business point of view.
  • Strong knowledge of relevant applications and development life cycles.
  • Experience working with geographically distributed and culturally diverse work-groups.
  • Strong desire to learn new technology.
  • Ability to work independently as a self-starter, and within a team environment.
Comfortable in a fast dynamic environment with an ability to handle multiple tasks simultaneously. The ability to work on-call nights/weekends as needed.,

Keyskills :
bias for actionroot cause analysisbig dataroot causeweb hostingdata qualitydata controlclient serveroperation teamimpact analysistechnical skillsmonitoring toolsqueue managementquality assurance

SRE / Production Support Related Jobs

© 2020 Skillindia All Rights Reserved