Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Site Reliability Engineer @ GreyOrange

Home > Devops

 Senior Site Reliability Engineer

Job Description

We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization. The SRE team at GreyOrange is responsible for monitoring the stability and availability of mission-critical production systems, managing incidents for quicker resolution, and establishing BAU. The team also manages and maintains internal tools/infra which is consumed by other development teams.
The experienced SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.
Requirements
  • Should have 5 to 8 years of experience
  • Well-versed with scripting/programming languages (Python/Bash/PowerShell, etc.) to automate manual work, particularly within cloud environments
  • Well-versed with Observability tools (Grafana, Splunk, Dynatrace) for monitoring, alerting, and logging solutions to identify and address potential issues, especially in cloud infrastructure
  • Working experience with automation tools (Jenkins, GitLab, Ansible/Chef for configuration management) and processes to streamline deployment, monitoring, and management of systems and applications in the cloud
  • Hands-on experience with containerization and orchestration technologies such as Docker, Kubernetes, or similar, particularly in cloud-native environments
  • Well aware of SLI, SLO, SLA, and Error Budget concepts and their implementations; provide on-call support and participate in incident management & response activities as needed
  • Expert with troubleshooting production issues and bugs.
  • Good knowledge of Unix systems, networking, web technologies, and databases.
  • Incident Management experience coupled with effective communication skills for production workload.
  • Working knowledge in any one of the cloud platforms (AWS or GCP)
What youll do:
  • Lead reliability engineering projects and drive them to closure.
  • Ensure system stability and high availability by proactively monitoring performance and troubleshooting issues
  • Design, build and maintain efficient, reliable, and scalable cloud-based infrastructure and services
  • Automate processes and find opportunities to improve the observability and availability of the Platform to reduce toil.
  • Implement and manage observability tools for comprehensive monitoring, alerting, and logging
  • Own end-to-end availability and performance of different services & tools.
  • Practice sustainable incident response and blameless postmortems.
  • Provide on-call support for incident management and participate actively in response activities

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: GreyOrange
Location(s): Noida, Gurugram

+ View Contactajax loader


Keyskills:   Unix Networking Powershell Configuration management Incident management Troubleshooting Monitoring Python System administration Capacity planning

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Senior Devops Engineer

  • Harita Techserv
  • 7 - 9 years
  • Chennai
  • 23 hours ago
₹ Not Disclosed

Devops Engineer

  • Keyutech
  • 3 - 6 years
  • Bengaluru
  • 1 day ago
₹ Not Disclosed

Azure Devops Engineer

  • SeaCross Recruitment
  • 4 - 8 years
  • Pune
  • 2 days ago
₹ 5.5-10 Lacs P.A.

Senior Autonomous Database Engineer with Devops

  • Oracle
  • 6 - 10 years
  • Pune
  • 3 days ago
₹ Not Disclosed

GreyOrange

GreyOrange Pvt Ltd GreyOrange Pvt Ltd