Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Site Reliability Engineer III - Ansible/Terraform @ GreyOrange

Home > Devops

 Senior Site Reliability Engineer III - Ansible/Terraform

Job Description

Responsibilities :

-Define and enforce SLOs, SLIs, and error budgets across microservices

-Architect an observability stack (metrics, logs, traces) and drive operational insights

-Automate toil and manual ops with robust tooling and runbooks

-Own incident response lifecycle: detection, triage, RCA, and postmortems

-Collaborate with product teams to build fault -tolerant systems

-Champion performance tuning, capacity planning, and scalability testing

-Optimise costs while maintaining the reliability of cloud infrastructure

Must have Skills :

6+ years in SRE/Infrastructure/Backend related roles using Cloud Native Technologies

-2+ years in SRE -specific capacity

-Strong experience with monitoring/observability tools (Datadog, Prometheus, Grafana, ELK etc.)

-Experience with infrastructure -as -code (Terraform/Ansible)

-Proficiency in Kubernetes, service mesh (Istio/Linkerd), and container orchestration

-Deep understanding of distributed systems, networking, and failure domains

-Expertise in automation with Python, Bash, or Go

-Proficient in incident management, SLAs/SLOs, and system tuning

-Hands -on experience with GCP (preferred)/AWS/Azure and cloud cost optimisation

-Participation in on -call rotations and running large -scale production systems

Nice to have skills :

-Familiarity with chaos engineering practices and tools (Gremlin, Litmus)

-Background in performance testing and load simulation (Gatling, Locust, k6, JMeter)

Why us ?

-You will be working with a lean team of passionate and talented individuals. We know that working with like -minded people is important.

-We are on a mission to supercharge brick -and -mortar retail stores in the era of e

-commerce. Our customers give us confidence in our journey, and you will have a huge impact with your work.

-You will be free to experiment and can choose to do things differently.

-Lastly, we deeply care about a culture of being a solver. Come, be one with us!

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: GreyOrange
Location(s): Noida, Gurugram

+ View Contactajax loader


Keyskills:   Ansible DevOps Azure Site Reliability Terraform Google Cloud Platform Prometheus AWS Grafana Monitoring Tools

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Application Developer - Cloud FullStack

  • IBM
  • 3 - 5 years
  • Hyderabad
  • 1 day ago
₹ Not Disclosed

SW Dev Ops Engineer IV

  • NCR Corporation
  • 10 - 15 years
  • Hyderabad
  • 1 day ago
₹ Not Disclosed

Application Developer-Cloud FullStack

  • IBM
  • 6 - 8 years
  • Hyderabad
  • 2 days ago
₹ Not Disclosed

Application Developer-Cloud FullStack

  • IBM
  • 3 - 5 years
  • Bengaluru
  • 2 days ago
₹ Not Disclosed

GreyOrange

GreyOrange Pvt Ltd GreyOrange Pvt Ltd