Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Lead Software Engineer - Site Reliability @ freshworks

Home > Devops

 Lead Software Engineer - Site Reliability

Job Description

Job Description

Key Responsibilities

  • Design and implement tools to improve availability, latency, scalability, and system health.

  • Define SLIs/SLOs, manage error budgets, and drive performance engineering efforts.

  • Build and maintain automated monitoring, alerting, and remediation pipelines.

  • Collaborate with engineering teams to improve reliability by design.

  • Lead incident response, root cause analysis, and blameless postmortems.

  • Champion observability across services logs, metrics, traces.

  • Contribute to infrastructure architecture, automation, and reliability roadmaps.

  • Advocate for SRE best practices across teams and functions.

Requirements

  • 4-12 years of experience in SRE, DevOps, or Production Engineering roles.

  • Coding Proficiency: Develop clear, efficient, and well-structured code.

  • Linux Expertise: In-depth knowledge of Linux for system administration and advanced troubleshooting.

  • Containerization & Orchestration: Practical experience with Docker and Kubernetes for application deployment and management.

  • CI/CD Management: Design, implement, and maintain Continuous Integration and Continuous Delivery pipelines.

  • Security & Compliance: Understand security best practices and compliance in infrastructure.

  • High Availability & Scalability: Design and implement highly available, scalable, and resilient distributed systems.

  • Infrastructure as Code (IaC) & Automation: Proficient in IaC tools and automating infrastructure provisioning and management.

  • Disaster Recovery (DR) & High Availability (HA): Deep knowledge and practical experience with various DR and HA strategies.

  • Observability: Implement and utilize monitoring, logging, and tracing tools for system health.

  • System Design (Distributed Systems): Design complex distributed systems with a focus on reliability and operations.

  • Problem-Solving & Troubleshooting: Excellent analytical and diagnostic skills for resolving complex system issues.Qualifications

Technical Skills & Experience

  • Extensive hands-on experience of 4-12 Years with relational databases (e.g., MySQL, PostgreSQL, SQL Server) and distributed NoSQL systems (e.g., Cassandra, MongoDB, DynamoDB).

  • Proven track record of designing and operating databases in large-scale cloud-native environments (AWS, GCP, Azure).

  • Strong programming skills in Python, Go, or Bash for building infrastructure tooling and automation frameworks.

  • Expertise with Infrastructure as Code (Terraform, Helm, Ansible) and Kubernetes for managing production database systems.

  • Deep knowledge of database replication, clustering, backup/restore, and failover techniques.

  • Advanced experience with observability tooling (Prometheus, Grafana, Datadog, New Relic) for monitoring distributed databases.

  • Strong communication skills and ability to influence across teams and levels.

  • Degree in Computer Science, Engineering, or related field.

  • Experience building and scaling services in production with high uptime targets (99.99%+).

  • Clear track record of reducing incident frequency and improving response metrics (MTTD/MTTR).

  • Strong communicator who thrives in high-pressure environments.

  • Passionate about automation, chaos engineering, and making things just work.

Job Classification

Industry: Software Product
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: Freshworks
Location(s): Chennai

+ View Contactajax loader


Keyskills:   Automation Compliance Linux Coding Postgresql MySQL Troubleshooting SQL Python System administration

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Application Support Engineer

  • Accenture
  • 3 - 8 years
  • Ahmedabad
  • 4 days ago
₹ Not Disclosed

Custom Software Engineer

  • Accenture
  • 2 - 5 years
  • Hyderabad
  • 4 days ago
₹ Not Disclosed

DevOps Engineer

  • Accenture
  • 3 - 6 years
  • Pune
  • 4 days ago
₹ Not Disclosed

Aws Devops Engineer

  • Capgemini
  • 4 - 9 years
  • Bengaluru
  • 9 days ago
₹ Not Disclosed

freshworks

Freshworks makes it fast and easy for businesses to delight their customers and employees. We do this by taking a fresh approach to building and delivering software that is affordable, quick to implement, and designed for the end-user. More than 50,000 companies -- from startups to public companies ...