Job Description
We are seeking an experienced and hands-on Director of Site Reliability Engineering (SRE) to lead and scale our global SRE team. In this pivotal role, you will architect, build, and maintain the critical infrastructure that powers all Saviynt products. This includes driving reliability, scalability, and security across our cloud services while fostering strong partnerships with product, engineering, and security teams. Your leadership will directly impact the performance and resilience of our IGA, External, PAM, AAG, ISPM, and NHI product portfolios.
Key Responsibilities:
Lead and mentor a high-performing team of Site Reliability Engineers, fostering a culture of operational excellence, continuous learning, and innovation.
Design and implement scalable, highly available, and secure infrastructure solutions in public cloud environments (AWS preferred).
Collaborate with engineering, product, and security teams to develop and enforce SRE best practices, tools, and frameworks that support efficient development and deployment.
Drive automation across all infrastructure processes, including deployment, monitoring, incident response, and capacity planning.
Develop and refine incident management protocols, leading post-incident reviews to ensure continuous improvement.
Define and track key performance indicators (KPIs) for system reliability, availability, and performance.
Implement robust security practices, ensuring compliance with industry standards such as FedRAMP and other certifications.
Influence and contribute to architectural decisions to improve system design and resilience.
What We re Looking For:
Proven leadership experience in SRE, DevOps, or Production Engineering roles, with at least 5+ years managing operations teams in fast-paced, high-growth environments.
Deep, hands-on technical expertise in distributed systems, cloud infrastructure (especially AWS), and performance optimization.
Strong background in automation, CI/CD pipelines, configuration management, and infrastructure as code (IaC) tools.
Experience with monitoring, alerting, and incident management tools, coupled with a proactive approach to system health and performance.
Solid understanding of disaster recovery strategies, backup solutions, and business continuity planning.
Proficiency in cloud security principles, including authentication, encryption, anomaly detection, and risk mitigation strategies.
A track record of building and scaling reliable, secure, and cost-effective hybrid cloud infrastructures.
Excellent communication and collaboration skills, with the ability to influence stakeholders across technical and non-technical teams.
Preferred Qualifications:
Experience with certifications like FedRAMP, SOC 2, ISO 27001, etc.
Strong problem-solving skills, with the ability to make data-driven decisions in high-pressure situations.
Passion for mentoring and developing talent within technical teams.
Job Classification
Industry: Software Product
Functional Area / Department: Strategic & Top Management
Role Category: Top Management
Role: CTO
Employement Type: Full time
Contact Details:
Company: Saviynt
Location(s): Bengaluru
Keyskills:
Automation
Information security
Configuration management
SOC
Disaster recovery
ISO 27001
Incident management
System design
Distribution system
Monitoring