Manager, Site Reliability Engineering (cortex Xdr Xsiam) @ Palo Alto Networks

Home > Software Development

Manager, Site Reliability Engineering (cortex Xdr Xsiam)

Palo Alto Networks
12 - 17 years
Bengaluru
1 month ago
Email to a friend
Report this job

Job Description

Your Career

Were seeking an experienced hands-on Cloud SRE manager to lead high-severity incident and problem management across our GCP-centric platforms. This role combines deep technical troubleshooting with process ownership, ensuring rapid recovery, root cause elimination, and long-term reliability improvements. You will own L3 OnCall responsibilities, drive post-incident learning, and champion automation and operational excellence.

Implement and lead post-mortem processes within SLAs, identify root causes, and drive corrective actions to reduce repeat incidents.

More information about the Cortex product can be found

Your Impact

In your technical and leadership capacity you will contribute to a seamless production site reliability operations , partnering closely with regional and global SRE counterparts with special attention to the below
Incident Analysis & Problem Management: Implement and lead post-mortem processes within SLAs, identify root causes, and drive corrective actions to reduce repeat incidents. Establish and maintain a problem backlog, ensuring timely resolution and continuous process improvement.
Troubleshooting: Rapidly diagnose and resolve failures across Kubernetes, Terraform, and GCP using advanced troubleshooting frameworks.
Preventative Measures: Implement automation and enhanced monitoring to proactively detect issues and reduce incident frequency.
Stakeholder Communication: Work with GCP / AWS TAMs and other vendors to request new features or followups for updates.
Mentorship: Coach and elevate SRE and DevOps teams, promoting best practices in reliability and incident/problem management.
Documentation: Establish and maintain a problem backlog, ensuring timely resolution and continuous process improvement.
Envision the future or SRE with AI/ML : Ability to envision how a modern SRE team should operate leveraging AI/ML

Qualifications

Your Experience

12+ years of experience in SRE/DevOps/Infrastructure roles, with a strong foundation in GCP cloud-based environments.
5+ years of proven experience managing SRE/DevOps teams, preferably with a strong focus on Google Cloud Platform (GCP).
Deep hands-on knowledge of Terraform, Kubernetes (GKE), GitLab CI/CD, and modern observability practices (e.g., Prometheus, OpenTelemetry).
Strong knowledge in Data Platforms like BIgQuery , Cassandra , Kafka , PostgreSQL and MySQL is mandatory.
Strong experience in managing incident response and postmortems, reducing MTTR, and driving proactive reliability improvements.
Proficiency with cloud platforms such as GCP & AWS.
Solid grasp of Infrastructure as Code, container orchestration, and scalable cloud architectures.
Track record of building tools for system reliability, automated remediation, and performance tuning.
Experience leveraging AI/ML-based operations tools for automation, anomaly detection, and predictive alerting is a plus.
Expertise in SLI/SLO/SLA design and implementation, and driving operational maturity through data.
Strong interpersonal and leadership skills, with a demonstrated ability to coach, mentor, and inspire teams.
Effective communicator, capable of translating complex technical concepts to non-technical stakeholders.
Committed to inclusion, collaboration, and creating a culture where every voice is heard and respected.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Engineering Manager
Employement Type: Full time

Contact Details:

Company: Palo Alto Networks
Location(s): Bengaluru

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: site reliability engineering gke kubernetes google cloud platform sre ai site reliability sla alerting postgresql reliability engineering cassandra gcp devops kafka mysql gitlab terraform prometheus aws infrastructure as code

Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

CTO - Quantum Engineering - Developer

Wipro

2 - 7 years

Bengaluru

4 days ago

₹ Not Disclosed

Software Engineering - Full Stack Engineer_Java Full Stack Development

Trigent Software

6 - 8 years

Bengaluru

4 days ago

₹ Not Disclosed

S&c Global Network- Mc- Industry X- Digital Engineering R&d- Analyst

Accenture

6 - 11 years

Mumbai

12 days ago

₹ Not Disclosed

Software Development Manager, Delivery Choices

Amazon

7 - 12 years

Hyderabad

15 days ago

₹ Not Disclosed

Palo Alto Networks

Palo Alto Networks (India) Technologies Pvt. Ltd

Manager, Site Reliability Engineering (cortex Xdr Xsiam) @ Palo Alto Networks

Home > Software Development