Job Description
We are seeking a Principal Site Reliability Developer (IC4) to join Oracle Cloud Infrastructure (OCI). This role blends software engineering expertise with site reliability engineering (SRE) principles , ensuring our large-scale distributed systems are reliable, observable, and efficient. As a senior technical leader, you will design and implement solutions that improve service availability, scalability, and performance, while mentoring others and driving best practices across teams.
Responsibilities Design, develop, and deploy software to improve the reliability, scalability, and efficiency of Oracle Cloud Infrastructure (OCI).
Build automation frameworks to eliminate manual toil and prevent recurring issues.
Implement observability practices including metrics, logging, and tracing to ensure system health and proactive monitoring.
Lead deployments, capacity planning, and demand forecasting to support large scale distributed systems.
Conduct performance analysis, system tuning, and incident response to maintain service excellence.
Influence architecture and standards for distributed systems, driving best practices across teams.
Provide technical leadership and mentorship to engineers, fostering a culture of reliability and innovation.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time
Contact Details:
Company: Oracle
Location(s): Hyderabad
Keyskills:
build automation
Demand forecasting
Service excellence
Technical leadership
Infrastructure
Performance analysis
Oracle
Distribution system
Monitoring
Capacity planning