We are seeking a Senior Site Reliability Engineer (SRE) with a strong background in software engineering and database engineering to join our growing database team. This role is ideal for engineers who are passionate about building scalable systems, automating operational processes, and ensuring system reliability, availability, and performance across complex distributed services.
As a Senior SRE, you will play a critical role in designing and implementing database infrastructure solutions that support our production environments, improve deployment pipelines, and enable seamless application delivery. Your unique blend of development expertise and database acumen will be key to driving initiatives across reliability, observability, and performance tuning.
What Your Responsibilities Will Be
Design, build, and maintain scalable and resilient database infrastructure for mission-critical systems.
Develop and maintain CI/CD pipelines and automated operational processes using tools like Gitlab, Terraform, etc.
Implement observability best practices including logging, monitoring, tracing, and alerting (e.g., Prometheus, Grafana, Loki).
Collaborate with development teams to ensure system designs are scalable, maintainable, and secure.
Manage and optimize relational and non-relational databases (e.g., PostgreSQL, MySQL, MongoDB , Snowflake & Kafka) with a focus on high availability and performance tuning.
Lead root cause analysis and postmortems for major incidents; drive long-term reliability improvements and contribute to internal tooling, automation framework and infrastructure-as-code.
Good Exposure on Frontend: Angular 14 & Bootstrap 5, Backend: Python Flask & General Skills: UI/UX Design Principles, Version Controls.
Participate in database on-call rotations to respond to system incidents, ensure uptime SLAs are met and promote DB SRE best practices across teams.
What Youll Need to be Successful
8+ years of experience in Software engineering, DevOps, or DB SRE roles.
Proven programming experience with Angular 14, Bootstrap 5 & Python Flask
Strong experience in database engineering: Schema design, Query optimization, replication, and backup/restore strategies and other Database Administration tasks.
Expertise with containerization (Docker) and orchestration platforms (Kubernetes).
Deep understanding of distributed systems, networking, and cloud-native architectures (AWS & GCP)
Familiarity with security practices related to infrastructure and data handling.
Experience with infrastructure-as-code tools (Terraform, etc.).
Strong analytical and troubleshooting skills in complex production database environments and work independently