Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Software Engineer - Platform Engineering & SRE @ Equinix

Home > Software Development

 Senior Software Engineer - Platform Engineering & SRE

Job Description

Who are we
Job Summary
We are looking for a highly skilled and motivated Platform Engineering SRE to join our team. As a Platform Engineering SRE, you will play a critical role in developing, maintaining and improving the reliability, scalability, and performance of our systems, ensuring seamless user experiences. This position blends software engineering and systems engineering expertise to create automated solutions for operational challenges.
Responsibilities
Reliability and Performance
  • Ensure the high availability, reliability, and performance of production systems and services
  • Implement and maintain disaster recovery plans and procedures
  • Monitor and manage system health using metrics, logs, and tracing to proactively identify and resolve issues
Automation and Infrastructure:
  • Automate repetitive tasks, including deployment, scaling, monitoring, and remediation of systems
  • Build and maintain infrastructure as code ( IaC ) using tools like Terraform, CloudFormation, or similar
Incident Management
  • Participate in incident response and troubleshooting efforts to minimize downtime and resolve issues quickly
  • Conduct root cause analysis for system failures and implement preventive measures to avoid future incidents
  • Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence
  • Maintain incident response playbooks and ensure efficient on-call rotations
Observability and Monitoring
  • Design and implement monitoring solutions using tools like Prometheus, Grafana, Datadog, or similar
  • Define and track SLIs, SLOs, and SLAs to measure and improve system performance
Collaboration
  • Work closely with development, QA, and operations teams to ensure smooth delivery of applications
  • Act as a bridge between software engineering and operations, advocating for DevOps best practices
  • Document system configurations, processes, and procedures to ensure knowledge sharing and maintain system integrity
Capacity and Scalability
  • Conduct capacity planning and optimize system scalability to meet future demands
  • Implement strategies for horizontal and vertical scaling of applications
Security and Compliance
  • Ensure infrastructure security by implementing best practices and addressing vulnerabilities
  • Collaborate with the security team to meet compliance standards and audits
Data Engineering Automation
  • Design, develop, and maintain scalable and efficient data pipelines
  • Automate data workflows for ETL/ELT processes, integrating data from various sources into data warehouses and other storage solutions
  • Develop and maintain solutions for data transformation, data modelling, and automate the orchestration of data processing
Data Warehouse Management
  • Design, implement, and maintain modern data warehouse architectures, ensuring effective data storage, retrieval, and accessibility
  • Work with cloud-based data warehouses (e.g., BigQuery , Snowflake, Redshift) and optimize data models for analytics and reporting
  • Develop and manage dimensional models, star/snowflake schemas, and data marts for operational and analytical use cases
Real-time and Batch Data Processing
  • Build and manage real-time and batch data pipelines for high-volume data ingestion, processing, and analytics
  • Leverage technologies such as Apache Kafka, Apache Beam, Apache Spark, and Google Cloud Dataflow for streaming and batch processing
Qualifications
Experience
  • 8+ years of experience in a Data Platform including Site Reliability Engineering, DevOps, or Systems Engineering role
Technical Skills
  • Strong programming skills in languages such as Python, Java, or similar
  • Experience in developing Data ingestion pipelines, Governance, Quality and automation
  • Proficiency in cloud platforms such as Google Cloud (Mandatory), AWS, Azure
  • Experience in leveraging AI/ML models to enhance efficiency in data platforms and improve monitoring capabilities
  • Hands-on experience with CI/CD pipelines using tools like GitHub Actions, Jenkins
  • Expertise in containerization and orchestration technologies like Docker and Kubernetes
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK Stack)
Methodologies
  • Knowledge of Software Engineering, Data Modelling and SDLC
  • Deep understanding of SRE principles, including SLIs, SLOs, and error budgets
  • Knowledge of incident management frameworks and root cause analysis techniques
Soft Skills
  • Strong analytical and problem-solving skills
  • Excellent communication and collaboration abilities
Preferred Qualifications
  • Familiarity with configuration management tools (e.g., Ansible, Puppet, Chef)
  • Background in performance testing and load testing
  • Knowledge of networking concepts and protocols (e.g., TCP/IP, DNS)
Tools Technologies
  • Google Cloud Platform
  • Python, Java, SQL
  • Apache Beam/Spark/Google Cloud Dataflow
  • Apache Ai rflow
  • Prometheus, Grafana, ELK Stack, Terraform, Ansible, Puppet, Github Actions, Kafka, Docker, Kubern etes

Job Classification

Industry: Internet
Functional Area / Department: Engineering - Software & QA,
Role Category: Software Development
Role: Software Development - Other
Employement Type: Full time

Contact Details:

Company: Equinix
Location(s): Bengaluru

+ View Contactajax loader


Keyskills:   Automation Networking Analytical DNS Data processing Incident management Troubleshooting Operations Analytics SQL

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Engineer /senior Engineer - (mcu Rtos)

  • Einfochips
  • 5 - 10 years
  • Hyderabad
  • 14 hours ago
₹ Not Disclosed

QA Automation & Infrastructure Engineer

  • FCS Software Solutions
  • 10 - 20 years
  • Noida, Gurugram
  • 2 days ago
₹ Not Disclosed

Senior Principal Technical Consultant

  • Oracle
  • 14 - 17 years
  • Hyderabad
  • 2 days ago
₹ Not Disclosed

Hiring - SAP Ariba Implementation - Hexaware Technologies

  • Hexaware Technologies
  • 7 - 12 years
  • Chennai
  • 2 days ago
₹ Not Disclosed

Equinix

Amazon.com strives to be Earth\'s most customer-centric company where people can find and discover virtually anything they want to buy online. By giving customers more of what they want - low prices, vast selection, and convenience - Amazon.com continues to grow and evolve as a world-class e-commerc...