You will develop software/tools and provide hands-on technical expertise to design, deploy, and optimize Cloud services
You will build automation using industry-standard tools such as Chef, Jenkins, Terraform, Spinnaker, etc, to deploy services
Coordinate release cycles for services, deploy code, integrate with CI/CD tools, and monitor updates
Will come up with plans to improve security and, availability of the services
Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
Identify system bottlenecks and recommend solutions to solve the availability issue
Participate in On-Call and drive any issues found to resolution and also contribute to postmortems
Proactively work on improving efficiency by setting clear requirements and optimizing system resource usage
Evangelize SRE principles and guide development team to build reliable services
You will build automation and tools that will increase the productivity of teams
Qualification:
Have proven ability as SRE in Cloud engineering
You have experience in automation and tool development
You have at least 3 years plus of experience building Cloud services and distributed systems - deployment, monitoring, scaling, debugging
You are proficient in multi cloud environments: AWS, Azure
Have experience writing applications using Go, Python, or JavaScript
Have experience in scaling to the limit with high efficiency services
You have designed resilient solutions to ensure reliability
You enjoy working with a large variety of services and technologies
Have provided detailed reporting and analysis through metrics and logs
You have experience with container technologies: Kubernetes, Docker
Experience with NewRelic, Splunk, Prometheus will be a plus
Job Classification
Industry: IT Services & Consulting Functional Area / Department: Engineering - Software & QA Role Category: Software Development Role: Search Engineer Employement Type: Full time