The Red Hat OpenShift Cluster Infrastructure team is looking for a Senior Software Engineer to join us in India. Our team plays a pivotal role in bridging OpenShift's capabilities with major infrastructure service providers like Azure, AWS, GCP, and vSphere. We are responsible for the core components like Machine API (MAPI) and Cluster API (CAPI), which manage the OpenShift lifecycle and its integration with cloud providers. We lead the development of MAPI and CAPI providers that enable customers to declaratively provision and operate clusters. Additionally we own the Cloud Controller Manager (CCM), which ensures seamless integration with native cloud services like load balancers and networking. For this role, we are seeking a passionate engineer to focus on ensuring the exceptional quality and reliability of these critical integrations. You will apply your software engineering expertise to design, build, and scale sophisticated automation frameworks and quality-focused tooling that are vital to our success.
What will you do?
Design, build, and maintain scalable and robust test automation frameworks in Golang to validate the functionality, performance, and reliability of OpenShift's cloud integrations
Develop and integrate new automated tests into our Prow-based Continuous Integration (CI) system, analyzing results and improving signal reliability
Ensure the ongoing stability and security of components like the Machine API (MAPI) by addressing high-priority bugs, handling CVEs, and performing necessary backports and rebases
Collaborate closely with the development team to understand new features, identify potential quality gaps, and advocate for highly testable software design
Engage in deep troubleshooting and root cause analysis of complex issues found across the platform
Contribute to the core product codebase, with a focus on improving system testability, reliability, and overall quality
Mentor other engineers on automation best practices and help foster a culture of quality throughout the team
Participate in and contribute to relevant upstream open-source communities
What will you bring?
5+ years of professional experience in software engineering
Proficiency in at least one object-oriented or procedural programming language, preferably Golang (or C, C++, Python with a strong willingness to learn Go)
Demonstrable experience in designing and developing test automation frameworks or related tooling
Solid understanding of Linux/UNIX operating systems
Experience with container technologies (e.g., Docker, Podman) and a strong understanding of Kubernetes architecture and concepts
Excellent problem-solving, analytical, and troubleshooting skills
Proven ability to work collaboratively and communicate effectively in a distributed, global team environment
The following are considered a plus:
Experience with major cloud providers like Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure, or VMware vSphere
Familiarity with CI/CD systems, especially Prow or Jenkins
Contributions to relevant open-source projects
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Engineering - Software & QARole Category: DevOpsRole: Site Reliability EngineerEmployement Type: Full time