Senior Site Reliability Engineer / Principal Site Reliability Engineer @ Highradius

Home > IT Infrastructure Services

Senior Site Reliability Engineer / Principal Site Reliability Engineer

Highradius
4 - 9 years
Hyderabad
11 months ago
Email to a friend
Report this job

Job Description

Job Summary:

We are looking for a highly skilled and adaptable Senior Site Reliability Engineer / Principal Site Reliability Engineer to become a key member of our Cloud Engineering team. In this crucial role, you will be instrumental in designing and refining our cloud infrastructure with a strong focus on reliability, security, and scalability. As an SRE, you'll apply software engineering principles to solve operational challenges, ensuring the overall operational resilience and continuous stability of our systems. This position requires a blend of managing live production environments and contributing to engineering efforts such as automation and system improvements.

Responsibilities

Cloud Infrastructure Architecture and Management: Design, build, and maintain resilient cloud infrastructure solutions to support the development and deployment of scalable and reliable applications. This includes managing and optimizing cloud platforms for high availability, performance, and cost efficiency.
Enhancing Service Reliability: Lead reliability best practices by establishing and managing monitoring and alerting systems to proactively detect and respond to anomalies and performance issues. Utilize SLI, SLO, and SLA concepts to measure and improve reliability. Identify and resolve potential bottlenecks and areas for enhancement.
Driving Automation and Efficiency: Contribute to the automation, provisioning, and standardization of infrastructure resources and system configurations. Identify and implement automation for repetitive tasks to significantly reduce operational overhead. Develop Standard Operating Procedures (SOPs) and automate workflows using tools like Rundeck or Jenkins.
Incident Response and Resolution: Participate in and help resolve major incidents, conduct thorough root cause analyses, and implement permanent solutions. Effectively manage incidents within the production environment using a systematic problem-solving approach.
Collaboration and Innovation: Work closely with diverse stakeholders and cross-functional teams, including software engineers, to integrate cloud solutions, gather requirements, and execute Proof of Concepts (POCs). Foster strong collaboration and communication. Guide designs and processes with a focus on resilience and minimizing manual effort. Promote the adoption of common tooling and components, and implement software and tools to enhance resilience and automate operations. Be open to adopting new tools and approaches as needed.

Requirements

Experience: 4 to 13 Years

Role: We have multiple roles the final role will depend on the candidate's experience and credentials

Education: BE/B. Tech/MCA/M.Sc./MTech/M.S

Technology Stack: Linux Administration, Shell / Python Scripting, AWS Cloud Services (EC2, S3), Cloud Operations, Linux (CentOS, Rocky Linux), Jenkins, ArgoCD, Kubernetes Management, Ansible, Terraform, OS Patching, Release Management, Incident Management

Infrastructure Management: Proven proficiency in on-premises hosting and virtualization platforms (VMware, Hyper-V, or KVM). Solid understanding of storage internals (NAS, SAN, EFS, NFS) and protocols (FTP, SFTP, SMTP, NTP, DNS, DHCP). Experience with networking and firewall technologies. Strong hands-on experience with Linux internals and operating systems (RHEL, CentOS, Rocky Linux). Experience with Windows operating systems to support varied environments.
Service Reliability Concepts: Good understanding of SLI, SLO, SLA and error budgeting

Other Mandatory Requirements: 1) Excellent communication skills 2) 24/7 support with monthly rotation shifts

Job Classification

Industry: Software Product
Functional Area / Department: IT & Information Security
Role Category: IT Infrastructure Services
Role: IT Infrastructure Services - Other
Employement Type: Full time

Contact Details:

Company: Highradius
Location(s): Hyderabad

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: Linux Administration Change Management VMware Cloud Operations Cloud Infra Aws Cloud Services Cloud Infrastructure Release Management Jenkins Terraform Kubernetes Administration Ansible Infrastructure Management Os Patching Incident Management ArgoCD Python

Job seems aged, it may have been expired!
Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Cloud Platform Engineer

Accenture

5 - 10 years

Pune

7 days ago

₹ Not Disclosed

Aws Devops Engineer

Tata Consultancy

4 - 8 years

Kolkata

7 days ago

₹ Not Disclosed

Senior Artificial Intelligence Engineer

Peoplefy

6 - 11 years

Hyderabad

7 days ago

₹ Not Disclosed

Automation Engineer: Infrastructure & Testing

Siro Clinpharm

4 - 6 years

Hyderabad

8 days ago

₹ Not Disclosed

Highradius

HighRadius HighRadius

Senior Site Reliability Engineer / Principal Site Reliability Engineer @ Highradius

Home > IT Infrastructure Services