Job Description
Welcome to Veradigm! Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the largest community of clients in healthcare, Veradigm is able to deliver an integrated platform of clinical, financial, connectivity and information solutions to facilitate enhanced collaboration and exchange of critical patient information. Site Reliability Engineer We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, performance, and availability of our systems and services. Your primary focus will be on managing and resolving incidents in real-time, requiring in-depth knowledge of cloud services in Azure and AWS. The ideal candidate for this position will have a passion for diagnosing and. troubleshooting complex problems and will be self-driven and highly communicative. Responsibilities: Serve as an on-call engineer, responsible for managing and resolving incidents that affect the availability and performance of our systems. Collaborate with 8+ years of experience in development, operations, and infrastructure teams to design, implement, and maintain robust, scalable, and reliable systems. Proactively monitor and analyze system metrics to identify potential issues and take necessary actions to prevent or mitigate them. Conduct thorough root cause analysis of incidents, identifying underlying issues and implementing long-term solutions to prevent recurrence. Automate manual processes and tasks to improve efficiency and reduce human error. Participate in capacity planning and performance optimization efforts to ensure system scalability and reliability. Stay updated with the latest industry trends and emerging technologies related to cloud services and site reliability engineering. Requirements: Bachelor's degree in computer science, engineering, or a related field (or equivalent work experience). Extensive experience in incident management and on-call support, preferably in a high-availability production environment. Strong knowledge of cloud services, particularly in Azure and AWS, including virtual machines, networking, storage, and load balancing. Proficient in scripting and automation using languages such as Python, Bash, or PowerShell. Excellent troubleshooting and problem-solving skills, with a keen attention to detail. Self-driven and motivated, with the ability to work independently and prioritize tasks effectively. Strong communication and interpersonal skills, with the ability to collaborate and communicate effectively with cross-functional teams. Familiarity with DevOps practices and tools, such as CI/CD pipelines and infrastructure-as-code. Experience with monitoring and logging tools, such as Prometheus, Grafana, ELK stack, or similar. Certifications in Azure or AWS are a plus. Join our team and be a part of an exciting journey to build and maintain highly reliable and scalable systems. As an SRE, you will have the opportunity to make a significant impact while continuously learning and growing in a challenging and dynamic environment. Apply now to become a key contributor to our success! We are an Equal Opportunity Employer. No job applicant or employee shall receive less favorable treatment or be disadvantaged because of their gender, marital or family status, color, race, ethnic origin, religion, disability or age; nor be subject to less favorable treatment or be disadvantaged on any other basis prohibited by applicable law. Veradigm is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse and inclusive workforce. Thank you for reviewing this opportunity! Does this look like a great match for your skill set If so, please scroll down and tell us more about yourself
Employement Category:
Employement Type: Full time
Industry: IT Services & Consulting
Role Category: Customer Service (International)
Functional Area: Not Applicable
Role/Responsibilies: Expert Software Engineer (sre - Aws/azure)
Keyskills:
cloud services
Azure
AWS
incident management
virtual machines
networking
storage
load balancing
scripting
Python
Bash
PowerShell
troubleshooting
DevOps
monitoring
logging
Site Reliability Engineer
oncall support
problemsolving
CICD pipelines
infrastructureascode
Prometheus
Grafana
ELK stack
certifications