Job Description:
Cloud Infrastructure Management: Manage cloud infrastructure (e.g., AWS, Azure, GCP) and optimize resource usage.Implement cost-saving measures while maintaining scalability and reliability.
Configuration Management:Manage configuration management tools (e.g., Ansible, Puppet, Chef) to ensure consistency across environments. Automate configuration changes and updates.
Security and Compliance:Own security policies, implement best practices, and ensure compliance with industry standards. Lead efforts to secure infrastructure and applications, including patch management and access controls.
Collaboration with Development and Operations Teams: Foster collaboration between development
and operations teams, promoting a DevOps culture. Be the go-to person for resolving cross-functional infrastructure issues and improving the development process.
Disaster Recovery and Business Continuity:Develop and maintain disaster recovery plans and
procedures.Ensure business continuity in the event of system failures or other disruptions.
Documentation and Knowledge Sharing: Create and maintain comprehensive documentation for
configurations, processes, and best practices.Share knowledge and mentor junior team members.
Technical Leadership and Innovation: Stay up-to-date with industry trends and emerging technologies.Lead efforts to introduce new tools and technologies that enhance DevOps practices.
Problem Resolution and Troubleshooting: Be responsible for diagnosing and resolving complex issues related to infrastructure and deployments.Implement preventive measures to reduce recurring problems.
Performance Optimization:Continuously improve system performance and resource utilization. Conduct capacity planning and scalability assessments.
Incident Response: Lead incident response activities, including root cause analysis and remediation.Be available for on-call support as needed.

Keyskills: Patch management Cloud computing Version control Infrastructure management devops Configuration management Disaster recovery System design Troubleshooting Capacity planning