"Site Reliability Engineer (SRE) - Cloud Infrastructure & Data:
- Ensure reliable, scalable, and secure cloud-based data infrastructure.
- Design, implement, and maintain AWS infrastructure with a focus on data products.
- Automate infrastructure management using Pulumi, Terraform, and policy as code.
- Monitor system health, optimize performance, and manage Kubernetes (EKS) clusters.
- Implement security measures, ensure compliance, and mitigate risks.
- Collaborate with development teams on deployment and operation of data applications.
- Optimize data pipelines for efficiency and cost-effectiveness.
- Troubleshoot issues, participate in incident response, and drive continuous improvement.
- Experience with Kubernetes administration, data pipelines, and monitoring and observability tools.
- In-depth coding and debugging skills in Python, Unix scripting
- Excellent communication and problem-solving skills.
- Self-driven, highly motivated and ability to work both independently and within a team.
- Operate optimally in fast-paced development environment with dynamic changes, tight deadlines and limited resources."
existing.
Keyskills: Aws Devops Aws Infrastructure AWS Devops