AI Platform - SRE & Devops (ML Framework) -- 2:30 PM to 11:00 PM (Work timings) Required Skills: Demonstrated ability in designing, building, refactoring and releasing software written in Python. Hands-on experience with ML frameworks such as PyTorch, TensorFlow, Triton. Ability to handle framework-related issues, version upgrades, and compatibility with data processing / model training environments. Experience with AI/ML model training and inferencing platforms is a big plus. Experience with the LLM fine tuning system is a big plus. Debugging and triaging skills. Cloud technologies like Kubernetes, Docker and Linux fundamentals. Familiar with DevOps practices and continuous testing. DevOps pipeline and automations: app deployment/configuration & performance monitoring. Test automations, Jenkins CI/CD. Excellent communication, presentation, and leadership skills to be able to work and collaborate with partners, customers and engineering teams. Well organized and able to manage multiple projects in a fast paced and demanding environment. Good oral/reading/writing English ability

Keyskills: cd continuous integration kubernetes python triton technical digital manufacturing presentation skills artificial intelligence docker analytics tensorflow system framework devops leadership design linux leadership skills pytorch jenkins debugging ml communication skills