We are seeking a highly skilled Staff Service Reliability Engineer to lead the deployment and optimization of Splunk for monitoring non-production, Engineering research and development equipment and workloads. This individual contributor role requires a strategic thinker with extensive experience in diverse IT environments and a comprehensive understanding of various hardware and operating systems, including testing equipment.
Your Impact
Splunk Deployment: Lead the deployment and ongoing management of Splunk to monitor non-production research and development resources in labs.
Dashboard Optimization: Design and optimize dashboards to enhance data visualization and provide actionable insights.
Data Feed Integration: Manage the integration of multiple data feeds into Splunk to ensure effective monitoring and analysis.
AI Utilization: Leverage AI technologies to analyze alerting data, identify trends, and generate meaningful insights for operational improvements.
Best Practice Implementation: Implement industry best practices in monitoring and observability to enhance system reliability and performance.
Collaborative Engagement: Engage with cross-functional teams to align monitoring solutions with organizational objectives. Lead and/or participate in global cross-functional meetings and discussions. Flexible (non-India Standard Time (IST)) hours may be required.
Documentation: Develop and maintain comprehensive documentation of monitoring processes and system configurations.
Minimum Qualifications
Experience : BE/M.tech with 14+ years of experience in managing IT systems across a range of environments with diverse hardware and operating systems.
Splunk Expertise : Proven expertise in Splunk administration, including dashboard creation and data integration.
Problem Solving : Strong problem-solving skills with the ability to address complex technical challenges.
Communication : Excellent communication skills to effectively convey technical concepts to various stakeholders.
Team Collaboration : Ability to work collaboratively with team members and contribute to a positive team dynamic.
Preferred Qualifications
B.E / M.Tech in Computer Science or Information Technology with 14 - 17 years of hands on experience
Proficiency in Python with hands-on experience and exposure in developing machine learning, deep learning and data science
Demonstrated expertise in Python scripting for developing, automating, and integrating scalable solutions.
Proven proficiency in Java with a solid foundation in object-oriented design and application development.
Extensive experience in implementing and tuning machine learning algorithms to solve complex problems.
Ability to integrate machine learning models within both Python and Java-based applications for seamless functionality.
Strong analytical skills with a proven track record of leveraging programming and ML techniques to drive data-driven decision making.
Demonstrated proficiency in SQL for data manipulation and extraction, with a focus on optimizing complex queries for performance.
Extensive experience in designing, maintaining, and scaling relational databases to support enterprise applications and machine learning workflows.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Engineering - Software & QARole Category: Quality Assurance and TestingRole: Blockchain Quality Assurance EngineerEmployement Type: Full time