In this role you will be responsible for the administration, maintenance, and enhancement of our observability platforms, ensuring optimal performance and availability for our critical security and business operations. In this role you will:
Participate in observability architecture design, support, and platform management
Gather and analyze metrics from operating systems and applications that enable development teams with observability insights
Manage users and roles, monitor platform performance, and ensure security and high availability.
Automate operational toil for observability focused administrative tasks
Build and support automation for legal and compliance requirements
Support end-users with training and technical guidance on observability tools and capabilities.
Maintain accurate documentation of configurations, workflows, and procedures.
Manage data ingestion and parsing to ensure data integrity and availability.
Design and manage dashboards, reports, alerts, and visualizations.
Implement strategies to increase observability system reliability and performance through on-call rotation and process optimization
Utilize observability tools to diagnose application and infra issues and incidents
Do you have the right
ingredients*
(Requirements)
Polyglot technologist/generalist with a thirst for learning
Understanding of cloud and microservice architecture
Experience with tools such as APM, RUM, Synthetics, Splunk, OTEL, Log pipelines, SIEM, Terraform etc.
Automation/scripting experience with Go, Python, etc
Splunk power user/administrator experience preferred
Industry experience with at least 2 years observability experience with a focus on SRE or observability platform management
Job Classification
Industry: Software ProductFunctional Area / Department: Engineering - Hardware & NetworksRole Category: IT NetworkRole: System Administrator / EngineerEmployement Type: Full time