Job Description
About Us (Ensono)
We care about your success, offering comprehensive strategic and managed services for mission-critical applications. Our Advisory and Consulting services can help upfront with an application strategy or find the right places for your applications whether it s public, multi or hybrid cloud, or mainframe. And because we span across all mission-critical platforms, we can meet you wherever you are in your digital transformation journey, with 24/7 support when you need it. We are your relentless ally, flexing with you when challenges emerge so you don t feel stuck in place. With cross-platform certifications and decades of experience, our technology experts have become an extension of your team so you re continuously innovating doing more with less while remaining secure. And that s just the beginning.
Role Summary
The L3 ARO Engineer ensures end-to-end reliability, resilience, and performance of critical applications. This role acts as the final technical escalation point , leads major incidents, performs deep diagnostics (especially for Java and .Net based systems ), drives permanent fixes, and influences architecture, automation, and operational standards. The engineer mentors L1/L2 teams and partners closely with Development, Architecture, Platform, and Security.
Key Responsibilities
Incident & Problem Management
- Lead major incident (MI) bridges and restore service with minimum business impact.
- Handle all L3 escalations , perform deep diagnostics across Java, JVM, middleware, OS, and infra.
- Own technical RCAs , drive long term and systemic remediation.
- Identify recurring failure patterns and risks.
Reliability Engineering
- Apply SRE principles : SLIs/SLOs, error budgets, resilience patterns.
- Tune JVM parameters , analyze thread/heap dumps, and improve performance.
- Influence application architecture for fault tolerance, scalability, and recoverability .
- Validate DR readiness , failover behavior, and resilience testing outcomes.
Change, Release & Risk
- Provide technical approval and risk assessment for high-risk changes.
- Enforce operational readiness for new apps and major releases.
- Ensure changes meet audit, compliance, and regulatory expectations .
Automation, Monitoring & Observability
- Build advanced automation using Shell/Python/PowerShell .
- Develop frameworks for health validation , automated recovery, and compliance checks.
- Define observability standards; optimize alerts and improve MTTR .
Leadership & Mentorship
- Mentor L1/L2 teams; review and approve runbooks, SOPs, and KB articles.
- Act as a trusted technical advisor to stakeholders and leadership.
Skills & Qualifications
Technical (Mandatory)
- Strong knowledge of application architecture, distributed systems, and middleware .
- Java expertise : JVM internals, GC, memory management, thread/heap dump analysis, performance tuning.
- .Net CLR internals, garbage collection, memory management, thread/dump analysis, and application performance tuning.
- Strong Unix/Linux , networking basics, and advanced scripting ( Shell/Python/PowerShell/VBS ).
- Advanced SQL and understanding of databases; Autosys (or equivalent scheduler).
- Handson with observability tools : Splunk, AppDynamics/Dynatrace, ELK, Grafana, Prometheus.
Reliability & Operations
- Major incident leadership, deep RCA, change/release readiness, DR & resilience engineering.
- Experience in regulated production environments .
Soft Skills
- Strong technical leadership and decision making.
- Clear communication during high pressure incidents.
- Ownership mindset and business awareness.
Experience & Education
- 7 12+ years in Application Reliability, Production Support, SRE, or platform operations.
- Bachelor s degree in Computer Science/Engineering or equivalent.
- ITIL, cloud, or industry certifications (preferred).
- Banking/financial domain experience (preferred).
Working Conditions
- On call and after hours support as required.
- Fast paced environment with multiple priorities.
- Hybrid working model
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Technical Lead
Employement Type: Full time
Contact Details:
Company: Ensono
Location(s): Hyderabad
Keyskills:
Unix
Performance tuning
Automation
Linux
Networking
Production support
microsoft
Middleware
SQL
Python