Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Architect, ML Engineering @ Icertis

Home > Software Development






 Senior Architect, ML Engineering

Job Description

We are looking for a Senior Architect, Machine Learning to define and lead the architecture for enterprise-grade Generative AI and Agentic AI systems. This is a senior, hands-on architecture role focused on building reliable, scalable, secure, and cost-efficient AI platforms - covering RAG, agent orchestration, inference infrastructure, evaluation/guardrails, and production operations across multiple tenants. You will work at the intersection of research innovation and engineering reliability: enabling rapid experimentation while ensuring the system runs 24/7 with strong SLOs, governance, and predictable cost.
Responsibilities
  • Architecture Technical Leadership: Own the end-to-end architecture for RAG + agentic workflows (Plan Execute Verify) across enterprise use cases (contracts, PDFs, knowledge bases).
  • Define architecture standards for multi-tenant isolation, API design, service boundaries, and integration patterns.
  • Lead technical decision-making: build vs buy, model strategy (hosted vs open-weights), tooling selection, and performance/cost tradeoffs.
  • Drive architecture reviews, mentor engineers/researchers, and raise the overall bar for engineering quality and research rigor.
  • RAG Retrieval Systems (Enterprise-grade): Design retrieval pipelines that optimize grounded accuracy: chunking strategy, hybrid retrieval, reranking, query rewriting, and context construction.
  • Define document ingestion patterns (PDF parsing, OCR, structured extraction, metadata enrichment) and index lifecycle strategies.
  • Establish retrieval evaluation and regression frameworks (ground truth, offline/online evaluation, drift tracking).
  • Enable async and event-driven architectures for long-running tasks using queues/streams (Kafka/RabbitMQ/Redis Streams) and/or durable workflow engines (Temporal).
  • Inference Platform Engineering: Architect model serving for high throughput and low latency using engines like vLLM / TGI / Triton / TorchServe (as applicable).
  • Define GPU orchestration and capacity strategy on Kubernetes (AKS/EKS/GKE), including scale-to-zero, scheduling, and quota-based governance.
  • Design platform-level controls for rate limiting, caching, backpressure, and cost containment (tenant quotas, token budgets, throttling).
  • Safety, Guardrails, Security Compliance: Own guardrail architecture for prompt injection defense, tool safety, policy enforcement, and PII handling (redaction patterns).
  • Define secure-by-default patterns: secrets management, data protection, audit logs, and safe prompt/tool execution boundaries.
  • Partner with security/compliance teams to meet enterprise standards (e.g., SOC2/GDPR expectations where relevant).
  • Observability, Reliability Operational Excellence: Establish SLOs and production readiness standards: error budgets, runbooks, incident response patterns.
  • Define observability strategy across LLM calls and agent tools: tracing, metrics, logs, cost dashboards, and token usage reporting.
  • Build reliability patterns for dependency failure (model provider downtime, throttling): circuit breakers, fallbacks, degradation strategies.
Experience
  • 13+ years of experience in ML systems / platform engineering / architecture roles, with ownership of production-grade systems.
  • Strong software engineering fundamentals: APIs, distributed systems patterns, testing, versioning, CI/CD, and operational readiness.
  • Hands-on experience with Kubernetes and Docker and cloud-native design (Azure/AWS/GCP).
  • Strong experience designing event-driven and async architectures with durable execution patterns (queues/workflows).
  • Proven ability to lead architecture for complex systems involving ML/LLMs, data pipelines, and multi-service integration.
  • Strong Python proficiency; comfortable with async patterns and structured validation (e.g., Pydantic-style design).
Preferred Qualifications
  • Deep experience with RAG (retrieval + grounding + reranking) and evaluation techniques for hallucinations and answer quality.
  • Experience with agent frameworks and multi-step tool execution patterns (plan/execute/verify, tool routing, loop prevention).
  • Experience with open-weight models and adaptation methods (e.g., PEFT/LoRA), plus evaluation-driven iteration.
  • Experience with model inference optimization (throughput, batching, caching) and GPU efficiency management.
  • Experience operating observability stacks (OpenTelemetry, Prometheus/Grafana, Datadog) and LLM tracing tools.
Disclaimer: This job posting has been aggregated from external source. Role details, content, and availability are subject to change. Applicants are advised to confirm the latest information directly on the company website before applying.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Technical Architect
Employement Type: Full time

Contact Details:

Company: Icertis
Location(s): Pune

+ View Contactajax loader


Keyskills:   Automation Circuit breakers Contract management GCP Machine learning Workflow Scheduling Distribution system Auditing Python

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Senior Angular Developer

  • Cognizant
  • 8 - 10 years
  • Bengaluru
  • 18 hours ago
₹ Not Disclosed

Senior Data Engineer- Spark, Abinitio, Python, Sql, Data Warehouse

  • Wells Fargo
  • 4 - 9 years
  • Bengaluru
  • 16 hours ago
₹ Not Disclosed

Sr Analyst III Software Engineering

  • DXC Technology
  • 3 - 8 years
  • Mumbai
  • 23 hours ago
₹ Not Disclosed

Data Engineering Manager

  • Wells Fargo
  • 4 - 9 years
  • Bengaluru
  • 1 day ago
₹ Not Disclosed

Icertis

This is a manufacturing, Trading and Retail Sales Company.\r\nProducts: Door, Chokhat, Plywood, Board etc.