Job Description
Job Summary
Were seeking a hands-on GenAI & Computer Vision Engineer with 3-5 years of experience delivering production-grade AI solutions. You must be fluent in the core libraries, tools, and cloud services listed below, and able to own end-to-end model developmentfrom research and fine-tuning through deployment, monitoring, and iteration. In this role, youll tackle domain-specific challenges like LLM hallucinations, vector search scalability, real-time inference constraints, and concept drift in vision models.
Key Responsibilities
Generative AI & LLM Engineering
- Fine-tune and evaluate LLMs (Hugging Face Transformers, Ollama, LLaMA) for specialized tasks
- Deploy high-throughput inference pipelines using vLLM or Triton Inference Server
- Design agent-based workflows with LangChain or LangGraph, integrating vector databases (Pinecone, Weaviate) for retrieval-augmented generation
- Build scalable inference APIs with FastAPI or Flask, managing batching, concurrency, and rate-limiting
Computer Vision Development
- Develop and optimize CV models (YOLOv8, Mask R-CNN, ResNet, EfficientNet, ByteTrack) for detection, segmentation, classification, and tracking
- Implement real-time pipelines using NVIDIA DeepStream or OpenCV (cv2); optimize with TensorRT or ONNX Runtime for edge and cloud deployments
- Handle data challengesaugmentation, domain adaptation, semi-supervised learningand mitigate model drift in production
MLOps & Deployment
- Containerize models and services with Docker; orchestrate with Kubernetes (KServe) or AWS SageMaker Pipelines
- Implement CI/CD for model/version management (MLflow, DVC), automated testing, and performance monitoring (Prometheus + Grafana)
- Manage scalability and cost by leveraging cloud autoscaling on AWS (EC2/EKS), GCP (Vertex AI), or Azure ML (AKS)
Cross-Functional Collaboration
- Define SLAs for latency, accuracy, and throughput alongside product and DevOps teams
- Evangelize best practices in prompt engineering, model governance, data privacy, and interpretability
- Mentor junior engineers on reproducible research, code reviews, and end-to-end AI delivery
Required Qualifications
You must be proficient in at least one tool from each category below:
- LLM Frameworks & Tooling:
Hugging Face Transformers, Ollama, vLLM, or LLaMA
LangChain or LangGraph; RAG with Pinecone, Weaviate, or Milvus
Triton Inference Server; FastAPI or Flask
- Computer Vision Frameworks & Libraries:
PyTorch or TensorFlow; OpenCV (cv2) or NVIDIA DeepStream
TensorRT; ONNX Runtime; Torch-TensorRT
Docker and Kubernetes (KServe, SageMaker); MLflow or DVC
- Monitoring & Observability:
Prometheus; Grafana
AWS (SageMaker, EC2/EKS) or GCP (Vertex AI, AI Platform) or Azure ML (AKS, ML Studio)
Python (required); C++ or Go (preferred)
Additionally:
- Bachelors or Masters in Computer Science, Electrical Engineering, AI/ML, or a related field
- 3-5 years of professional experience shipping both generative and vision-based AI models in production
- Strong problem-solving mindset; ability to debug issues like LLM drift, vector index staleness, and model degradation
- Excellent verbal and written communication skills
Typical Domain Challenges Youll Solve
- LLM Hallucination & Safety: Implement grounding, filtering, and classifier layers to reduce false or unsafe outputs
- Vector DB Scaling: Maintain low-latency, high-throughput similarity search as embeddings grow to millions
- Inference Latency: Balance batch sizing and concurrency to meet real-time SLAs on cloud and edge hardware
- Concept & Data Drift: Automate drift detection and retraining triggers in vision and language pipelines
- Multi-Modal Coordination: Seamlessly orchestrate data flow between vision models and LLM agents in complex workflows
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Machine Learning Engineer
Employement Type: Full time
Contact Details:
Company: Auriga It
Location(s): Jaipur
Keyskills:
python
docker
tensorflow
pytorch
aws
kubernetes
cloud services
triton
c++
cnn
natural language processing
neural networks
aws sagemaker
aiml
machine learning
artificial intelligence
deep learning
computer vision
keras
flask
onnx
opencv
ml