Design and implement end-to-end RAG pipelines including ingestion, chunking, embeddings, and retrieval
Develop and optimize LLM-based applications using Python
Implement prompt engineering techniques to improve response quality and reduce hallucinations
Work with vector databases (FAISS/Chroma) for semantic search and retrieval
Build scalable APIs using FastAPI/Flask
Deploy AI solutions using Docker, Kubernetes, and cloud platforms (AWS/Azure/GCP)
Optimize retrieval performance, latency, and system scalability
Debug and improve LLM and RAG pipeline performance
Required Skills:
Strong expertise in Python programming
Hands-on experience with RAG architectures
Experience working with LLMs and prompt engineering
Knowledge of embeddings, tokenization, and retrieval systems
Familiarity with LangChain / LlamaIndex / LangGraph
Experience deploying AI systems into production
Preferred Skills:
Experience with vector databases (FAISS, Chroma)
Knowledge of cloud platforms (AWS, Azure, GCP)
Understanding of LLM evaluation and optimization techniques
Employement Category:
Employement Type: Full timeIndustry: IT Services & ConsultingRole Category: Application Programming / MaintenanceFunctional Area: Not SpecifiedRole/Responsibilies: Gen Ai Engineer