Job Description
Must Have
Key Responsibilities
1. Architect and implement complex GenAI applications leveraging LLMs, LangChain, and Retrieval Augmented Generation (RAG) workflows.
2. Design scalable RAG pipelines including ingestion, chunking, embeddings, vector storage, retrieval, reranking, and synthesis using AWS-native components.
3. Lead LangChain development (chains, multi-agent systems, tool calling, memory architectures, prompt templates, E2E pipelines).
4. Build and optimize vector search systems using OpenSearch, Pinecone, or pgvector for high-performance semantic search.
5. Integrate with Amazon Bedrock to utilize state-of-the-art LLMs and embedding models; implement responsible AI and guardrails.
6. Drive prompt engineering frameworks, versioning strategy, A/B testing, and structured output (JSON schema, tool calling).
7. Develop microservices & APIs using FastAPI/Flask on AWS Lambda, ECS, or EKS for serving GenAI capabilities in production.
8. Implement observability & evaluation for GenAI systems through LangSmith, Ragas/DeepEval, CloudWatch logs, traces, and dashboards.
9. Own CI/CD pipelines for GenAI apps using CodePipeline, GitHub Actions, or Terraform/CDK-based infra automation.
10. Collaborate with cross-functional teams (Product, Data Engineering, Cloud, QA, Architecture) to translate business use cases into high-quality AI solutions.
11. Drive POCs to production—lead prototypes, perform feasibility studies, and scale validated designs into robust enterprise systems.
12. Maintain thorough documentation—prompt catalogs, RAG architecture diagrams, API specs, decision records, and runbooks.
Required Skills & Experience
5+ years of software development experience, including strong Python (async, typing, patterns) skills.
Deep expertise in GenAI concepts, RAG architectures, and prompt engineering.
Hands-on mastery in LangChain, LangGraph, LangSmith, tool calling, and agentic workflows.
Production-grade AWS experience: Lambda, API Gateway, ECS/EKS, S3, Bedrock, CloudWatch, Step Functions.
Strong understanding of vector databases: OpenSearch, Pinecone, pgvector, or Weaviate.
Experience building secure, scalable microservices and asynchronous processing pipelines.
Solid understanding of distributed systems, caching layers, queues (SQS), containers, and networking.
Experience with CI/CD pipelines, IaC (Terraform/CDK), and container orchestration.
Proficiency in testing GenAI systems (prompt tests, dataset tests, regression runs).
Strong communication and leadership to drive architecture decisions and mentor younger engineers.