Title: MLOps Lead
KA, IN
Job Description
Must Have
About the Role
We are seeking a highly skilled MLOps Lead to design, build, and manage robust machine learning operations and data pipelines at scale. The ideal candidate will have deep expertise in Databricks, PySpark, MLflow, and cloud platforms (AWS & Azure), combined with a strong foundation in automation, deployment, and model monitoring.
As an MLOps Lead, you will be responsible for building scalable ML infrastructure, enabling seamless collaboration between Data Science, Engineering, and DevOps teams, and ensuring reliable model lifecycle management from development to production.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key Responsibilities
MLOps Strategy & Leadership:
Define and implement MLOps standards, best practices, and governance frameworks to streamline model lifecycle management.
Pipeline Development:
Build and optimize scalable data and ML pipelines using Databricks, PySpark, and SQL for both batch and streaming use cases.
Model Deployment & Automation:
Develop CI/CD pipelines using Jenkins for automated training, testing, and deployment of ML models across environments.
Model Tracking & Governance:
Use MLflow for model experiment tracking, versioning, and deployment management.
Infrastructure & Cloud Management:
Architect and manage ML infrastructure on AWS and Azure, leveraging services such as S3, EKS, ADF, AKS, and Data Lake environments.
Monitoring & Optimization:
Implement and maintain robust model monitoring frameworks for performance, data drift, and retraining automation.
Model Monitoring & Maintenance:
Establish robust model monitoring frameworks to track model performance, detect drift, and trigger retraining workflows.
Set up alerts, dashboards, and automated health checks for ML models in production.
Implement monitoring tools and frameworks to ensure continuous model reliability.
Collaboration & Documentation:
Work closely with data scientists, data engineers, and DevOps teams; maintain clear documentation using Jira and Confluence.
Automation & Continuous Improvement:
Identify opportunities for process automation, infrastructure optimization, and operational efficiency across the ML lifecycle.
----------------------------------------------------------------------------------------------------------------------------------
Required Qualifications
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
6+ years of experience in MLOps, Data Engineering, or Machine Learning Engineering, with at least 2+ years in a lead or senior role.
Strong hands-on experience with Databricks (notebooks, jobs, clusters, Delta Lake).
Proficiency in Python, SQL, and PySpark.
Deep expertise in MLflow, Jenkins, and CI/CD automation.
Practical experience with AWS and Azure cloud ecosystems.
Strong understanding of data lake architectures, model deployment, and monitoring frameworks.
Experience working with Agile tools like Jira and documentation platforms like Confluence.
Excellent problem-solving, leadership, and cross-functional communication skills.
Good to have
EQUAL OPPORTUNITY