About the Role
We are seeking an experienced Senior Data Scientist to join our AI team and drive the development and deployment of advanced machine learning and generative AI solutions. This role will be instrumental in building intelligent systems that leverage both traditional ML techniques and cutting-edge generative AI technologies to deliver business value.
Key Responsibilities
- Design, develop, and deploy production-grade machine learning models and generative AI applications using state-of-the-art frameworks and methodologies
- Build and optimize Retrieval Augmented Generation (RAG) pipelines for enterprise knowledge systems and intelligent document processing
- Implement Model Context Protocol (MCP) and agent-to-agent (A2A) context engineering solutions for complex AI orchestration
- Develop feedback loops and monitoring systems to continuously improve model performance and ensure reliability
- Architect and maintain MLOps pipelines for model training, versioning, deployment, and monitoring
- Create AI agents using LangChain and LangGraph frameworks for autonomous decision-making and workflow automation
- Collaborate with data engineering teams to build robust data pipelines and feature engineering workflows
- Mentor junior data scientists and contribute to the development of AI best practices and standards
Required Qualifications
- 5+ years of experience in data science, machine learning, or related field
- Strong expertise in generative AI technologies including LLMs, prompt engineering, and context management
- Hands-on experience building RAG systems with vector databases and semantic search
- Proficiency with LangChain, LangGraph, and agent-based architectures
- Experience with Model Context Protocol (MCP) and A2A context engineering patterns
- Deep understanding of traditional machine learning algorithms (regression, classification, clustering, time series)
- Strong MLOps experience including CI/CD pipelines, model versioning, and monitoring frameworks
- Proficiency with Hugging Face ecosystem (Transformers, Datasets, Hub)
- Experience with TensorFlow and/or PyTorch for model development
- Strong Python programming skills with experience in production-grade code
- Experience designing and implementing feedback loops for continuous model improvement
- Knowledge of cloud platforms (AWS, Azure, or GCP) for ML deployment
- Excellent communication skills with ability to explain complex technical concepts to non-technical stakeholders
Preferred Qualifications
- Experience in regulated industries (financial services, healthcare)
- Knowledge of data governance and model risk management frameworks
- Experience with distributed training and large-scale model deployment
- Familiarity with other frameworks like Anthropic's Claude API, OpenAI API
- Experience with vector databases (Pinecone, Weaviate, Chroma)
- Understanding of prompt engineering and fine-tuning techniques
- Contributions to open-source ML/AI projects
Technical Skills
- Languages: Python, SQL
- ML Frameworks: TensorFlow, PyTorch, scikit-learn
- GenAI Tools: LangChain, LangGraph, Hugging Face
- MLOps: Docker, Kubernetes, MLflow, Weights & Biases
- Cloud: AWS/Azure/GCP machine learning services
- Data: Pandas, NumPy, vector databases
- Version Control: Git, CI/CD pipelines
|