We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Machine Learning Engineer - Devops

Genentech
United States, California, South San Francisco
Sep 11, 2025
The Position

A healthier future. It's what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love. That's what makes us Roche.

Advances in AI, data, and computational sciences are transforming drug discovery and development. Roche's Research and Early Development organisations at Genentech (gRED) and Pharma (pRED) have demonstrated how these technologies accelerate R&D, leveraging data and novel computational models to drive impact. Seamless data sharing and access to models across gRED and pRED are essential to maximising these opportunities. The new Computational Sciences Center of Excellence (CoE) is a strategic, unified group whose goal is to harness this transformative power of data and Artificial Intelligence (AI) to assist our scientists in both pRED and gRED to deliver more innovative and transformative medicines for patients worldwide.

The Opportunity

At Roche's AI for Drug Discovery (AIDD) group (Prescient Design), we are revolutionizing drug discovery with cutting-edge machine learning techniques. We are seeking a highly motivated and skilled ML Infrastructure DevOps Engineer to join our growing team within Genentech Research and Early Development AI Drug Development (gRED AIDD). This role is crucial for building and maintaining the scalable and robust infrastructure that powers our machine learning initiatives. The ideal candidate will be proactive, user-facing, and possess a "get-it-done" attitude, while consistently adhering to corporate standards and best practices.

What you'll do

  • Design, implement, and maintain scalable and reliable ML infrastructure on AWS.

  • Automate deployment, monitoring, alerting, and operational tasks using tools like Terraform and Helm.

  • Manage and optimize CI/CD pipelines and Git repositories for ML projects, ensuring efficient version control to support collaboration and deployment.

  • Collaborate closely with ML engineers and data scientists to understand their infrastructure needs and provide solutions.

  • Troubleshoot and resolve infrastructure-related issues in a timely manner.

  • Implement and enforce security best practices for ML infrastructure.

  • Document infrastructure designs, processes, and operational procedures.

  • Contribute to initiatives independently as part of a team, delivering assigned outputs.

  • Proactively identify issues and gaps, proposing ideas and suggestions for improvements.

Who you are

  • Proven experience in designing, deploying, and managing infrastructure on Amazon Web Services (AWS), including services such as EC2, S3, RDS, EKS, SageMaker, etc.

  • Strong proficiency with Git and Git repository management.

  • Hands-on experience with Terraform for infrastructure provisioning and management.

  • Experience with Helm for deploying and managing applications on Kubernetes.

  • Proficiency in scripting languages (e.g., Python, Bash) for automation.

  • Excellent problem-solving skills and a strong ability to debug complex issues.

  • Strong communication and interpersonal skills to effectively collaborate with cross-functional teams and user-facing interactions.

  • Demonstrated ability to take initiative, anticipate needs, and drive projects to completion.

  • Ability to thrive in a fast-paced environment and adapt to evolving requirements while adhering to corporate guidelines.

  • Ability to write clean code with little syntax/convention feedback.

  • Applies software engineering best practices (linting automation, unit testing, documentation, CI/CD).

  • Familiarity with modern machine learning methods.

  • Knowledge of and experience with high-performance computing, distributed systems, and cloud computing.

Preferred

  • Experience with MLOps platforms and tools.

  • Familiarity with CI/CD pipelines for ML workflows.

  • Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack)

Relocation benefits are available for this job posting

The expected salary range for this position based on the primary location of California is $147,600, - $274,000. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. A discretionary annual bonus may be available based on individual and Company performance. This position also qualifies for the benefits detailed at the link provided below.

Benefits

#ComputationCoE

#tech4lifeComputationalScience

#tech4lifeAI

Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.

If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants.

Applied = 0

(web-759df7d4f5-j8zzc)