We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.

Job posting has expired

#alert
Back to search results
New

Data Engineer

Massachusetts General Hospital
United States, Massachusetts, Boston
60 Blossom Street (Show on map)
Sep 04, 2025
Summary
Responsible for implementing methods to improve data reliability and quality. They combine raw information from different sources to create consistent and machine-readable formats and they develop and test architectures that enable data extraction and transformation for predictive or prescriptive modeling.

Does this position require Patient Care?
No

Essential Functions
-Design, develop, and implement data pipelines and ETL/ELT code to support business requirements.

-Maintain and optimize various components of the data pipeline architecture.

-Deliver high quality, efficient solutions to meet technical standards and industry best practices.

-Deliver optimal technical solutions for business and operational requirements.

-Participate in team design sessions and contribute options and solutions Produce and support product documentation.

-Participate in ETL Quality circle discussions to explore, discuss, and arrive at efficient solutions and best practices.

The Brain Modulation Lab within the MGB Department of Neurosurgery, is seeking a multi-talented individual to serve as the lab's data engineer. The Brain Modulation Lab studies brain electrophysiology and behavior in patients undergoing surgery for epilepsy, movement disorders, and psychiatric disease.

The Brain Modulation Lab Data Engineer will report directly to Mark Richardson, MD, PhD, Professor of Neurosurgery and lab director, and to Alan Bush, PhD, Instructor of Neurosurgery and lab co-director. The candidate will have experience and/or ability in data warehousing both with onsite servers as well as cloud-based integration services, and will contribute continuous improvements in functionality, infrastructure, and workflow operations. The role is a hands-on technical position that requires management and running of data pipelines, ability to support several NIH-sponsored trials, commitment to following HIPAA and security-related regulations, and desire to contribute to translational neuroscience. The ideal candidate is a self-motivated, highly qualified individual who demonstrates attention to detail, an ability to work independently and with a team, and can quickly get up to speed on complex data infrastructure and sharing policies.

The data engineer will have a special responsibility for working with intracranial data collected in epilepsy patients, both from implanted responsive neurostimulators and from clinical stereo-EEG studies, along with associated data from the electronic medical record. Additionally, the data engineer will interface with colleagues in the Department of Brain and Cognitive Sciences at MIT, for warehousing intracranial research data through the MGH-MIT InBRAIN collaboration. The data engineer is required to have experience in machine learning applications and will be expected to build an LLM-based tool for extracting specific types of data from the EMR. There may be additional opportunities to support a project that adapts BrainBERT, an AI model designed to produce high-quality embeddings from intracranial data, specifically to handle responsive neurostimulation data.

Core Responsibilities:

  • Data warehousing: Create, maintain and modify existing data pipelines for analysis and preprocessing, including cloud-based ETL pipelines that use Snowflake on Azure.
  • Implement pipelines for data deidentification and develop/maintain protocols for secure and HIPAA compliant data transfer to collaborators.
  • Create, maintain and modify dashboards built on a SQL engine to query metadata related to the current data collected and their processing status.
  • Build LLM-based tools for data extraction from clinical notes and other documents in the EMR.

Requirements:

  • Ability to self-start and manage projects end-to-end.
  • Strong proficiency in at least one of the first three, and working knowledge of all the following languages:
    • MATLAB
    • Python
    • R
    • Unix shell
    • SQL
  • Proficiency with git or other version control software and ability to assist in implementation across lab members.
  • Demonstrated ability to extensively document procedures and protocols as well as familiarity with good coding practices is required.
  • Working knowledge of HIPAA policies and requirements for compliance related to sharing of clinical data for research.

Education
Bachelor's Degree Computer Science required or Bachelor's Degree Related Field of Study required

Can this role accept experience in lieu of a degree?
Yes

Licenses and Credentials

Experience
Data warehousing development in large reporting environment(s) 2-3 years required and Experience with developing data pipelines using on Snowflake features ( Snowpipe, SnowSQL, Snow Sight, Data Streams ) required and Hands-on development experience with ETL/ELT tools, such as dbt, Fivetran, or Informatica required and Experience working in Agile software development environment required

Knowledge, Skills and Abilities
- Working knowledge of cloud computing platforms such as AWS, GCP, or Azure.
- Familiarity with enterprise data warehousing systems a plus.

Physical Requirements

  • Standing Occasionally (3-33%)
  • Walking Occasionally (3-33%)
  • Sitting Constantly (67-100%)
  • Lifting Occasionally (3-33%) 20lbs - 35lbs
  • Carrying Occasionally (3-33%) 20lbs - 35lbs
  • Pushing Rarely (Less than 2%)
  • Pulling Rarely (Less than 2%)
  • Climbing Rarely (Less than 2%)
  • Balancing Occasionally (3-33%)
  • Stooping Occasionally (3-33%)
  • Kneeling Rarely (Less than 2%)
  • Crouching Rarely (Less than 2%)
  • Crawling Rarely (Less than 2%)
  • Reaching Occasionally (3-33%)
  • Gross Manipulation (Handling) Constantly (67-100%)
  • Fine Manipulation (Fingering) Frequently (34-66%)
  • Feeling Constantly (67-100%)
  • Foot Use Rarely (Less than 2%)
  • Vision - Far Constantly (67-100%)
  • Vision - Near Constantly (67-100%)
  • Talking Constantly (67-100%)
  • Hearing Constantly (67-100%)


The General Hospital Corporation is an Equal Opportunity Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. All qualified applicants will receive consideration for employment without regard to race, color, religious creed, national origin, sex, age, gender identity, disability, sexual orientation, military service, genetic information, and/or other status protected under law. We will ensure that all individuals with a disability are provided a reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.

(web-759df7d4f5-j8zzc)