Research Intern - AI Systems and Architecture
Microsoft | |
United States, California, Mountain View | |
Jan 03, 2025 | |
OverviewResearch Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.Join our Strategic Planning and Architecture (SPARC) team within Microsoft's Azure Hardware Systems & Infrastructure (AHSI) organization and be a part of the organization behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. Microsoft delivers more than 200 online services to more than one billion individuals worldwide and AHSI is the team behind our expanding cloud infrastructure. We deliver the core infrastructure and foundational technologies for Microsoft's cloud businesses including Microsoft Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live. The SPARC organization manages Azure's hardware roadmap from architecture concept through production for all of Microsoft's current and future on-line services.
ResponsibilitiesResearch Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world's best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.Additional ResponsibilitiesResponsible for developing and contributing to an in-house performance modeling tool for large scale machine learning systems.Responsible for evaluation of ideas for performance improvement along with bottleneck analysis and feature enhancement.Responsible for building framework for running large scale parallel performance simulations using cloud-based compute infrastructure.Developing a testing framework and testbenches for enabling operator level unit tests and end-to-end application tests for the performance model.Integrate performance model with power & TCO model to project application level Perf/W and Perf/$ metrics across workloads.Develop cloud-based performance simulation database for storing large scale data from design-space exploration experiments.Develop data-analytics framework along with debug tools and automation for easier retrieval of performance data based on user queries.Develop and maintain performance dashboards and visualization tools for improving the analysis framework.Formalize and improve general software development practices including codebase maintenance, code review, feature development and software design reviews.Integrating CI/CD pipeline into Azure devops software development process.General troubleshooting and debug processes including common performance bottleneck limiters and developing performance comparison tools.Collaborate with larger team to define product requirements, feature improvements and implementation. |