Computer Systems Engineer 3
![]() | |
![]() United States, California, Berkeley | |
![]() 1 Cyclotron Road (Show on map) | |
![]() | |
Lawrence Berkeley National Laboratory has multiple openings for a Computer Systems Engineer 3 in Berkeley, CA. Duties: Linux system and HPC cluster maintenance and installations, operating system upgrades, system security hardening and intrusion detection, storage and file system management, system hardware and peripheral management, customization of user group working environment, troubleshooting, network monitoring, and crash recovery. Design and implement build, deployment, and configuration management. Build and test automation tools for infrastructure provisioning. Handle code deployments. Monitor metrics and develop ways to improve. Build and manage CI and CD tools. Assist users with program compilation, commercial and public domain software installation, and use of Linux tools. Configure, administer, and troubleshoot desktop, server and storage infrastructures as well as racking, installing, and maintaining systems in a datacenter. Plan, organize, prioritize and complete assigned tasks and projects in a timely manner. Frequently and clearly communicate task or project status to customers to either set or negotiate expectations. Market IT Division services to the scientific community by providing excellent customer service coupled with competent technical support skills. Participate in developing system administration, security, and network policies, documentation, and tools oriented towards efficient systems management. Provide cluster support to LBNL and UC researchers. Responsible for initial installation, integration and the on-going maintenance of Linux High Performance Computing cluster systems. Lead technical efforts in one or more areas of HPC technologies such as job schedulers, high performance interconnects, parallel file systems, cybersecurity, cluster management, VM infrastructure, networking, performance tuning, support of scientific applications, or data center planning. Lead group projects, of small to medium size and complexity, to implement and deploy new computing technologies and associated services to the research community. May telecommute. Benefits: This full salary range of this position is between $129,948 to $219,276 per year depending upon candidates' full skills, knowledge, and abilities, including education, certifications, and years of experience. Requirements: Employer will accept a Bachelor's degree in Computer Science, Engineering or related field followed by 8 years of progressive, post-baccalaureate experience in job offered or in a related occupation. Alternatively, employer will accept a Master's degree in Computer Science, Engineering or related field and 6 years of experience in job offered or in a related occupation. Position requires: 1. Linux system administration experience in a large distributed computing environment; 2. Providing systems and end-user support for multiple scientific or computational research groups; 3. Red Hat Enterprise Linux (including derivatives such as CentOS and Scientific Linux), Debian, Ubuntu and use of large-scale system administration tools and configuration management tools such as Kickstart, Ansible, Puppet, Chef, CFEngine, or in-house developed systems management tools; 4. Support of common services such as NFS, LDAP, CIFS, MySQL, or Apache/Nginx HTTPD; 5. Implementing solutions based on Virtual Machines (VM) technologies (KVM, VMWare, or OpenStack) and container technologies (Docker and Singularity); 6. HPC technologies: Linux operating systems, job schedulers, high performance interconnects, parallel file systems, cybersecurity, cluster management, VM infrastructure, networking, performance tuning, support of scientific applications, or data center planning; 7. Linux internals, TCP/IP networking, software programming, and cybersecurity concepts; 8. Python and Bash; 9. Building, optimizing and debugging scientific codes in C, C++, Fortran and Java; 10. Popular compilers (GCC or Intel), program debugging tools, use of Makefiles, use of version-control systems (git and Subversion); 11. High-performance computing schedulers (OpenPBS and Slurm) and package managers (Spack or similar); and 12. Technologies for heterogeneous large-scale computing (CUDA-aware MPI). This position is eligible for LBNL's Employee Referral Program benefit(s). Want to learn more about working at Berkeley Lab? Please visit: careers.lbl.gov Misconduct Disclosure Requirement: As a condition of employment, the finalist will be required to disclose if they are subject to any final administrative or judicial decisions within the last seven years determining that they committed any misconduct, are currently being investigated for misconduct, left a position during an investigation for alleged misconduct, or have filed an appeal with a previous employer. |