The Cloud Infrastructure Engineer is responsible for cloud solutions development, architectural design, and overarching IT operations and maintenance activities of a secure cloud infrastructure encompassing multiple accounts and external interfaces running primarily on Linux, serverless, containerized architecture, predominantly in AWS. The ideal candidate should have professional experience operating multi-account cloud environments, interacting effectively with customers and development teams, gathering and analyzing requirements for the design of new systems and infrastructure which satisfy business requirements. Must have the ability to see big picture requirements and distill them into concrete tasks and execute independently in a timely fashion. The successful candidate will evaluate and implement enhanced approaches to deliver cloud-based infrastructure services with built-in automation to increase efficiency and lower human intervention and administrative level of effort. Will have considerable AWS and Linux skills, as well as DevOps skills around CI/CD and experience with tools such as Ansible, CI/CD tools, Terraform and Cloud Formation. Will be a part of the operations team supporting Development, Test, QA, Stage, and Production environments, and will effectively manage technology vendor relationships and ensure proper controls and oversight are provided.
Essential Functions
To perform this job successfully, an individual must be able to perform each essential duty and responsibility satisfactorily. Reasonable accommodations may be made to enable an individual with disabilities to perform the essential functions. Other duties may be assigned to meet business needs.
- Provide comprehensive systems administration functions on Amazon Web Services (AWS) infrastructure to include support of AWS products such as: AWS Console root user administration, Key Management, EC2 Compute, S3 Storage, Relational Database Service (RDS), AWS Networking & Content delivery (VPC, Route 53, ELB, etc.) Identity & Access Management, CloudWatch, CloudTrail, Cloud Formation, Auto Scaling, Cost and Usage Reports, and more.
- Assist with the design and development of a multi-account, multi-region, highly available and highly automated AWS environment to support full software development life cycle and production of mission-critical applications.
- Administer server, operations, and maintenance of Amazon Web Server EC2 instances.
- Through direct actions or vendor oversight, ensure recurring system and security updates are applied to mitigate risk and improve overall infrastructure information security.
- Provide application support for product customers.
- Create new AWS cloud compute instances, storage, and other cloud services.
- Analyze EC2 instance and S3 storage performance and resize to meet performance requirements and to control costs.
- Evaluate, recommend, and implement cost optimized enhancements to the current architectural and AWS services.
- Serve as the system administrator for the enterprise monitoring system (Datadog).
- Monitor infrastructure and pro-actively mitigate potential incidents before service degradation occurs.
- Proactively mitigate business service disruptions with designed redundancy, backups, and highly available solutions.
- Reactively troubleshoot outages, perform root cause analysis, and execute continual service improvements.
- Develop and adhere to technical standards, specifications, and best practices.
- Create system support documents, operational procedures, and build scripts/Cloud Formation templates.
- Provide day-to-day operations and on-call escalation support for all AWS client, server, storage, and network services.
- Work with Development teams to install, test, implement and troubleshoot functionality.
- Create and maintain automation scripts to increase system efficiency and lower administrative level of effort.
- Complete ongoing performance tuning and system optimization to better meet business needs.
- Work in matrixed teams to assist with product development and delivery.
Additional Responsibilities
- Attend and host meetings and provide support in the form of targeted agendas, meeting notes, communications, and follow-up delivery.
- Maintain relevant and current professional knowledge via in-house training, online resources, attendance at professional events, and personal investment in continued education and certifications.
- Monitor industry trends for changes, risks, releases, and advancements in cloud services and technologies.
- Develop and maintain working relationships and collaborate with various vendors/other stakeholders.
- Other duties, as assigned.
Minimum Qualifications To perform this job successfully, an individual should possess the knowledge, skills, and abilities listed and meet the amount of education, training and/or work experience required.
Education
- B.S. / B.A. degree or equivalent required.
Experience
- 8+ years of relevant professional system engineering or administration experience, with significant exposure to a variety of technologies and domains.
- 5+ years working with AWS, deep understanding of AWS concepts, and fluency with the AWS APIs/command line tools. Experience implementing and maintaining cloud-based systems and applications in Amazon Web Services (AWS).
- 5+ years of advanced working knowledge of the Linux operating system.
- Experience automating server configurations to include standard build installations and system security hardening.
- Experience in developing Infrastructure as Code (IAC) solutions using Terraform and AWS Cloudformation.
- Python, NodeJS and bash scripting experience is required.
- Experience with core network services to include: DHCP, DNS, VLAN, load balancing, etc.
- Experience writing standard operating procedures, system requirements, and other technical documents.
- Experience collaborating with cross functional teams to achieve a shared project goal.
- Experience centrally monitoring systems for alerts and incident management functions; preferable with Amazon CloudWatch.
- Experience with analysis, design, development and continual improvements of enterprise monitoring tools including Datadog.
- Experience with CI/CD tools and Git-based solutions such as AWS, Gitlab, GitHub and Bitbucket.
- Strong working knowledge of automation tools such as (Puppet, Ansible, Jenkins, and Chef).
- Working knowledge of software-defined lifecycles, product packaging, and deployments.
- Working knowledge of RDS database such as PostgresSQL, Oracle, and MySQL.
Requirements
- Must be eligible to obtain or currently possess a U.S. Government clearance at the Public Trust (NACI) moderate level or higher.
- Must be an authorized United States citizen.
- AWS Professional Certification - Solutions Architect, SysOps Administrator or DevOps Engineer.
- Due to the nature of CSBS's business in support of state financial services supervision, all CSBS employees have the potential of interacting with confidential information related to the supervision of financial services companies ("Confidential Supervisory Information"). As a result, in addition to general business conflicts of interest, all CSBS employees are expected to disclose conflicts of interest in financial services companies on at least an annual basis and to proactively avoid such conflicts.
- Protect the confidentiality, integrity, and availability of CSBS information and information systems in accordance with CSBS policies and procedures.
Knowledge, Skills, and Abilities
- Working knowledge with project and portfolio management tools, such as Project and Jira.
- Excellent verbal and writing skills and the ability to communicate effectively with all levels.
- Excellent time-management, prioritization skills and understanding when to escalate.
- Excellent communications skills.
- Strong planning and task management skills.
Values Instilled Behaviors for Excellence (VIBEs) Member/ Customer Service
- Builds and values relationships
- Prioritizes work
- Advocates and advances member's goals
Teamwork
- Gives credit to others
- Has a "pitch in" attitude
- Learns from successes and setbacks
Respect/Trust
- Listens and learns from others
- Speaks the truth even when uncomfortable
- Honors the expertise of others
Collaboration
- Recognizes the contributions of others
- Consults and communicates effectively
- Desires to make others successful
Ownership/Engagement
- Perseveres through adversity
- Experiments and takes risks
- Plans ahead and is forward-thinking
Leadership Competencies Achievement Oriented Thinking
- Is a solutions-oriented thinker
- Has good time management skills
- Manages expectations of what is achievable
Change Management
- Actively engage and participate during change
- Asks questions and takes ownership for understanding why the change is happening and the risk of not changing actively
- Adopts the new habits, monitors own performance and checks self against the objectives and seeks help when they don't match
- Identifies and communicates obstacles and resistance
Emotional Intelligence
- Manages own emotions productively to stay in role
- Handles emotionally charged situations productively and with empathy
- Asks for and openly accepts feedback; looks for opportunities to grow
- Conducts conversations courageously - hitting difficult issues head-on with an eye on maintaining relationships
Working Conditions
- Hybrid work environment with remote support.
- On-call rotation and issue escalation support.
- Occasional travel outside Washington D.C. area (1-3 days per quarter).
|