Senior Infrastructure Services and Management Analyst - Incident Management
Northwell Health | ||
$91,400.00 - $158,100.00 / yr | ||
United States, New York, Westbury | ||
Dec 31, 2024 | ||
Job Description This role requires a blend of reactive incident management skills and proactive problem-solving abilities, strong leadership, and excellent communication skills. The ideal candidate will be comfortable leading during high-pressure incidents while also possessing the analytical skills to prevent future occurrences. Job Responsibility
Job Qualification
Highly Preferred Experience: * Incident Management Expertise: Significant experience managing major incidents, preferably within a large, complex IT environment. Deep understanding of ITIL principles, particularly incident, problem, and change management. Proven ability to lead and coordinate large technical teams during critical, high-impact incidents. Experience developing and improving incident management processes. * Problem Management & RCA Skills: Experience in problem management methodologies, including proactive problem identification, trend analysis, and known error management. Demonstrated expertise in conducting complex RCA investigations using various techniques (e.g., 5 Whys, fishbone diagrams, Kepner-Tregoe). Ability to identify systemic issues and recommend long-term solutions. * Technical Proficiency: A strong understanding of IT infrastructure components (servers, networks, databases, applications) and their interdependencies. Familiarity with cloud technologies (AWS, Azure, GCP) is increasingly important. Experience with monitoring tools and platforms. * Communication & Leadership: Exceptional leadership skills with the ability to command and control during high-pressure, complex situations. Outstanding communication skills (written and verbal) to effectively communicate with all stakeholders, including executive management, technical teams, and business representatives. Ability to articulate complex technical issues clearly and concisely to both technical and non-technical audiences. Experience presenting to senior leadership. * Analytical & Problem-Solving Skills: Strong analytical and problem-solving skills to quickly identify root causes, develop effective solutions, and prevent future incidents. Ability to think strategically and make sound decisions under extreme pressure. Ability to identify patterns, trends, and systemic issues in incident data. * Mentorship & Team Development: Proven ability to mentor and guide other incident management team members. Experience developing training materials and conducting training sessions. Ability to foster a collaborative and high-performing team environment. * Process Improvement Mindset: A proactive approach to identifying areas for improvement in processes and systems to prevent recurring incidents. * Availability: Availability to work outside regular business hours and on-call as needed. Highly PReferred Skills: * Leading the response to the most complex and critical major incidents, coordinating large technical teams, and ensuring timely resolution. * Acting as the primary point of contact for executive management and other key stakeholders during major incidents. * Facilitating post-incident reviews, leading RCA investigations, and driving the implementation of corrective actions. * Developing, implementing, and continuously improving incident and problem management procedures and documentation. * Mentoring and training other incident management team members, fostering a culture of continuous improvement. * Leading the development and implementation of incident and problem management tools and technologies. * Providing strategic direction for the incident management team and contributing to the overall IT strategy. *Additional Salary Detail |