Join our team and lead the charge in transforming production stability at JPMorgan Chase, where your problem-solving expertise drives innovation and reliability in a collaborative, high-impact environment.
As a Problem Manager at JPMorgan Chase within the Problem Management Team, you'll have the opportunity to influence production stability by contributing to the core problem management functions of Engage, Improve, Plan, and Measure. You’ll play a key role in transforming this domain and our services to enhance their scalability and reliability.
You'll collaborate with a dedicated team to deliver top-tier root causes and incident response analysis, to identify and complete repair items, and to cultivate a blameless learning culture to help modernize our application services. You'll lead the charge in Problem Management conversations with precise investigation and urgency. You will partner with Site Reliability and Application Development Engineers to research production incidents, developing your post-incident analysis. Throughout the Problem Management lifecycle, you will communicate status and progress, completing the feedback loop with senior leadership. Your commitment to follow through will ensure incidents are thoroughly addressed and preventative measures are implemented to avoid recurrence.
Job Responsibilities
Perform root cause analysis (RCA) on major impacting incidents, as well as standard incidents with potential for impact, ensuring root cause and tactical/strategic actions are identified.Coordinate, convene, and facilitate major problem review meetings across the North America region, and other regions where needed.Proactively analyze and define problem areas, developing strategic efforts across all levels of priority/severity. Apply RCA lessons learned across the technology environment.Partner with business resources and develop actions to eliminate recurrence on “business-owned” incidents.Collaborate with subject matter experts to refine operating processes and procedures to deliver and restore service more efficiently.Ensure the problem records are accurate and progress through the Problem Management process in a timely and prioritized fashion.Manage and maintain information in the ServiceNow tool and other artifacts as necessary.Own and run various Stability and Service Level Improvement programs for applications/services as well as other initiatives in an agile approach.Drive continuous improvement initiatives and implement best practices in Problem Management.Required Qualifications, Capabilities, and Skills
5+ years of experience or equivalent expertise in troubleshooting, resolving, and maintaining information technology services.Experience managing Root Cause Analysis (RCA) in a system of record such as Service Now.Proficient in pattern recognition and data correlation, with strong analytical and problem-solving skills.Advanced Excel knowledge with the ability to dissect large data files, utilizing formulas, minor scripting, and filtering.Ability to navigate, interface, and work with multiple teams across regional boundaries and communication channels, demonstrating command and control.Ability to influence and lead technical conversations with various application support groups that include technical leaders, IT professionals, developers, and architects.Continuously track progress to ensure deliverables within prescribed timelines until full problem closure.Cross-technology background in disciplines such as Cloud Engineering, Networking, Site Reliability Engineering, or Technology Support.Understanding of observability and monitoring tools and techniques.Excellent communication, technical writing, presentation, and relationship management skills.Preferred Qualifications, Capabilities, and Skills
Working knowledge on dashboard reporting using Tableau, PowerBI, Qlik, and other such tools.Practical knowledge of engineering principles, design patterns, failure mode-effects analysis.Practical experience with public cloud.ITIL Foundation certification or higher preferred, with exposure to processes in scope of the Information Technology Infrastructure Library (ITIL) framework.