Session

AIOps at ING: Correlating Signals for Smarter Incident Response

At ING, ensuring uninterrupted service for our customers starts with answering one simple question: ‘Is the bank online?’ Our correlation engine prototype brings together alerts, metrics, and external signals into a clear, unified view of service health. This helps us detect and resolve issues before they impact customers.

This talk highlights how these efforts strengthen our operational excellence by supporting deployment strategies, enhancing situational awareness, and strengthening incident response. We’ll focus on two key phases:

1) Before Deployment:
We proactively assure reliability by offering real-time observability into critical business services. Dashboards and the Change Reliability Indicator (CRI)—an AIOps model—help identify risky changes early, reducing the risk of incidents and protecting ING’s reputation.

2) During & After Deployment:
We maintain real-time visibility and control, ensuring stability throughout deployments. When issues arise, timely alerts and historical insights accelerate root cause analysis and reduce Mean Time to Resolve, supporting operational resilience.

Our correlation engine brings together diverse data—internal alerts, external monitoring, customer reports—to answer if ING is truly “online.” Early results show promise in detecting outages across multiple signals, though further refinement is needed.

This initiative builds on ING’s observability culture and demonstrates the power of a data-driven approach to incident detection and remediation. It’s just the beginning: by expanding to more signals, we aim for a future where we can confidently and quickly pinpoint issues, ensuring reliability at scale.

Eileen Kapel

Data Scientist at ING

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top