Abhijeet Chaudhuri

DevOps Engineer@RoboMQ

Jaipur, India

Actions

I am an infrastructure enthusiast with a strong foundation in Docker, Kubernetes, Jenkins, AWS, and Terraform, driven by a passion for creating impactful solutions that enhance efficiency and scalability.
I enjoy exploring innovative technologies and collaborating with like-minded professionals.

Let’s connect and share insights about the ever-evolving tech landscape!!

Links

LinkedIn

Area of Expertise

Information & Communications Technology
Transports & Logistics

Topics

DevOps
SRE
Monitoring
Kubernetes
Monitoring and Observability
Docker
CNCF
kubecon

Beyond Logs & Metrics: Building an Operational Dashboard for Proactive Monitoring

Traditional monitoring is reactive—by the time an alert fires, the damage is done. But what if we could proactively detect issues before they impact users? In this talk, I’ll share how I built an SRE-driven operational dashboard that goes beyond basic monitoring to provide real-time insights and anomaly detection.

- Custom Exporters & Metrics – How we extended Prometheus with custom exporters to track critical system health.
- Log Optimization & Cost Reduction – The techniques we used to optimize Fluentd and Elasticsearch for efficient storage and faster queries.
- Meaningful Alerting & Query Optimization – How to fine-tune queries and reduce noise while maintaining high signal alerts.
- Automating Incident Detection – How we combined metrics, logs, and traces into a single pane of glass for proactive issue resolution.

Attendees will walk away with actionable strategies for building an observability stack that SREs actually use, improving incident response, system reliability, and cost efficiency in Kubernetes environments.

Benefits to the Ecosystem

Observability in Kubernetes is often fragmented, with logs, metrics, and traces existing in silos, making proactive issue detection challenging. While existing solutions offer visibility, they often lack actionable insights for preventing incidents.

This talk presents a structured approach to building an SRE-driven operational dashboard that unifies logs, metrics, and alerts. By leveraging custom Prometheus exporters, optimized log ingestion pipelines, and real-time Grafana visualizations, teams can enhance system reliability.

Attendees will learn how to:

Reduce alert fatigue with intelligent alerting and noise reduction.
Improve log query performance for cost-effective storage and retrieval.
Correlate metrics and logs for faster root cause analysis and incident resolution.
A centralized observability framework helps teams detect anomalies early, reduce cloud costs, and improve MTTR, enabling a proactive approach to reliability in Kubernetes environments.

Target Audience: SREs, DevOps, and platform engineers.
Technical Level: Intermediate to Advanced (familiarity with Prometheus, Fluentd, and Grafana recommended).
Key Takeaways: Practical strategies to reduce alert fatigue, improve incident response, and enhance system reliability.

Abhijeet Chaudhuri

DevOps Engineer@RoboMQ

Jaipur, India

Links

LinkedIn

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Abhijeet Chaudhuri

Actions

Links

Area of Expertise

Topics

Sessions

Beyond Logs & Metrics: Building an Operational Dashboard for Proactive Monitoring

Abhijeet Chaudhuri

Links

Actions