Session
Optimizing Metrics and Alerts Management with Thanos
Imagine your organization has a variety of IT components such as multi-cloud environments, storage systems, hypervisors, container orchestrators, and application metrics. Each component may also come from different open sources or third-party vendors. As a DevOps team, you need to manage thousands of exporter endpoints, Prometheus nodes, and millions of alert rules that need to be maintained, updated, and kept functioning correctly.
In our company, we faced the same problems until we found Thanos. Thanos helped us centralize the metrics of various independent components into a single cluster. We also use ArgoCD for GitOps and change management of the entire alert rule and metric endpoint configuration.
This talk will share a case study of how we solved this problem. It will provide an overview of the architecture and scalability achieved by combining Thanos and GitOps, explain how we applied these solutions across all our subsidiary teams, and discuss some trade-offs we encountered.
Sang Tran Quoc
Deputy Director of Cloud Infrastructure Service Development Center - FPT Smart Cloud
Ho Chi Minh City, Vietnam
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top