Speaker

Milind Srivastava

Milind Srivastava

PhD student at CMU, working on making your analytics and observability 100x faster and cheaper

Actions

Milind Srivastava is a PhD student at Carnegie Mellon University advised by Prof. Vyas Sekar. Milind is interested in using approximation techniques in a principled manner to reduce telemetry costs across the stack, including data collection, transmission, and analysis. He is also interested in seeing this research get adopted by industry practitioners. Previously, Milind got his Bachelor's and Master's degrees in Computer Science from IIT Madras in India. In his free time, Milind likes to cook, explore restaurants, and bike around Pittsburgh.

ASAPQuery: A drop-in sketch-based accelerator for Clickhouse

ClickHouse excels at exact real-time analytics. However, there exist many query workloads where some amount of approximation is tolerable, especially if it provides significant reduction in querying cost and latency.
Our research at Carnegie Mellon University and University of Maryland explores this from first principles -- what if we build a query engine that treats approximation primitives as first-class citizens?

ASAPQuery is an open-source (https://github.com/ProjectASAP/ASAPQuery) drop-in accelerator that sits in front of ClickHouse. It requires no changes to your application or your queries — your existing SQL work as-is. ASAPQuery intercepts queries transparently, routes eligible ones to pre-computed sketch summaries, and falls back to ClickHouse for everything else. ASAPQuery provides Clickhouse users with a new performance-accuracy tradeoff point with minimal effort. Need exact? Use Clickhouse. Okay with approximation? Use ASAPQuery.
Early results show that ASAPQuery can improve query latency compared to Clickhouse by > 2x, while maintaining 99% accurate query results.

ASAPQuery's key technique is the principled use of "sketches".
Sketches are approximate data summaries that can estimate aggregates over large streams of data with very low resources. For instance, a 10KB sketch can be used to measure the percentiles of 100 million data points (~800MB) with > 99% accuracy - 4 orders of magnitude memory reduction!
ASAPQuery is conceptually similar to ClickHouse’s incremental materialized views — sketches are precomputed continuously at ingest time, so queries hit sketches rather than raw data. The key difference is that ASAPQuery does this automatically for sketch-based approximations, without needing query rewrites or manual view maintenance. While sketches aren't a new concept, their use requires expert knowledge. ASAPQuery democratizes the benefits of sketches for users of Clickhouse, by automatically translating SQL queries into sketch-based query plans and executing them.

In this talk, we will provide an intuition on how sketches work, and how ASAPQuery uses them in query execution. We'll show benchmark results across real query workloads, characterize which query classes benefit most, and demonstrate ASAPQuery's drop-in deployment against an existing ClickHouse stack.

A Drop-in System to Accelerate Metrics Observability by 100x using Sketch-based Approximation

Metrics observability workloads are growing in scale, resulting in (a) higher cost to operate observability infrastructure, and (b) slower query latencies.

The usual approaches to deal with these are:
- sample data
- roll up data
- reduce data cardinality
- send less queries

All of these approaches compromise the coverage of the observability infrastructure and can result in missing important anomalous behavior.

Through our research, we have developed a radically new approach to achieve large scale, low cost, and low latency without compromising the coverage of the observability infrastructure.

Our system reduce querying cost and latency by 100x by using 2 key techniques:
- streaming precomputation
- sketch-based approximation

Our system is developed as a drop-in accelerator to an existing Prometheus-Grafana stack i.e. we require no changes to or replacement of Prometheus or Grafana.

We will release an open-source prototype around the end of 2025.

A Drop-in System to Accelerate Metrics Observability by 100x using Sketch-based Approximation

Metrics observability workloads are growing in scale, resulting in (a) higher cost to operate observability infrastructure, and (b) slower query latencies.

The usual approaches to deal with these are:
- sample data
- roll up data
- reduce data cardinality
- send less queries

All of these approaches compromise the coverage of the observability infrastructure and can result in missing important anomalous behavior.

Through our research, we have developed a radically new approach to achieve large scale, low cost, and low latency without compromising the coverage of the observability infrastructure.

Our system reduce querying cost and latency by 100x by using 2 key techniques:
- streaming precomputation
- sketch-based approximation

Our system is developed as a drop-in accelerator to an existing Prometheus-Grafana stack i.e. we require no changes to and replacement of Prometheus or Grafana.

We will release an open-source prototype in the coming months.

SketchDB: Reducing your AWS analytics cost by 10x using principled approximation

Streaming analytics serve a variety of downstream applications. With rising volumes of data and an increased need for real-time analytics, analytics costs are ballooning out of control. Many downstream applications are amenable to working with approximate analytics and thus, methods like sampling are commonly used by practitioners to reduce costs. Unfortunately, in doing so, practitioners are faced with a cost-accuracy dichotomy, having to choose between low cost and high accuracy.

Sketches aka sketching algorithms provide an opportunity to break this cost-accuracy dichotomy. Sketches can accurately estimate statistical metrics over data streams while consuming extremely low resources and only making one pass over the data. Sketches also provide theoretical error guarantees backed by extensive scientific literature. Some metrics that can be estimated by sketches include quantiles, heavy hitters (i.e. most frequent items), cardinality (number of distinct items). While sketches seem like a panacea for ballooning costs, they (a) are difficult and unintuitive to use, (b) require tuning low-level knobs to get optimal performance, and (c) are not well integrated with analytics frameworks.

To tackle these challenges, we design SketchDB, a drop-in sketch-based optimizer that sits in front of an existing database/streaming platform and provides high-accuracy low-latency analytics at a fraction of the cost, while also providing an easy-to-use high-level interface. SketchDB can estimate metrics with < 1% error and 10x lower latency, while consuming 10-30x lower memory at ingest and query time. SketchDB supports aggregate statistical metrics over entire data streams as well as subpopulations. To use SketchDB, an operator simply needs to configure SketchDB with the specific streams and corresponding metrics that should be accelerated. Our grand aim is to reduce analytics costs by multiple orders of magnitude and democratize the use of approximation primitives like sketches.

A Library of Sketching Algorithms Integrated into Apache Flink

Enterprises ingest and analyze massive volumes of streaming data in Flink to analyze and derive real-time insights. For instance, financial institutions process credit card transactions to monitor risk and detect fraud, while observability platforms ingest telemetry data to monitor application performance. While traditional Flink analytics pipelines have served us well so far, the rising scale and complexity of data are causing an untenable increase in cloud costs as well as increased latency that prohibits real-time decision-making. Thus, there is a need to rethink the design of aggregate analytics pipelines.

Sketching algorithms provide an effective alternative to traditional aggregation by leveraging compact, probabilistic data structures to provide highly accurate and low-cost analytics. These algorithms are designed to estimate various aggregates like distinct counts, frequency, and quantiles, and are amenable to massively parallel processing. Sketches are backed by extensive research and estimate aggregates, with mathematically bounded errors. Unfortunately, implementations of these algorithms have not made it into the Flink ecosystem, preventing the Flink community from reaping their benefits.

We have provided a library of sketches for Flink by integrating the Apache DataSketches library, an open-source library of sketches, into the Flink ecosystem. Users can use our library with the Flink DataStream API or through a declarative YAML configuration where they can specify the sketches to use and their parameters, what labels to key by, etc.. We are integrating newer sketches like UnivMon, Hydra, and DDSketch, which provide novel capabilities. We are in the process of open-sourcing our implementation and initial benchmark results, and hope that the community can benefit from this effort.

Reducing Cloud Costs for Security Data Analytics by 10x Using Principled Approximation

Problem:
Security data analytics serves a variety of downstream situational awareness goals including network monitoring, anomaly detection, attack detection, and machine learning. For instance, operators want to routinely check for “heavy hitters”, “new application patterns” or “anomalous trends in packet/flow distributions” that may be indicative of attacks. With rising volumes of data and an increased need for real-time analytics, cloud costs for security observability and analytics are spiraling out of control. We argue that to tame this cost, the world of security data analytics needs a fundamental shift in how these use cases are served by big data stacks, to provide low cost and accurate analytics.

Opportunity:
Sketches aka sketching algorithms provide an opportunity to reduce the cost of analytics, while providing accurate analytics. These algorithms are designed to accurately estimate statistical aggregates over data, such as percentiles, heavy hitters (i.e. most frequent items) and, cardinality (number of distinct items), at a fraction of the cost. Sketches also provide theoretical error guarantees backed by extensive scientific literature. Unfortunately, sketches are (a) are difficult to use, (b) require tuning low-level knobs to get optimal performance, and (c) are not well integrated with analytics frameworks.

Contribution:
Our research re-imagines big data analytics from an approximation-first lens and takes a fundamentally new approach to data analytics. We design SketchDB, a drop-in sketch-based optimizer that integrates with an existing big data deployment. SketchDB provides high-accuracy analytics with a fraction of the cost and latency of existing systems. Our initial experiments show that SketchDB can estimate queries with < 1% error and 10x lower latency, while consuming 10-30x lower memory while ingesting and querying data. Our grand vision is to reduce analytics costs by multiple orders of magnitude and democratize the use of approximation primitives like sketches.

Example Usecae and Deployment:
Consider an operator who wants to report the top 10 source IPs every minute based on the volume of traffic sent to a datacenter in the last hour. The state-of-the-art deployments will use Netflow to collect this data, which will then be ingested into a monitoring or observability tool such as ThousandEyes or Prometheus. Every minute, this tool will compute the top 10 source IPs based on the last hour of data and serve the results. While accurate, this is extremely costly and inefficient. Instead, SketchDB deploys a streaming precompute layer (such as using Apache Flink) that performs lightweight computation on the data as it is being streamed into the monitoring tool. When the query hits, SketchDB uses its query engine to quickly aggregate the precomputed results and answer the query, instead of having to compute on the raw data each time. This reduces CPU time, memory usage, query latency and energy consumption, all while providing approximate yet high accuracy analytics!

Milind Srivastava

PhD student at CMU, working on making your analytics and observability 100x faster and cheaper

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top