Session

Ingesting Data at Scale into Elasticsearch with Apache Pulsar

One of the best things about Elasticsearch is its ability to handle large amounts of data and serve this data with sub-millisecond latency, which makes it an ideal platform to run analytics workloads. But like any purpose-built database, there are always trade-offs to consider. Elasticsearch's case is how to load the data continuously and at scale. A way to solve this problem is by using a buffer layer that can store and forward events to Elasticsearch. Apache Pulsar provides a great alternative to implement this layer.

This talk will explain how Pulsar can implement data ingestion, validation, aggregation, and storage and push this data to Elasticsearch using the sink connector. It will provide the necessary knowledge for you to ingest any data of data, such as logs, sensor data, and streaming events into Elasticsearch for analytics and visualization.

Timothy Spann

Principal Developer Advocate for Data in Motion @ Cloudera

Princeton, New Jersey, United States

View Speaker Profile