Session
Building An Efficient Streaming Data Pipeline with Apache Cassandra and Apache Pulsar
Event Streaming is one of the most important software technologies in the current computing era as it enables systems to process huge volumes of data in blazingly high speed and in real-time. It is indeed the "glue" that can connect data to flow through disparate systems and pipelines that are typical in cloud environments. Leveraging on the Pub/Sub pattern for the message flow, and designed with the cloud in mind, Apache Pulsar has emerged as a powerful distributed messaging and event streaming platform in recent years. With its flexible and decoupled messaging style, it can integrate and work well with many other modern-day libraries and frameworks.
In this workshop, we will build a modern, efficient streaming data pipeline using Apache Pulsar and Apache Cassandra. Apache Pulsar will handle the data ingest. The external data that comes in will be used for further processing by Pulsar Functions that will in turn reference the tables in Cassandra as the data lookup sources. The results of the transformed data will then be egressed and sent to an Astra DB sink. We will also examine to see how we can further optimize the entire processing pipeline.
Mary Grygleski
AI Practice Lead, TED/x Speaker, Technical Advocate, Java Champion, President of Chicago-JUG, Chapter Co-Lead of AICamp-Chicago
Chicago, Illinois, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top