Building a ChatGPT Data Pipeline with RisingWave Stream Processor and Cassandra Vector Search

Enter the exciting brave new world of GenAI, by building a ChatGPT Data Pipeline that leverages on RisingWave's efficient stream processing write jobs for real-time data that we draw from an X (or Twitter feed) that's been enriched with Astra/Cassandra's high performant vector embedding and similarity search.

We'll explore the exciting ChatGPT world, building an efficient data pipeline that's enriched with vector embeddings as stored on the efficient Cassandra-backed Astra DB platform, and how it can pair with the performant RisingWave stream processor for its write job. We will illustrate a sample use case with live coding, as follows:

* Simulate a streaming data feed from X (or Twitter), we'll be using Kafka as the message broker for data ingestion
* RisingWave will consume the data stream, and perform data analysis
* Construct prompts based on the top 3 hashtags identified by RisingWave
* Prompts will be used for inferencing against a RAG-based BOT built with Astra DB Vector

Mary Grygleski

Senior Developer Advocate, Java Champion, President of Chicago-JUG, Chapter Co-Lead of AICamp-Chicago

Chicago, Illinois, United States

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top