Session
Build a ChatGPT RAG Data Pipeline with RisingWave Stream Processor and Vector Store
Enter the exciting brave new world of GenAI, by building a ChatGPT Data Pipeline that leverages on RisingWave's efficient stream processing jobs for real-time data that we draw from an X (or Twitter feed) that's been enriched with vector data and similarity search.
We'll explore the exciting ChatGPT world, building an efficient data pipeline that's enriched with vector embeddings as stored in a vector DB (PgVector) , and how it can pair with the performant RisingWave cloud-based stream processor for its write job. We will illustrate a sample use case with live coding, as follows:
* Simulate a streaming data feed from X (or Twitter), we'll be using Kafka as the message broker for data ingestion
* RisingWave will consume the data stream, and perform data analysis
* Construct prompts based on the top 3 hashtags identified by RisingWave
* Prompts will be used for inferencing against a RAG-based BOT built with PgVector
Mary Grygleski
AI Practice Lead, TED/x Speaker, Technical Advocate, Java Champion, President of Chicago-JUG, Chapter Co-Lead of AICamp-Chicago
Chicago, Illinois, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top