Building Copilots with Flink SQL, LLMs and vector databases

Generative AI apps such as Copilots can serve as a vital link between foundational models and enterprise data, enhancing developer and employee productivity. Users are able to ask questions about streaming and batch data in natural language, making stream processing more accessible to developers, data professionals and operational teams.

For structured data, the accuracy of foundation models to dynamically generate and run correct Flink SQL can be greatly improved by annotating the schema of Flink tables with LLMs first to provide the best context to the prompt. The schemas as well as other unstructured data (like text, JSON, and binary files) can be used to continuously generate embeddings, storing them in vector databases for retrieval augmented generation (RAG).

We will demonstrate in a concrete step-by-step example how to build and deploy a Copilot in TypeScript/JavaScript with open-source tools, integrated with Flink for stream processing, the latest OpenAI/Mistral models for model inference and vector stores. We will also discuss how to select and integrate the best foundation model for the relevant use case, to optimize for cost, performance and latency at inference.

Steffen Hoellinger

CEO at Airy

San Francisco, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Building Copilots with Flink SQL, LLMs and vector databases

Steffen Hoellinger

Links

Actions