Session
Total Recall. Get ready for a surprise.
You followed the tutorials. You embedded your docs, wired up a vector database, and shipped a shiny RAG chatbot...only to watch it confidently invent facts. The standard playbook is letting you down.
The twist? The wins in production are often simple and counter-intuitive. The secret isn't a bigger model, it's smarter retrieval.
This talk reveals the overlooked patterns that turn a POC into a production-ready system:
Surprise #1: Your chunks are wrong.
Stop naïve text splitting. Use your content's native structure (headers, JSON) to keep context intact in your embeddings.
Surprise #2: Keywords still matter.
Hybrid dense+sparse (BM25) raises coverage and precision, especially for entities, IDs, and exact phrases.
Surprise #3: Retrieval is two-stage.
Add reranking with a stronger cross-encoder to sift the true signal from top-k noise.
Surprise #4: You can’t steer blind.
Run evals (e.g. Recall@k) so iterations move the numbers, not just the "vibes."
No RAG 101. We'll show before/after metrics, failure modes, and the trade-offs that separate prototypes from production systems. You'll leave with a pragmatic playbook for retrieval that actually remembers.
Attendees leave with a concrete playbook for high‑fidelity RAG retrieval.
I'll (try to) teach developers how to design smarter retrieval pipelines and prove improvements with metrics. The focus is on structure‑aware chunking, hybrid dense+sparse search, two‑stage retrieval with reranking, and evals that guide iteration beyond vibes.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top