Session
Bringing Vector Search to Production with OpenSearch
Vector search has moved well beyond experimentation—teams are now running semantic search and retrieval-augmented generation workloads in production. However, many implementations struggle once real traffic, cost constraints, and reliability requirements show up.
In this session, I’ll walk through how to take a vector search setup from prototype to production using OpenSearch. I’ll start with a simple architecture overview, embedding generation, indexing strategies, and query flows and then dive into the decisions that matter most in real systems. This includes choosing between dense and sparse vectors, shard and replica planning, balancing recall vs latency, and combining vector search with traditional keyword ranking using hybrid approaches.
I’ll also share how we evaluated performance using practical metrics, what we monitored in production, and how we handled reindexing and cost control as data grew. The talk includes a short demo and a checklist you can reuse to decide when approximate search is “good enough” and when exact search is worth the cost.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top