Session
Ship It: From Agent Demo to Production in Minutes
Your agent demo wowed the team. The VP nodded approvingly. "Ship it," they said. Six months later, you are still rewriting it for production. The agent forgets users between sessions. You have no idea why it failed at 3 AM. It cannot handle more than 10 concurrent requests. And last month's bill was four times the estimate. You are not building features anymore; you are rebuilding infrastructure. The prototype-to-production gap is the graveyard of AI projects. It is not that the agent does not work. It works beautifully in a notebook. The problem is everything around it: memory that persists across sessions so users feel recognized, monitoring that tells you what happened without instrumenting every function, infrastructure that scales to 1,000 users without manual intervention, and cost controls that prevent a single runaway conversation from blowing your budget. These are not hard problems individually, but together they take months when they should take minutes. In this talk, I will show you: • How to add cross-session memory to any agent using a managed vector store, so your agent remembers users, their preferences, and past interactions without managing a database • How to implement zero-code monitoring that captures every agent decision, tool call, and token count without modifying your agent logic • How to deploy auto-scaling infrastructure that handles traffic spikes gracefully and scales to zero when idle • How to apply cost optimization patterns that set per-conversation budgets, cache repeated tool calls, and alert before bills surprise you • A live demo: taking a prototype agent from a notebook to a production endpoint with all four capabilities in under 15 minutes You will walk away with: • A production deployment checklist covering memory, monitoring, scaling, and cost, applicable to any agent framework • Working infrastructure-as-code templates you can deploy to your own cloud account • A cost estimation model that predicts monthly spend based on
Outline: • The Six-Month Gap • Cross-Session Memory with S3 Vectors • Zero-Code Monitoring and Observability • Auto-Scaling and Cost Optimization • The Complete Picture and Resources
Elizabeth Fuentes Leone
Developer Advocate
San Francisco, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top