David vonThenen
AI/ML Leader | Keynote Speaker | Book in Progress | Agentic AI, Deep Learning, Production AI | OSS Engineer & Developer Advocate | Python, Go, C++
Long Beach, California, United States
Actions
David vonThenen is an AI/ML Engineer where he focuses on production AI systems, enterprise AI strategy, and architectures for explainable, governable, and reliable generative AI. His work spans Agentic AI, Graph RAG, Document RAG, AI memory systems, multi-agent architectures, OpenSearch, Neo4j, data lineage, provenance, and AI governance.
David has more than 20 years of experience building production software across AI/ML, speech and NLP, Kubernetes, cloud-native platforms, storage, virtualization, and backup/recovery. He combines hands-on engineering with technical strategy, open-source leadership, developer advocacy, and customer-facing architecture work.
He has spoken at NVIDIA GTC, Devoxx, ODSC, DeveloperWeek, WeAreDevelopers, All Things Open AI, and SCaLE on topics including explainable AI, adaptive RAG, agentic systems, model quantization, multimodal ML, voice cloning, and small language model training. His mission is to help teams move beyond AI prototypes and build systems that are observable, reproducible, trustworthy, and ready for production.
Area of Expertise
Topics
From Vector Search to Better Understanding: How Hybrid RAG Improves Answers, Not Just Matches
Retrieval-Augmented Generation is everywhere, and most teams start with vector search when building out these agents. It works well when the goal is finding relevant text. It struggles when the task shifts to understanding, summarizing, or reasoning across multiple documents. Developers often discover this the hard way when their system retrieves "relevant" chunks but still produces shallow, inconsistent, or even contradictory answers.
This session introduces Hybrid RAG as a practical alternative. We'll walk through how combining vector retrieval with symbolic and keyword-based approaches changes what the model can actually do. You'll see why Hybrid RAG performs better for synthesis-heavy tasks, how it reduces failure modes common in embedding-only pipelines, and how to implement it in practice. The talk includes multiple live demos that show the differences side by side, using open-source code you can adapt to your own solutions.
Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI
The AI industry is shifting from bigger to better. As companies chase efficiency and performance, quantization has emerged as one of the most effective ways to make models smaller, faster, and more affordable—without crippling accuracy. With recent breakthroughs from teams like Deepseek proving that optimization can shake entire markets, developers are rethinking what "efficient AI" really means. The real question isn't whether we can make models smarter... it's whether we can make them smarter per watt, per dollar, and per millisecond.
This session explores the full lifecycle of model quantization and how it powers the rise of Small Language Models (SLMs) and agentic AI systems. We'll cover how quantization works, when it pays off, and how it changes deployment tradeoffs across CPUs, GPUs, and AI accelerators. Attendees will walk away with practical techniques for compressing models, tuning quantization-aware training, and deploying specialized SLMs to leverage them in multi-agent Agentic systems using Agent2Agent protocol. The end goal is to maximize hardware potential while staying responsive without breaking the bank on hardware costs.
The Sound of Your Secrets: Teaching Your Model to Spy, So You Can Learn to Defend
Every keyboard has a sound signature. Every click and clack carries information. With deep learning and a decent microphone, that information can be weaponized. In this session, we'll explore how modern AI models can identify what you're typing just from the sound of your keyboard. Using a dataset of recorded keystrokes and an open source sound classification pipeline, we'll walk through building a model that can recover text with startling accuracy. You'll see firsthand how a few lines of Python and a trained network can turn your laptop into an acoustic fingerprint.
But this talk isn't about enabling surveillance... it's about understanding it to fight back. We'll unpack why uniform keyboard layouts and consistent typing styles make these attacks so effective, then explore real countermeasures: signal masking, password entropy, and environmental noise defenses. You'll leave with a practical understanding of how these attacks work, how to reproduce them for research or awareness, and how to harden your systems (and yourself) against them.
OpenSearchCon North America 2026 Sessionize Event Upcoming
Render (RenderATL) 2026 Sessionize Event Upcoming
WeAreDevelopers World Congress 2026 - Europe Sessionize Event Upcoming
Devoxx Poland 2026 Upcoming
A Practical Guide to Training a Small Language Model: Tokenizers, Training, and Real-World Pitfalls
DevBcn 2026 Sessionize Event Upcoming
(Virtual) Toronto Machine Learning Society 2026 Upcoming
Why Your RAG Agent Is Confidently Wrong: Retrieval Choices That Actually Matter
NDC Copenhagen 2026 Sessionize Event
AI DevSummit + DeveloperWeek Management 2026 Sessionize Event
Open Data Science Conference (ODSC) AI East 2026
Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI
Open Data Science Conference (ODSC) AI East 2026
Train Your Own Small Language Model: A Hands-On Workshop in Model Design, Distillation, and Deployment
Devoxx Greece 2026
The Sound of Your Secrets: Teaching A Model to Spy, So You Can Learn to Defend
Devoxx Greece 2026
Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI
Devoxx France 2026
The Sound of Your Secrets: Teaching Your Model to Spy So You Can Learn to Defend
Southern California Linux Expo (SCaLE) 23x
A Practical Guide to Training a Small Language Model: Tokenizers, Training, and Real-World Pitfalls
Southern California Linux Expo (SCaLE) 23x
The Sound of Your Secrets: Teaching Your Model to Spy, So You Can Learn to Defend
Devoxx Morocco
Rethinking RAG: How MCP and Multi-Agents Will Transform the Future of Intelligent Search
KubeCon + CloudNativeCon North America 2025 Sessionize Event
Open Data Science Conference (ODSC) West
Rethinking RAG: How MCP and Agent2Agent Will Transform the Future of Intelligent Search
All Things Open
TinyML Meets PyTorch: Deploying AI at the Edge with Python Using ExecuTorch
California Technology Summit
Guardians of AI: Equipping Humans to Detect and Prevent Adversarial Manipulation
API World 2025 Sessionize Event
AI_dev: Open Source GenAI & ML Summit Europe 2025 Sessionize Event
WeAreDevelopers World Congress 2025 Sessionize Event
DevBcn 2025 Sessionize Event
Render (RenderATL) 2025 Sessionize Event
Open Data Science Conference
Workshop: Adaptive RAG Systems with Knowledge Graphs: Building Reinforcement-Learning-Driven AI Applications
Devoxx UK
The Rise of Agentic AI: Harnessing Open Source for Dynamic Decision-Making
Video: https://youtu.be/zYFYqXh1UGg
Devoxx France
Explaining the Unexplainable: Python Tools for AI Transparency using Captum
Video: https://youtu.be/DW1GWoBYZbw
All Things Open AI
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Applications
NVIDIA GTC 2025
Crack the AI Black Box: Practical Techniques for Explainable AI
Video: https://youtu.be/umurTAD4x2Y
Southern California Linux Expo (SCaLE) 22x
Training Multi-Modal ML Classification Models for Real-Time Detection of Debilitating Disease
Southern California Linux Expo (SCaLE) 22x
Demystifying Building Natural Language Processing ML Models and How to Leverage Them By Example
Video: https://youtu.be/hAvCtpKnkLI
Developer Week 2025
KEYNOTE: The Sound of Innovation: Why Voice Cloning Will Redefine Human-Computer Interaction
Developer Week 2025
Navigating the Edge-Cloud Bridge: Building Resource-Optimized IoT/Edge Assistants with LLMs
Open Data Science Conference 2024
Workshop: Building Multiple Natural Language Processing Models to Work In Concert Together
Real Time Communications Conference & Expo 2024
KeyNote: Training Machine Learning Classification Models for Creating Real-Time Data Points of Medical Conditions
Video: https://youtu.be/YgeinCCUBCk
Real Time Communications Conference & Expo 2024
Session: Building Multiple Natural Language Processing Models to Work In Concert Together
Video: https://youtu.be/0DHHS17mn_o
AI DevSummit 2024 Sessionize Event
SCaLE 21x (2024)
Title: Voice-Activated AI Collaborators: A Hands-On Guide Using LLMs in IoT & Edge Devices
Video: https://youtu.be/9Nj4hKy70yQ
IEEE RTC Conference
Title: Enhancing Real-Time WebRTC Conversation Understanding Using ChatGPT
Video: https://youtu.be/u-Q2TdzS7d8
IEEE RTC Conference
Title: Edge Devices as Interactive Personal Assistants: Unleashing the Power of Generative AI Agents
Video: https://youtu.be/ctyWBG-x9y8
Nexus x TPF GenAI Rush 2023
Title: Streamlining Communication Workflows
Video: https://www.youtube.com/watch?v=8gfWnN_hwGk
API World 2022
Title: Enabling Untapped Use Cases in ML/AI : Edge, Memory-Constrained, and Server-side Use Cases
Video: https://youtu.be/XYQPIHazMK8
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top