Zero-cloud setups for LLM/GenAI observability

As GenAI shifts from cloud APIs to localized, fine-tuned models running directly on developer workstations, traditional logging has already fallen short. Large Language Models (LLMs) introduce complex failure modes: non-deterministic outputs, opaque agentic loops, hidden prompt injection risks, and unpredictable token-based latency. To build reliable applications, engineers need deep observability long before their code hits a production cluster. Crucially, they need it without sending sensitive proprietary prompts to external cloud vendors.

This session provides a highly practical, architectural guide to implementing end-to-end LLM observability entirely on your local development machine. We will explore the modern open-source landscape, detailing how to capture spans, trace multi-turn conversations, evaluate retrieval-augmented generation (RAG) quality, and track system resources locally. Walk away with a clear blueprint for comparing and deploying: in-process diagnostic engines like Arize Phoenix for instant, notebook-based tracing, self-hosted Docker stacks such as Langfuse for comprehensive agent and prompt management, standardized semantic conventions via OpenTelemetry, OpenLLMetry, and OpenLIT to cleanly decouple application logic from telemetry backends. Whether you are building basic wrapper scripts or complex autonomous agent networks via Ollama or vLLM, this talk equips you with the tools to debug, profile, and optimize your local AI stack securely and efficiently.

Cajetan Rodrigues

Software @ Volvo Cars

Göteborg, Sweden

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.