Session
Running MCP Fully Local: Private, Offline-Capable Agents with Ollama and Open Models
Over the last few months I've built and shipped 100+ agents in public, many wired into MCP servers running against local models through Ollama, LM Studio, and decentralized open-source inference. A draw.io MCP server driven by Llama 3.2, an Excalidraw MCP integration, a Bright Data MCP briefing agent, all running end-to-end without a single token leaving the host.
This talk is a working engineer's tour of that stack. We'll walk through a minimal local MCP setup (server + client + Ollama) and then dig into the real-world failure modes: tool-selection collapse on 7B models, JSON-schema compliance gaps, capability-negotiation mismatches, and the surprisingly large quality delta between structured-output fine-tunes and general chat models. I'll share the prompt shapes, tool-description patterns, and schema-validation tricks that reliably push small open models from "demo-grade" to "I'd ship this internally."
Attendees will leave with a reference architecture for private MCP, a shortlist of open models that actually handle tool calls well today, and a set of design patterns for MCP servers that degrade gracefully when the client LLM has 8B parameters instead of a trillion.
Harish Kotra
Developer Relations, Hackathons Specialist & A No-Code Educator
Hyderābād, India
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top