Session

Running MCP Fully Local: Private, Offline-Capable Agents with Ollama and Open Models

Over the last few months I've built and shipped 100+ agents in public, many wired into MCP servers running against local models through Ollama, LM Studio, and decentralized open-source inference. A draw.io MCP server driven by Llama 3.2, an Excalidraw MCP integration, a Bright Data MCP briefing agent, all running end-to-end without a single token leaving the host.

This talk is a working engineer's tour of that stack. We'll walk through a minimal local MCP setup (server + client + Ollama) and then dig into the real-world failure modes: tool-selection collapse on 7B models, JSON-schema compliance gaps, capability-negotiation mismatches, and the surprisingly large quality delta between structured-output fine-tunes and general chat models. I'll share the prompt shapes, tool-description patterns, and schema-validation tricks that reliably push small open models from "demo-grade" to "I'd ship this internally."

Attendees will leave with a reference architecture for private MCP, a shortlist of open models that actually handle tool calls well today, and a set of design patterns for MCP servers that degrade gracefully when the client LLM has 8B parameters instead of a trillion.

Harish Kotra

Developer Relations, Hackathons Specialist & A No-Code Educator

Hyderābād, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top