Skip to content

Ollama MCP Server (Python)

Supercharge your AI assistant with local LLM access

A Python implementation of the Model Context Protocol (MCP) server that exposes Ollama SDK functionality as MCP tools, enabling seamless integration between your local LLM models and MCP-compatible applications like Windsurf and VS Code.

This is a Python port of the TypeScript ollama-mcp project.


Features

  • ☁ Ollama Cloud Support — Full integration with Ollama's cloud platform
  • 🔧 8 Comprehensive Tools — Full access to Ollama's SDK functionality
  • 🔄 Hot-Swap Architecture — Automatic tool discovery with zero-config
  • 🎯 Type-Safe — Built with Pydantic models and type hints
  • 🚀 Minimal Dependencies — Lightweight and fast
  • 🔌 Drop-in Integration — Works with Windsurf, VS Code, and other MCP clients
  • 🌐 Web Search & Fetch — Real-time web search and content extraction via Ollama Cloud (planned)
  • 🔀 Hybrid Mode — Use local and cloud models seamlessly in one server

Why Python?

This Python implementation provides the same functionality as the TypeScript version but with:

  • Python Native — No Node.js dependencies required
  • Poetry Package Management — Modern Python dependency management
  • Async/Await — Native Python async support
  • Pydantic Models — Robust data validation and type safety
  • Poetry Scripts — Easy installation and execution

Quick Example

Type in your MCP-compatible chat window:

  • MCP Tool: ollama / ollama_chat — Use model llava and tell me a bed time story
  • MCP Tool: ollama / ollama_chat — Use model gpt-oss and tell me a bed time story

Next Steps