How Model Context Protocol Works

Model Context Protocol (MCP) is an open standard introduced by Anthropic in late 2024 that defines how AI models communicate with external tools, data sources, and services. Where function calling lets a model invoke a single function defined in the same API request, MCP defines a persistent, structured protocol between a model host and external capability servers. This guide explains how the protocol actually works — the architecture, the three core primitives, the transport layer, and what it means in practice for engineers building on top of AI systems.

The Problem MCP Solves

Before MCP, connecting an AI model to external capabilities required custom integration code for every combination of model and tool. A team using Claude to query a database, search a knowledge base, and call an internal API would write three separate integration layers, each tied to a specific model provider's API format. Switching models or adding capabilities meant rewriting integrations.

The deeper problem is context. AI models are stateless — they process a prompt and return a response. For an agent to be genuinely useful across complex tasks, it needs access to current data, the ability to take actions, and structured ways to receive results back. Without a standard protocol, every team solved this differently, producing fragmented, non-portable tooling.

MCP defines a shared interface. A capability built as an MCP server works with any MCP-compatible host — Claude, GPT, Gemini, or a local model — without modification. It separates the question of "what can the AI do" from "which AI are you using."

The Three-Layer Architecture

MCP operates across three distinct components: the host, the client, and the server.

The host is the application that contains or interfaces with the AI model. Claude Desktop, VS Code with an AI extension, a custom agent application — any software that manages the model's context and surfaces its output to a user or downstream system is a host. The host is responsible for deciding which MCP servers to connect to and mediating the model's access to them.

The client is a component inside the host that manages the connection to a specific MCP server. One host can maintain multiple clients, each connected to a different server. The client handles the protocol-level communication: sending requests, receiving responses, and managing the lifecycle of the connection.

The server is the external process or service that exposes capabilities. A server might expose access to a file system, a database, a REST API, a code execution environment, or anything else. Servers are lightweight and focused — a well-designed MCP server does one thing well. They run as separate processes and communicate with the client over a defined transport.

The separation between host, client, and server means capability servers are independent of the model. You can build a server that exposes your company's internal knowledge base, and it works with any MCP-compatible host without any awareness of which model is on the other end.

The Three Core Primitives

MCP servers expose capabilities through three primitives: tools, resources, and prompts. Each has a distinct purpose and a defined interaction pattern.

Tools

Tools are executable functions the model can call. A tool has a name, a description, and an input schema defined in JSON Schema format. The model reads the description to understand what the tool does and the schema to know what parameters to pass.

When the model decides to use a tool, it generates a tool call with the appropriate parameters. The client sends this to the server, the server executes the function, and returns the result. The result goes back into the model's context as tool output, which the model uses to continue generating its response.

Examples: a tool that runs a SQL query and returns rows, a tool that searches a vector database and returns matching documents, a tool that creates a GitHub issue, a tool that executes a shell command. Tools are the "do something" primitive — they produce side effects or retrieve data on demand.

Resources

Resources are data that the server exposes for the model to read. Unlike tools, resources are not invoked with parameters — they're addressable by a URI and read directly. A resource might be a file, a database record, a configuration document, or a live data feed.

Resources have a URI scheme defined by the server. A file server might expose resources at file:///path/to/document.txt. A database server might expose records at db://customers/12345. The client reads resources and injects their contents into the model's context.

The distinction from tools matters: tools are called to perform an action or fetch data dynamically; resources are static or semi-static data sources the model can reference. A good rule of thumb: if the capability produces side effects or requires parameters, it's a tool. If it's simply data the model should be able to read, it's a resource.

Prompts

Prompts are reusable, parameterised message templates that servers expose for use in conversations. A server might expose a prompt template for summarising a document, drafting a specific type of report, or following a particular reasoning pattern.

Prompts let server authors encode domain knowledge and task-specific instructions that hosts can surface to users or inject automatically. They're more structured than system prompts set by the host — they come with metadata, can accept parameters, and are discoverable through the protocol.

In practice, prompts are the least commonly used of the three primitives today, but they provide a clean way for capability servers to ship task-specific instructions alongside their tools and resources.

How the Transport Layer Works

MCP uses JSON-RPC 2.0 as its message format. Every interaction between client and server is a JSON-RPC request or notification, carried over one of two transport mechanisms.

stdio transport runs the server as a local subprocess. The client launches the server process and communicates with it over standard input and output streams. This is the simplest transport and the default for local development — no network, no ports, no authentication overhead. It's appropriate for servers that run on the same machine as the host, such as file system access, local database connections, or development tools.

HTTP with Server-Sent Events (SSE) runs the server as a network service. The client connects to an HTTP endpoint; the server streams responses back using SSE. This transport is appropriate for remote servers, shared infrastructure, or any scenario where the server needs to be accessible from multiple hosts or deployed independently.

The choice of transport is transparent to the model — it sees tool results regardless of whether they came from a local subprocess or a remote service. Server authors pick the transport that fits their deployment model; host authors implement support for both.

The Session Lifecycle

An MCP session begins with an initialisation handshake. The client sends an initialize request containing its protocol version and capabilities. The server responds with its own version, capabilities, and information about the primitives it exposes. Both sides negotiate a compatible protocol version before any capability calls are made.

After initialisation, the client can query the server's available tools (tools/list), resources (resources/list), and prompts (prompts/list). The host injects this capability manifest into the model's context — the model reads it to understand what's available and decides when to use each capability.

During a session, the model can invoke tools (tools/call), read resources (resources/read), and retrieve prompt templates (prompts/get). The server can also send notifications to the client — for example, notifying that a resource has been updated, or sending log messages for debugging.

Sessions end when the client sends a termination message or the transport connection closes. Servers are expected to clean up any state associated with the session.

How the Model Decides to Use Tools

The model does not directly access MCP servers. The host injects the server's capability descriptions into the model's context — typically as part of the system prompt or as structured tool definitions. The model reads these descriptions and decides, based on the conversation, whether to use a capability and which parameters to pass.

This means tool descriptions are load-bearing. A tool named search with the description "searches documents" will be used differently than one named semantic_search with the description "performs vector similarity search over the knowledge base, returning the top-k most relevant document chunks with their source metadata." The model's ability to select the right tool for the task depends entirely on how well the descriptions communicate intent, scope, and expected output.

The host also controls which tools are visible to the model at any given turn. A host can selectively expose subsets of available tools based on context — a coding assistant might only surface file system and code execution tools, not calendar or email tools, even if all are connected.

MCP vs. Function Calling

Function calling, supported by most major model APIs, lets you define a set of functions in the API request and have the model generate structured calls to them. MCP builds on top of this pattern but is architecturally different in several ways.

Function calling is request-scoped. You define functions per request, and the model calls them within that request's context. MCP is session-scoped. Servers connect once and remain available throughout a session, and they can maintain state, stream updates, and send notifications between turns.

Function calling is model-specific. Each provider has its own format for defining functions and receiving call results. MCP is model-agnostic. Servers implement the MCP protocol once and work with any compliant host.

Function calling handles one primitive — executable functions. MCP handles three — tools, resources, and prompts — with defined lifecycle, discovery, and versioning semantics for each.

In practice, many MCP hosts implement tool calls by translating them into the underlying model's function calling format. The model sees function call syntax; the host translates that into MCP tool calls on the server. The model API and the MCP protocol operate at different layers.

Security Considerations

MCP servers can execute code, read files, query databases, and call external APIs. The trust model matters.

Hosts are responsible for deciding which servers to connect to. A user-controlled host like Claude Desktop presents server connection requests to the user for approval. A programmatic agent that auto-connects to any server in a configuration file has a much larger attack surface.

Tool descriptions can contain injected instructions. A malicious server could describe a tool in a way that manipulates the model into calling it in unintended ways or leaking context from other tools. This is a form of prompt injection at the capability layer — hosts should treat server-provided descriptions with the same caution as user-provided input.

Stdio servers run as local processes with the host's OS permissions. A compromised or malicious stdio server has access to whatever the host process can access. Sandboxing server processes is not currently part of the MCP specification but is a sensible operational practice.

For remote SSE servers, standard network security applies: TLS, authentication tokens, rate limiting, and scope restriction on what data each server can access.

Building an MCP Server

Anthropic publishes official SDKs for Python and TypeScript. A minimal server in Python registers tools using a decorator pattern and starts a transport:

from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent

app = Server("my-server")

@app.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_weather",
            description="Returns current weather for a given city",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_weather":
        city = arguments["city"]
        return [TextContent(type="text", text=f"Weather in {city}: 18°C, partly cloudy")]

async def main():
    async with stdio_server() as (read, write):
        await app.run(read, write, app.create_initialization_options())

The server declares its tools in list_tools, handles invocations in call_tool, and runs on stdio transport. To expose it via SSE for remote access, swap stdio_server for the SSE transport and add an HTTP server layer.

Resources follow the same pattern with list_resources and read_resource handlers. Prompts use list_prompts and get_prompt.

The Ecosystem Today

Since its release, MCP has attracted implementations from major developer tooling vendors. Databases (PostgreSQL, SQLite), development environments (VS Code, JetBrains), version control platforms (GitHub, GitLab), productivity tools (Slack, Google Drive, Notion), and monitoring systems have all published MCP servers. Anthropic maintains a registry of community servers, and most major AI application frameworks have added MCP support.

The adoption pattern suggests MCP is settling into the role it was designed for: a standard integration layer that separates capability development from model development. Teams can build internal MCP servers for proprietary data and tools without tying themselves to a specific model provider, and switch or combine models as requirements change.

Key Takeaways

MCP is a client-server protocol built on JSON-RPC 2.0 that standardises how AI hosts connect to external capabilities. The architecture separates hosts (AI applications), clients (connection managers), and servers (capability providers) into independent components.

Servers expose three primitives: tools for executable functions, resources for readable data, and prompts for reusable message templates. Two transports are defined: stdio for local processes and HTTP+SSE for remote services.

The practical impact for engineering teams is portability. A capability server built to the MCP specification works with any compliant host, removing the need to rewrite integrations when models change. For teams building on top of AI systems with meaningful tool use requirements, MCP provides a more structured and maintainable integration path than ad-hoc function calling implementations.

How Model Context Protocol Works

The Problem MCP Solves

The Three-Layer Architecture

The Three Core Primitives

Tools

Resources

Prompts

How the Transport Layer Works

The Session Lifecycle

How the Model Decides to Use Tools

MCP vs. Function Calling

Security Considerations

Building an MCP Server

The Ecosystem Today

Key Takeaways

Topics

More

The Problem MCP Solves

The Three-Layer Architecture

The Three Core Primitives

Tools

Resources

Prompts

How the Transport Layer Works

The Session Lifecycle

How the Model Decides to Use Tools

MCP vs. Function Calling

Security Considerations

Building an MCP Server

The Ecosystem Today

Key Takeaways

Related Reading

The Three Layers of AI Agent Architecture

Agent State Management and Persistence

Orchestrating Agents Without Chaos

Prompt Injection at Scale: How Agentic AI Turned a Demo Exploit into a Real Attack Vector

Topics

More