MCP Sampling

Q: What is MCP Sampling?

MCP sampling enables MCP servers to make LLM requests through the client — the server delegates reasoning to the language model.

What Is MCP Sampling?

MCP sampling solves a core problem with AI agents: servers can independently delegate reasoning tasks to the language model without the client needing to be programmed for it. This enables more complex, autonomous workflows. For developers, sampling is the key to AI agents that can handle multi-step tasks independently.

MCP sampling is an advanced feature of the Model Context Protocol (MCP) that inverts the usual communication flow. Normally, the client (e.g., an AI application) queries the server for data or tools. With sampling, the MCP server asks the client to perform an LLM request — the server delegates reasoning to the client’s language model.

The flow works like this: an MCP server determines that it needs the capabilities of an LLM for a task — for example, to summarize a text, classify data, or make a decision. It sends a sampling request to the client with the desired prompt. The client executes the LLM request and returns the result to the server. Importantly, the client retains control and can filter or reject sampling requests — an important security mechanism.

For businesses developing their own MCP servers, sampling opens up new architectural possibilities. Servers can become more intelligent without needing to run their own LLM. Combined with MCP resources and tool calling, this creates a flexible infrastructure where data, tools, and reasoning are cleanly separated yet seamlessly connected.

In brief

What Is MCP Sampling?