Architecture

MCP follows a client-server architecture where an MCP host — an AI application like Claude Code, Codex, Antigravity etc — establishes connections to one or more MCP servers. The MCP host accomplishes this by creating one MCP client for each MCP server. Each MCP client maintains a dedicated connection with its corresponding MCP server.

Local MCP servers that use the STDIO transport typically serve a single MCP client, whereas remote MCP servers that use the Streamable HTTP transport will typically serve many MCP clients. The key participants in the MCP architecture are:

MCP Host — LLM application (such as Cursor) that manages connections
MCP Client — Maintains 1:1 connections with MCP servers
MCP Server — Provides context, tools, and capabilities to the LLMs

Layers

Data Layer

The data layer implements a JSON-RPC 2.0 based exchange protocol that defines the message structure and semantics. This layer includes:

Lifecycle management: Handles connection initialization, capability negotiation, and connection termination between clients and servers
Server features: Enables servers to provide core functionality including tools for AI actions, resources for context data, and prompts for interaction templates from and to the client
Client features: Enables servers to ask the client to sample from the host LLM, elicit input from the user, and log messages to the client
Utility features: Supports additional capabilities like notifications for real-time updates and progress tracking for long-running operations

Transport Layer

The transport layer manages communication channels and authentication between clients and servers. It handles connection establishment, message framing, and secure communication between MCP participants. MCP supports two transport mechanisms:

Stdio transport: Uses standard input/output streams for direct process communication between local processes on the same machine, providing optimal performance with no network overhead, this is generally used when you want to set up the MCP server locally.
Streamable HTTP transport: Uses HTTP POST for client-to-server messages with optional Server-Sent Events for streaming capabilities. This transport enables remote server communication and supports standard HTTP authentication methods including bearer tokens, API keys, and custom headers. MCP recommends using OAuth to obtain authentication tokens, this is generally used when you want to connect with some other servers on the Internet by different provides (like GitHub, Supabase etc.) or want to connect with your own MCP server that is running on the remote.

The transport layer abstracts communication details from the protocol layer, enabling the same JSON-RPC 2.0 message format across all transport mechanisms.

Transport and Data Layer

Lifecycle

MCP is a stateful protocol (A subset of MCP can be made stateless using the Streamable HTTP transport) that requires lifecycle management. The purpose of lifecycle management is to negotiate the capabilities (Features and operations that a client or server supports, such as tools, resources, or prompts) that both client and server support.

The lifecycle mainly consist of three phases :

Initialization: Capability negotiation and protocol version agreement
Operation: Normal protocol communication
Shutdown: Graceful termination of the connection

Primitives

MCP primitives are the most important concept within MCP. They define what clients and servers can offer each other. These primitives specify the types of contextual information that can be shared with AI applications and the range of actions that can be performed. MCP defines three core primitives that servers can expose:

Tools: Executable functions that AI applications can invoke to perform actions (e.g., file operations, API calls, database queries)
Resources: Data sources that provide contextual information to AI applications (e.g., file contents, database records, API responses)
Prompts: Reusable templates that help structure interactions with language models (e.g., system prompts, few-shot examples)

Each primitive type has associated methods for discovery (*/list), retrieval (*/get), and in some cases, execution (tools/call). MCP clients will use the */list methods to discover available primitives. For example, a client can first list all available tools (tools/list) and then execute them. This design allows listings to be dynamic.

As a concrete example, consider an MCP server that provides context about a database. It can expose tools for querying the database, a resource that contains the schema of the database, and a prompt that includes few-shot examples for interacting with the tools.

MCP also defines primitives that clients can expose. These primitives allow MCP server authors to build richer interactions.

Sampling: Allows servers to request language model completions from the client’s AI application. This is useful when server authors want access to a language model, but want to stay model-independent and not include a language model SDK in their MCP server. They can use the sampling/createMessage method to request a language model completion from the client’s AI application.
Elicitation: Allows servers to request additional information from users. This is useful when server authors want to get more information from the user, or ask for confirmation of an action. They can use the elicitation/create method to request additional information from the user.
Logging: Enables servers to send log messages to clients for debugging and monitoring purposes.

Besides server and client primitives, the protocol offers cross-cutting utility primitives that augment how requests are executed:

Tasks (Experimental): Durable execution wrappers that enable deferred result retrieval and status tracking for MCP requests (e.g., expensive computations, workflow automation, batch processing, multi-step operations)

Notifications

The protocol supports real-time notifications to enable dynamic updates between servers and clients. For example, when a server’s available tools change—such as when new functionality becomes available or existing tools are modified—the server can send tool update notifications to inform connected clients about these changes. Notifications are sent as JSON-RPC 2.0 notification messages (without expecting a response) and enable MCP servers to provide real-time updates to connected clients.