Architecture Overview
- In-Process: The tool is a Python async function running in your own process. The product of
create_sdk_mcp_servercommunicates with the CLI via the SDK’s control channel without spawning a child process. - External: You declare a child process or remote URL in the configuration; the CLI handles connection, discovery, and invocation.
Three Integration Methods
| Method | Config type | Process Boundary | Use Case |
|---|---|---|---|
| In-Process | 'sdk' (created by create_sdk_mcp_server) | Same process | Custom application tools that need direct access to host state |
| Stdio | 'stdio' (can be omitted) | Child process | Existing MCP toolkits (@modelcontextprotocol/server-*) |
| SSE / HTTP | 'sse' / 'http' | Remote | Remote services, SaaS tools, services requiring OAuth |
query() / QoderSDKClient.
💡mcp_serverscan also accept astr/pathlib.Pathpointing to a JSON configuration file path; the SDK will pass it through to the CLI as--mcp-config <path>.
In-Process Server (Recommended)
In-process tools are the most straightforward extension method: define a regularasync function, declare a schema with the decorator, and it becomes callable by the Agent. For detailed @tool() / schema / handler behavior, see tools.md; this section only covers parts related to MCP server assembly.
30-Second Getting Started
@tool() / create_sdk_mcp_server() Full Signatures
| Parameter | Description |
|---|---|
name (tool) | Tool name; the fully-qualified name will be mcp__<server>__<name> |
description | Description for the model, determining when the AI invokes it — clearly state What/When |
input_schema | Three forms: simple dict / TypedDict / full JSON Schema dict; see Tools Reference - input_schema |
annotations | MCP tool annotations; see table below |
name (server) | Server name (determines tool prefix mcp__<name>__) |
version | Defaults to '1.0.0' |
tools | List of SdkMcpTool |
McpSdkServerConfig looks like {"type": "sdk", "name": ..., "instance": ...} and can be placed directly into options.mcp_servers.
Annotations Actually Consumed
The three fields below are consumed by the SDK and returned to the host viaget_mcp_status().mcpServers[i].tools[i].annotations:
| Field | What it does | Host-side reads as |
|---|---|---|
readOnlyHint | Declares the tool is read-only. Read-only tools may run concurrently (they don’t block each other in the same batch); the TUI renders a [read-only] badge in tool details | annotations.readOnly |
destructiveHint | Declares the tool performs destructive operations. The TUI renders a [destructive] badge in tool details | annotations.destructive |
openWorldHint | Declares the tool interacts with the outside world (e.g., web search, third-party API calls). The TUI renders an [open-world] badge in tool details | annotations.openWorld |
Note that host-side field names drop theHintsuffix:readOnlyHint→annotations.readOnly, and so on. Theannotationsobject only contains fields that were explicitly set. ⚠️ These fields do NOT affect auto-mode permission decisions. The CLI treats server-declared annotations as unverifiable advisory metadata (servers can freely under- or over-declare) and intentionally keeps them out of the permission pipeline — admitting them would launder authority for the server’s self-assessment. To hard-block specific tools, use theallowed_toolsallowlist or hooks — annotations are for host-side identification (get_mcp_status) and TUI display only.
idempotentHint and title are not currently consumed by the SDK — passing them won’t error, but they neither affect CLI behavior nor appear in get_mcp_status(). If your application needs this information, maintain the mapping yourself on the host side.
💡 AboutmaxResultSizeChars: The Python SDK writesanthropic/maxResultSizeCharsinto the tool’s_metaviaToolAnnotations(maxResultSizeChars=...), and the CLI uses this to relax the default 50K return length limit. This field is a Python-side incremental capability (TS exposes it via the same-named annotation; wire format is identical).
Handler Return Value
is_error: True instead of throwing an exception. The full content type description, along with several behavior differences between Python and TS (resource_link degrades to text, top-level _meta is not passed through, binary embedded resources are skipped), is in Tools Reference - CallToolResult.
Handler Cancellation Signal
The handler can optionally accept a second parameterToolInvocationContext to cooperatively exit when the CLI cancels the current call via extra.signal:
⚠️ Do not reuse the same server config across multiple query() calls: Each query binds an independent transport. Reusing the same config has no side effects, but you also won’t get “cross-query shared state” — for shared state, place it in module scope outside the handler closure.
Stdio Server
Communicates with MCP servers via a child process’s stdin/stdout. The@modelcontextprotocol/server-* packages on NPM are all stdio implementations.
command is unreachable or fails to start, it does not bring down the entire query — that server’s status stays non-'connected', and other servers are unaffected.
SSE / HTTP Server
'connected', and other servers are unaffected. For remote services requiring OAuth, see OAuth Authentication.
Tool Naming and Allowlists
The CLI uniformly prefixes MCP tools when exposing them to the model:my_tools with tool name greet gives the model the tool name mcp__my_tools__greet. Server names may contain hyphens and other special characters (my-tools → mcp__my-tools__<tool>).
tools: Restrict Which Tools the Model Can See
Use tools when you want the model to only see a subset of tools. The CLI adds every built-in tool not in the list to the disallow set — effectively a “visibility allowlist”:
⚠️ Omitting tools means everything is exposed: all built-in tools plus every tool from connected MCP servers reach the model. For production, list them explicitly to tighten scope.
allowed_tools: Pre-approval (Not a Visibility Allowlist)
allowed_tools adds listed tools to the “always-allow” rule set — calls skip the permission prompt, but unlisted tools are not hidden. Use it to whitelist low-risk MCP tools for unattended use:
allowed_tools just means no pre-approval rules — the model still sees and can call every tool; write operations simply route through the regular permission_mode approval flow. See Permissions docs for full semantics.
allowed_mcp_server_names: Process-Server Allowlist
Only filters process-based (stdio/sse/http) servers; does not affect in-process servers. Combined with strict_mcp_config=True, it can prevent the CLI from loading additional local configurations:
⚠️ Omitting allowed_mcp_server_names means all process servers connect; to tighten, list them explicitly. In-process servers are never filtered by this field.
Runtime Management (QoderSDKClient)
query() is a single-shot iterator and cannot change servers or auth mid-stream. For runtime management of MCP, use QoderSDKClient, which exposes status queries, OAuth, server add/remove, reconnect / toggle, etc., as public methods.
⚠️ Caching Principle: MCP server config / auth state changes rebuild the tools list, which breaks the prompt prefix cache mid-session. The SDK provides methods for “querying status + completing auth before the first message”; the server set itself should be configured once at startup viaoptions.mcp_servers, creating a newQoderSDKClientwhen necessary.
Querying Status
💡 The MCP handshake occurs after the CLI completesinitializebut before the first user message.QoderSDKClient.connect()already waits until initialize returns; handshake IO may take a few hundred milliseconds, so pollget_mcp_status()untilconnectedif needed.
Subscribing to Status Changes
Pick one of two ways: Method 1 (recommended): Attach anon_mcp_status_change callback on options; it is called every time status changes.
system/mcp_status_change. The callback and the message stream share the same payload; the callback simply removes the need for filtering.
Runtime Add/Remove Server / Reconnect / Toggle
| Method | Purpose |
|---|---|
client.set_mcp_servers(servers) | Replace the current MCP configuration with the full desired mapping; returns {added, removed, errors} |
client.reconnect_mcp_server(name) | Reconnect a specific server, typically to recover from a 'failed' state |
client.toggle_mcp_server(name, enabled) | Enable / disable a server; disabling disconnects it and removes its tools |
⚠️ All three methods trigger a tools-list rebuild and therefore break the prompt prefix cache. In production, prefer to configure mcp_servers fully at startup; reserve these APIs for debugging and local development.
Controlling Request Timeout
control_cancel_request and rejects the current Future.
OAuth Authentication
Remote MCP servers (HTTP/SSE) often require OAuth. The CLI has a complete built-in OAuth 2.0 + PKCE + Dynamic Client Registration (RFC 7591) implementation. The Python SDK exposes both inbound (CLI proactively asks the host to complete OAuth) and outbound (host actively triggers OAuth) paths; choose based on your host shape.⚠️ Caching Principle: After OAuth completes, the CLI reconnects to the server and rediscovers tools, which inevitably breaks the prompt prefix cache mid-session. Complete authentication before sending the first user message so the tools list stabilizes before the conversation starts.
💡 This section only covers CLI-driven OAuth: The CLI performs metadata discovery, PKCE, token exchange, and token persistence itself. There is another server-driven auth path — where the server uses MCP elicitation/create to have the client redirect to a URL for authorization. The two paths are independent and won’t trigger simultaneously. See Elicitation: Server Requests User Input.
Inbound: on_mcp_oauth_required Callback
When the CLI detects during handshake that a server requires OAuth, it pushes an McpOAuthRequest to the SDK via control_request, and the SDK invokes the host’s on_mcp_oauth_required callback. The host returns one of the following resolutions:
| Return type | Meaning |
|---|---|
OAuthToken or {"token": OAuthToken} | The host runs the entire OAuth flow itself and injects the token directly into the CLI |
{"callbackUrl": "..."} | The host returns the full callback URL (including code / state); the CLI parses it and exchanges for a token |
{"code": "...", "state": "..."} | The host extracts the code itself and returns it to the CLI |
None | Reject; the CLI marks that server as failed |
Outbound: Host Actively Drives Authentication
When the host’s own UI has a “Sign in” entry, it can actively invoke:| Method | Purpose | When to Call |
|---|---|---|
client.mcp_authenticate(name, redirect_uri=None) | Initiate OAuth; returns {authUrl?, requiresUserAction}; on silent renewal, requiresUserAction=False and no UI is needed | Before the first user message |
client.mcp_submit_oauth_callback_url(name, callback_url) | Submit the complete callback URL (with code/state) | Before the first user message |
client.inject_mcp_token(name, token) | The host runs the entire OAuth flow itself and injects the OAuthToken into the CLI | Before the first user message |
client.mcp_clear_auth(name) | Delete the OAuth credentials stored by the CLI — equivalent to “sign out” | Any time; the next tool call will trigger re-authentication |
redirect_uri is optional, overriding the default OAuth callback target (Electron custom protocol, enterprise intranet callback addresses, etc.).
The CLI stores tokens in the system Keychain by default (macOS / Linux Secret Service), falling back to ~/.qoder/mcp-oauth-tokens.json (0o600 permissions + cross-process locking).
Elicitation: Server Requests User Input
MCPelicitation/create is a server → client request used to have the client display an interaction to the user (form mode collects structured input; url mode asks the user to visit a URL to complete an action).
✅ The Python SDK is now aligned with the TS SDK:QoderAgentOptions.on_elicitationaccepts an async callback that returnsElicitationResult, with the same signature as the TS version. When the callback is not set, the SDK still defaults to answering{"action": "cancel"}. TheElicitation/ElicitationResulthook events still fire in parallel as a read-only observation channel.
⚠️ The current CLI does not advertise theelicitation.urlcapability. A server’selicit({mode: 'url'})is rejected directly by the CLI (MCP error -32602: Client does not support URL-mode elicitation requests), so URL-mode elicit will not reach the SDK, and thesystem/elicitation_completenotification will not fire on the current CLI either. Once the CLI enables the URL capability, this path will automatically become operational.
Responding to elicit with on_elicitation
- Field names follow the TS SDK’s camelCase (
serverName / elicitationId / requestedSchema / displayName); the CLI’s snake_case payload is converted automatically by the SDK. - Returning
Noneis equivalent to{"action": "cancel"}, making it convenient for the host to bail out from a fallback path. - You can also return a
mcp.types.ElicitResultPydantic model (the SDK callsmodel_dump).
Observing elicitation (hook channel)
Afteron_elicitation lands, the Elicitation / ElicitationResult hooks still fire in parallel — they are a read-only observation channel and do not make decisions.
| Hook Event | Timing | Payload TypedDict |
|---|---|---|
Elicitation | When server request arrives | ElicitationHookInput — mcp_server_name / message / mode / elicitation_id? / requested_schema? / url? / title? |
ElicitationResult | After SDK / host completes its response | ElicitationResultHookInput — mcp_server_name / action / mode / elicitation_id? / content? |
Boundary with the OAuth Path
- CLI-driven OAuth (
mcp_authenticate/inject_mcp_token/on_mcp_oauth_required): Token stored in qodercli Keychain; driven whenget_mcp_status()showsneeds-auth; does NOT trigger the Elicitation hook. - Server-driven elicit: Token stays internal to the server;
get_mcp_status()does not showneeds-auth; decisions are made via theon_elicitationcallback (the SDK auto-cancels when not registered).
Options Reference
| Field | Type | Default | Description |
|---|---|---|---|
mcp_servers | dict[str, McpServerConfig] | str | Path | {} | Server name → config; or a JSON config file path |
allowed_mcp_server_names | list[str] | [] | Process-based server allowlist (does not affect in-process); empty list means all are open |
strict_mcp_config | bool | False | Prevent the CLI from loading additional MCP from user config files |
tools | list[str] | ToolsPreset | None | None | Model-visible tool allowlist; omitting means every built-in + MCP tool is visible |
allowed_tools | list[str] | [] | Pre-approval list (skip permission prompts; does not control visibility); empty list means no pre-approval rules |
disallowed_tools | list[str] | [] | Explicit deny list; takes precedence over allow |
control_request_timeout_ms | int | 60_000 | Control request timeout (including mcp series), 0 to disable |
on_mcp_oauth_required | OnMcpOAuthRequired | None | None | Triggered when the CLI detects that a server requires OAuth |
on_mcp_status_change | OnMcpStatusChange | None | None | Triggered on every server status change; equivalent to filtering the system/mcp_status_change stream |
on_elicitation | OnElicitation | None | None | Host decision when a server requests user input via MCP elicitation/create; the SDK auto-cancels if not set |
hooks['Elicitation'] | list[HookMatcher] | – | Read-only observation hook when a server requests user input (decisions go through on_elicitation) |
Methods on QoderSDKClient
| Method | Description | When to Call |
|---|---|---|
get_mcp_status() | Get current status of all MCP servers | Any time |
set_mcp_servers(servers) | Replace the entire MCP server configuration; returns {added, removed, errors} | Any time (breaks the prefix cache) |
reconnect_mcp_server(name) | Reconnect a specific server | Any time |
toggle_mcp_server(name, enabled) | Enable / disable a server | Any time |
mcp_authenticate(name, redirect_uri=None) | Actively initiate OAuth | Before the first user message |
mcp_submit_oauth_callback_url(name, callback_url) | Submit OAuth callback URL | Before the first user message |
inject_mcp_token(name, token) | Inject a token after running the full OAuth flow on the host | Before the first user message |
mcp_clear_auth(name) | Delete stored OAuth credentials | Any time |
Type Reference
McpServerStatus.status enum (McpServerConnectionStatus):
| Value | Meaning |
|---|---|
'pending' | Registered, connection not yet started |
'connecting' | Handshaking |
'connected' | Connected, tools are callable |
'failed' | Connection failed (check the error field) |
'needs-auth' | Requires OAuth, proceed with auth flow |
'disabled' | Disabled (determined by CLI internal config or toggle_mcp_server) |
Best Practices
- Write descriptions for the AI: The
@tooldescriptiondetermines when the AI selects it. Clearly state “what it does, when to use it, what it should NOT be used for.” - Add
Annotatedto parameters: In simple dict / TypedDict, writeAnnotated[type, "..."]on fields; the AI uses this information to construct call arguments. - Use
is_error: Truefor failures, don’t throw exceptions: Let the AI see the result. For a complete comparison, see Tools Guide - How the SDK handles tool errors. - Prefer read-only +
readOnlyHint: Be cautious with write operations; pair withcan_use_toolor hooks for secondary confirmation. - Keep server names short: They appear in tool prefixes; overly long names waste tokens.
- Place in-process shared state in module scope: Handlers are closures, but each query still reuses the same server instance.
- Complete OAuth before the first user message: Use
mcp_authenticate+mcp_submit_oauth_callback_url, the inboundon_mcp_oauth_requiredcallback, orinject_mcp_token. Completing auth mid-session inevitably breaks the prompt prefix cache. - Pull MCP status with
get_mcp_status()oron_mcp_status_change: The push channel (status change message) is retained; pick one as needed. - Set a reasonable
control_request_timeout_ms: Remote server handshakes may take seconds; the default 60s is usually sufficient. Increase it when waiting for user OAuth actions, and set it explicitly in CI environments. - Use
strict_mcp_configfor isolation: Prevent MCP servers declared in the user’s local~/.qoder/settings.json/.mcp.jsonfrom interfering with your application.