第 12 章：Subagent —— 上下文隔离的分治

Subagent 的核心价值不是"并行"，而是上下文隔离：给子任务一个干净的 messages[]，防止主对话被探索性中间产物污染，同时只将精炼的摘要返回给父 Agent。

12.1 为什么需要 Subagent

上下文污染问题

Agent 的 messages[] 是 append-only 的。每次读文件、跑命令、检索文档，输出都会永久留在上下文里。一个简单的问题——"这个项目用什么测试框架？"——可能需要读 5 个文件、跑 3 条命令，产生几千 token 的中间输出。但父 Agent 只需要一个词："pytest"。

没有隔离机制时，这些中间产物永久占用上下文窗口，导致两个后果：

注意力稀释（Context Rot）：有效信息被冗余历史淹没，模型后续决策质量下降
Token 浪费：每轮 API 调用都要把这些历史 token 重新送入，成本线性增长

异构工具集需求

不同子任务需要不同的系统提示和工具集合。一个只做代码搜索的子任务不应该拥有文件写入权限；一个做代码审查的子任务不应该拥有执行修改的能力。Subagent 天然支持为每个子任务配置独立的 tools 列表和 system prompt。

失败隔离

子任务可能在第 15 轮工具调用时遇到死循环或异常。如果这发生在主对话中，整个会话的上下文都被污染。Subagent 模式下，子任务的失败被封装在独立的 messages[] 中——失败了就丢弃，父 Agent 收到一条错误摘要，主对话上下文完好无损。

12.2 实现

核心机制：独立 messages[] + 工具过滤 + 摘要返回

Subagent 的实现出奇地简单。关键是三件事：

独立的消息历史：子 Agent 从 messages = [{"role": "user", "content": prompt}] 开始，不继承父 Agent 的任何历史
可选的不同工具集：子 Agent 的 tools 可以是父 Agent 的子集（通常移除 task 工具本身，防止递归生成）
只返回最终文本：子 Agent 可能跑了 30 次工具调用，但父 Agent 只收到最终的文本摘要作为 tool_result

python

def run_subagent(prompt: str) -> str:
    sub_messages = [{"role": "user", "content": prompt}]  # 全新上下文
    for _ in range(30):  # 安全上限
        response = client.messages.create(
            model=MODEL,
            system=SUBAGENT_SYSTEM,
            messages=sub_messages,
            tools=CHILD_TOOLS,  # 不含 task 工具
            max_tokens=8000,
        )
        sub_messages.append({"role": "assistant", "content": response.content})
        if response.stop_reason != "tool_use":
            break
        results = []
        for block in response.content:
            if block.type == "tool_use":
                handler = TOOL_HANDLERS.get(block.name)
                output = handler(**block.input) if handler else f"Unknown tool"
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(output)[:50000]
                })
        sub_messages.append({"role": "user", "content": results})
    # 只返回最终文本，子上下文整体丢弃
    return "".join(
        b.text for b in response.content if hasattr(b, "text")
    ) or "(no summary)"

父 Agent 侧，task 工具的定义是一个普通的 tool schema：

python

PARENT_TOOLS = CHILD_TOOLS + [
    {
        "name": "task",
        "description": "Spawn a subagent with fresh context. "
                       "It shares the filesystem but not conversation history.",
        "input_schema": {
            "type": "object",
            "properties": {
                "prompt": {"type": "string"},
                "description": {"type": "string"}
            },
            "required": ["prompt"]
        },
    },
]

数据流如下：

mermaid

sequenceDiagram
    participant U as User
    participant P as Parent Agent
    participant S as Subagent

    U->>P: "这个项目用什么测试框架？"
    P->>P: 决定调用 task 工具
    P->>S: dispatch(prompt="分析项目的测试配置...")

    rect rgb(240, 240, 255)
        Note over S: 独立 messages[]
        S->>S: read_file(package.json)
        S->>S: read_file(pytest.ini)
        S->>S: bash(pip list | grep test)
        S->>S: 生成最终摘要
    end

    S-->>P: "该项目使用 pytest 框架，配置在 pytest.ini"
    Note over P: 子 Agent 的 30+ 条消息历史<br/>被完全丢弃
    P-->>U: "这个项目使用 pytest。"

Context Isolation 的两种模式

借鉴并发编程的哲学，子 Agent 的上下文初始化有两种模式：

模式 A：通信式隔离（Share memory by communicating）

子 Agent 以空白上下文启动，通过 prompt 接收任务描述，自己去探索环境。

优点：主从上下文绝对隔离，KV-cache 命中率高（子 Agent 的前缀稳定）
缺点：如果任务依赖大量父 Agent 已有的上下文信息，prompt 难以完整传达，子 Agent 需要重新探索

模式 B：Fork 式继承（Communicate by sharing memory）

子 Agent 继承父 Agent 的完整上下文（或部分状态）的副本。

优点：子 Agent 无需重新探索，直接拥有完整背景
缺点：Token 消耗极高，且由于每次 Fork 的历史不同，KV-cache 命中率极低

生产系统中，Codex 通过 fork_context: true 参数支持模式 B——spawn_agent 接受一个 fork_parent_spawn_call_id，将父 thread 的对话历史注入子 thread 的初始上下文。但默认推荐模式 A，仅在深度推理场景（如跨多文件重构需要完整代码理解）才启用 Fork。

成本/收益权衡

维度	模式 A（隔离新建）	模式 B（Fork 上下文）
子 Agent 初始 token 数	仅 system prompt + task prompt	父 Agent 全部历史
KV-cache 命中率	高（前缀稳定）	低（每次 Fork 不同）
信息完整性	可能丢失父 Agent 已有发现	完整继承
适用场景	独立探索任务	依赖上下文的深度推理
推荐程度	默认首选	仅在必要时使用

12.3 Subagent 使用模式

根据子任务的目标和权限需求，Subagent 分为三种典型模式：

研究型（Explore）

只读探索，返回调研报告。子 Agent 只拥有读取类工具（read_file、grep、glob、bash），不能修改任何文件。

典型场景：

"查找所有使用 auth 模块的文件"
"分析项目的依赖结构"
"搜索 API 端点的定义位置"

OpenCode 的 explore Agent 就是这种模式的生产实现。其 system prompt 明确声明身份——"You are a file search specialist"——并限制行为："Do not create any files, or run bash commands that modify the user's system state in any way."

执行型（Code/Build）

拥有完整工具权限，独立完成子任务，返回工件路径和变更摘要。

典型场景：

"实现用户注册表单"
"修复第 47 行的空指针异常"
"把这个模块从 JavaScript 迁移到 TypeScript"

执行型子 Agent 的 tools 通常包含 write_file、edit_file、bash 等写入工具。关键设计：子 Agent 通过文件系统返回工件（而非通过消息），父 Agent 收到的只是"已完成，修改了 3 个文件"这样的摘要。

验证型（Code-Reviewer）

检查主 Agent 或其他子 Agent 的产出，返回审查意见。通常只有读取权限。

典型场景：

代码写完后自动触发 code-review
检查生成的配置文件是否符合规范
验证测试是否覆盖了所有边界条件

DeepAgents 内置了 code-reviewer 类型的子 Agent，其 description 是 "Reviews code for quality and security issues"。OpenCode 也支持用户自定义验证型 Agent。

三种模式的工具权限对比

mermaid

graph LR
    subgraph "研究型 (Explore)"
        E_R[read_file]
        E_G[grep / glob]
        E_B[bash<br>只读命令]
    end

    subgraph "执行型 (Code/Build)"
        C_R[read_file]
        C_W[write_file]
        C_E[edit_file]
        C_B[bash<br>全部命令]
    end

    subgraph "验证型 (Reviewer)"
        V_R[read_file]
        V_G[grep / glob]
        V_B[bash<br>测试命令]
    end

12.4 生产级实现

Codex：多 Agent 协作系统

Codex 的 subagent 实现是目前最完善的生产级方案。其核心不是简单的"task 工具"，而是一个完整的多 Agent 协作框架，包含五个协作工具：

工具	功能
`spawn_agent`	创建新子 Agent，指定 agent_type、model、reasoning_effort
`send_input`	向已存在的子 Agent 发送消息
`wait`	等待一个或多个子 Agent 完成，支持超时
`close_agent`	关闭子 Agent，释放资源
`resume_agent`	恢复被中断的子 Agent

角色系统（Role System）：Codex 的子 Agent 类型通过 role 配置文件定义，而非硬编码。每个 role 是一个 TOML 配置，可以覆盖 model、reasoning_effort、approval_policy、sandbox 设置等。apply_role_to_config() 函数在 spawn 时将 role 配置叠加到基础配置之上。

text

spawn_agent 调用链：
1. 解析 agent_type（role 名称）
2. 构建基础 config = 父 Agent 的运行时配置快照
3. apply_role_to_config() — 叠加 role 特定配置
4. apply_spawn_agent_runtime_overrides() — 同步审批策略、沙箱、cwd
5. apply_spawn_agent_overrides() — 检查深度限制，达到上限则禁用 spawn
6. agent_control.spawn_agent_with_metadata() — 创建独立 thread

深度限制：agent_max_depth 配置项控制 subagent 的最大嵌套深度。达到深度上限时，子 Agent 的 spawn_agent 工具被禁用（disable(Feature::Collab)），强制模型自己完成任务而非继续委托。

Fork 支持：spawn_agent 接受 fork_context: true 参数，将父 thread 的对话历史注入子 thread，实现模式 B 的上下文继承。

并发等待：wait 工具支持同时等待多个子 Agent，并行执行的子 Agent 各自在独立 thread 中运行。超时控制：最小 10 秒，默认 30 秒，最大 1 小时。

OpenCode：build + plan + explore + general 四 Agent 体系

OpenCode 的 Agent 系统通过 Agent.Info schema 定义，每个 Agent 有三个关键属性：

mode: "primary" | "subagent" | "all" — 决定 Agent 是顶层 Agent 还是子 Agent
permission: 基于 Permission.Ruleset 的精细权限控制
prompt: 可选的专用系统提示

四个核心 Agent 的权限设计：

text

build (primary)
├── 所有工具: allow
├── question: allow
├── plan_enter: allow     ← 可以切换到 plan 模式
└── plan_exit: deny

plan (primary)
├── 所有工具: allow
├── edit: deny            ← 不能编辑代码文件
├── plan_exit: allow      ← 可以切换回 build 模式
└── 特殊: 仅允许编辑 .opencode/plans/*.md

explore (subagent)
├── grep/glob/list/bash/read: allow
├── 其他所有工具: deny     ← 只读
└── websearch/webfetch: allow

general (subagent)
├── 所有工具: allow
├── todoread/todowrite: deny  ← 不管理 todo
└── task: 视配置而定

plan → build 切换：OpenCode 实现了 plan_exit 工具，当 plan Agent 完成计划后，可以提示用户切换到 build Agent 开始实现。这不是传统意义的 subagent，而是同一会话内的 Agent 模式切换——messages[] 是共享的，但 system prompt 和工具权限发生变化。

会话恢复：OpenCode 的 task 工具支持 task_id 参数。如果提供已有的 task_id，子 Agent 会恢复到之前的会话继续工作，而非创建全新上下文。返回值格式包含 task_id：

python

task_id: <session_id> (for resuming to continue this task if needed)

<task_result>
实际结果文本
</task_result>

权限审批：调用 task 工具时，OpenCode 会通过 ctx.ask() 请求用户授权（除非用户通过 @agent_name 直接触发，此时 bypassAgentCheck 为 true）。

DeepAgents：同步 + 异步双模式

DeepAgents 同时支持两种子 Agent 模式：

同步子 Agent（task 工具）：与前述模式相同——spawn、run、return、reconcile。内置 general-purpose 和 code-reviewer 两种类型。

异步子 Agent（remote LangGraph servers）：通过 start_async_task、check_async_task、update_async_task、cancel_async_task、list_async_tasks 五个工具管理后台任务。异步子 Agent 运行在远程 LangGraph 服务器上，主 Agent 立即获得 task_id 并返回控制权给用户，后续通过轮询获取结果。

异步模式的关键设计约束（写入 system prompt 中）：

启动后立即将控制权交还用户，不要自动 check
不要轮询 check_async_task，每次用户请求只 check 一次
对话历史中的 task status 始终是过时的，必须调用工具获取最新状态

Agent 类型选择策略

面对一个具体任务，如何选择正确的子 Agent 类型？以下是决策流程：

mermaid

flowchart TD
    A[收到子任务] --> B{需要修改文件？}
    B -->|否| C{需要深度搜索？}
    B -->|是| D{需要代码审查？}

    C -->|是| E[Explore 型<br>只读工具]
    C -->|否| F[直接在父 Agent 中执行<br>不值得 spawn]

    D -->|是| G[先用 Code 型执行<br>再用 Reviewer 型审查]
    D -->|否| H[Code/Build 型<br>全部工具]

    E --> I{任务复杂度？}
    I -->|简单搜索| J[父 Agent 直接用<br>grep/glob 完成]
    I -->|多步探索| K[spawn explore<br>subagent]

OpenCode 的 task tool description 中明确列出了 不该使用 subagent 的场景：

要读取一个已知路径的文件 → 直接用 Read 工具
要搜索特定类定义 → 直接用 Glob 工具
在 2-3 个文件中搜索代码 → 直接用 Read 工具

核心原则：spawn subagent 的成本（至少一次完整的 LLM 调用 + system prompt）必须低于直接在父 Agent 中执行的成本（上下文膨胀）。简单的单步操作不值得 spawn。

并行 Subagent

生产系统普遍鼓励并行启动多个 subagent。具体做法是在同一条 assistant 消息中包含多个 tool_use block，每个 block 调用一次 task 工具。API 层面，这些 tool call 可以并发执行。

OpenCode 的 task tool description 中第一条使用说明就是："Launch multiple agents concurrently whenever possible, to maximize performance; to do that, use a single message with multiple tool uses."

DeepAgents 的 system prompt 同样强调："Whenever you have independent steps to complete - make tool_calls, or kick off tasks (subagents) in parallel to accomplish them faster."

并行 subagent 的前提是任务之间无数据依赖。如果任务 B 需要任务 A 的输出，则必须串行执行——先等 A 返回，再 spawn B。

第 12 章：Subagent —— 上下文隔离的分治 ​

12.1 为什么需要 Subagent ​

上下文污染问题 ​

异构工具集需求 ​

失败隔离 ​

12.2 实现 ​

核心机制：独立 messages[] + 工具过滤 + 摘要返回 ​

Context Isolation 的两种模式 ​

成本/收益权衡 ​

12.3 Subagent 使用模式 ​

研究型（Explore） ​

执行型（Code/Build） ​

验证型（Code-Reviewer） ​

三种模式的工具权限对比 ​

12.4 生产级实现 ​

Codex：多 Agent 协作系统 ​

OpenCode：build + plan + explore + general 四 Agent 体系 ​

DeepAgents：同步 + 异步双模式 ​

Agent 类型选择策略 ​

并行 Subagent ​

第 12 章：Subagent —— 上下文隔离的分治

12.1 为什么需要 Subagent

上下文污染问题

异构工具集需求

失败隔离

12.2 实现

核心机制：独立 messages[] + 工具过滤 + 摘要返回

Context Isolation 的两种模式

成本/收益权衡

12.3 Subagent 使用模式

研究型（Explore）

执行型（Code/Build）

验证型（Code-Reviewer）

三种模式的工具权限对比

12.4 生产级实现

Codex：多 Agent 协作系统

OpenCode：build + plan + explore + general 四 Agent 体系

DeepAgents：同步 + 异步双模式

Agent 类型选择策略

并行 Subagent