minion-orchestrator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Minion Orchestrator

Minion编排器

Contract

协议

Minions is a Postgres-native job queue for durable, observable background work. This single skill handles two lanes:
  • Deterministic shell jobs (
    gbrain jobs submit shell ...
    )
  • LLM subagent jobs (
    gbrain agent run ...
    )
When to route to Minions: durable, observable work that must survive restarts, fan out across many parallel tasks, or persist across sessions. Routing policy is defined in
skills/conventions/subagent-routing.md
— the project default is
pain_triggered
(native subagents first, Minions after specific pain signals fire); Mode A (all-through-Minions) is opt-in.
Guarantees:
  • Jobs survive gateway restart (Postgres-backed)
  • Every job has structured progress, token accounting, and session transcripts
  • Running agents can be steered mid-flight via inbox messages
  • Jobs can be paused, resumed, or cancelled at any time
  • Parent-child DAGs with configurable failure policies
Minions是一个基于Postgres的原生任务队列,用于处理持久化、可观测的后台任务。 这一单一技能涵盖两类任务:
  • 确定性shell任务(
    gbrain jobs submit shell ...
  • LLM子代理任务(
    gbrain agent run ...
何时路由至Minions:需要持久化、可观测,且必须在重启后仍能运行、需并行分发多个任务,或跨会话持久化的任务。路由策略定义于
skills/conventions/subagent-routing.md
——项目默认策略为
pain_triggered
(优先使用原生子代理,触发特定异常信号后再使用Minions);模式A(所有任务均通过Minions)为可选启用模式。
特性保障:
  • 任务可在网关重启后继续执行(基于Postgres存储)
  • 每个任务都有结构化进度、令牌统计和会话记录
  • 运行中的代理可通过收件箱消息进行中途控制
  • 任务可随时暂停、恢复或取消
  • 支持带有可配置失败策略的父子DAG(有向无环图)

Route the Request: Shell Job vs Subagent

路由请求:Shell任务 vs 子代理

ConditionAction
User asks for deterministic command/script runShell job (CLI:
gbrain jobs submit shell ...
)
User asks to "run in minions" + explicit command/argvShell job (CLI,
--params
with
cmd
or
argv
)
User asks for research/reasoning/iterative agentSubagent job (CLI:
gbrain agent run
)
User asks to steer/pause/resume an agentSubagent job lifecycle tools (MCP-callable)
Single simple operation under ~30sConsider inline execution first
Needs restart durability/observabilitySubmit as Minion job
Parallel work (2+ streams)
gbrain agent run --fanout-manifest
or parent + child subagents
If intent is ambiguous, ask one clarification: "Do you want a deterministic shell command job, or an LLM agent job?"
条件操作
用户要求执行确定性命令/脚本Shell任务(CLI:
gbrain jobs submit shell ...
用户要求“在minions中运行” + 明确命令/参数Shell任务(CLI,通过
--params
传入
cmd
argv
用户要求进行研究/推理/迭代式代理任务子代理任务(CLI:
gbrain agent run
用户要求控制/暂停/恢复代理子代理任务生命周期工具(可通过MCP调用)
单个简单操作耗时约30秒以内优先考虑内联执行
需要重启后仍能持久化/可观测提交为Minion任务
并行任务(2个及以上流)
gbrain agent run --fanout-manifest
或父代理+子代理模式
若意图不明确,需询问澄清: “您需要的是确定性shell命令任务,还是LLM代理任务?”

Shell Jobs (Deterministic Scripts)

Shell任务(确定性脚本)

Use for reproducible command execution, ETL steps, cron work, and scriptable tasks where no LLM reasoning loop is needed.
适用于可复现的命令执行、ETL步骤、定时任务,以及无需LLM推理循环的脚本化任务。

Preconditions (read before submitting your first shell job)

前置条件(提交首个shell任务前请阅读)

  • GBRAIN_ALLOW_SHELL_JOBS=1
    must be set on the worker environment.
    Without it, the shell handler refuses to register and submissions sit in
    waiting
    silently. Gate lives in
    src/core/minions/handlers/shell.ts
    .
  • Security: flipping
    GBRAIN_ALLOW_SHELL_JOBS=1
    authorizes arbitrary command execution on the worker. On a shared queue, this is a remote code execution surface. Treat as privileged infrastructure authorization.
  • Execution mode — pick one:
    • Postgres + daemon:
      gbrain jobs work
      runs a persistent worker that claims and executes jobs from the queue.
    • PGLite + --follow:
      gbrain jobs submit ... --follow
      runs inline. The daemon mode is not available on PGLite (exclusive file lock). See
      docs/guides/minions-shell-jobs.md
      .
  • MCP boundary: shell-job submission is CLI-only.
    submit_job name="shell"
    over MCP throws an
    OperationError
    with code
    permission_denied
    ("'shell' jobs cannot be submitted over MCP") because
    shell
    is in
    PROTECTED_JOB_NAMES
    . Agents CAN observe shell jobs via
    get_job
    /
    list_jobs
    /
    get_job_progress
    (not protected), but cannot submit them. Operator or autopilot submits; agent observes.
  • Verify setup: after configuration, run
    gbrain jobs stats
    (CLI) to confirm the worker is registered and consuming the queue.
  • 工作环境必须设置
    GBRAIN_ALLOW_SHELL_JOBS=1
    。 若未设置,shell处理程序将拒绝注册,提交的任务会静默处于
    waiting
    状态。限制逻辑位于
    src/core/minions/handlers/shell.ts
  • 安全性:设置
    GBRAIN_ALLOW_SHELL_JOBS=1
    将授权工作节点执行任意命令。在共享队列环境中,这存在远程代码执行风险。需视为特权基础设施授权。
  • 执行模式——二选一
    • Postgres + 守护进程
      gbrain jobs work
      启动一个持久化工作节点,从队列中获取并执行任务。
    • PGLite + --follow
      gbrain jobs submit ... --follow
      以内联方式运行。 PGLite不支持守护进程模式(文件独占锁)。详见
      docs/guides/minions-shell-jobs.md
  • MCP边界:shell任务提交仅支持CLI。通过MCP调用
    submit_job name="shell"
    会抛出
    OperationError
    ,错误码为
    permission_denied
    (“'shell'任务无法通过MCP提交”),因为
    shell
    属于
    PROTECTED_JOB_NAMES
    。 代理可通过
    get_job
    /
    list_jobs
    /
    get_job_progress
    观测shell任务(这些操作不受保护),但无法提交。仅操作员或自动驾驶系统可提交;代理仅可观测。
  • 验证设置:配置完成后,运行
    gbrain jobs stats
    (CLI)确认工作节点已注册并正在消费队列。

Submit (CLI, operator or autopilot)

提交(CLI,操作员或自动驾驶系统)

Shell jobs take their command via
--params
as a JSON object with
cmd
(string) or
argv
(array), plus
cwd
and optional
env
.
Command string form:
gbrain jobs submit shell --params '{"cmd":"echo hello","cwd":"/abs/path"}'
Argv form (no shell expansion):
gbrain jobs submit shell --params '{"argv":["bash","-lc","echo hello"],"cwd":"/abs/path"}'
Inline execution on PGLite or any one-shot deployment:
gbrain jobs submit shell --params '{"cmd":"echo hello","cwd":"/tmp"}' --follow
Queue/lifecycle flags exposed by
gbrain jobs submit --help
:
--queue
,
--priority
,
--delay
,
--max-attempts
,
--max-stalled
,
--backoff-type
,
--backoff-delay
,
--backoff-jitter
,
--timeout-ms
,
--idempotency-key
,
--dry-run
.
Shell任务通过
--params
传入JSON对象指定命令,包含
cmd
(字符串)或
argv
(数组),以及
cwd
和可选的
env
命令字符串形式:
gbrain jobs submit shell --params '{"cmd":"echo hello","cwd":"/abs/path"}'
Argv形式(无shell扩展):
gbrain jobs submit shell --params '{"argv":["bash","-lc","echo hello"],"cwd":"/abs/path"}'
在PGLite或任何一次性部署中内联执行:
gbrain jobs submit shell --params '{"cmd":"echo hello","cwd":"/tmp"}' --follow
gbrain jobs submit --help
提供的队列/生命周期标志:
--queue
--priority
--delay
--max-attempts
--max-stalled
--backoff-type
--backoff-delay
--backoff-jitter
--timeout-ms
--idempotency-key
--dry-run

Monitor (agents or operator)

监控(代理或操作员)

These operations are MCP-callable and safe for agent use:
list_jobs --name shell --status active
get_job ID
get_job_progress ID
Check structured result fields (exit code, stdout/stderr tails, attempts, timings) from
get_job
. Use
gbrain jobs stats
(CLI) for worker/queue health dashboard.
以下操作可通过MCP调用,且对代理安全:
list_jobs --name shell --status active
get_job ID
get_job_progress ID
get_job
中查看结构化结果字段(退出码、stdout/stderr尾部、尝试次数、时间统计)。使用
gbrain jobs stats
(CLI)查看工作节点/队列健康仪表板。

Control (MCP-callable)

控制(可通过MCP调用)

cancel_job id=ID
replay_job id=ID
replay_job
is not protected — only shell submission is. Agents can cancel or replay a shell job without CLI access.
Use idempotency keys for recurring shell workloads to avoid duplicate runs.
cancel_job id=ID
replay_job id=ID
replay_job
不受保护——仅shell任务提交操作受限制。代理无需CLI权限即可取消或重放shell任务。
对周期性shell工作负载使用幂等键,避免重复运行。

Subagent Jobs (LLM Orchestration)

子代理任务(LLM编排)

Use for open-ended reasoning, tool-using research, and fan-out synthesis.
User-facing entrypoint:
gbrain agent run <prompt>
is the canonical way to submit subagent work. It handles the elevated-trust plumbing —
subagent
and
subagent_aggregator
are both in
PROTECTED_JOB_NAMES
, so direct MCP submission requires
{allowProtectedSubmit: true}
, which
gbrain agent run
supplies.
适用于开放式推理、工具调用型研究,以及分发式合成任务。
用户入口
gbrain agent run <prompt>
是提交子代理任务的标准方式。它处理高信任度的管道——
subagent
subagent_aggregator
均属于
PROTECTED_JOB_NAMES
,因此直接通过MCP提交需要
{allowProtectedSubmit: true}
,而
gbrain agent run
会自动提供该参数。

Phase 1: Submit

阶段1:提交

gbrain agent run "Research Acme Corp revenue" --tools "search,query"
--tools
accepts a comma-separated subset of
BRAIN_TOOL_ALLOWLIST
(see
src/core/minions/tools/brain-allowlist.ts
):
query
,
search
,
get_page
,
list_pages
,
file_list
,
file_url
,
get_backlinks
,
traverse_graph
,
resolve_slugs
,
get_ingest_log
,
put_page
. Anything outside the allow-list is rejected at submit time with
allowed_tools references unknown tool
.
For parallel work with a fan-out manifest:
gbrain agent run --fanout-manifest companies.json
The manifest describes N children + 1 aggregator. Each child runs
name="subagent"
under the hood; the aggregator runs
name="subagent_aggregator"
and claims AFTER every child terminates. See
src/core/minions/handlers/subagent.ts
and
src/core/minions/handlers/subagent-aggregator.ts
.
Flags (from
src/commands/agent.ts
):
  • --subagent-def <name>
    — named subagent definition
  • --model <id>
    — override model
  • --max-turns <N>
    — cap the LLM loop
  • --tools <csv>
    — allow-listed brain tools (see above)
  • --timeout-ms <N>
    — hard timeout per job
  • --fanout-manifest <file>
    — N children + 1 aggregator
  • --follow
    /
    --no-follow
    — stream logs + wait (default on TTY)
  • --detach
    — submit and return immediately
Queue/priority/retry tuning is not exposed by
gbrain agent run
; submit the raw
subagent
handler via
gbrain jobs submit
(requires CLI trust) if you need those knobs.
gbrain agent run "Research Acme Corp revenue" --tools "search,query"
--tools
接受
BRAIN_TOOL_ALLOWLIST
(见
src/core/minions/tools/brain-allowlist.ts
)中的逗号分隔子集:
query
search
get_page
list_pages
file_list
file_url
get_backlinks
traverse_graph
resolve_slugs
get_ingest_log
put_page
。提交时若使用允许列表外的工具,会被拒绝并提示
allowed_tools references unknown tool
使用分发清单处理并行任务:
gbrain agent run --fanout-manifest companies.json
清单描述了N个子代理 + 1个聚合器。每个子代理在后台运行
name="subagent"
;聚合器运行
name="subagent_aggregator"
,并在所有子代理终止后执行。详见
src/core/minions/handlers/subagent.ts
src/core/minions/handlers/subagent-aggregator.ts
标志(来自
src/commands/agent.ts
):
  • --subagent-def <name>
    — 命名子代理定义
  • --model <id>
    — 覆盖默认模型
  • --max-turns <N>
    — 限制LLM循环次数
  • --tools <csv>
    — 允许使用的brain工具(见上文)
  • --timeout-ms <N>
    — 任务硬超时时间
  • --fanout-manifest <file>
    — N个子代理 + 1个聚合器
  • --follow
    /
    --no-follow
    — 流式日志 + 等待(终端环境默认启用)
  • --detach
    — 提交后立即返回
gbrain agent run
未暴露队列/优先级/重试调优选项;若需要这些配置,可通过
gbrain jobs submit
提交原始
subagent
处理程序(需CLI信任权限)。

Phase 2: Monitor

阶段2:监控

list_jobs --status active          # MCP — what's running?
get_job ID                         # MCP — full details + logs + tokens
get_job_progress ID                # MCP — structured progress snapshot
gbrain jobs stats                  # CLI — queue health dashboard
gbrain agent logs ID --follow      # CLI — streaming transcript + heartbeat
Progress includes: step count, total steps, message, token usage, last tool called.
list_jobs --status active          # MCP — 当前运行的任务?
get_job ID                         # MCP — 完整详情 + 日志 + 令牌统计
get_job_progress ID                # MCP — 结构化进度快照
gbrain jobs stats                  # CLI — 队列健康仪表板
gbrain agent logs ID --follow      # CLI — 流式会话记录 + 心跳
进度信息包括:步骤计数、总步骤数、消息、令牌使用量、最后调用的工具。

Phase 3: Steer

阶段3:控制

Send a message to redirect a running agent:
send_job_message id=ID payload={"directive":"focus on revenue, skip headcount"}
The agent handler reads inbox messages on each iteration and injects them as context. Messages are acknowledged (read receipts tracked).
Only the parent job or admin can send messages (sender validation).
发送消息重定向运行中的代理:
send_job_message id=ID payload={"directive":"focus on revenue, skip headcount"}
代理处理程序会在每次迭代时读取收件箱消息,并将其作为上下文注入。消息会被确认(已读状态会被跟踪)。
仅父任务或管理员可发送消息(发送者验证)。

Phase 4: Lifecycle

阶段4:生命周期

pause_job id=ID                    # freeze without losing state
resume_job id=ID                   # pick up where it left off
cancel_job id=ID                   # hard stop
replay_job id=ID                   # re-run with same or modified params
replay_job id=ID data_overrides={"depth":"deep"}  # replay with changes
All lifecycle ops are MCP-callable.
pause_job id=ID                    # 冻结任务,不丢失状态
resume_job id=ID                   # 从暂停处继续执行
cancel_job id=ID                   # 强制停止
replay_job id=ID                   # 使用相同或修改后的参数重新运行
replay_job id=ID data_overrides={"depth":"deep"}  # 修改参数后重新运行
所有生命周期操作均可通过MCP调用。

Phase 5: Review Results

阶段5:查看结果

get_job ID                         # result, token counts, transcript
Token accounting: every job tracks
tokens_input
,
tokens_output
,
tokens_cache_read
. Child tokens roll up to parent automatically on completion.
get_job ID                         # 结果、令牌统计、会话记录
令牌统计:每个任务都会跟踪
tokens_input
tokens_output
tokens_cache_read
。子任务的令牌使用量会在完成后自动汇总到父任务。

Output Format

输出格式

When reporting job status to the user:
Job #ID (name) — status
Progress: step/total — last action
Tokens: input_count in / output_count out (+ cache_read cached)
Runtime: Xs
Children: N pending, M completed
When reporting completion:
Job #ID completed in Xs
Tokens used: input / output / cache_read
Result: <summary>
When reporting batch status (parent with children):
Parent #ID — waiting-children
  #A subagent(Acme) — active, 3/5 steps, 2.5k tokens
  #B subagent(Beta) — completed, 1.8k tokens
  #C subagent(Gamma) — paused
Total tokens so far: 4.3k
向用户报告任务状态时:
任务 #ID(名称) — 状态
进度:已完成步骤/总步骤 — 最后操作
令牌:输入 input_count / 输出 output_count(+ 缓存读取 cache_read)
运行时间:X秒
子任务:N个待处理,M个已完成
报告任务完成时:
任务 #ID 已完成,耗时X秒
令牌使用量:输入 / 输出 / 缓存读取
结果:<摘要>
报告批量任务状态(父任务带多个子任务)时:
父任务 #ID — 等待子任务完成
  #A subagent(Acme) — 运行中,3/5步骤,2.5k令牌
  #B subagent(Beta) — 已完成,1.8k令牌
  #C subagent(Gamma) — 已暂停
当前总令牌使用量:4.3k

Anti-Patterns

反模式

  • Don't spawn a Minion for a single search query (use search tool directly)
  • Don't fire-and-forget without checking results
  • Don't spawn > 5 concurrent agents without checking
    gbrain jobs stats
    first
  • For subagent work, don't use
    sessions_spawn
    with
    runtime: "subagent"
    when Minions is available (use
    gbrain agent run
    instead)
  • Don't poll
    get_job
    in a tight loop (use
    get_job_progress
    for lightweight checks)
  • 不要为单个搜索查询生成Minion任务(直接使用搜索工具)
  • 不要提交任务后就不管不顾,不检查结果
  • 在未查看
    gbrain jobs stats
    的情况下,不要生成超过5个并发代理
  • 对于子代理任务,当Minions可用时,不要使用
    sessions_spawn
    并设置
    runtime: "subagent"
    (应使用
    gbrain agent run
  • 不要在循环中频繁调用
    get_job
    (使用
    get_job_progress
    进行轻量级检查)

Tools Used

使用的工具

  • Submit a background job —
    submit_job
    (MCP, non-protected names only; shell jobs are CLI-only, subagent jobs via
    gbrain agent run
    )
  • Get job details —
    get_job
    (MCP)
  • List jobs with filters —
    list_jobs
    (MCP)
  • Cancel a job —
    cancel_job
    (MCP)
  • Pause a job —
    pause_job
    (MCP)
  • Resume a paused job —
    resume_job
    (MCP)
  • Replay a completed/failed job —
    replay_job
    (MCP)
  • Send sidechannel message —
    send_job_message
    (MCP)
  • Get structured progress —
    get_job_progress
    (MCP)
  • Queue stats —
    gbrain jobs stats
    (CLI; no MCP equivalent)
  • 提交后台任务 —
    submit_job
    (MCP,仅支持非受保护名称;shell任务仅支持CLI,子代理任务通过
    gbrain agent run
    提交)
  • 获取任务详情 —
    get_job
    (MCP)
  • 过滤列出任务 —
    list_jobs
    (MCP)
  • 取消任务 —
    cancel_job
    (MCP)
  • 暂停任务 —
    pause_job
    (MCP)
  • 恢复暂停的任务 —
    resume_job
    (MCP)
  • 重放已完成/失败的任务 —
    replay_job
    (MCP)
  • 发送侧通道消息 —
    send_job_message
    (MCP)
  • 获取结构化进度 —
    get_job_progress
    (MCP)
  • 队列统计 —
    gbrain jobs stats
    (CLI;无MCP等效操作)