ralph-review-deep

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

STARTER_CHARACTER = 🔬

STARTER_CHARACTER = 🔬

Ralph Review (Deep)

Ralph深度评审

Validate the beads graph and launch three independent reviews of the ralph artifacts in parallel using different models via

opencode run

. Each model produces one review file evaluating goal clarity, task decomposition, sequencing, plan completeness, risk and gaps, and feasibility.

For a faster single-model review using an internal subagent, use the ralph-review skill (/ralph-review, $ralph-review) instead. This deep variant trades wall time and cost for cross-model consensus.

验证beads图，并通过

opencode run

使用不同模型并行启动三个独立的Ralph工件评审。每个模型生成一份评审文件，评估目标清晰度、任务分解、排序、计划完整性、风险与缺口以及可行性。

如需使用内部子代理进行更快的单模型评审，请改用ralph-review技能（/ralph-review, $ralph-review）。此深度变体以耗时和成本为代价换取跨模型共识。

Models

模型

Label	Model ID
openai	openai/gpt-5.3-codex
gemini	google/gemini-3.1-pro-preview
claude	az-anthropic/claude-opus-4-6

标签	模型ID
openai	openai/gpt-5.3-codex
gemini	google/gemini-3.1-pro-preview
claude	az-anthropic/claude-opus-4-6

Procedure

流程

Step 1: Validate Artifacts Exist

步骤1：验证工件存在

bash

PROJECT_ROOT=$(git rev-parse --show-toplevel)

Verify all of these exist:

```
$PROJECT_ROOT/.llmtmp/ralph-plan.md
```

$PROJECT_ROOT/.llmtmp/ralph-prompt-insession.md

$PROJECT_ROOT/.llmtmp/ralph-prompt-external.md

$PROJECT_ROOT/.beads/

(directory; bd-aware probe via

bd list >/dev/null 2>&1

)

Abort if any are missing.

bash

PROJECT_ROOT=$(git rev-parse --show-toplevel)

验证以下所有内容是否存在：

```
$PROJECT_ROOT/.llmtmp/ralph-plan.md
```

$PROJECT_ROOT/.llmtmp/ralph-prompt-insession.md

$PROJECT_ROOT/.llmtmp/ralph-prompt-external.md

$PROJECT_ROOT/.beads/

(目录；通过

bd list >/dev/null 2>&1

进行bd感知探测)

若有任何一项缺失则中止。

Step 2: Validate the bd Graph

步骤2：验证bd图

Run these checks before launching opencode. Failures here are HIGH-severity findings and should abort or be flagged loudly.

The bd graph is organized under a per-run epic. The slug is derived from the current branch; the epic is found via the

ralph:<slug>

label. All scoped checks filter by

--parent ${EPIC}

bash

cd "$PROJECT_ROOT"

SLUG=$(git branch --show-current | tr '/' '-')

在启动opencode前运行这些检查。此处的失败属于高严重性问题，应中止流程或进行醒目标记。

bd图按每次运行的史诗（epic）进行组织。slug由当前分支派生而来；史诗通过

ralph:<slug>

标签查找。所有范围检查均通过

--parent ${EPIC}

进行过滤。

bash

cd "$PROJECT_ROOT"

SLUG=$(git branch --show-current | tr '/' '-')

1. Single open epic for this run

1. 本次运行仅存在一个开放的史诗

EPIC_COUNT=$(bd list -t epic -l "ralph:${SLUG}" --json 2>/dev/null | jq '[.[] | select(.status=="open")] | length') if [[ "$EPIC_COUNT" -ne 1 ]]; then echo "HIGH: expected exactly one open epic with label ralph:${SLUG}, found ${EPIC_COUNT}" fi EPIC=$(bd list -t epic -l "ralph:${SLUG}" --json 2>/dev/null | jq -r '[.[] | select(.status=="open")] | .[0].id // empty')

2. Cycles

2. 循环检查

if ! bd dep cycles >/dev/null 2>&1; then echo "HIGH: bd graph has dependency cycles" bd dep cycles fi

3. Minimum child count under the run epic

3. 运行史诗下的最小子任务数

COUNT=$(bd list --parent "$EPIC" --json 2>/dev/null | jq 'length') if [[ "$COUNT" -eq 0 ]]; then echo "HIGH: run epic ${EPIC} has no child tasks" fi

4. At least one ready task under the run epic

4. 运行史诗下至少有一个就绪任务

READY_COUNT=$(bd ready --parent "$EPIC" --json 2>/dev/null | jq 'length') if [[ "$READY_COUNT" -eq 0 && "$COUNT" -gt 0 ]]; then echo "HIGH: run epic ${EPIC} has tasks but no ready tasks (deadlock or all closed unexpectedly)" fi

5. Isolated nodes within the run epic

5. 运行史诗内的孤立节点

if [[ "$COUNT" -gt 1 ]]; then bd list --parent "$EPIC" --json 2>/dev/null
| jq -r '.[] | select(.dependency_count == 0 and .dependent_count == 0) | "MEDIUM: isolated task " + .id + " " + .title' fi

6. Description coverage: every child carries the worker's instructions

6. 描述覆盖率：每个子任务都包含执行者的指令

bd list --parent "$EPIC" --json 2>/dev/null
| jq -r '.[] | select(((.description // "") | length) == 0) | "HIGH: task " + .id + " has empty description: " + .title'

7. Parent set equals label set (defense-in-depth orphan detection)

7. 父集合与标签集合一致（深度防御式孤立任务检测）

PARENT_IDS=$(bd list --parent "$EPIC" --json 2>/dev/null | jq -r '.[].id' | sort) LABEL_IDS=$(bd list --label "ralph:${SLUG}" --json 2>/dev/null | jq -r '.[] | select(.issue_type != "epic") | .id' | sort) if [[ "$PARENT_IDS" != "$LABEL_IDS" ]]; then echo "HIGH: parent set (--parent ${EPIC}) and label set (--label ralph:${SLUG} excluding epic) differ" echo "Parent-only (label missing): $(comm -23 <(echo "$PARENT_IDS") <(echo "$LABEL_IDS"))" echo "Label-only (orphan, parent missing): $(comm -13 <(echo "$PARENT_IDS") <(echo "$LABEL_IDS"))" fi


Persist these findings; the reviewers reference them in their final reports.


保存这些检查结果；评审人员会在最终报告中参考这些内容。

Step 3: Clean Up Previous Reviews

步骤3：清理之前的评审文件

bash

rm -f "$PROJECT_ROOT"/.llmtmp/ralph-review-openai.md \
      "$PROJECT_ROOT"/.llmtmp/ralph-review-gemini.md \
      "$PROJECT_ROOT"/.llmtmp/ralph-review-claude.md

bash

rm -f "$PROJECT_ROOT"/.llmtmp/ralph-review-openai.md \
      "$PROJECT_ROOT"/.llmtmp/ralph-review-gemini.md \
      "$PROJECT_ROOT"/.llmtmp/ralph-review-claude.md

Step 4: Read Files and Build Prompt Context

步骤4：读取文件并构建提示词上下文

Concatenate the contents of these files into a single

CONTEXT

string:

```
.llmtmp/ralph-plan.md
```
```
.llmtmp/ralph-prompt-insession.md
```
```
.llmtmp/ralph-prompt-external.md
```
The slug and epic ID resolved in Step 2 (literal label
```
SLUG=...
```
and
```
EPIC=...
```
)
Output of
```
bd list --parent "$EPIC" --json
```
Output of
```
bd ready --parent "$EPIC" --json
```
Output of
```
bd list --label "ralph:${SLUG}" --json
```
The bd graph validation findings from Step 2
Plan documents listed in the plan's
```
plan documents:
```
line. Parse the line, split on commas, trim whitespace, read each path. Wrap each as
```
=== plan-documents/<path> ===
```
so the reviewer can distinguish them from baseline. Empty value means no plan documents to load.
```
CLAUDE.md
```
and
```
README.md
```
from project root (if they exist)
All
```
.md
```
files in
```
.llmdocs/
```

Wrap each as:

text

=== <relative-path or label> ===
<contents>

将以下文件的内容拼接成单个

CONTEXT

字符串：

```
.llmtmp/ralph-plan.md
```
```
.llmtmp/ralph-prompt-insession.md
```
```
.llmtmp/ralph-prompt-external.md
```
步骤2中解析得到的slug和史诗ID（字面标签
```
SLUG=...
```
和
```
EPIC=...
```
）
```
bd list --parent "$EPIC" --json
```
的输出
```
bd ready --parent "$EPIC" --json
```
的输出
```
bd list --label "ralph:${SLUG}" --json
```
的输出
步骤2中的bd图验证结果
计划中
```
plan documents:
```
行列出的计划文档。解析该行，按逗号分割，去除空格，读取每个路径。将每个路径包装为
```
=== plan-documents/<path> ===
```
，以便评审人员区分基线内容。空值表示无需加载计划文档。
项目根目录下的
```
CLAUDE.md
```
和
```
README.md
```
（若存在）
```
.llmdocs/
```
目录下的所有
```
.md
```
文件

将每个内容块包装为：

text

=== <相对路径或标签> ===
<内容>

Step 5: Construct the Review Prompt

步骤5：构建评审提示词

Build the prompt that each opencode process executes. Replace

<PROJECT_ROOT>

and

<CONTEXT>

text

PROMPT="You are a plan reviewer running headless in a non-interactive session. There is no user present. Do not ask questions. Do not prompt for confirmation. Do not stop to wait for input.

YOUR PRIMARY OBJECTIVE IS TO WRITE A FILE. Everything else is preparation. After completing your review, your VERY NEXT action must be a tool call that writes the output file. Do not summarize, do not reflect, do not plan — write the file immediately.

OUTPUT RULES: You are running headless. Nobody reads your text responses. ALL review content goes exclusively into the output file via tool calls. Any text you generate outside of a tool call is wasted tokens and risks hitting the output token limit before you can write the file. Keep text responses to one short sentence at most.

PROJECT_ROOT: <PROJECT_ROOT>

构建每个opencode流程执行的提示词。替换

<PROJECT_ROOT>

和

<CONTEXT>

：

text

PROMPT="You are a plan reviewer running headless in a non-interactive session. There is no user present. Do not ask questions. Do not prompt for confirmation. Do not stop to wait for input.

YOUR PRIMARY OBJECTIVE IS TO WRITE A FILE. Everything else is preparation. After completing your review, your VERY NEXT action must be a tool call that writes the output file. Do not summarize, do not reflect, do not plan — write the file immediately.

OUTPUT RULES: You are running headless. Nobody reads your text responses. ALL review content goes exclusively into the output file via tool calls. Any text you generate outside of a tool call is wasted tokens and risks hitting the output token limit before you can write the file. Keep text responses to one short sentence at most.

PROJECT_ROOT: <PROJECT_ROOT>

Context

How Ralph Loops Work

A ralph loop is an autonomous, iterative execution mechanism. Two execution modes share the same artifacts:

In-session (ralph-loop plugin): a Stop hook re-feeds the prompt to the same Claude session. Cheap on short runs; expensive on long runs because context grows quadratically.
External (scripts/ralph.sh): a bash while-loop spawns a fresh
```
claude --print
```
per iteration. Higher fixed overhead per iteration; stays in the cheap part of the cost curve indefinitely.

Authority sources in CONTEXT:

The CONTEXT contains two distinct authority categories. Treat them differently.

Plan documents (under
```
=== plan-documents/...
```
blocks): forward-looking intent the planner consulted, e.g., a PRD, a Jira ticket, a design note. The plan must faithfully render these. Flag the plan if it omits, contradicts, or stale-references a requirement from a plan document.
Baseline docs (
```
CLAUDE.md
```
,
```
README.md
```
,
```
.llmdocs/*
```
): the system's current state. The plan's job is to change this state. Do NOT flag the plan for deviating from baseline. Flag only if the plan changes baseline without including a corresponding docs-update bead (typically delegated to the docs skill).

Mechanics that apply to both modes:

The bd graph is the state machine. The graph is shared across worktrees of the same repo. Each ralph run is organized under a per-run epic so concurrent runs in different worktrees stay isolated by
```
--parent
```
filtering.
```
bd ready --parent <epic> --json
```
finds the next unblocked task within this run.
The run epic carries the label
```
ralph:<slug>
```
where
```
<slug>
```
is derived from the current branch. Every child task carries both
```
--parent <epic>
```
and the same
```
ralph:<slug>
```
label. The two signals are independent so the reviewer can detect orphans.
The plan's "Run Identity" section records the literal slug and epic ID. The agent reads them directly from the plan each iteration, no derivation.
One task per iteration. The agent reads
```
bd ready --parent <epic> --json
```
, takes the first result, runs the per-task workflow, then closes the task.
When the ready queue is empty, the agent closes the epic and emits the sentinel as the terminal action. The epic is not closed by
```
bd
```
automatically.
Resumable. A partially-completed graph resumes cleanly. Partial completion is normal; iterations are cheap.
Every action is a bd task, including ralph lifecycle scaffolding. The branch creation, BEGIN tag, code work, docs, and END tag are all first-class beads, all parented to the run epic.
Each bead carries its full per-task instructions in its description: files to touch, expected behavior, verification command, commit message. The worker reads
```
bd show <id>
```
and executes the description exactly.
Run-wide conventions (commit style, language, build command, test command) live in
```
ralph-plan.md
```
, not in every description.
Environment is a precondition. Credentials, network, toolchain, and runtime dependencies are the operator's responsibility. Do not flag missing environment setup as a gap.

Do NOT recommend: collapsing tasks into bundles, adding environment preflight checks, treating partial completion as a risk, modifying the prompt body, replacing the bd graph with a markdown checklist, or reintroducing a workflow-label taxonomy. The

ralph:<slug>

label is run-identity for partition and orphan detection, not workflow dispatch; do not flag it as a label taxonomy.

Flag as HIGH severity if: zero or more than one open epic exists for

ralph:<slug>

; any open child has an empty description; the bd graph contains cycles; the run epic has no children; the run epic has children but no ready tasks; the parent set under the epic differs from the label set for

ralph:<slug>

(orphans or planning typos); the per-task workflow does not explicitly stop after

bd close

(the agent will do multiple beads in one context, defeating the Ralph design); the per-task workflow does not close the epic before emitting the sentinel; the plan lacks a "Run Identity" section recording the literal slug and epic ID; the plan lacks an "Inputs" section with a

plan documents:

line; the plan omits or contradicts a material requirement explicit in any plan document; the plan's Approach prose declares an ordering ("X first", "after Y") that the bd dependency edges do not enforce. Flag as MEDIUM if: a description omits the verification command or commit message; an isolated task exists in a multi-task graph; the plan duplicates information that already lives in bead descriptions; a bead that touches README.md/CLAUDE.md/.llmdocs/ authors edits inline instead of running the docs skill.

A ralph loop is an autonomous, iterative execution mechanism. Two execution modes share the same artifacts:

In-session (ralph-loop plugin): a Stop hook re-feeds the prompt to the same Claude session. Cheap on short runs; expensive on long runs because context grows quadratically.
External (scripts/ralph.sh): a bash while-loop spawns a fresh
```
claude --print
```
per iteration. Higher fixed overhead per iteration; stays in the cheap part of the cost curve indefinitely.

Authority sources in CONTEXT:

The CONTEXT contains two distinct authority categories. Treat them differently.

Plan documents (under
```
=== plan-documents/...
```
blocks): forward-looking intent the planner consulted, e.g., a PRD, a Jira ticket, a design note. The plan must faithfully render these. Flag the plan if it omits, contradicts, or stale-references a requirement from a plan document.
Baseline docs (
```
CLAUDE.md
```
,
```
README.md
```
,
```
.llmdocs/*
```
): the system's current state. The plan's job is to change this state. Do NOT flag the plan for deviating from baseline. Flag only if the plan changes baseline without including a corresponding docs-update bead (typically delegated to the docs skill).

Mechanics that apply to both modes:

The bd graph is the state machine. The graph is shared across worktrees of the same repo. Each ralph run is organized under a per-run epic so concurrent runs in different worktrees stay isolated by
```
--parent
```
filtering.
```
bd ready --parent <epic> --json
```
finds the next unblocked task within this run.
The run epic carries the label
```
ralph:<slug>
```
where
```
<slug>
```
is derived from the current branch. Every child task carries both
```
--parent <epic>
```
and the same
```
ralph:<slug>
```
label. The two signals are independent so the reviewer can detect orphans.
The plan's "Run Identity" section records the literal slug and epic ID. The agent reads them directly from the plan each iteration, no derivation.
One task per iteration. The agent reads
```
bd ready --parent <epic> --json
```
, takes the first result, runs the per-task workflow, then closes the task.
When the ready queue is empty, the agent closes the epic and emits the sentinel as the terminal action. The epic is not closed by
```
bd
```
automatically.
Resumable. A partially-completed graph resumes cleanly. Partial completion is normal; iterations are cheap.
Every action is a bd task, including ralph lifecycle scaffolding. The branch creation, BEGIN tag, code work, docs, and END tag are all first-class beads, all parented to the run epic.
Each bead carries its full per-task instructions in its description: files to touch, expected behavior, verification command, commit message. The worker reads
```
bd show <id>
```
and executes the description exactly.
Run-wide conventions (commit style, language, build command, test command) live in
```
ralph-plan.md
```
, not in every description.
Environment is a precondition. Credentials, network, toolchain, and runtime dependencies are the operator's responsibility. Do not flag missing environment setup as a gap.

ralph:<slug>

label is run-identity for partition and orphan detection, not workflow dispatch; do not flag it as a label taxonomy.

Flag as HIGH severity if: zero or more than one open epic exists for

ralph:<slug>

ralph:<slug>

(orphans or planning typos); the per-task workflow does not explicitly stop after

bd close

plan documents:

Your Task

Review the Ralph artifacts:

ralph-plan.md (mode-neutral execution plan)
ralph-prompt-insession.md (slash-command wrapper for the in-session driver)
ralph-prompt-external.md (body for scripts/ralph.sh)
The bd graph (issues, dependencies, ready/closed state)

Evaluate across these areas:

Review the Ralph artifacts:

ralph-plan.md (mode-neutral execution plan)
ralph-prompt-insession.md (slash-command wrapper for the in-session driver)
ralph-prompt-external.md (body for scripts/ralph.sh)
The bd graph (issues, dependencies, ready/closed state)

Evaluate across these areas:

1. Goal Clarity

Is the objective well-defined and unambiguous?
Would a fresh-context iteration understand what success looks like from the plan alone?

Is the objective well-defined and unambiguous?
Would a fresh-context iteration understand what success looks like from the plan alone?

2. Task Decomposition

Are tasks the right granularity (one focused deliverable each)?
Is each task independently completable?
Are there missing tasks needed to achieve the goal?
Are there unnecessary or redundant tasks?
Do titles name deliverables rather than techniques?
Do descriptions carry enough detail (files, verification, commit message) for a cold-start worker?

Are tasks the right granularity (one focused deliverable each)?
Is each task independently completable?
Are there missing tasks needed to achieve the goal?
Are there unnecessary or redundant tasks?
Do titles name deliverables rather than techniques?
Do descriptions carry enough detail (files, verification, commit message) for a cold-start worker?

3. Sequencing and Dependencies

Are dependencies correct in the bd graph?
Would executing in
```
bd ready --json
```
order produce correct results?
Are there implicit dependencies that should be explicit edges?
Are there cycles or isolated nodes?

Are dependencies correct in the bd graph?
Would executing in
```
bd ready --json
```
order produce correct results?
Are there implicit dependencies that should be explicit edges?
Are there cycles or isolated nodes?

4. Plan Completeness

Do the instructions cover repo-specific conventions, tools, and commands?
Are build, test, and lint commands accurate?
Is the per-task workflow clear (read description, execute it, close)?
Does the plan have a "Run Identity" section that records the literal branch slug, the literal epic ID, and the epic label
```
ralph:<slug>
```
?
Does the Per-Task Workflow read the epic ID from the "Run Identity" section and scope
```
bd ready
```
with
```
--parent <epic-id>
```
?
Does the Per-Task Workflow close the epic via
```
bd close <epic-id>
```
before emitting the sentinel when the ready queue is empty?
Does the Per-Task Workflow explicitly say STOP after
```
bd close
```
so the agent does not loop within a single context?
Does the Git Workflow section require a
```
ralph/
```
feature branch and BEGIN/END tags?
Is the Per-Task Workflow explicit that
```
bd close
```
for the current task runs BEFORE the sentinel?
Do beads that touch
```
README.md
```
,
```
CLAUDE.md
```
, or
```
.llmdocs/
```
invoke the docs skill (/docs, $docs) rather than authoring edits inline?

Do the instructions cover repo-specific conventions, tools, and commands?
Are build, test, and lint commands accurate?
Is the per-task workflow clear (read description, execute it, close)?
Does the plan have a "Run Identity" section that records the literal branch slug, the literal epic ID, and the epic label
```
ralph:<slug>
```
?
Does the Per-Task Workflow read the epic ID from the "Run Identity" section and scope
```
bd ready
```
with
```
--parent <epic-id>
```
?
Does the Per-Task Workflow close the epic via
```
bd close <epic-id>
```
before emitting the sentinel when the ready queue is empty?
Does the Per-Task Workflow explicitly say STOP after
```
bd close
```
so the agent does not loop within a single context?
Does the Git Workflow section require a
```
ralph/
```
feature branch and BEGIN/END tags?
Is the Per-Task Workflow explicit that
```
bd close
```
for the current task runs BEFORE the sentinel?
Do beads that touch
```
README.md
```
,
```
CLAUDE.md
```
, or
```
.llmdocs/
```
invoke the docs skill (/docs, $docs) rather than authoring edits inline?

5. Risk and Gaps

What could go wrong during autonomous execution?
Are there tasks that require human judgment or external access?
Could any task leave the repo in a broken state?

What could go wrong during autonomous execution?
Are there tasks that require human judgment or external access?
Could any task leave the repo in a broken state?

6. Feasibility

Can a single iteration realistically complete each task?
Are there tasks that exceed what a fresh context can do autonomously?

For each area, list specific findings with severity (high/medium/low) and actionable recommendations.

Can a single iteration realistically complete each task?
Are there tasks that exceed what a fresh context can do autonomously?

For each area, list specific findings with severity (high/medium/low) and actionable recommendations.

Output

Your VERY NEXT action after the review must be a tool call that writes the file. No intermediate steps.

Determine your model label:

Claude variants: claude
GPT variants: openai
Gemini variants: gemini

Write to

<PROJECT_ROOT>/.llmtmp/ralph-review-$MODEL_LABEL.md

using this template:

```markdown

Your VERY NEXT action after the review must be a tool call that writes the file. No intermediate steps.

Determine your model label:

Claude variants: claude
GPT variants: openai
Gemini variants: gemini

Write to

<PROJECT_ROOT>/.llmtmp/ralph-review-$MODEL_LABEL.md

using this template:

```markdown

Ralph Loop Review

Model: <MODEL_LABEL>

Goal Clarity

[medium] Short title. Explanation...

[medium] Short title. Explanation...

Task Decomposition

[high] Short title. Explanation...

[high] Short title. Explanation...

Sequencing and Dependencies

[medium] Short title. Explanation...

[medium] Short title. Explanation...

Plan Completeness

[low] Short title. Explanation...

[low] Short title. Explanation...

Risk and Gaps

[high] Short title. Explanation...

[high] Short title. Explanation...

Feasibility

[medium] Short title. Explanation...

[medium] Short title. Explanation...

Summary

<overall assessment and top 3 recommendations> ```

Finding format: `- [severity] Title. Description.`

Every section must be present. If no findings for an area, write 'No findings.' under its heading.

Then verify: `ls -la '<PROJECT_ROOT>/.llmtmp/ralph-review-<MODEL_LABEL>.md'`

If the file does not exist, write it again. Do not exit without the file."

undefined

<overall assessment and top 3 recommendations> ```

Finding format: `- [severity] Title. Description.`

Every section must be present. If no findings for an area, write 'No findings.' under its heading.

Then verify: `ls -la '<PROJECT_ROOT>/.llmtmp/ralph-review-<MODEL_LABEL>.md'`

If the file does not exist, write it again. Do not exit without the file."

undefined

Step 6: Write Prompt to File

步骤6：将提示词写入文件

Avoid shell interpolation issues with large prompts:

bash

STATE_DIR="$PROJECT_ROOT/.llmtmp/ralph_review_state"
mkdir -p "$STATE_DIR"
OPENAI_DIR=$(mktemp -d)
GEMINI_DIR=$(mktemp -d)
CLAUDE_DIR=$(mktemp -d)
cat > /tmp/ralph-review-prompt.txt <<'PROMPT_EOF'
<the prompt from Step 5>
PROMPT_EOF

避免大型提示词导致的shell插值问题：

bash

STATE_DIR="$PROJECT_ROOT/.llmtmp/ralph_review_state"
mkdir -p "$STATE_DIR"
OPENAI_DIR=$(mktemp -d)
GEMINI_DIR=$(mktemp -d)
CLAUDE_DIR=$(mktemp -d)
cat > /tmp/ralph-review-prompt.txt <<'PROMPT_EOF'
<the prompt from Step 5>
PROMPT_EOF

Step 7: Launch 3 Separate Background Bash Tasks

步骤7：启动3个独立的后台Bash任务

Each opencode process must run as its own background Bash task. Do NOT chain them in a single shell with

; child processes get killed when the parent exits.

OpenAI:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m openai/gpt-5.3-codex \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<OPENAI_DIR>" \
  --title "Ralph Review - OpenAI" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/openai.ndjson" 2>"$STATE_DIR/openai.log"

Gemini:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m google/gemini-3.1-pro-preview \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<GEMINI_DIR>" \
  --title "Ralph Review - Gemini" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/gemini.ndjson" 2>"$STATE_DIR/gemini.log"

Claude:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m az-anthropic/claude-opus-4-6 \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<CLAUDE_DIR>" \
  --title "Ralph Review - Claude" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/claude.ndjson" 2>"$STATE_DIR/claude.log"

All 3 launch as separate background tasks. Do NOT use

in a single shell.

每个opencode流程必须作为独立的后台Bash任务运行。不要在单个shell中用

串联；父进程退出时子进程会被终止。

OpenAI:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m openai/gpt-5.3-codex \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<OPENAI_DIR>" \
  --title "Ralph Review - OpenAI" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/openai.ndjson" 2>"$STATE_DIR/openai.log"

Gemini:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m google/gemini-3.1-pro-preview \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<GEMINI_DIR>" \
  --title "Ralph Review - Gemini" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/gemini.ndjson" 2>"$STATE_DIR/gemini.log"

Claude:

bash

STATE_DIR="<PROJECT_ROOT>/.llmtmp/ralph_review_state" && \
opencode run \
  -m az-anthropic/claude-opus-4-6 \
  --format json \
  --print-logs \
  --log-level INFO \
  --dir "<CLAUDE_DIR>" \
  --title "Ralph Review - Claude" \
  "$(cat /tmp/ralph-review-prompt.txt)" \
  > "$STATE_DIR/claude.ndjson" 2>"$STATE_DIR/claude.log"

三个流程均作为独立后台任务启动。不要在单个shell中使用

。

Output Streams

输出流

Each opencode process produces two output files:

<label>.ndjson
(stdout): Structured NDJSON events from
```
--format json
```
. Use for programmatic progress tracking.
<label>.log
(stderr): Plain-text info-level logs. Use for diagnosing startup failures, permission issues, MCP server errors, plugin loading.

Log lines are structured text, one per line:

text

INFO  2026-03-13T00:54:25 +4ms service=default directory=/private/tmp creating instance

每个opencode流程生成两个输出文件：

<label>.ndjson
(标准输出)：
```
--format json
```
生成的结构化NDJSON事件。用于程序化进度跟踪。
<label>.log
(标准错误)：纯文本信息级日志。用于诊断启动失败、权限问题、MCP服务器错误、插件加载问题。

日志行是结构化文本，每行一条：

text

INFO  2026-03-13T00:54:25 +4ms service=default directory=/private/tmp creating instance

Step 8: Monitor and Wait

步骤8：监控并等待

Poll each background task until all 3 complete.

NDJSON progress (replace

$NDJSON

with the actual path):

bash

undefined

轮询每个后台任务，直到三个任务全部完成。

NDJSON进度（将

$NDJSON

替换为实际路径）：

bash

undefined

Count completed steps

统计已完成步骤数

grep -c '"type":"step_finish"' "$NDJSON" 2>/dev/null || echo 0

Check if done (last step_finish has reason "stop")

检查是否完成（最后一个step_finish的reason为"stop"）

tail -1 "$NDJSON" 2>/dev/null | jq -r '.part.reason // empty'

Check for errors

检查错误

grep '"type":"error"' "$NDJSON" 2>/dev/null


Text logs (replace `$LOGFILE` with the actual path):

```bash

grep '"type":"error"' "$NDJSON" 2>/dev/null


文本日志（将`$LOGFILE`替换为实际路径）：

```bash

Errors or warnings

错误或警告

grep -E "^(ERROR|WARN)" "$LOGFILE" 2>/dev/null

Recent activity

Is done?

是否已完成？

tail -1 "$NDJSON" | jq -r 'select(.type=="step_finish") | .part.reason'

Total cost

总成本

jq -s '[.[] | select(.type=="step_finish") | .part.cost] | add' "$NDJSON"

All errors

所有错误

jq -r 'select(.type=="error") | .error.data.message' "$NDJSON"


Replace `$LOGFILE` with the text log path:

```bash
grep -E "^(ERROR|WARN)" "$LOGFILE"

jq -r 'select(.type=="error") | .error.data.message' "$NDJSON"


将`$LOGFILE`替换为文本日志路径：

```bash
grep -E "^(ERROR|WARN)" "$LOGFILE"

Rules

规则

The invoking agent is ONLY a launcher. All review work happens inside the opencode processes.
Do NOT perform any review analysis directly. The opencode processes handle reviewing.
Do NOT ask questions during execution. This is non-interactive.
Launch all 3 processes in parallel. Do not wait for one to finish before starting another.
Use plain message invocation, not
```
--command
```
. The
```
--command
```
flag has a known issue with the c7 MCP server.
Preserve the state directory after reporting results. NDJSON and log files stay for post-run inspection.
If a model fails, still wait for and report the others.
The bd graph validation in Step 2 runs in the invoking agent, not in opencode. Findings are passed to opencode reviewers as context.

调用代理仅作为启动器。所有评审工作在opencode流程内完成。
不要直接执行任何评审分析。opencode流程负责评审工作。
执行过程中不要提问。此流程为非交互式。
并行启动三个流程。不要等待一个流程完成后再启动另一个。
使用普通消息调用，而非
```
--command
```
。
```
--command
```
标志在c7 MCP服务器上存在已知问题。
报告结果后保留状态目录。NDJSON和日志文件用于运行后检查。
若某个模型失败，仍需等待并报告其他模型的结果。
步骤2中的bd图验证在调用代理中运行，而非opencode。验证结果会作为上下文传递给opencode评审人员。