search-it-bulk-by-codex

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

search-it-bulk-by-codex

基于Codex批量执行搜索任务

Run large batches of small factual research questions with native Codex only:

codex exec

, built-in

--search

, and filesystem artifacts. Do not use MCPs, browser plugins, custom scrapers, or external research tools.

Default policy:

text

Subagent model: gpt-5.4-mini
Subagent reasoning: medium
Orchestrator model: gpt-5.4
Orchestrator reasoning: medium
Search: codex --search
Session root: .agent-docs/qa-session/

Never leave reasoning unset. Every

codex

command in this workflow includes

-c model_reasoning_effort=medium

仅使用原生Codex批量执行大量小型事实研究问题：借助

codex exec

、内置的

--search

功能和文件系统产物。请勿使用MCP、浏览器插件、自定义爬虫或外部研究工具。

默认策略：

text

Subagent model: gpt-5.4-mini
Subagent reasoning: medium
Orchestrator model: gpt-5.4
Orchestrator reasoning: medium
Search: codex --search
Session root: .agent-docs/qa-session/

请勿留空推理设置。此工作流中的每个

codex

命令都需包含

-c model_reasoning_effort=medium

。

Sanity Check

完整性检查

Run these before dispatching a batch:

bash

codex --help
codex exec --help

codex --help

must show

exec

--search

-m/--model

-c/--config

-a/--ask-for-approval

-s/--sandbox

, and

-C/--cd

codex exec --help

must show

--skip-git-repo-check

--output-last-message

-o

--json

, and

--sandbox

Important CLI quirks:

```
--search
```
is global; put it before
```
exec
```
.
```
--ask-for-approval
```
is global; use
```
-a never
```
before
```
exec
```
.
Use
```
--skip-git-repo-check
```
outside git repos.
Use
```
--sandbox workspace-write
```
when the agent must write answer files.
```
-o
```
captures the final message only; the answer file is the durable artifact.

Verified native web check, run 2026-05-22:

bash

codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox read-only \
  -o /tmp/codex-web-check.txt \
  "Use native live web search. What is the headline and URL of the newest item currently listed on the OpenAI News page? Answer in one sentence with the date if shown."

The run reported

model: gpt-5.4-mini

reasoning effort: medium

, and used web search. The final answer was the OpenAI News item "OpenAI named a Leader in enterprise coding agents by Gartner" at

https://openai.com/index/gartner-2026-agentic-coding-leader/

, dated 2026-05-22. Re-run this check in the target environment because auth, feature flags, and network policy can differ.

在批量调度前运行以下命令：

bash

codex --help
codex exec --help

codex --help

必须显示

exec

、

--search

、

-m/--model

、

-c/--config

、

-a/--ask-for-approval

、

-s/--sandbox

和

-C/--cd

。

codex exec --help

必须显示

--skip-git-repo-check

、

--output-last-message

-o

、

--json

和

--sandbox

。

重要的CLI特性：

```
--search
```
是全局参数；需放在
```
exec
```
之前。
```
--ask-for-approval
```
是全局参数；在
```
exec
```
之前使用
```
-a never
```
。
在Git仓库外使用
```
--skip-git-repo-check
```
。
当Agent需要写入答案文件时，使用
```
--sandbox workspace-write
```
。
```
-o
```
仅捕获最终消息；答案文件是持久化产物。

已验证的原生网络检查，运行于2026-05-22：

bash

codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox read-only \
  -o /tmp/codex-web-check.txt \
  "Use native live web search. What is the headline and URL of the newest item currently listed on the OpenAI News page? Answer in one sentence with the date if shown."

运行结果显示

model: gpt-5.4-mini

、

reasoning effort: medium

，且使用了网络搜索。最终答案是OpenAI新闻条目“OpenAI named a Leader in enterprise coding agents by Gartner”，链接为

https://openai.com/index/gartner-2026-agentic-coding-leader/

，日期为2026-05-22。请在目标环境中重新运行此检查，因为认证、功能标志和网络策略可能存在差异。

File Protocol

文件协议

Use one directory for the batch:

bash

mkdir -p .agent-docs/qa-session

Question files are written by the orchestrator:

text

.agent-docs/qa-session/001-question.md
.agent-docs/qa-session/002-question.md

Answer files are written by subagents:

text

.agent-docs/qa-session/001-answer-correct.md
.agent-docs/qa-session/002-answer-not-clear.md

Valid status suffixes:

Suffix	Use when
`correct`	Answer is confirmed and high-confidence
`findings`	Useful partial results, not fully conclusive
`incorrect`	Initial assumption was wrong; explain why
`not-clear`	Conflicting sources or insufficient signal
`timeout`	Search exhausted time/URL budget without resolution

If an incoming plan uses

NNNq.md

NNNa-{status}.md

, normalize it to

NNN-question.md

and

NNN-answer-{status}.md

. Keep the numeric prefix stable; it is the join key.

Question template:

undefined

为批量任务创建一个目录：

bash

mkdir -p .agent-docs/qa-session

问题文件由编排器生成：

text

.agent-docs/qa-session/001-question.md
.agent-docs/qa-session/002-question.md

答案文件由子Agent生成：

text

.agent-docs/qa-session/001-answer-correct.md
.agent-docs/qa-session/002-answer-not-clear.md

有效的状态后缀：

后缀	使用场景
`correct`	答案已确认且置信度高
`findings`	有用的部分结果，但未完全得出结论
`incorrect`	初始假设错误；需解释原因
`not-clear`	来源冲突或信号不足
`timeout`	搜索耗尽时间/URL预算仍未解决问题

如果现有计划使用

NNNq.md

或

NNNa-{status}.md

，请标准化为

NNN-question.md

和

NNN-answer-{status}.md

。保持数字前缀稳定；它是关联键。

问题模板：

undefined

001 Question

Question: <one focused sentence>

Context:

<known facts>
<false positives or traps>

Expected answer:

<single line, yes/no, path, package name, year, owner, etc.>


Answer template:

```md

Question: <one focused sentence>

Context:

<known facts>
<false positives or traps>

Expected answer:

<single line, yes/no, path, package name, year, owner, etc.>


答案模板：

```md

001 Answer

Sources:

<URL or exact search term> - <what it proved>

Notes: <Optional. Only include material interpretation notes.>


Keep answers short. Orchestrator context is finite; verbose answers pollute the
merge loop.

Sources:

<URL or exact search term> - <what it proved>

Notes: <Optional. Only include material interpretation notes.>


保持答案简洁。编排器的上下文有限；冗长的答案会污染合并循环。

Fan-Out Rule

扩散规则

A single broad search is incomplete. Every subagent must:

Decompose the question into 3-5 narrower sub-questions.
Search each sub-question independently with distinct keyword sets.
Cross-reference findings; agreement increases confidence, conflict triggers a targeted follow-up search.
Synthesize only after the sub-searches are done.

Budget is a ceiling, not a quota:

text

Search up to 50 URLs if needed.
Search up to 50 keyword/search-term variants if needed.
Use fewer when the evidence is already strong.

Example: for "Does library X support feature Y?", do not only search that sentence. Fan out:

text

library X changelog feature Y
library X GitHub issues feature Y
library X documentation Y API
library X Y workaround OR alternative
library X Y release notes

Prioritize official docs, changelogs, source repos, issue trackers, package indexes, platform records, and archived primary pages. Forums and blogs are supporting evidence. Snippets are leads, not proof.

单次宽泛搜索是不完整的。每个子Agent必须：

将问题分解为3-5个更具体的子问题。
使用不同的关键词集独立搜索每个子问题。
交叉验证结果；一致的结果提升置信度，冲突则触发针对性的后续搜索。
仅在完成所有子搜索后进行综合。

预算是上限，而非配额：

text

Search up to 50 URLs if needed.
Search up to 50 keyword/search-term variants if needed.
Use fewer when the evidence is already strong.

示例：对于“Does library X support feature Y?”，不要只搜索该句子。应扩散为：

text

library X changelog feature Y
library X GitHub issues feature Y
library X documentation Y API
library X Y workaround OR alternative
library X Y release notes

优先选择官方文档、更新日志、源码仓库、问题追踪器、包索引、平台记录和归档的主页面。论坛和博客作为辅助证据。片段仅作为线索，而非证据。

Subagent Prompt

子Agent提示词

Use one prompt file per question:

bash

cat > .agent-docs/qa-session/001-prompt.txt <<'PROMPT'
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.

Read .agent-docs/qa-session/001-question.md.
Choose exactly one status suffix from: correct, findings, incorrect,
not-clear, timeout.
Write exactly one answer file:
.agent-docs/qa-session/001-answer-<chosen-status>.md
Do not include the angle brackets literally.

Before searching, decompose the question into 3-5 narrower sub-questions. Do
not search the top-level question directly as your only search. Search up to
50 URLs and up to 50 keyword variants if needed; you may need fewer. Prioritize
the highest-signal search angles based on your own experience.

Use the required answer template. Keep it concise. No preamble, no extra
commentary, no markdown headers beyond the template.

Your answer file must contain these fields:
Question: <copy from question file>
Status: <chosen status>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>

If the answer likely does not exist, stop spinning and write not-clear or
incorrect with the best evidence.
PROMPT

Run it:

bash

codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox workspace-write \
  -C "$PWD" \
  -o .agent-docs/qa-session/001-last-message.txt \
  - < .agent-docs/qa-session/001-prompt.txt

为每个问题创建一个提示词文件：

bash

cat > .agent-docs/qa-session/001-prompt.txt <<'PROMPT'
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.

Read .agent-docs/qa-session/001-question.md.
Choose exactly one status suffix from: correct, findings, incorrect,
not-clear, timeout.
Write exactly one answer file:
.agent-docs/qa-session/001-answer-<chosen-status>.md
Do not include the angle brackets literally.

Before searching, decompose the question into 3-5 narrower sub-questions. Do
not search the top-level question directly as your only search. Search up to
50 URLs and up to 50 keyword variants if needed; you may need fewer. Prioritize
the highest-signal search angles based on your own experience.

Use the required answer template. Keep it concise. No preamble, no extra
commentary, no markdown headers beyond the template.

Your answer file must contain these fields:
Question: <copy from question file>
Status: <chosen status>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>

If the answer likely does not exist, stop spinning and write not-clear or
incorrect with the best evidence.
PROMPT

运行命令：

bash

codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox workspace-write \
  -C "$PWD" \
  -o .agent-docs/qa-session/001-last-message.txt \
  - < .agent-docs/qa-session/001-prompt.txt

Batch Dispatch

批量调度

Use bounded concurrency. Start at 8 workers; raise only after observing rate limits and machine load.

bash

cd /path/to/project
mkdir -p .agent-docs/qa-session

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  n="$(basename "$base")"
  prompt=".agent-docs/qa-session/${n}-prompt.txt"
  last=".agent-docs/qa-session/${n}-last-message.txt"

  cat > "$prompt" <<PROMPT
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.
Read ${q}.
Choose exactly one status suffix from: correct, findings, incorrect, not-clear, timeout.
Write exactly one answer file: ${base}-answer-<chosen-status>.md
Do not include the angle brackets literally.
Before searching, decompose the question into 3-5 narrower sub-questions.
Search up to 50 URLs and up to 50 keyword variants if needed.
Your answer file must contain:
Question: <copy from question file>
Status: <correct|findings|incorrect|not-clear|timeout>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>
PROMPT

  (
    codex --search \
      -m gpt-5.4-mini \
      -c model_reasoning_effort=medium \
      -a never \
      exec \
      --skip-git-repo-check \
      --sandbox workspace-write \
      -C "$PWD" \
      -o "$last" \
      - < "$prompt"
  ) >> .agent-docs/qa-session/_dispatch.log 2>&1 &

  while [ "$(jobs -pr | wc -l | tr -d ' ')" -ge 8 ]; do
    sleep 5
  done
done

wait

使用有限并发。初始设置为8个工作进程；仅在观察到速率限制和机器负载后再增加数量。

bash

cd /path/to/project
mkdir -p .agent-docs/qa-session

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  n="$(basename "$base")"
  prompt=".agent-docs/qa-session/${n}-prompt.txt"
  last=".agent-docs/qa-session/${n}-last-message.txt"

  cat > "$prompt" <<PROMPT
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.
Read ${q}.
Choose exactly one status suffix from: correct, findings, incorrect, not-clear, timeout.
Write exactly one answer file: ${base}-answer-<chosen-status>.md
Do not include the angle brackets literally.
Before searching, decompose the question into 3-5 narrower sub-questions.
Search up to 50 URLs and up to 50 keyword variants if needed.
Your answer file must contain:
Question: <copy from question file>
Status: <correct|findings|incorrect|not-clear|timeout>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>
PROMPT

  (
    codex --search \
      -m gpt-5.4-mini \
      -c model_reasoning_effort=medium \
      -a never \
      exec \
      --skip-git-repo-check \
      --sandbox workspace-write \
      -C "$PWD" \
      -o "$last" \
      - < "$prompt"
  ) >> .agent-docs/qa-session/_dispatch.log 2>&1 &

  while [ "$(jobs -pr | wc -l | tr -d ' ')" -ge 8 ]; do
    sleep 5
  done
done

wait

Progress Loop

进度循环

During long runs, report progress from filenames:

bash

delay=60
while jobs -pr | grep -q .; do
  sleep "$delay"
  echo "[progress] Checking answer files..."
  ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | sort || true
  echo "[progress] Pending questions:"
  for q in .agent-docs/qa-session/*-question.md; do
    base="${q%-question.md}"
    ls "${base}"-answer-*.md >/dev/null 2>&1 || echo "PENDING: $q"
  done
  case "$delay" in
    60) delay=120 ;;
    120) delay=240 ;;
    240) delay=480 ;;
    *) delay=600 ;;
  esac
done

If a subagent times out, it writes

NNN-answer-timeout.md

and the orchestrator moves on. Missing files are not an acceptable terminal state.

在长时间运行期间，通过文件名报告进度：

bash

delay=60
while jobs -pr | grep -q .; do
  sleep "$delay"
  echo "[progress] Checking answer files..."
  ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | sort || true
  echo "[progress] Pending questions:"
  for q in .agent-docs/qa-session/*-question.md; do
    base="${q%-question.md}"
    ls "${base}"-answer-*.md >/dev/null 2>&1 || echo "PENDING: $q"
  done
  case "$delay" in
    60) delay=120 ;;
    120) delay=240 ;;
    240) delay=480 ;;
    *) delay=600 ;;
  esac
done

如果子Agent超时，它会写入

NNN-answer-timeout.md

，编排器将继续执行。缺失文件不是可接受的终端状态。

Gap Check and Summary

差距检查与总结

Find unresolved questions:

bash

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  ls ${base}-answer-*.md 2>/dev/null || echo "MISSING: $q"
done

Retry missing answers,

timeout

not-clear

, and any answer that used only one search angle.

Build the parseable index after all workers settle:

bash

out=.agent-docs/qa-session/_summary.tsv
printf 'id\tstatus\tanswer\tconfidence\tfile\n' > "$out"

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  id="$(basename "$base")"
  answer_file="$(ls -t "${base}"-answer-*.md 2>/dev/null | head -1)"
  if [ -z "$answer_file" ]; then
    printf '%s\tmissing\t\t\t\n' "$id" >> "$out"
    continue
  fi
  answer_status="$(basename "$answer_file" | sed -E 's/^[0-9]+-answer-(.*)\.md$/\1/')"
  answer="$(grep -m1 '^Answer:' "$answer_file" | sed 's/^Answer:[[:space:]]*//')"
  confidence="$(grep -m1 '^Confidence:' "$answer_file" | sed 's/^Confidence:[[:space:]]*//')"
  printf '%s\t%s\t%s\t%s\t%s\n' "$id" "$answer_status" "$answer" "$confidence" "$answer_file" >> "$out"
done

Before reporting done:

bash

echo "questions: $(ls .agent-docs/qa-session/*-question.md | wc -l | tr -d ' ')"
echo "answers:   $(ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | wc -l | tr -d ' ')"
grep -E $'\t(timeout|not-clear|missing)\t' .agent-docs/qa-session/_summary.tsv || true
grep -L '^Answer:' .agent-docs/qa-session/*-answer-*.md 2>/dev/null || true

查找未解决的问题：

bash

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  ls ${base}-answer-*.md 2>/dev/null || echo "MISSING: $q"
done

重新尝试缺失的答案、

timeout

、

not-clear

状态的答案，以及仅使用单一搜索角度的答案。

在所有工作进程完成后构建可解析的索引：

bash

out=.agent-docs/qa-session/_summary.tsv
printf 'id\tstatus\tanswer\tconfidence\tfile\n' > "$out"

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  id="$(basename "$base")"
  answer_file="$(ls -t "${base}"-answer-*.md 2>/dev/null | head -1)"
  if [ -z "$answer_file" ]; then
    printf '%s\tmissing\t\t\t\t\n' "$id" >> "$out"
    continue
  fi
  answer_status="$(basename "$answer_file" | sed -E 's/^[0-9]+-answer-(.*)\.md$/\1/')"
  answer="$(grep -m1 '^Answer:' "$answer_file" | sed 's/^Answer:[[:space:]]*//')"
  confidence="$(grep -m1 '^Confidence:' "$answer_file" | sed 's/^Confidence:[[:space:]]*//')"
  printf '%s\t%s\t%s\t%s\t%s\n' "$id" "$answer_status" "$answer" "$confidence" "$answer_file" >> "$out"
done

在报告完成前运行：

bash

echo "questions: $(ls .agent-docs/qa-session/*-question.md | wc -l | tr -d ' ')"
echo "answers:   $(ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | wc -l | tr -d ' ')"
grep -E $'\t(timeout|not-clear|missing)\t' .agent-docs/qa-session/_summary.tsv || true
grep -L '^Answer:' .agent-docs/qa-session/*-answer-*.md 2>/dev/null || true

Completion Rules

完成规则

Treat answer files as the artifacts; do not rely on subagent final messages.
Regenerate
```
_summary.tsv
```
after late retries finish.
Treat answer files without
```
Answer:
```
or
```
Confidence:
```
as malformed and retry with the full answer template in the prompt.
If multiple answer files exist for one question, choose the newest only after checking whether an older answer has better sources.
Report counts: question files, answer files, missing files, timeout/not-clear files, and deliberate unresolved rows.
For non-trivial batches, run a fresh verifier that only reads
```
.agent-docs/qa-session/
```
and checks file presence, answer templates, summary consistency, and unresolved statuses.

将答案文件视为产物；不要依赖子Agent的最终消息。
在延迟重试完成后重新生成
```
_summary.tsv
```
。
将缺少
```
Answer:
```
或
```
Confidence:
```
的答案文件视为格式错误，并在提示词中使用完整答案模板重新尝试。
如果一个问题存在多个答案文件，仅在检查旧答案是否有更好的来源后选择最新的文件。
报告统计数据：问题文件数量、答案文件数量、缺失文件数量、超时/不明确状态文件数量，以及故意未解决的条目数量。
对于非 trivial 的批量任务，运行一个新的验证器，仅读取
```
.agent-docs/qa-session/
```
目录，检查文件存在性、答案模板、摘要一致性和未解决状态。