search-it-bulk-by-codex

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

search-it-bulk-by-codex

基于Codex批量执行搜索任务

Run large batches of small factual research questions with native Codex only:
codex exec
, built-in
--search
, and filesystem artifacts. Do not use MCPs, browser plugins, custom scrapers, or external research tools.
Default policy:
text
Subagent model: gpt-5.4-mini
Subagent reasoning: medium
Orchestrator model: gpt-5.4
Orchestrator reasoning: medium
Search: codex --search
Session root: .agent-docs/qa-session/
Never leave reasoning unset. Every
codex
command in this workflow includes
-c model_reasoning_effort=medium
.
仅使用原生Codex批量执行大量小型事实研究问题:借助
codex exec
、内置的
--search
功能和文件系统产物。请勿使用MCP、浏览器插件、自定义爬虫或外部研究工具。
默认策略:
text
Subagent model: gpt-5.4-mini
Subagent reasoning: medium
Orchestrator model: gpt-5.4
Orchestrator reasoning: medium
Search: codex --search
Session root: .agent-docs/qa-session/
请勿留空推理设置。此工作流中的每个
codex
命令都需包含
-c model_reasoning_effort=medium

Sanity Check

完整性检查

Run these before dispatching a batch:
bash
codex --help
codex exec --help
codex --help
must show
exec
,
--search
,
-m/--model
,
-c/--config
,
-a/--ask-for-approval
,
-s/--sandbox
, and
-C/--cd
.
codex exec --help
must show
--skip-git-repo-check
,
--output-last-message
/
-o
,
--json
, and
--sandbox
.
Important CLI quirks:
  • --search
    is global; put it before
    exec
    .
  • --ask-for-approval
    is global; use
    -a never
    before
    exec
    .
  • Use
    --skip-git-repo-check
    outside git repos.
  • Use
    --sandbox workspace-write
    when the agent must write answer files.
  • -o
    captures the final message only; the answer file is the durable artifact.
Verified native web check, run 2026-05-22:
bash
codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox read-only \
  -o /tmp/codex-web-check.txt \
  "Use native live web search. What is the headline and URL of the newest item currently listed on the OpenAI News page? Answer in one sentence with the date if shown."
The run reported
model: gpt-5.4-mini
,
reasoning effort: medium
, and used web search. The final answer was the OpenAI News item "OpenAI named a Leader in enterprise coding agents by Gartner" at
https://openai.com/index/gartner-2026-agentic-coding-leader/
, dated 2026-05-22. Re-run this check in the target environment because auth, feature flags, and network policy can differ.
在批量调度前运行以下命令:
bash
codex --help
codex exec --help
codex --help
必须显示
exec
--search
-m/--model
-c/--config
-a/--ask-for-approval
-s/--sandbox
-C/--cd
codex exec --help
必须显示
--skip-git-repo-check
--output-last-message
/
-o
--json
--sandbox
重要的CLI特性:
  • --search
    是全局参数;需放在
    exec
    之前。
  • --ask-for-approval
    是全局参数;在
    exec
    之前使用
    -a never
  • 在Git仓库外使用
    --skip-git-repo-check
  • 当Agent需要写入答案文件时,使用
    --sandbox workspace-write
  • -o
    仅捕获最终消息;答案文件是持久化产物。
已验证的原生网络检查,运行于2026-05-22:
bash
codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox read-only \
  -o /tmp/codex-web-check.txt \
  "Use native live web search. What is the headline and URL of the newest item currently listed on the OpenAI News page? Answer in one sentence with the date if shown."
运行结果显示
model: gpt-5.4-mini
reasoning effort: medium
,且使用了网络搜索。最终答案是OpenAI新闻条目“OpenAI named a Leader in enterprise coding agents by Gartner”,链接为
https://openai.com/index/gartner-2026-agentic-coding-leader/
,日期为2026-05-22。请在目标环境中重新运行此检查,因为认证、功能标志和网络策略可能存在差异。

File Protocol

文件协议

Use one directory for the batch:
bash
mkdir -p .agent-docs/qa-session
Question files are written by the orchestrator:
text
.agent-docs/qa-session/001-question.md
.agent-docs/qa-session/002-question.md
Answer files are written by subagents:
text
.agent-docs/qa-session/001-answer-correct.md
.agent-docs/qa-session/002-answer-not-clear.md
Valid status suffixes:
SuffixUse when
correct
Answer is confirmed and high-confidence
findings
Useful partial results, not fully conclusive
incorrect
Initial assumption was wrong; explain why
not-clear
Conflicting sources or insufficient signal
timeout
Search exhausted time/URL budget without resolution
If an incoming plan uses
NNNq.md
or
NNNa-{status}.md
, normalize it to
NNN-question.md
and
NNN-answer-{status}.md
. Keep the numeric prefix stable; it is the join key.
Question template:
md
undefined
为批量任务创建一个目录:
bash
mkdir -p .agent-docs/qa-session
问题文件由编排器生成:
text
.agent-docs/qa-session/001-question.md
.agent-docs/qa-session/002-question.md
答案文件由子Agent生成:
text
.agent-docs/qa-session/001-answer-correct.md
.agent-docs/qa-session/002-answer-not-clear.md
有效的状态后缀:
后缀使用场景
correct
答案已确认且置信度高
findings
有用的部分结果,但未完全得出结论
incorrect
初始假设错误;需解释原因
not-clear
来源冲突或信号不足
timeout
搜索耗尽时间/URL预算仍未解决问题
如果现有计划使用
NNNq.md
NNNa-{status}.md
,请标准化为
NNN-question.md
NNN-answer-{status}.md
。保持数字前缀稳定;它是关联键。
问题模板:
md
undefined

001 Question

001 Question

Question: <one focused sentence>
Context:
  • <known facts>
  • <false positives or traps>
Expected answer:
  • <single line, yes/no, path, package name, year, owner, etc.>

Answer template:

```md
Question: <one focused sentence>
Context:
  • <known facts>
  • <false positives or traps>
Expected answer:
  • <single line, yes/no, path, package name, year, owner, etc.>

答案模板:

```md

001 Answer

001 Answer

Question: <copy from 001-question.md> Status: <correct|findings|incorrect|not-clear|timeout> Answer: <single line, value, path, yes/no, or concise sentence> Confidence: <high|medium|low>
Sources:
  • <URL or exact search term> - <what it proved>
Notes: <Optional. Only include material interpretation notes.>

Keep answers short. Orchestrator context is finite; verbose answers pollute the
merge loop.
Question: <copy from 001-question.md> Status: <correct|findings|incorrect|not-clear|timeout> Answer: <single line, value, path, yes/no, or concise sentence> Confidence: <high|medium|low>
Sources:
  • <URL or exact search term> - <what it proved>
Notes: <Optional. Only include material interpretation notes.>

保持答案简洁。编排器的上下文有限;冗长的答案会污染合并循环。

Fan-Out Rule

扩散规则

A single broad search is incomplete. Every subagent must:
  1. Decompose the question into 3-5 narrower sub-questions.
  2. Search each sub-question independently with distinct keyword sets.
  3. Cross-reference findings; agreement increases confidence, conflict triggers a targeted follow-up search.
  4. Synthesize only after the sub-searches are done.
Budget is a ceiling, not a quota:
text
Search up to 50 URLs if needed.
Search up to 50 keyword/search-term variants if needed.
Use fewer when the evidence is already strong.
Example: for "Does library X support feature Y?", do not only search that sentence. Fan out:
text
library X changelog feature Y
library X GitHub issues feature Y
library X documentation Y API
library X Y workaround OR alternative
library X Y release notes
Prioritize official docs, changelogs, source repos, issue trackers, package indexes, platform records, and archived primary pages. Forums and blogs are supporting evidence. Snippets are leads, not proof.
单次宽泛搜索是不完整的。每个子Agent必须:
  1. 将问题分解为3-5个更具体的子问题。
  2. 使用不同的关键词集独立搜索每个子问题。
  3. 交叉验证结果;一致的结果提升置信度,冲突则触发针对性的后续搜索。
  4. 仅在完成所有子搜索后进行综合。
预算是上限,而非配额:
text
Search up to 50 URLs if needed.
Search up to 50 keyword/search-term variants if needed.
Use fewer when the evidence is already strong.
示例:对于“Does library X support feature Y?”,不要只搜索该句子。应扩散为:
text
library X changelog feature Y
library X GitHub issues feature Y
library X documentation Y API
library X Y workaround OR alternative
library X Y release notes
优先选择官方文档、更新日志、源码仓库、问题追踪器、包索引、平台记录和归档的主页面。论坛和博客作为辅助证据。片段仅作为线索,而非证据。

Subagent Prompt

子Agent提示词

Use one prompt file per question:
bash
cat > .agent-docs/qa-session/001-prompt.txt <<'PROMPT'
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.

Read .agent-docs/qa-session/001-question.md.
Choose exactly one status suffix from: correct, findings, incorrect,
not-clear, timeout.
Write exactly one answer file:
.agent-docs/qa-session/001-answer-<chosen-status>.md
Do not include the angle brackets literally.

Before searching, decompose the question into 3-5 narrower sub-questions. Do
not search the top-level question directly as your only search. Search up to
50 URLs and up to 50 keyword variants if needed; you may need fewer. Prioritize
the highest-signal search angles based on your own experience.

Use the required answer template. Keep it concise. No preamble, no extra
commentary, no markdown headers beyond the template.

Your answer file must contain these fields:
Question: <copy from question file>
Status: <chosen status>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>

If the answer likely does not exist, stop spinning and write not-clear or
incorrect with the best evidence.
PROMPT
Run it:
bash
codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox workspace-write \
  -C "$PWD" \
  -o .agent-docs/qa-session/001-last-message.txt \
  - < .agent-docs/qa-session/001-prompt.txt
为每个问题创建一个提示词文件:
bash
cat > .agent-docs/qa-session/001-prompt.txt <<'PROMPT'
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.

Read .agent-docs/qa-session/001-question.md.
Choose exactly one status suffix from: correct, findings, incorrect,
not-clear, timeout.
Write exactly one answer file:
.agent-docs/qa-session/001-answer-<chosen-status>.md
Do not include the angle brackets literally.

Before searching, decompose the question into 3-5 narrower sub-questions. Do
not search the top-level question directly as your only search. Search up to
50 URLs and up to 50 keyword variants if needed; you may need fewer. Prioritize
the highest-signal search angles based on your own experience.

Use the required answer template. Keep it concise. No preamble, no extra
commentary, no markdown headers beyond the template.

Your answer file must contain these fields:
Question: <copy from question file>
Status: <chosen status>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>

If the answer likely does not exist, stop spinning and write not-clear or
incorrect with the best evidence.
PROMPT
运行命令:
bash
codex --search \
  -m gpt-5.4-mini \
  -c model_reasoning_effort=medium \
  -a never \
  exec \
  --skip-git-repo-check \
  --sandbox workspace-write \
  -C "$PWD" \
  -o .agent-docs/qa-session/001-last-message.txt \
  - < .agent-docs/qa-session/001-prompt.txt

Batch Dispatch

批量调度

Use bounded concurrency. Start at 8 workers; raise only after observing rate limits and machine load.
bash
cd /path/to/project
mkdir -p .agent-docs/qa-session

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  n="$(basename "$base")"
  prompt=".agent-docs/qa-session/${n}-prompt.txt"
  last=".agent-docs/qa-session/${n}-last-message.txt"

  cat > "$prompt" <<PROMPT
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.
Read ${q}.
Choose exactly one status suffix from: correct, findings, incorrect, not-clear, timeout.
Write exactly one answer file: ${base}-answer-<chosen-status>.md
Do not include the angle brackets literally.
Before searching, decompose the question into 3-5 narrower sub-questions.
Search up to 50 URLs and up to 50 keyword variants if needed.
Your answer file must contain:
Question: <copy from question file>
Status: <correct|findings|incorrect|not-clear|timeout>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>
PROMPT

  (
    codex --search \
      -m gpt-5.4-mini \
      -c model_reasoning_effort=medium \
      -a never \
      exec \
      --skip-git-repo-check \
      --sandbox workspace-write \
      -C "$PWD" \
      -o "$last" \
      - < "$prompt"
  ) >> .agent-docs/qa-session/_dispatch.log 2>&1 &

  while [ "$(jobs -pr | wc -l | tr -d ' ')" -ge 8 ]; do
    sleep 5
  done
done

wait
使用有限并发。初始设置为8个工作进程;仅在观察到速率限制和机器负载后再增加数量。
bash
cd /path/to/project
mkdir -p .agent-docs/qa-session

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  n="$(basename "$base")"
  prompt=".agent-docs/qa-session/${n}-prompt.txt"
  last=".agent-docs/qa-session/${n}-last-message.txt"

  cat > "$prompt" <<PROMPT
You are a Codex research subagent using native Codex web search only.
Do not use MCPs, browser plugins, custom scrapers, or external research tools.
Read ${q}.
Choose exactly one status suffix from: correct, findings, incorrect, not-clear, timeout.
Write exactly one answer file: ${base}-answer-<chosen-status>.md
Do not include the angle brackets literally.
Before searching, decompose the question into 3-5 narrower sub-questions.
Search up to 50 URLs and up to 50 keyword variants if needed.
Your answer file must contain:
Question: <copy from question file>
Status: <correct|findings|incorrect|not-clear|timeout>
Answer: <single line>
Confidence: <high|medium|low>
Sources:
- <URL or exact search term> - <what it proved>
Notes:
<optional, only if material>
PROMPT

  (
    codex --search \
      -m gpt-5.4-mini \
      -c model_reasoning_effort=medium \
      -a never \
      exec \
      --skip-git-repo-check \
      --sandbox workspace-write \
      -C "$PWD" \
      -o "$last" \
      - < "$prompt"
  ) >> .agent-docs/qa-session/_dispatch.log 2>&1 &

  while [ "$(jobs -pr | wc -l | tr -d ' ')" -ge 8 ]; do
    sleep 5
  done
done

wait

Progress Loop

进度循环

During long runs, report progress from filenames:
bash
delay=60
while jobs -pr | grep -q .; do
  sleep "$delay"
  echo "[progress] Checking answer files..."
  ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | sort || true
  echo "[progress] Pending questions:"
  for q in .agent-docs/qa-session/*-question.md; do
    base="${q%-question.md}"
    ls "${base}"-answer-*.md >/dev/null 2>&1 || echo "PENDING: $q"
  done
  case "$delay" in
    60) delay=120 ;;
    120) delay=240 ;;
    240) delay=480 ;;
    *) delay=600 ;;
  esac
done
If a subagent times out, it writes
NNN-answer-timeout.md
and the orchestrator moves on. Missing files are not an acceptable terminal state.
在长时间运行期间,通过文件名报告进度:
bash
delay=60
while jobs -pr | grep -q .; do
  sleep "$delay"
  echo "[progress] Checking answer files..."
  ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | sort || true
  echo "[progress] Pending questions:"
  for q in .agent-docs/qa-session/*-question.md; do
    base="${q%-question.md}"
    ls "${base}"-answer-*.md >/dev/null 2>&1 || echo "PENDING: $q"
  done
  case "$delay" in
    60) delay=120 ;;
    120) delay=240 ;;
    240) delay=480 ;;
    *) delay=600 ;;
  esac
done
如果子Agent超时,它会写入
NNN-answer-timeout.md
,编排器将继续执行。缺失文件不是可接受的终端状态。

Gap Check and Summary

差距检查与总结

Find unresolved questions:
bash
for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  ls ${base}-answer-*.md 2>/dev/null || echo "MISSING: $q"
done
Retry missing answers,
timeout
,
not-clear
, and any answer that used only one search angle.
Build the parseable index after all workers settle:
bash
out=.agent-docs/qa-session/_summary.tsv
printf 'id\tstatus\tanswer\tconfidence\tfile\n' > "$out"

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  id="$(basename "$base")"
  answer_file="$(ls -t "${base}"-answer-*.md 2>/dev/null | head -1)"
  if [ -z "$answer_file" ]; then
    printf '%s\tmissing\t\t\t\n' "$id" >> "$out"
    continue
  fi
  answer_status="$(basename "$answer_file" | sed -E 's/^[0-9]+-answer-(.*)\.md$/\1/')"
  answer="$(grep -m1 '^Answer:' "$answer_file" | sed 's/^Answer:[[:space:]]*//')"
  confidence="$(grep -m1 '^Confidence:' "$answer_file" | sed 's/^Confidence:[[:space:]]*//')"
  printf '%s\t%s\t%s\t%s\t%s\n' "$id" "$answer_status" "$answer" "$confidence" "$answer_file" >> "$out"
done
Before reporting done:
bash
echo "questions: $(ls .agent-docs/qa-session/*-question.md | wc -l | tr -d ' ')"
echo "answers:   $(ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | wc -l | tr -d ' ')"
grep -E $'\t(timeout|not-clear|missing)\t' .agent-docs/qa-session/_summary.tsv || true
grep -L '^Answer:' .agent-docs/qa-session/*-answer-*.md 2>/dev/null || true
查找未解决的问题:
bash
for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  ls ${base}-answer-*.md 2>/dev/null || echo "MISSING: $q"
done
重新尝试缺失的答案、
timeout
not-clear
状态的答案,以及仅使用单一搜索角度的答案。
在所有工作进程完成后构建可解析的索引:
bash
out=.agent-docs/qa-session/_summary.tsv
printf 'id\tstatus\tanswer\tconfidence\tfile\n' > "$out"

for q in .agent-docs/qa-session/*-question.md; do
  base="${q%-question.md}"
  id="$(basename "$base")"
  answer_file="$(ls -t "${base}"-answer-*.md 2>/dev/null | head -1)"
  if [ -z "$answer_file" ]; then
    printf '%s\tmissing\t\t\t\t\n' "$id" >> "$out"
    continue
  fi
  answer_status="$(basename "$answer_file" | sed -E 's/^[0-9]+-answer-(.*)\.md$/\1/')"
  answer="$(grep -m1 '^Answer:' "$answer_file" | sed 's/^Answer:[[:space:]]*//')"
  confidence="$(grep -m1 '^Confidence:' "$answer_file" | sed 's/^Confidence:[[:space:]]*//')"
  printf '%s\t%s\t%s\t%s\t%s\n' "$id" "$answer_status" "$answer" "$confidence" "$answer_file" >> "$out"
done
在报告完成前运行:
bash
echo "questions: $(ls .agent-docs/qa-session/*-question.md | wc -l | tr -d ' ')"
echo "answers:   $(ls .agent-docs/qa-session/*-answer-*.md 2>/dev/null | wc -l | tr -d ' ')"
grep -E $'\t(timeout|not-clear|missing)\t' .agent-docs/qa-session/_summary.tsv || true
grep -L '^Answer:' .agent-docs/qa-session/*-answer-*.md 2>/dev/null || true

Completion Rules

完成规则

  • Treat answer files as the artifacts; do not rely on subagent final messages.
  • Regenerate
    _summary.tsv
    after late retries finish.
  • Treat answer files without
    Answer:
    or
    Confidence:
    as malformed and retry with the full answer template in the prompt.
  • If multiple answer files exist for one question, choose the newest only after checking whether an older answer has better sources.
  • Report counts: question files, answer files, missing files, timeout/not-clear files, and deliberate unresolved rows.
  • For non-trivial batches, run a fresh verifier that only reads
    .agent-docs/qa-session/
    and checks file presence, answer templates, summary consistency, and unresolved statuses.
  • 将答案文件视为产物;不要依赖子Agent的最终消息。
  • 在延迟重试完成后重新生成
    _summary.tsv
  • 将缺少
    Answer:
    Confidence:
    的答案文件视为格式错误,并在提示词中使用完整答案模板重新尝试。
  • 如果一个问题存在多个答案文件,仅在检查旧答案是否有更好的来源后选择最新的文件。
  • 报告统计数据:问题文件数量、答案文件数量、缺失文件数量、超时/不明确状态文件数量,以及故意未解决的条目数量。
  • 对于非 trivial 的批量任务,运行一个新的验证器,仅读取
    .agent-docs/qa-session/
    目录,检查文件存在性、答案模板、摘要一致性和未解决状态。