/ork:swarm-migrate — Cross-Repo Migration Swarm
/ork:swarm-migrate — 跨仓库迁移集群
One command, N repos, one coordinator, one ledger.
Use when the same transformation needs to land in
3 or more repos with the same shape (workflow bump, dependency upgrade, codemod, lint-rule introduction, secret rotation, runbook header). Don't use for one-repo work — that's
. Don't use for novel exploration — that's
.
This skill exists because the 275-session insights showed 25 sessions burned coordinating PR cascades manually (M164 deploy-migration, M17 yg-mcp-core extraction, @v1 reusable workflow rollout across 14 repos). The pattern was always: pick a repo, branch, apply, push, watch CI, repeat. This automates the repeat.
适用于需要在
3个及以上结构相似的仓库中执行相同变更的场景(如工作流升级、依赖版本更新、codemod应用、lint规则引入、密钥轮换、运行手册头更新)。请勿用于单仓库工作——单仓库操作请使用
。请勿用于创新性探索——此类场景请使用
。
开发该技能的原因是,基于275次会话的分析发现,有25次会话耗费在手动协调PR流程上(如M164部署迁移、M17 yg-mcp-core抽取、在14个仓库中推广@v1可复用工作流)。这些工作的模式始终如一:选择仓库、创建分支、应用变更、推送代码、等待CI完成,重复上述步骤。而本技能将这一重复流程自动化。
yaml
name: bump-actions-checkout-v4
description: "Pin @actions/checkout to v4 across all repos"
yaml
name: bump-actions-checkout-v4
description: "Pin @actions/checkout to v4 across all repos"
Topology — repos in dependency order. Coordinator only proceeds
Topology — repos in dependency order. Coordinator only proceeds
to a downstream repo after every upstream parent has merged green.
to a downstream repo after every upstream parent has merged green.
repos:
- path: ~/coding/yonatan-hq/platform
upstream: []
- path: ~/coding/yonatan-hq/ventures/jobscraper
upstream: [platform] # waits for platform to merge first
repos:
- path: ~/coding/yonatan-hq/platform
upstream: []
- path: ~/coding/yonatan-hq/ventures/jobscraper
upstream: [platform] # waits for platform to merge first
Transformation — applied identically per repo. The agent runs this
Transformation — applied identically per repo. The agent runs this
inside the isolated worktree, then verifies with the next field.
inside the isolated worktree, then verifies with the next field.
transform:
type: codemod # codemod | regex | command
command: |
grep -rl 'actions/checkout@v3' .github/workflows |
xargs sed -i '' 's|actions/checkout@v3|actions/checkout@v4|g'
transform:
type: codemod # codemod | regex | command
command: |
grep -rl 'actions/checkout@v3' .github/workflows |
xargs sed -i '' 's|actions/checkout@v3|actions/checkout@v4|g'
Verification — must pass before PR opens. Coordinator skips the repo
Verification — must pass before PR opens. Coordinator skips the repo
if it fails locally (records skip-reason in ledger).
if it fails locally (records skip-reason in ledger).
verify:
- command: "git diff --quiet"
expect: nonzero # must have changes
- command: "grep -r 'actions/checkout@v3' .github/workflows"
expect: nonzero # zero matches = clean
verify:
- command: "git diff --quiet"
expect: nonzero # must have changes
- command: "grep -r 'actions/checkout@v3' .github/workflows"
expect: nonzero # zero matches = clean
PR shape — title, body, base branch
PR shape — title, body, base branch
pr:
branch_prefix: chore/bump-checkout-v4
title: "chore(ci): pin @actions/checkout to v4"
body_file: swarm-specs/bump-actions-checkout-v4.pr.md
base: main
labels: [chore, ci]
pr:
branch_prefix: chore/bump-checkout-v4
title: "chore(ci): pin @actions/checkout to v4"
body_file: swarm-specs/bump-checkout-v4.pr.md
base: main
labels: [chore, ci]
CI gate — coordinator waits for required checks to pass before
CI gate — coordinator waits for required checks to pass before
moving downstream. Set to false for dry-run, or in repos without CI.
moving downstream. Set to false for dry-run, or in repos without CI.
ci_gate:
required_checks: ["build", "test"]
timeout_minutes: 20
on_failure: pause # pause | skip | abort
ci_gate:
required_checks: ["build", "test"]
timeout_minutes: 20
on_failure: pause # pause | skip | abort
max_parallel: 4
abort_on_novel_failure: true
max_parallel: 4
abort_on_novel_failure: true
┌──────────────────────────────────┐
│ COORDINATOR (you) │
│ reads spec → builds DAG → │
│ writes .swarm-state.json │
└────────────┬─────────────────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ WORKER A │ │ WORKER B │ │ WORKER C │
│ (repo 1) │ │ (repo 2) │ │ (repo 3) │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└─────────── isolated worktrees ──────┘
│ each: clone branch, transform,
│ verify, push, open PR,
│ wait for CI, report
▼
┌─────────────────────────────────────────────┐
│ .swarm-state.json │
│ rolling ledger of {repo, status, │
│ pr_url, ci_state, last_action_at} │
└─────────────────────────────────────────────┘
Each worker is a
tool invocation (subagent type
for plumbing or
for schema-flavored migrations). The coordinator (you, this skill) reads the ledger between waves and decides whether to release downstream waves or pause.
┌──────────────────────────────────┐
│ COORDINATOR (you) │
│ reads spec → builds DAG → │
│ writes .swarm-state.json │
└────────────┬─────────────────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ WORKER A │ │ WORKER B │ │ WORKER C │
│ (repo 1) │ │ (repo 2) │ │ (repo 3) │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└─────────── isolated worktrees ──────┘
│ each: clone branch, transform,
│ verify, push, open PR,
│ wait for CI, report
▼
┌─────────────────────────────────────────────┐
│ .swarm-state.json │
│ rolling ledger of {repo, status, │
│ pr_url, ci_state, last_action_at} │
└─────────────────────────────────────────────┘
每个Worker都是一次
工具调用(对于底层操作,子Agent类型为
;对于模式化迁移,子Agent类型为
)。协调器(即执行本技能的你)会在不同阶段读取台账,决定是否启动下一批次任务或暂停。
Phase 1 — Spec validation
阶段1 — 配置文件验证
- Every exists and is a git repo (use checks).
- The returns 0 in a dry-run mode (or resolves to a known codemod registered in ).
- Every reference points to a declared repo (no dangling deps).
- exists and is non-empty.
If any check fails, abort and print the table of failures. Do NOT proceed.
- 每个对应的路径存在且为Git仓库(通过检查)。
- 在试运行模式下返回0(或指向中已注册的已知codemod)。
- 所有引用均指向已声明的仓库(无悬空依赖)。
- 存在且非空。
若任意检查失败,则终止任务并打印失败列表,请勿继续执行。
Phase 2 — Topology sort
阶段2 — 拓扑排序
Build a DAG from
edges. Detect cycles → abort. Group nodes by topological wave (wave 0 = no deps, wave 1 = depends only on wave 0, …). Coordinator releases one wave at a time.
Write
at the repo root running the skill:
json
{
"spec": "swarm-specs/bump-actions-checkout-v4.yaml",
"started_at": "2026-05-16T17:00:00Z",
"waves": [
{ "id": 0, "repos": ["platform"] },
{ "id": 1, "repos": ["jobscraper"] }
],
"repos": {
"platform": { "status": "pending", "pr_url": null, "ci_state": null, "last_action_at": null },
"jobscraper": { "status": "blocked", "blocked_on": ["platform"], "pr_url": null }
}
}
基于
依赖关系构建DAG图。若检测到循环依赖则终止任务。按拓扑批次对节点分组(批次0:无依赖;批次1:仅依赖批次0的仓库;依此类推)。协调器将逐批次启动任务。
json
{
"spec": "swarm-specs/bump-actions-checkout-v4.yaml",
"started_at": "2026-05-16T17:00:00Z",
"waves": [
{ "id": 0, "repos": ["platform"] },
{ "id": 1, "repos": ["jobscraper"] }
],
"repos": {
"platform": { "status": "pending", "pr_url": null, "ci_state": null, "last_action_at": null },
"jobscraper": { "status": "blocked", "blocked_on": ["platform"], "pr_url": null }
}
}
Phase 3 — Dispatch wave
阶段3 — 调度批次任务
For each repo in the current wave, in parallel (bounded by
):
- Worktree — create an isolated worktree at
<repo>/../<repo>-swarm-<spec-name>
off . Never mutate the live working tree.
- Branch —
git checkout -b <branch_prefix>-<short-sha>
.
- Transform — run (or apply codemod). Capture stdout to
.swarm-logs/<repo>-transform.log
.
- Verify — run each , assert exit matches . On mismatch, mark repo in ledger with reason, do not push.
- Push + PR — push branch, open PR via . Update ledger with PR URL.
- Watch CI — poll every 45s up to . Update in ledger on every state transition.
Use the
tool with
subagent_type: ork:git-operations-engineer
for steps 1–5 to keep main context lean. The coordinator only reads the ledger.
- Worktree — 在
<repo>/../<repo>-swarm-<spec-name>
路径下创建基于的独立worktree,绝不修改本地活跃工作树。
- 分支 — 执行
git checkout -b <branch_prefix>-<short-sha>
创建分支。
- 变更 — 运行(或应用codemod)。将标准输出捕获至
.swarm-logs/<repo>-transform.log
。
- 验证 — 运行每个,断言退出码与匹配。若不匹配,则在台账中标记仓库为并记录原因,不推送代码。
- 推送 + 创建PR — 推送分支,通过创建PR。在台账中更新PR链接。
- 监控CI — 每隔45秒轮询,直至达到设定的超时时间。每次状态变更时更新台账中的。
步骤1-5使用
工具并指定
subagent_type: ork:git-operations-engineer
,以保持主上下文简洁。协调器仅负责读取台账。
Phase 4 — Wave gate
阶段4 — 批次闸门
After every wave, check the ledger:
- All → release the next wave.
- Any → keep polling.
- Any → consult :
- → halt the swarm, write a summary to , surface the failing logs, ask the user.
- → mark repo , continue with siblings (but block downstream unless they explicitly don't depend on this repo).
- → terminate the swarm, leave open PRs as-is, never merge.
每个批次任务完成后,检查台账:
- 所有仓库状态为 → 启动下一批次任务。
- 存在状态为的仓库 → 持续轮询。
- 存在状态为的仓库 → 遵循配置:
- → 暂停集群任务,在中写入总结,展示失败日志并询问用户。
- → 标记仓库为,继续执行同批次其他仓库任务(但下游仓库会被阻塞,除非它们明确不依赖该仓库)。
- → 终止集群任务,保留已创建的PR,绝不自动合并。
Phase 5 — Auto-rebase on conflicts
阶段5 — 冲突自动变基
If a downstream repo's worker hits a merge conflict on rebase (because an upstream merged), the worker:
- Re-fetches the upstream's merge commit SHA.
- Attempts . If clean → push, ledger update.
- If conflicts → mark the conflict files in the ledger, do NOT auto-resolve, surface to the coordinator. Conflicts are the most common place auto-fixers ship broken code.
若下游仓库的Worker在变基时遇到合并冲突(因上游仓库已合并),Worker将执行以下操作:
- 重新拉取上游仓库的合并提交SHA。
- 尝试执行。若变基无冲突 → 推送代码并更新台账。
- 若存在冲突 → 在台账中标记冲突文件,不自动解决,将问题反馈给协调器。冲突是自动修复工具最容易引入错误代码的场景。
Phase 6 — Final report
阶段6 — 最终报告
When all waves complete (or the swarm pauses/aborts), emit a single markdown report under
.swarm-logs/<spec-name>-report.md
:
当所有批次任务完成(或集群暂停/终止)时,在
.swarm-logs/<spec-name>-report.md
中生成一份Markdown报告:
Swarm report: bump-actions-checkout-v4
Swarm report: bump-actions-checkout-v4
Completed: 12/14 repos · paused: 2 · duration: 47 min
| repo | status | PR | CI | duration |
|---|
| platform | merged | #3456 | green | 8 min |
| jobscraper | merged | #281 | green | 6 min |
| ... | | | | |
| dormant-repo-1 | skipped | — | — | (no CI runner configured) |
| trading-ai | paused | #99 | red | (novel failure — see logs) |
Completed: 12/14 repos · paused: 2 · duration: 47 min
| repo | status | PR | CI | duration |
|---|
| platform | merged | #3456 | green | 8 min |
| jobscraper | merged | #281 | green | 6 min |
| ... | | | | |
| dormant-repo-1 | skipped | — | — | (no CI runner configured) |
| trading-ai | paused | #99 | red | (novel failure — see logs) |
Novel failures (escalated)
Novel failures (escalated)
- trading-ai #99: pyproject lockfile mismatch — see .swarm-logs/trading-ai-ci.log
- trading-ai #99: pyproject lockfile mismatch — see .swarm-logs/trading-ai-ci.log
- Never merge a PR. The swarm opens PRs; humans merge them. Auto-merge can be armed by the user with post-swarm if they want.
- Never force-push. If a worker can't fast-forward, it pauses.
- Never roam outside the spec's declared . Even if a transformation seems like it'd help elsewhere.
- Always quarantine credentials. Workers run with the user's gh auth; the coordinator never logs tokens, just the URLs.
- Always respect existing branch protections. If fails because of required reviewers or other rules, that's a feature, not a bug to work around.
- 绝不自动合并PR。集群仅负责创建PR,合并操作由人工完成。若用户需要自动合并,可在集群任务完成后执行。
- 绝不强制推送。若Worker无法快进推送,则暂停任务。
- 绝不超出配置文件中声明的仓库范围。即使变更似乎对其他仓库有帮助也不行。
- 始终隔离凭证。Worker使用用户的gh权限运行;协调器绝不会记录令牌,仅记录链接。
- 始终遵守现有分支保护规则。若因需要指定审核人或其他规则而失败,这是预期功能,而非需要解决的bug。
Failure modes you'll actually hit
实际会遇到的故障模式
| Mode | What it looks like | Mitigation |
|---|
| Stale lockfile | CI red on after dependency bump | Spec includes a post_transform.command: npm install
step |
| Branch protection blocks PR creation | exits non-zero | Coordinator marks repo , surfaces to user |
| Topology cycle | Phase 2 abort | Re-spec the upstream edges |
| Coordinator crash mid-flight | half-written | Skill is resumable: re-run with same spec, it reads the ledger and skips / repos |
| Worker subagent hangs | No ledger update for >5 min | Coordinator times out the agent, marks repo , surfaces logs |
| 模式 | 表现 | 缓解措施 |
|---|
| 过期锁文件 | 依赖升级后导致CI失败 | 在配置文件中添加post_transform.command: npm install
步骤 |
| 分支保护阻止PR创建 | 退出码非零 | 协调器标记仓库为并反馈给用户 |
| 拓扑循环依赖 | 阶段2终止任务 | 重新定义上游依赖关系 |
| 协调器中途崩溃 | 写入不完整 | 技能支持恢复:使用相同配置文件重新运行,它会读取台账并跳过已/的仓库 |
| Worker子Agent挂起 | 超过5分钟未更新台账 | 协调器终止Agent超时,标记仓库为并展示日志 |
- Upstream — to design the spec, to ASCII-preview the DAG before dispatch.
- Downstream — per repo after merge, for org-wide sweep, if a worker hits a CI red.
- Composes with — (each worker calls into it), (bulk-update labels/milestones post-swarm).
- 前置技能 — 使用设计配置文件,使用在调度前以ASCII预览DAG图。
- 后置技能 — 合并后使用验证单个仓库,使用进行全组织范围检查,若Worker遇到CI失败则使用排查问题。
- 组合技能 — 与(每个Worker都会调用它)、(集群任务完成后批量更新标签/里程碑)配合使用。
What this skill does NOT do
本技能不支持的操作
- Does not invent the spec. You write the spec; the skill executes it.
- Does not perform schema migrations across DBs (use a single-repo skill plus ).
- Does not orchestrate production deploys — open PRs only; deploy is a separate gate (the platform's deploy-operator).
- Does not bypass 's playground-gate rule — each PR body must include a playground reference if the repo enforces it.
- 不自动生成配置文件。配置文件需由用户编写,技能仅负责执行。
- 不支持跨数据库的模式迁移(请使用单仓库技能配合)。
- 不编排生产部署——仅创建PR;部署是独立的环节(由平台的deploy-operator负责)。
- 不绕过的playground-gate规则——若仓库强制要求,每个PR正文必须包含playground引用。
Dry-run: build the DAG, verify spec, do NOT push or open PRs
试运行:构建DAG图,验证配置文件,不推送代码或创建PR
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --dry-run
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --dry-run
Live: dispatch up to 4 workers in parallel
正式运行:并行调度最多4个Worker
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --max-parallel=4
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml --max-parallel=4
Resume after pause: same command, the ledger remembers
暂停后恢复:执行相同命令,台账会记录进度
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml
/ork:swarm-migrate swarm-specs/bump-actions-checkout-v4.yaml
Why this exists (one paragraph)
开发背景(简述)
You ran 25 sessions in a single month coordinating cross-repo PRs by hand. The 14-repo @v1 workflow rollout, the M17 yg-mcp-core extraction, the M164 deploy-migration. Every one of those sessions had the same shape: a coordinator (you) holding the DAG in your head, dispatching workers (you, sequentially) in different terminal tabs, hand-rolling a status table in your notes. This skill makes the coordinator a YAML file and the workers parallel subagents. The DAG, the ledger, the auto-rebase, the wave gating — all the bookkeeping you were doing manually — get codified once. You write the spec, you walk away, you come back to a report.
你在一个月内手动协调了25次跨仓库PR会话。包括14个仓库的@v1工作流推广、M17 yg-mcp-core抽取、M164部署迁移。每一次会话的流程都一模一样:你作为协调器在脑中记住DAG图,在不同终端标签页中依次执行Worker任务,在笔记中手动维护状态表。本技能将协调器逻辑固化为YAML文件,将Worker任务改为并行子Agent执行。DAG管理、台账记录、自动变基、批次闸门——所有你手动完成的繁琐工作——都被一次性编码实现。你只需编写配置文件,然后离开,回来时就能看到完整报告。