openclaw-github-dedupe

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Issue/PR Cluster Deduper

Issue/PR 集群去重器

Use this skill when a cluster of GitHub issues and pull requests has been reported for a common failure mode (Slack, iMessage, support threads, or manual list), and you need an evidence-based dedupe recommendation or execution.

当出现针对同一故障模式的GitHub Issue和PR集群时（来自Slack、iMessage、支持线程或手动整理的列表），可使用此技能，以获取基于证据的去重建议或执行去重操作。

Purpose

用途

Provide a consistent, evidence-driven triage pass for issue and PR clusters so duplicate work is folded, contributor credit is preserved, and cleanup actions stay auditable.

为Issue和PR集群提供一致、基于证据的分类处理，合并重复工作，保留贡献者信用，并使清理操作可审计。

When to use

使用场景

You have a cluster of suspected duplicates to classify as canonical/related/independent.
You need a concise dedupe plan with explicit credit and rationale before taking action.
The user asked to execute safe closure/label/comment steps (or return a vetted plan only).
You need to avoid duplicate PR churn by identifying which change should stay canonical.

你有一组疑似重复的Issue/PR，需要将其分类为标准对象/关联对象/独立对象。
在执行操作前，你需要一份包含明确信用说明和理由的简洁去重方案。
用户要求执行安全的关闭/添加标签/评论步骤（或仅返回经过审核的方案）。
你需要通过确定应保留为标准对象的变更，避免PR的重复返工。

Inputs

输入参数

```
cluster_refs
```
(required): list of issue/PR references as IDs or URLs.
```
mode
```
(optional):
```
plan
```
(default) or
```
execute
```
.
```
channel
```
(optional): source context like
```
slack
```
,
```
discord
```
,
```
support
```
, etc.
```
repo
```
(optional): explicit
```
owner/repo
```
when not using current checkout.
```
canonical_hint
```
(optional): explicit preference when ambiguity exists.
```
merge_guard
```
(optional):
```
high|medium
```
mergeability strictness; default
```
high
```
.
```
max_changed_files_for_canonical
```
(optional): default
```
30
```
.
```
max_delta_lines_for_canonical
```
(optional): default
```
2500
```
.
```
min_greptile_score
```
(optional): default
```
65
```
when available.
```
body_noise_mode
```
(optional):
```
strict|medium
```
for junk body tolerance; default
```
medium
```
.
```
reuse_copy_detection
```
(optional):
```
off|on
```
with default
```
on
```
for bot/copied-work checks.
```
bot_author_pattern
```
(optional): list of substrings for suspicious authors.

merge_tool_pref

(optional):

auto

gh

merge-skill

, or

land-skill

; default

auto

```
cluster_refs
```
（必填）：Issue/PR引用列表，格式为ID或URL。
```
mode
```
（可选）：
```
plan
```
（默认）或
```
execute
```
。
```
channel
```
（可选）：来源上下文，如
```
slack
```
、
```
discord
```
、
```
support
```
等。
```
repo
```
（可选）：当未使用当前本地仓库时，需明确指定
```
owner/repo
```
。
```
canonical_hint
```
（可选）：存在歧义时的明确偏好设置。
```
merge_guard
```
（可选）：合并严格性，
```
high|medium
```
；默认
```
high
```
。
```
max_changed_files_for_canonical
```
（可选）：默认值为
```
30
```
。
```
max_delta_lines_for_canonical
```
（可选）：默认值为
```
2500
```
。
```
min_greptile_score
```
（可选）：可用时默认值为
```
65
```
。
```
body_noise_mode
```
（可选）：对无效内容的容忍度，
```
strict|medium
```
；默认
```
medium
```
。
```
reuse_copy_detection
```
（可选）：机器人/复制内容检测开关，
```
off|on
```
；默认
```
on
```
。
```
bot_author_pattern
```
（可选）：可疑作者的子字符串列表。
```
merge_tool_pref
```
（可选）：合并工具偏好，
```
auto
```
、
```
gh
```
、
```
merge-skill
```
或
```
land-skill
```
；默认
```
auto
```
。

Workflow

工作流程

Fetch cluster evidence.

For PRs:

gh pr view <ref> --json number,title,body,state,author,labels,createdAt,updatedAt,mergedAt,closedAt,mergeable,mergeStateStatus,isDraft,changedFiles,additions,deletions,statusCheckRollup,commits,url

For issues:

gh issue view <ref> --json number,title,body,state,labels,author,createdAt,updatedAt,url,comments

Pull quick file scope context if needed:
```
gh pr diff <ref> --name-only
```
Pull status checks:
```
gh pr checks <ref>
```
Pull review/AI-tool comments for score hints:
```
gh issue view <ref> --json comments
```
Record common bot/copied-work signals:
- ```
author.login
```
  ending with
```
[bot]
```
  or matching
```
bot_author_pattern
```
  .
- PR body that advertises generated/copied content (
```
generated by
```
  ,
```
copy/pasted
```
  ,
```
cherry-picked from
```
  , etc.).
- Commit messages that indicate lifted or mechanical work without problem-specific reasoning.

Normalize signals into a guardrail matrix.
- Mergeability signal: mergeable state, draft status, merge-state blockers, and check conclusions.
- Churn signal:
```
changedFiles
```
  ,
```
additions
```
  , and
```
deletions
```
  .
- Body hygiene signal: body length, meaningful root-cause description, and presence of placeholder/junk patterns (e.g.
```
WIP
```
  ,
```
TODO
```
  , blank text, automation template only, no concrete impact).
- AI review signal: Greptile or equivalent score-like comments. If score is present and below threshold, downgrade confidence.
- Source integrity signal: bot/copied-work risk and commit provenance.
  - Any PR matched by bot/copied-work signals is marked
```
review-heavy
```
    and cannot become canonical without explicit manual confirmation.
Run full review-on-candidate for likely canonical PR(s).
- For the top canonical candidate(s), perform a complete pass:
  - ```
  gh pr view <id> --json files
```
  and
```
  gh pr diff <id>
```
  .
- ```
gh pr checks <id>
```
    and ensure required checks are passing or have explicit acceptable exceptions.
  - Quick duplicate/conflict scan against canonical cluster references via file overlap and root-cause text.
- If mergeability and hygiene are clean, treat this as the "winning PR" candidate and move to merge-prep.
Choose canonical outputs.
- Prefer merged + validated PRs with mergeability and a strong scope match.
- If a newer item is a strict superset and still merge-safe, prefer the newer item.
- If two items have equivalent scope, keep the most stable/least conflicting anchor.
- If uncertainty is high, do not close anything; return a report of open questions.
Apply hard guardrails before assigning
```
canonical
```
or
```
duplicate
```
.
- Mergeability hard-stop:
  - Reject as canonical if
```
isDraft
```
    is true.
  - Reject as canonical if
```
mergeable
```
    is false or
```
mergeStateStatus
```
    indicates blocking.
  - If checks are failing, mark as
```
manual-review-required
```
    unless user explicitly wants forced action.
- Churn hard-stop:
  - If
```
changedFiles > max_changed_files_for_canonical
```
    or
```
additions + deletions > max_delta_lines_for_canonical
```
    , do not auto-close other PRs as duplicate.
  - Treat as related/review-needed even if overlap appears strong.
- Body hygiene hard-stop:
  - For empty/junk bodies or no concrete root cause, classify confidence as low and require explicit manual confirmation before duplicate actions.
- AI review hard-stop:
  - If a recognized score (for example Greptile) is below
```
min_greptile_score
```
    , require manual confirmation before duplicates and avoid auto-canonicalization.
- Bot/copy hard-stop:
  - If author appears as bot/auto-generated and work appears copied with low evidence of original validation, classify as
```
manual-review-required
```
    .
  - Do not suppress original human credit; keep attribution to originator and include a provenance note.
Map outcomes.
- Canonical issue/PR: keep open.
- Duplicate PRs/issues: close with explicit duplicate rationale and credit.
- Winning PR that passes all guardrails: perform final review and help merge.
- Related but not duplicate: keep open and add explicit relationship note.
- Unrelated: split into separate routing plan.
- For any hard-stop failures, return
```
manual-review-required
```
  with blockers listed.
Merge and OpenClaw follow-up.
- If user asked
```
mode=execute
```
  and the winning PR is green:
  - Use
```
merge_tool_pref
```
    :
    - merge-skill
      /
      land-skill
      : invoke local merge/land workflow/skill if available.
    - gh
      : run
      gh pr merge <id> --auto --merge
      (or
      --squash
      ,
      --rebase
      based on repo policy).
    - auto
      : prefer merge/land skill first if detected; otherwise fallback to
      gh
      merge.
  - Re-check checks after merge and report final state.
- If canonical PR is merge-ready but not yet merged, add/verify OpenClaw changelog entry in that PR:
  - Check presence of
```
CHANGELOG.md
```
    .
  - If entry exists, do not duplicate; otherwise add a concise bullet under current unreleased
```
Fixes
```
    section with PR number + credits.
  - Prefer including this within the canonical PR before merge.
  - If not possible, create follow-up squash commit after merge to add changelog.
Run boundary check.
- If
```
mode=plan
```
  , return a precise command plan and message drafts with confidence and blocker list.
- If
```
mode=execute
```
  , run GH commands only for items with no blockers; otherwise stop and report.
Emit results with explicit evidence and attribution.

获取集群证据。

针对PR：

gh pr view <ref> --json number,title,body,state,author,labels,createdAt,updatedAt,mergedAt,closedAt,mergeable,mergeStateStatus,isDraft,changedFiles,additions,deletions,statusCheckRollup,commits,url

针对Issue：

gh issue view <ref> --json number,title,body,state,labels,author,createdAt,updatedAt,url,comments

若需要，拉取快速文件范围上下文：
```
gh pr diff <ref> --name-only
```
拉取状态检查结果：
```
gh pr checks <ref>
```
拉取评审/AI工具评论以获取评分线索：
```
gh issue view <ref> --json comments
```
记录机器人/复制内容的常见信号：
- ```
author.login
```
  以
```
[bot]
```
  结尾或匹配
```
bot_author_pattern
```
  。
- PR正文包含生成/复制内容的声明（如
```
generated by
```
  、
```
copy/pasted
```
  、
```
cherry-picked from
```
  等）。
- 提交消息表明工作是照搬或机械性的，无针对问题的特定推理。

将信号标准化为护栏矩阵。
- 合并可行性信号：合并状态、草稿状态、合并状态阻塞项和检查结论。
- 返工信号：
```
changedFiles
```
  、
```
additions
```
  和
```
deletions
```
  。
- 正文质量信号：正文长度、有意义的根因描述，以及是否存在占位符/无效内容模式（如
```
WIP
```
  、
```
TODO
```
  、空白文本、仅自动化模板、无具体影响说明）。
- AI评审信号：Greptile或类似的评分类评论。若存在评分且低于阈值，则降低置信度。
- 来源完整性信号：机器人/复制内容风险和提交来源。
  - 任何匹配机器人/复制内容信号的PR都会被标记为
```
review-heavy
```
    ，未经明确手动确认不得成为标准对象。
对潜在的标准PR执行全面评审。
- 对排名靠前的标准候选对象执行完整检查：
  - ```
  gh pr view <id> --json files
```
  和
```
  gh pr diff <id>
```
  。
- ```
gh pr checks <id>
```
    ，确保必要检查已通过或存在明确可接受的例外情况。
  - 通过文件重叠和根因文本，针对集群中的标准引用执行快速重复/冲突扫描。
- 若合并可行性和正文质量均达标，则将其视为“胜出PR”候选对象，进入合并准备阶段。
选择标准输出对象。
- 优先选择已合并且经过验证、合并可行性良好且范围匹配度高的PR。
- 若较新的对象是严格超集且仍可安全合并，则优先选择较新的对象。
- 若两个对象范围相当，则保留最稳定/冲突最少的锚点对象。
- 若不确定性高，则不关闭任何对象；返回包含待解决问题的报告。
在分配
```
canonical
```
或
```
duplicate
```
标签前应用严格护栏规则。
- 合并可行性硬限制：
  - 若
```
isDraft
```
    为true，则拒绝将其设为标准对象。
  - 若
```
mergeable
```
    为false或
```
mergeStateStatus
```
    显示存在阻塞，则拒绝将其设为标准对象。
  - 若检查失败，则标记为
```
manual-review-required
```
    ，除非用户明确要求强制执行。
- 返工硬限制：
  - 若
```
changedFiles > max_changed_files_for_canonical
```
    或
```
additions + deletions > max_delta_lines_for_canonical
```
    ，则不得自动关闭其他PR作为重复项。
  - 即使重叠度高，也将其标记为关联/需评审。
- 正文质量硬限制：
  - 对于空白/无效正文或无具体根因描述的对象，将置信度设为低，执行重复操作前需明确手动确认。
- AI评审硬限制：
  - 若存在可识别的评分（如Greptile）且低于
```
min_greptile_score
```
    ，则标记重复项前需手动确认，且避免自动设为标准对象。
- 机器人/复制内容硬限制：
  - 若作者为机器人/自动生成，且工作内容看似复制而来、无原创验证证据，则标记为
```
manual-review-required
```
    。
  - 不得剥夺原始人类贡献者的信用；保留对原创者的归因，并添加来源说明。
映射结果。
- 标准Issue/PR：保持打开状态。
- 重复PR/Issue：关闭并附上明确的重复理由和信用说明。
- 通过所有护栏检查的胜出PR：执行最终评审并协助合并。
- 关联但非重复：保持打开状态并添加明确的关联说明。
- 不相关：拆分至独立的跟踪计划。
- 对于任何硬限制失败的情况，返回
```
manual-review-required
```
  并列出阻塞项。
合并与OpenClaw后续操作。
- 若用户设置
```
mode=execute
```
  且胜出PR状态正常：
  - 使用
```
merge_tool_pref
```
    ：
    - merge-skill
      /
      land-skill
      ：若可用，调用本地合并/上线工作流/技能。
    - gh
      ：运行
      gh pr merge <id> --auto --merge
      （或根据仓库策略使用
      --squash
      、
      --rebase
      ）。
    - auto
      ：优先使用合并/上线技能（若检测到）；否则回退到
      gh
      合并。
  - 合并后重新检查状态并报告最终结果。
- 若标准PR已准备好合并但尚未合并，在该PR中添加/验证OpenClaw变更日志条目：
  - 检查是否存在
```
CHANGELOG.md
```
    。
  - 若条目已存在，则不重复添加；否则在当前未发布的
```
Fixes
```
    部分添加简洁的项目符号，包含PR编号和贡献者信用。
  - 优先在标准PR合并前添加该条目。
  - 若无法实现，则在合并后创建后续压缩提交以添加变更日志。
执行边界检查。
- 若
```
mode=plan
```
  ，返回精确的命令计划和消息草稿，包含置信度和阻塞项列表。
- 若
```
mode=execute
```
  ，仅对无阻塞项的对象运行GitHub命令；否则停止并报告。
输出包含明确证据和归因的结果。

Outputs

输出内容

A canonicality assessment for the cluster with evidence bullets.
A
```
plan
```
table: item, role (canonical/duplicate/related/unrelated), rationale, and confidence.

risk register

per item for:

mergeability

churn

body_hygiene

ai_review_score

A merge-readiness checklist for each candidate (checks, approvals, conflicts, branch state, policy alignment).
For chosen canonical PR: explicit merge command used, merge result, and OpenClaw changelog action status.
Draft close/keep messages containing contributor credits.
Command list for labels/comments/close steps (and execution status when run).
Explicit escalation notes for manual review and uncertainty.

集群的标准性评估报告，附证据要点。
```
plan
```
表格：对象、角色（标准/重复/关联/不相关）、理由、置信度。

每个对象的

风险登记册

：

mergeability

、

churn

、

body_hygiene

、

ai_review_score

。

每个候选对象的合并准备清单（检查结果、审批情况、冲突、分支状态、策略对齐）。
针对选定的标准PR：使用的明确合并命令、合并结果、OpenClaw变更日志操作状态。
包含贡献者信用的关闭/保留消息草稿。
标签/评论/关闭步骤的命令列表（以及执行状态，若已运行）。
手动评审和不确定性的明确升级说明。

Message templates

消息模板

Canonical PR credit line

标准PR信用说明

This final fix is in #{canonical_pr} by @{canonical_author}. Earlier related work is credited to #{prior_pr} by @{prior_author}; thanks for that groundwork.

最终修复方案见 #{canonical_pr}，由 @{canonical_author} 提交。早期相关工作由 #{prior_pr} 的 @{prior_author} 完成；感谢奠定的基础。

Close duplicate PR

关闭重复PR

Thanks for the earlier contribution. Closing as a duplicate of #{canonical_pr}. Your PR covered part of the same root cause and is credited in the canonical fix. If this is a mistake, please tell me and we'll reopen a review path.

感谢你的早期贡献。因与 #{canonical_pr} 重复，现关闭此PR。你的PR覆盖了部分相同根因，相关信用已记录在标准修复方案中。若判断有误，请告知我们，我们将重新开启评审流程。

Keep canonical issue open

保留标准Issue为打开状态

Keeping this open as canonical for this {channel}-specific failure cluster.

将此Issue作为针对 {channel} 特定故障集群的标准对象，保持打开状态。

Close duplicate issue

关闭重复Issue

Closing as duplicate of #{canonical}. This matches the same root-cause path and behavior. Credit is tracked in #{canonical_pr} with foundational work from #{prior_pr}. If this is a mistake, please tell me and we'll reopen a review path.

因与 #{canonical} 重复，现关闭此Issue。它匹配相同的根因路径和行为。相关信用已在 #{canonical_pr} 中记录，包含 #{prior_pr} 的基础工作。若判断有误，请告知我们，我们将重新开启评审流程。

Related (not duplicate)

关联（非重复）

This appears related but not a duplicate: symptom path diverges at {reason}. Keeping it open separately.

此对象看似相关但非重复：症状路径在 {reason} 处出现分歧。将其单独保留为打开状态。

Winning PR full-review summary

胜出PR全面评审摘要

I reviewed the winning PR for scope, checks, and mergeability. Conflicts: {status}. Check status: {status}. Final action: {merged|ready|blocked}.

我已评审胜出PR的范围、检查结果和合并可行性。冲突状态：{status}。检查状态：{status}。最终操作：{merged|ready|blocked}。

Unrelated (routing only)

不相关（仅路由）

This appears separate from this cluster and should be tracked independently.

此对象看似与当前集群无关，应独立跟踪。

Action commands

操作命令

Run commands from repo checkout unless explicitly using full
owner/repo
URLs.

除非明确使用完整的
owner/repo
URL，否则请从本地仓库目录运行命令。

Add labels for closed duplicates

为已关闭的重复项添加标签

PR/issue duplicate:

gh issue edit <id> --add-label dedupe:child --add-label close:duplicate

(repo labels vary; use

duplicate

fallback as needed).

Canonical issue:

gh issue edit <canonical_id> --add-label dedupe:parent

PR/Issue重复项：
```
gh issue edit <id> --add-label dedupe:child --add-label close:duplicate
```
（仓库标签可能不同；必要时使用
```
duplicate
```
作为备选）。

标准Issue：

gh issue edit <canonical_id> --add-label dedupe:parent

。

Close with comment (issues)

带评论关闭Issue

Comment first, then close:

gh issue comment <id> --body "..."

gh issue close <id> --reason not planned

先添加评论，再关闭：

gh issue comment <id> --body "..."

gh issue close <id> --reason not planned

Close PR

关闭PR

For duplicate PR:
```
gh pr close <id> --comment "..."
```
.

重复PR：
```
gh pr close <id> --comment "..."
```
。

Guardrail helper checks

护栏辅助检查

Mergeability snapshot:

gh pr view <id> --json mergeable,mergeStateStatus,isDraft,statusCheckRollup

Churn snapshot:

gh pr view <id> --json changedFiles,additions,deletions

Greptile/AI review scan (optional):
```
gh issue view <id> --json comments
```

合并可行性快照：

gh pr view <id> --json mergeable,mergeStateStatus,isDraft,statusCheckRollup

返工快照：

gh pr view <id> --json changedFiles,additions,deletions

Greptile/AI评审扫描（可选）：
```
gh issue view <id> --json comments
```

Changelog policy

变更日志策略

If changelog already has an entry, do not duplicate it.
If adding new entry, place under current Unreleased
```
### Fixes
```
section with PR number and contributor credits.

若变更日志已存在对应条目，则不重复添加。
若添加新条目，请将其放在当前未发布的
```
### Fixes
```
部分，包含PR编号和贡献者信用。

Suggested merge helpers

scripts/alias.sh helper

scripts/alias.sh 辅助脚本

Run the tiny helper directly from this skill directory for repetitive cluster operations:

./scripts/alias.sh run-cluster /path/to/cluster.txt

Minimal

cluster.txt

format:

<type>:<id>|<action>|<target>

Supported actions:

```
inspect
```
→ fetch issue/PR views, diff file list, and checks.
```
close-pr-duplicate
```
→ close PR as duplicate of target PR.
```
close-issue-duplicate
```
→ comment + close issue as duplicate of target issue.
```
noop
```
→ no action.

Typical example file:

text

pr:20988|inspect|
pr:20377|close-pr-duplicate|20988
issue:19839|close-issue-duplicate|20337
issue:12714|noop|

Use

ODGH_DRY_RUN=1

for dry-run mode to print command intent without making mutations. You can also use it inline:

./scripts/alias.sh --dry-run run-cluster /path/to/cluster.txt

For safe editing without real references, copy:

scripts/cluster-example.txt

直接从本技能目录运行此小型辅助脚本，以处理重复的集群操作：

./scripts/alias.sh run-cluster /path/to/cluster.txt

cluster.txt

的极简格式：

<type>:<id>|<action>|<target>

支持的操作：

```
inspect
```
→ 获取Issue/PR详情、差异文件列表和检查结果。
```
close-pr-duplicate
```
→ 将PR作为目标PR的重复项关闭。
```
close-issue-duplicate
```
→ 添加评论并将Issue作为目标Issue的重复项关闭。
```
noop
```
→ 无操作。

典型示例文件：