openclaw-github-dedupe

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Issue/PR Cluster Deduper

Issue/PR 集群去重器

Use this skill when a cluster of GitHub issues and pull requests has been reported for a common failure mode (Slack, iMessage, support threads, or manual list), and you need an evidence-based dedupe recommendation or execution.
当出现针对同一故障模式的GitHub Issue和PR集群时(来自Slack、iMessage、支持线程或手动整理的列表),可使用此技能,以获取基于证据的去重建议或执行去重操作。

Purpose

用途

Provide a consistent, evidence-driven triage pass for issue and PR clusters so duplicate work is folded, contributor credit is preserved, and cleanup actions stay auditable.
为Issue和PR集群提供一致、基于证据的分类处理,合并重复工作,保留贡献者信用,并使清理操作可审计。

When to use

使用场景

  • You have a cluster of suspected duplicates to classify as canonical/related/independent.
  • You need a concise dedupe plan with explicit credit and rationale before taking action.
  • The user asked to execute safe closure/label/comment steps (or return a vetted plan only).
  • You need to avoid duplicate PR churn by identifying which change should stay canonical.
  • 你有一组疑似重复的Issue/PR,需要将其分类为标准对象/关联对象/独立对象。
  • 在执行操作前,你需要一份包含明确信用说明和理由的简洁去重方案。
  • 用户要求执行安全的关闭/添加标签/评论步骤(或仅返回经过审核的方案)。
  • 你需要通过确定应保留为标准对象的变更,避免PR的重复返工。

Inputs

输入参数

  • cluster_refs
    (required): list of issue/PR references as IDs or URLs.
  • mode
    (optional):
    plan
    (default) or
    execute
    .
  • channel
    (optional): source context like
    slack
    ,
    discord
    ,
    support
    , etc.
  • repo
    (optional): explicit
    owner/repo
    when not using current checkout.
  • canonical_hint
    (optional): explicit preference when ambiguity exists.
  • merge_guard
    (optional):
    high|medium
    mergeability strictness; default
    high
    .
  • max_changed_files_for_canonical
    (optional): default
    30
    .
  • max_delta_lines_for_canonical
    (optional): default
    2500
    .
  • min_greptile_score
    (optional): default
    65
    when available.
  • body_noise_mode
    (optional):
    strict|medium
    for junk body tolerance; default
    medium
    .
  • reuse_copy_detection
    (optional):
    off|on
    with default
    on
    for bot/copied-work checks.
  • bot_author_pattern
    (optional): list of substrings for suspicious authors.
  • merge_tool_pref
    (optional):
    auto
    ,
    gh
    ,
    merge-skill
    , or
    land-skill
    ; default
    auto
    .
  • cluster_refs
    (必填):Issue/PR引用列表,格式为ID或URL。
  • mode
    (可选):
    plan
    (默认)或
    execute
  • channel
    (可选):来源上下文,如
    slack
    discord
    support
    等。
  • repo
    (可选):当未使用当前本地仓库时,需明确指定
    owner/repo
  • canonical_hint
    (可选):存在歧义时的明确偏好设置。
  • merge_guard
    (可选):合并严格性,
    high|medium
    ;默认
    high
  • max_changed_files_for_canonical
    (可选):默认值为
    30
  • max_delta_lines_for_canonical
    (可选):默认值为
    2500
  • min_greptile_score
    (可选):可用时默认值为
    65
  • body_noise_mode
    (可选):对无效内容的容忍度,
    strict|medium
    ;默认
    medium
  • reuse_copy_detection
    (可选):机器人/复制内容检测开关,
    off|on
    ;默认
    on
  • bot_author_pattern
    (可选):可疑作者的子字符串列表。
  • merge_tool_pref
    (可选):合并工具偏好,
    auto
    gh
    merge-skill
    land-skill
    ;默认
    auto

Workflow

工作流程

  1. Fetch cluster evidence.
    • For PRs:
      gh pr view <ref> --json number,title,body,state,author,labels,createdAt,updatedAt,mergedAt,closedAt,mergeable,mergeStateStatus,isDraft,changedFiles,additions,deletions,statusCheckRollup,commits,url
    • For issues:
      gh issue view <ref> --json number,title,body,state,labels,author,createdAt,updatedAt,url,comments
    • Pull quick file scope context if needed:
      gh pr diff <ref> --name-only
    • Pull status checks:
      gh pr checks <ref>
    • Pull review/AI-tool comments for score hints:
      gh issue view <ref> --json comments
    • Record common bot/copied-work signals:
      • author.login
        ending with
        [bot]
        or matching
        bot_author_pattern
        .
      • PR body that advertises generated/copied content (
        generated by
        ,
        copy/pasted
        ,
        cherry-picked from
        , etc.).
      • Commit messages that indicate lifted or mechanical work without problem-specific reasoning.
  2. Normalize signals into a guardrail matrix.
    • Mergeability signal: mergeable state, draft status, merge-state blockers, and check conclusions.
    • Churn signal:
      changedFiles
      ,
      additions
      , and
      deletions
      .
    • Body hygiene signal: body length, meaningful root-cause description, and presence of placeholder/junk patterns (e.g.
      WIP
      ,
      TODO
      , blank text, automation template only, no concrete impact).
    • AI review signal: Greptile or equivalent score-like comments. If score is present and below threshold, downgrade confidence.
    • Source integrity signal: bot/copied-work risk and commit provenance.
      • Any PR matched by bot/copied-work signals is marked
        review-heavy
        and cannot become canonical without explicit manual confirmation.
  3. Run full review-on-candidate for likely canonical PR(s).
    • For the top canonical candidate(s), perform a complete pass:
      • gh pr view <id> --json files
        and
        gh pr diff <id>
        .
      • gh pr checks <id>
        and ensure required checks are passing or have explicit acceptable exceptions.
      • Quick duplicate/conflict scan against canonical cluster references via file overlap and root-cause text.
    • If mergeability and hygiene are clean, treat this as the "winning PR" candidate and move to merge-prep.
  4. Choose canonical outputs.
    • Prefer merged + validated PRs with mergeability and a strong scope match.
    • If a newer item is a strict superset and still merge-safe, prefer the newer item.
    • If two items have equivalent scope, keep the most stable/least conflicting anchor.
    • If uncertainty is high, do not close anything; return a report of open questions.
  5. Apply hard guardrails before assigning
    canonical
    or
    duplicate
    .
    • Mergeability hard-stop:
      • Reject as canonical if
        isDraft
        is true.
      • Reject as canonical if
        mergeable
        is false or
        mergeStateStatus
        indicates blocking.
      • If checks are failing, mark as
        manual-review-required
        unless user explicitly wants forced action.
    • Churn hard-stop:
      • If
        changedFiles > max_changed_files_for_canonical
        or
        additions + deletions > max_delta_lines_for_canonical
        , do not auto-close other PRs as duplicate.
      • Treat as related/review-needed even if overlap appears strong.
    • Body hygiene hard-stop:
      • For empty/junk bodies or no concrete root cause, classify confidence as low and require explicit manual confirmation before duplicate actions.
    • AI review hard-stop:
      • If a recognized score (for example Greptile) is below
        min_greptile_score
        , require manual confirmation before duplicates and avoid auto-canonicalization.
    • Bot/copy hard-stop:
      • If author appears as bot/auto-generated and work appears copied with low evidence of original validation, classify as
        manual-review-required
        .
      • Do not suppress original human credit; keep attribution to originator and include a provenance note.
  6. Map outcomes.
    • Canonical issue/PR: keep open.
    • Duplicate PRs/issues: close with explicit duplicate rationale and credit.
    • Winning PR that passes all guardrails: perform final review and help merge.
    • Related but not duplicate: keep open and add explicit relationship note.
    • Unrelated: split into separate routing plan.
    • For any hard-stop failures, return
      manual-review-required
      with blockers listed.
  7. Merge and OpenClaw follow-up.
    • If user asked
      mode=execute
      and the winning PR is green:
      • Use
        merge_tool_pref
        :
        • merge-skill
          /
          land-skill
          : invoke local merge/land workflow/skill if available.
        • gh
          : run
          gh pr merge <id> --auto --merge
          (or
          --squash
          ,
          --rebase
          based on repo policy).
        • auto
          : prefer merge/land skill first if detected; otherwise fallback to
          gh
          merge.
      • Re-check checks after merge and report final state.
    • If canonical PR is merge-ready but not yet merged, add/verify OpenClaw changelog entry in that PR:
      • Check presence of
        CHANGELOG.md
        .
      • If entry exists, do not duplicate; otherwise add a concise bullet under current unreleased
        Fixes
        section with PR number + credits.
      • Prefer including this within the canonical PR before merge.
      • If not possible, create follow-up squash commit after merge to add changelog.
  8. Run boundary check.
    • If
      mode=plan
      , return a precise command plan and message drafts with confidence and blocker list.
    • If
      mode=execute
      , run GH commands only for items with no blockers; otherwise stop and report.
  9. Emit results with explicit evidence and attribution.
  1. 获取集群证据。
    • 针对PR:
      gh pr view <ref> --json number,title,body,state,author,labels,createdAt,updatedAt,mergedAt,closedAt,mergeable,mergeStateStatus,isDraft,changedFiles,additions,deletions,statusCheckRollup,commits,url
    • 针对Issue:
      gh issue view <ref> --json number,title,body,state,labels,author,createdAt,updatedAt,url,comments
    • 若需要,拉取快速文件范围上下文:
      gh pr diff <ref> --name-only
    • 拉取状态检查结果:
      gh pr checks <ref>
    • 拉取评审/AI工具评论以获取评分线索:
      gh issue view <ref> --json comments
    • 记录机器人/复制内容的常见信号:
      • author.login
        [bot]
        结尾或匹配
        bot_author_pattern
      • PR正文包含生成/复制内容的声明(如
        generated by
        copy/pasted
        cherry-picked from
        等)。
      • 提交消息表明工作是照搬或机械性的,无针对问题的特定推理。
  2. 将信号标准化为护栏矩阵。
    • 合并可行性信号:合并状态、草稿状态、合并状态阻塞项和检查结论。
    • 返工信号:
      changedFiles
      additions
      deletions
    • 正文质量信号:正文长度、有意义的根因描述,以及是否存在占位符/无效内容模式(如
      WIP
      TODO
      、空白文本、仅自动化模板、无具体影响说明)。
    • AI评审信号:Greptile或类似的评分类评论。若存在评分且低于阈值,则降低置信度。
    • 来源完整性信号:机器人/复制内容风险和提交来源。
      • 任何匹配机器人/复制内容信号的PR都会被标记为
        review-heavy
        ,未经明确手动确认不得成为标准对象。
  3. 对潜在的标准PR执行全面评审。
    • 对排名靠前的标准候选对象执行完整检查:
      • gh pr view <id> --json files
        gh pr diff <id>
      • gh pr checks <id>
        ,确保必要检查已通过或存在明确可接受的例外情况。
      • 通过文件重叠和根因文本,针对集群中的标准引用执行快速重复/冲突扫描。
    • 若合并可行性和正文质量均达标,则将其视为“胜出PR”候选对象,进入合并准备阶段。
  4. 选择标准输出对象。
    • 优先选择已合并且经过验证、合并可行性良好且范围匹配度高的PR。
    • 若较新的对象是严格超集且仍可安全合并,则优先选择较新的对象。
    • 若两个对象范围相当,则保留最稳定/冲突最少的锚点对象。
    • 若不确定性高,则不关闭任何对象;返回包含待解决问题的报告。
  5. 在分配
    canonical
    duplicate
    标签前应用严格护栏规则。
    • 合并可行性硬限制:
      • isDraft
        为true,则拒绝将其设为标准对象。
      • mergeable
        为false或
        mergeStateStatus
        显示存在阻塞,则拒绝将其设为标准对象。
      • 若检查失败,则标记为
        manual-review-required
        ,除非用户明确要求强制执行。
    • 返工硬限制:
      • changedFiles > max_changed_files_for_canonical
        additions + deletions > max_delta_lines_for_canonical
        ,则不得自动关闭其他PR作为重复项。
      • 即使重叠度高,也将其标记为关联/需评审。
    • 正文质量硬限制:
      • 对于空白/无效正文或无具体根因描述的对象,将置信度设为低,执行重复操作前需明确手动确认。
    • AI评审硬限制:
      • 若存在可识别的评分(如Greptile)且低于
        min_greptile_score
        ,则标记重复项前需手动确认,且避免自动设为标准对象。
    • 机器人/复制内容硬限制:
      • 若作者为机器人/自动生成,且工作内容看似复制而来、无原创验证证据,则标记为
        manual-review-required
      • 不得剥夺原始人类贡献者的信用;保留对原创者的归因,并添加来源说明。
  6. 映射结果。
    • 标准Issue/PR:保持打开状态。
    • 重复PR/Issue:关闭并附上明确的重复理由和信用说明。
    • 通过所有护栏检查的胜出PR:执行最终评审并协助合并。
    • 关联但非重复:保持打开状态并添加明确的关联说明。
    • 不相关:拆分至独立的跟踪计划。
    • 对于任何硬限制失败的情况,返回
      manual-review-required
      并列出阻塞项。
  7. 合并与OpenClaw后续操作。
    • 若用户设置
      mode=execute
      且胜出PR状态正常:
      • 使用
        merge_tool_pref
        • merge-skill
          /
          land-skill
          :若可用,调用本地合并/上线工作流/技能。
        • gh
          :运行
          gh pr merge <id> --auto --merge
          (或根据仓库策略使用
          --squash
          --rebase
          )。
        • auto
          :优先使用合并/上线技能(若检测到);否则回退到
          gh
          合并。
      • 合并后重新检查状态并报告最终结果。
    • 若标准PR已准备好合并但尚未合并,在该PR中添加/验证OpenClaw变更日志条目:
      • 检查是否存在
        CHANGELOG.md
      • 若条目已存在,则不重复添加;否则在当前未发布的
        Fixes
        部分添加简洁的项目符号,包含PR编号和贡献者信用。
      • 优先在标准PR合并前添加该条目。
      • 若无法实现,则在合并后创建后续压缩提交以添加变更日志。
  8. 执行边界检查。
    • mode=plan
      ,返回精确的命令计划和消息草稿,包含置信度和阻塞项列表。
    • mode=execute
      ,仅对无阻塞项的对象运行GitHub命令;否则停止并报告。
  9. 输出包含明确证据和归因的结果。

Outputs

输出内容

  • A canonicality assessment for the cluster with evidence bullets.
  • A
    plan
    table: item, role (canonical/duplicate/related/unrelated), rationale, and confidence.
  • A
    risk register
    per item for:
    mergeability
    ,
    churn
    ,
    body_hygiene
    ,
    ai_review_score
    .
  • A merge-readiness checklist for each candidate (checks, approvals, conflicts, branch state, policy alignment).
  • For chosen canonical PR: explicit merge command used, merge result, and OpenClaw changelog action status.
  • Draft close/keep messages containing contributor credits.
  • Command list for labels/comments/close steps (and execution status when run).
  • Explicit escalation notes for manual review and uncertainty.
  • 集群的标准性评估报告,附证据要点。
  • plan
    表格:对象、角色(标准/重复/关联/不相关)、理由、置信度。
  • 每个对象的
    风险登记册
    mergeability
    churn
    body_hygiene
    ai_review_score
  • 每个候选对象的合并准备清单(检查结果、审批情况、冲突、分支状态、策略对齐)。
  • 针对选定的标准PR:使用的明确合并命令、合并结果、OpenClaw变更日志操作状态。
  • 包含贡献者信用的关闭/保留消息草稿。
  • 标签/评论/关闭步骤的命令列表(以及执行状态,若已运行)。
  • 手动评审和不确定性的明确升级说明。

Message templates

消息模板

Canonical PR credit line

标准PR信用说明

This final fix is in #{canonical_pr} by @{canonical_author}. Earlier related work is credited to #{prior_pr} by @{prior_author}; thanks for that groundwork.
最终修复方案见 #{canonical_pr},由 @{canonical_author} 提交。早期相关工作由 #{prior_pr} 的 @{prior_author} 完成;感谢奠定的基础。

Close duplicate PR

关闭重复PR

Thanks for the earlier contribution. Closing as a duplicate of #{canonical_pr}. Your PR covered part of the same root cause and is credited in the canonical fix. If this is a mistake, please tell me and we'll reopen a review path.
感谢你的早期贡献。因与 #{canonical_pr} 重复,现关闭此PR。你的PR覆盖了部分相同根因,相关信用已记录在标准修复方案中。若判断有误,请告知我们,我们将重新开启评审流程。

Keep canonical issue open

保留标准Issue为打开状态

Keeping this open as canonical for this {channel}-specific failure cluster.
将此Issue作为针对 {channel} 特定故障集群的标准对象,保持打开状态。

Close duplicate issue

关闭重复Issue

Closing as duplicate of #{canonical}. This matches the same root-cause path and behavior. Credit is tracked in #{canonical_pr} with foundational work from #{prior_pr}. If this is a mistake, please tell me and we'll reopen a review path.
因与 #{canonical} 重复,现关闭此Issue。它匹配相同的根因路径和行为。相关信用已在 #{canonical_pr} 中记录,包含 #{prior_pr} 的基础工作。若判断有误,请告知我们,我们将重新开启评审流程。

Related (not duplicate)

关联(非重复)

This appears related but not a duplicate: symptom path diverges at {reason}. Keeping it open separately.
此对象看似相关但非重复:症状路径在 {reason} 处出现分歧。将其单独保留为打开状态。

Winning PR full-review summary

胜出PR全面评审摘要

I reviewed the winning PR for scope, checks, and mergeability. Conflicts: {status}. Check status: {status}. Final action: {merged|ready|blocked}.
我已评审胜出PR的范围、检查结果和合并可行性。冲突状态:{status}。检查状态:{status}。最终操作:{merged|ready|blocked}。

Unrelated (routing only)

不相关(仅路由)

This appears separate from this cluster and should be tracked independently.
此对象看似与当前集群无关,应独立跟踪。

Action commands

操作命令

Run commands from repo checkout unless explicitly using full
owner/repo
URLs.
除非明确使用完整的
owner/repo
URL,否则请从本地仓库目录运行命令。

Add labels for closed duplicates

为已关闭的重复项添加标签

  • PR/issue duplicate:
    gh issue edit <id> --add-label dedupe:child --add-label close:duplicate
    (repo labels vary; use
    duplicate
    fallback as needed).
  • Canonical issue:
    gh issue edit <canonical_id> --add-label dedupe:parent
    .
  • PR/Issue重复项:
    gh issue edit <id> --add-label dedupe:child --add-label close:duplicate
    (仓库标签可能不同;必要时使用
    duplicate
    作为备选)。
  • 标准Issue:
    gh issue edit <canonical_id> --add-label dedupe:parent

Close with comment (issues)

带评论关闭Issue

  • Comment first, then close:
    gh issue comment <id> --body "..."
    gh issue close <id> --reason not planned
  • 先添加评论,再关闭:
    gh issue comment <id> --body "..."
    gh issue close <id> --reason not planned

Close PR

关闭PR

  • For duplicate PR:
    gh pr close <id> --comment "..."
    .
  • 重复PR:
    gh pr close <id> --comment "..."

Guardrail helper checks

护栏辅助检查

  • Mergeability snapshot:
    gh pr view <id> --json mergeable,mergeStateStatus,isDraft,statusCheckRollup
  • Churn snapshot:
    gh pr view <id> --json changedFiles,additions,deletions
  • Greptile/AI review scan (optional):
    gh issue view <id> --json comments
  • 合并可行性快照:
    gh pr view <id> --json mergeable,mergeStateStatus,isDraft,statusCheckRollup
  • 返工快照:
    gh pr view <id> --json changedFiles,additions,deletions
  • Greptile/AI评审扫描(可选):
    gh issue view <id> --json comments

Changelog policy

变更日志策略

  • If changelog already has an entry, do not duplicate it.
  • If adding new entry, place under current Unreleased
    ### Fixes
    section with PR number and contributor credits.
  • 若变更日志已存在对应条目,则不重复添加。
  • 若添加新条目,请将其放在当前未发布的
    ### Fixes
    部分,包含PR编号和贡献者信用。

Suggested merge helpers

推荐的合并辅助工具

  • Prefer local merge/land skill (
    merge_tool_pref=merge-skill
    or
    land-skill
    ) when available and configured.
  • Otherwise use:
    gh pr merge <id> --merge --auto
    gh pr merge <id> --squash --auto
    gh pr merge <id> --rebase --auto
Select merge mode based on repo policy and branch protection requirements.
  • 若可用且已配置,优先使用本地合并/上线技能(
    merge_tool_pref=merge-skill
    land-skill
    )。
  • 否则使用:
    gh pr merge <id> --merge --auto
    gh pr merge <id> --squash --auto
    gh pr merge <id> --rebase --auto
根据仓库策略和分支保护要求选择合并模式。

scripts/alias.sh helper

scripts/alias.sh 辅助脚本

Run the tiny helper directly from this skill directory for repetitive cluster operations:
./scripts/alias.sh run-cluster /path/to/cluster.txt
Minimal
cluster.txt
format:
<type>:<id>|<action>|<target>
Supported actions:
  • inspect
    → fetch issue/PR views, diff file list, and checks.
  • close-pr-duplicate
    → close PR as duplicate of target PR.
  • close-issue-duplicate
    → comment + close issue as duplicate of target issue.
  • noop
    → no action.
Typical example file:
text
pr:20988|inspect|
pr:20377|close-pr-duplicate|20988
issue:19839|close-issue-duplicate|20337
issue:12714|noop|
Use
ODGH_DRY_RUN=1
for dry-run mode to print command intent without making mutations. You can also use it inline:
./scripts/alias.sh --dry-run run-cluster /path/to/cluster.txt
For safe editing without real references, copy:
scripts/cluster-example.txt
直接从本技能目录运行此小型辅助脚本,以处理重复的集群操作:
./scripts/alias.sh run-cluster /path/to/cluster.txt
cluster.txt
的极简格式:
<type>:<id>|<action>|<target>
支持的操作:
  • inspect
    → 获取Issue/PR详情、差异文件列表和检查结果。
  • close-pr-duplicate
    → 将PR作为目标PR的重复项关闭。
  • close-issue-duplicate
    → 添加评论并将Issue作为目标Issue的重复项关闭。
  • noop
    → 无操作。
典型示例文件:
text
pr:20988|inspect|
pr:20377|close-pr-duplicate|20988
issue:19839|close-issue-duplicate|20337
issue:12714|noop|
使用
ODGH_DRY_RUN=1
启用试运行模式,仅打印命令意图而不执行实际变更。也可内联使用:
./scripts/alias.sh --dry-run run-cluster /path/to/cluster.txt
若要安全编辑而不使用真实引用,请复制:
scripts/cluster-example.txt

Git cleanup option

Git 清理选项

  • To remove stale branch:
    git branch -D <branch>
    git push origin --delete <branch>
  • 移除过时分支:
    git branch -D <branch>
    git push origin --delete <branch>

Safety / anti-patterns

安全注意事项/反模式

  • Do not assume duplicate status from metadata alone; always inspect body + diff for scope overlap.
  • Do not use failed closure defaults when hard-stop guardrails fail.
  • Do not invent non-existent GH close reasons (for example,
    --reason duplicate
    in
    gh issue close
    is invalid; use
    not planned
    plus label/comments).
  • Preserve evidence in each comment text for auditability.
  • Do not merge or close items while mergeability/churn/AI-review warnings are unresolved.
  • 不得仅根据元数据判断重复状态;始终检查正文和差异以确认范围重叠。
  • 当硬限制护栏触发时,不得使用默认的关闭操作。
  • 不得使用不存在的GitHub关闭理由(例如
    gh issue close
    中的
    --reason duplicate
    无效;请使用
    not planned
    并配合标签/评论)。
  • 在每条评论中保留证据以便审计。
  • 当合并可行性/返工/AI评审警告未解决时,不得合并或关闭任何对象。