improve-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Skill Improver

Skill 改进器

Increase the effectiveness of an existing skill by modeling user intent, testing the skill against that intent through mental simulation and live doc validation, and proposing ranked improvements — new features, UX gains, accuracy fixes, efficiency wins.

This skill is about whether the skill accomplishes what users need. Structural correctness (malformed frontmatter, missing fields, violated conventions) is repair-skill's domain — if obvious structural issues are present, note them briefly, recommend

repair-skill

, and continue with effectiveness analysis.

通过建模用户意图、通过模拟推演和实时文档验证来测试skill是否符合该意图，并提出优先级排序的改进方案——包括新功能、UX优化、准确性修复、效率提升，以此提升现有skill的效能。

本skill关注的是skill是否能满足用户需求。结构正确性问题（如格式错误的前置元数据、缺失字段、违反规范）属于repair-skill的处理范畴——如果存在明显的结构问题，只需简要提及，推荐使用

repair-skill

，然后继续进行效能分析。

Phase 0: Load the Skill

阶段0：加载Skill

Read

$ARGUMENTS

as the path to a skill directory or SKILL.md file.

If a directory: read
```
SKILL.md
```
, then note which of
```
references/
```
,
```
scripts/
```
,
```
examples/
```
,
```
assets/
```
exist
If a file: read it directly, then discover sibling resource directories

If the path is missing or ambiguous, use AskUserQuestion to resolve before proceeding.

Note any obvious structural issues in one sentence ("this description is first-person — recommend repair-skill for structural fixes") and move on. Do not run a full structural audit.

Phase 0 is complete when SKILL.md is loaded and the sibling directory inventory is noted.

将

$ARGUMENTS

视为skill目录或SKILL.md文件的路径。

如果是目录：读取
```
SKILL.md
```
，然后记录
```
references/
```
、
```
scripts/
```
、
```
examples/
```
、
```
assets/
```
这些目录是否存在
如果是文件：直接读取该文件，然后查找同级的资源目录

如果路径缺失或不明确，先使用AskUserQuestion工具解决问题后再继续。

用一句话记录任何明显的结构问题（例如“此描述使用第一人称——建议使用repair-skill进行结构修复”），然后继续。无需进行完整的结构审计。

当SKILL.md加载完成且同级目录清单已记录后，阶段0完成。

Phase 1: Understand User Intent

阶段1：理解用户意图

Before analyzing, establish what the user wants. Use AskUserQuestion:

"What specifically does this skill not do well?"
Offer options: a specific gap they've noticed, "I'm not sure — run a full effectiveness audit", "it works but I want new capabilities", "the workflow feels clunky"

Regardless of the answer, also infer the skill's purpose from its description and body. State your understanding of what problem it solves and for whom — one sentence — before proceeding. This grounds the entire analysis in the correct frame.

If the user named a specific complaint: orient the analysis toward that area and scan for related issues in the same workflow region. If the user is unsure: run the full Phase 2 audit across all five sub-analyses.

Phase 1 is complete when user intent is established and the skill's purpose is stated.

在分析之前，先明确用户需求。使用AskUserQuestion工具：

“这个skill具体在哪方面表现不佳？”
提供选项：他们发现的特定缺陷、“我不确定——运行完整的效能审计”、“它能正常工作，但我想要新功能”、“工作流程感觉很繁琐”

无论用户如何回答，还要从skill的描述和内容中推断其用途。用一句话说明你对该skill解决的问题及其目标用户的理解，然后再继续。这将为整个分析奠定正确的框架。

如果用户提出了具体的问题：将分析重点放在该领域，并扫描同一工作流区域中的相关问题。如果用户不确定：运行阶段2的完整审计，涵盖所有五个子分析。

当用户意图明确且skill的用途已说明后，阶段1完成。

Phase 2: Effectiveness Analysis

阶段2：效能分析

Load

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/references/effectiveness-rubric.md

before starting. It contains the severity framework, improvement type definitions, and effort/impact calibration criteria used in Phase 3.

Run all five sub-analyses. For each finding record: what the issue is, why it reduces effectiveness for the user, and the concrete improvement.

开始前先加载

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/references/effectiveness-rubric.md

。该文件包含阶段3中使用的严重性框架、改进类型定义以及工作量/影响校准标准。

运行所有五个子分析。对于每个发现，记录：问题是什么、为什么会降低用户效能、以及具体的改进方案。

2a — Mental Simulation

2a — 模拟推演

Walk through the skill as Claude executing it with a concrete representative user request. Choose an input that exercises the main workflow path — not an edge case, but the typical use.

For each phase of the skill, evaluate:

Missing info: What does this step need that hasn't been provided or gathered yet? If the user hasn't specified something and the skill doesn't ask, what does Claude have to guess?
Divergence points: Where would two different Claude instances execute this instruction and arrive at meaningfully different outputs? These are underspecified steps.
Dead ends: Where does the skill's workflow stop but the user's actual goal isn't yet accomplished? What does the user have to do manually after the skill finishes?
Friction: Where does the skill pause the user at a low-value decision point (Claude could make a good default), or skip user input at a high-value moment (user has a strong preference)?

Document findings by type: stuck points, divergence points, dead ends, friction points.

以Claude执行该skill的视角，结合一个具体的代表性用户请求来梳理skill流程。选择一个能覆盖主要工作流路径的输入——不是边缘情况，而是典型使用场景。

对于skill的每个阶段，评估：

缺失信息：此步骤需要但尚未提供或收集的信息是什么？如果用户未指定某些内容且skill未询问，Claude需要猜测什么？
分歧点：在哪些地方，不同的Claude实例执行同一指令会产生显著不同的输出？这些是说明不充分的步骤。
死胡同：skill的工作流在哪些地方停止，但用户的实际目标尚未完成？skill完成后用户需要手动做什么？
摩擦点：skill在哪些低价值决策点让用户暂停（Claude本可以做出合理默认），或者在高价值时刻跳过用户输入（用户有强烈偏好）？

按类型记录发现：停滞点、分歧点、死胡同、摩擦点。

2b — Live Doc Validation

2b — 实时文档验证

Identify all factual claims in the skill that reference external standards: frontmatter field names, Claude tool names and behavior, API parameters, CLI flags, third-party service interfaces.

For each claim, verify against current documentation:

For Claude-specific claims (frontmatter options, tool names, model IDs):
  Use Task tool with subagent_type=claude-code-guide — faster and more accurate than WebSearch.

For third-party claims (npm packages, APIs, external CLIs):
  Use WebSearch or WebFetch against official documentation.

Flag drift between what the skill states and current reality. Severity: high if the claim produces broken output; medium if it produces outdated guidance; low if it's a naming change with no behavioral difference.

识别skill中所有引用外部标准的事实声明：前置元数据字段名、Claude工具名称和行为、API参数、CLI标志、第三方服务接口。

对于每个声明，对照当前文档进行验证：

对于Claude特定的声明（前置元数据选项、工具名称、模型ID）：
  使用Task工具，设置subagent_type=claude-code-guide——比WebSearch更快更准确。

对于第三方声明（npm包、API、外部CLI）：
  使用WebSearch或WebFetch工具查询官方文档。

标记skill声明与当前实际情况的偏差。严重性：如果声明会导致输出错误则为高；如果会提供过时指导则为中；如果只是名称变更但行为无差异则为低。

2c — Feature Adjacency Scan

2c — 功能邻接扫描

Given the skill's purpose, identify capabilities that are absent but would be high-value:

Adjacent: Naturally extends what the skill already does — same domain, one step further. The user who just ran this skill would almost certainly want this next.
Complementary: Commonly needed right before or after this skill. The user does this manually today.
End-to-end gap: The skill starts a job the user finishes by hand — workflow stops at "here's a plan" when the user wanted "and apply it."

For each candidate: estimate implementation effort (one instruction change, new phase, or new script) and user value (rare edge case, common scenario, or blocks the skill in a key scenario).

根据skill的用途，识别缺失但高价值的功能：

邻接功能：自然扩展skill已有的功能——同一领域，更进一步。刚使用完该skill的用户几乎肯定会想要这个后续功能。
互补功能：在使用该skill之前或之后通常需要的功能。用户目前需要手动完成这些操作。
端到端缺口：skill启动了一项工作，但用户需要手动完成——工作流停留在“这里是计划”，而用户想要的是“并应用该计划”。

对于每个候选功能：估算实现工作量（只需修改一条指令、新增一个阶段或新增一个脚本）和用户价值（罕见边缘情况、常见场景或在关键场景中阻碍skill使用）。

2d — UX Flow Review

2d — UX流程审查

Evaluate the skill's interaction design:

Does the skill surface all necessary questions at the start (before heavy work), or does it interrupt mid-workflow with requests the user didn't anticipate?
Is the "I don't know" path explicit? If a user triggers the skill without a specific complaint, does the skill handle that gracefully, or does it assume the user knows?
Does the output format match how users consume it? A report users read once can be dense prose; one they apply iteratively needs more structure.
Are there steps where the skill makes a consequential decision without user input?

评估skill的交互设计：

skill是否在开始时（繁重工作之前）就提出所有必要的问题，还是在工作流中途突然提出用户未预料到的请求？
“我不知道”的处理路径是否明确？如果用户触发skill但没有具体问题，skill是否能优雅处理，还是会假设用户了解情况？
输出格式是否符合用户的使用习惯？用户只读一次的报告可以是密集的 prose；用户需要反复应用的报告则需要更清晰的结构。
是否存在skill在未征求用户意见的情况下做出重大决策的步骤？

2e — Edge Case Stress Test

2e — 边缘用例压力测试

After walking the main path (2a), deliberately try to break the skill. Identify 3–5 adversarial inputs that test failure modes:

Missing or malformed input: What happens when expected files don't exist, paths are wrong, or arguments are empty/garbled?
Contradictory requirements: What if the user's stated goal conflicts with their input (e.g., "improve this skill" on a file that isn't a skill)?
Unsupported configurations: What if the skill encounters a structure it wasn't designed for — a different framework version, an edge-case project layout, an unexpected file format?
Boundary conditions: What if the input is very large (500-line SKILL.md), very small (empty file), or has unusual characters in paths/names?

For each adversarial input, evaluate: does the skill detect the problem and surface a useful error, silently produce wrong output, or crash the workflow? Map findings to improvement types: missing error handling → NEW FEATURE, poor failure message → UX IMPROVEMENT, undetected bad state → ACCURACY FIX.

Phase 2 is complete when all five sub-analyses are finished and findings are recorded.

在梳理完主要路径（2a）后，故意尝试破坏skill。识别3-5个测试故障模式的对抗性输入：

缺失或格式错误的输入：当预期文件不存在、路径错误或参数为空/混乱时会发生什么？
矛盾的需求：如果用户的目标与输入冲突（例如，对非skill文件提出“改进这个skill”）会发生什么？
不支持的配置：如果skill遇到其设计未覆盖的结构——不同的框架版本、边缘情况的项目布局、意外的文件格式会发生什么？
边界条件：如果输入非常大（500行的SKILL.md）、非常小（空文件）或路径/名称中包含特殊字符会发生什么？

对于每个对抗性输入，评估：skill是否能检测到问题并给出有用的错误提示，还是会静默生成错误输出，或是导致工作流崩溃？将发现映射到改进类型：缺少错误处理→新功能，错误提示不佳→UX改进，未检测到异常状态→准确性修复。

当所有五个子分析完成且发现已记录后，阶段2完成。

Phase 3: Improvement Proposal

阶段3：改进提案

Load

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/references/effectiveness-report-template.md

for the output format before constructing the report. Reference

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/examples/sample-analysis.md

to calibrate depth and specificity if needed.

Present findings grouped by improvement type — users think in terms of outcomes, not audit dimensions. Each entry must include: the sub-analysis code in brackets, what the gap is, why it matters to the user, and the specific fix. Calibrate severity using the criteria in

references/effectiveness-rubric.md

Ask: "Apply all improvements? Or select specific ones?"

Phase 3 is complete when the report is delivered and user selection is confirmed.

在构建报告之前，先加载

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/references/effectiveness-report-template.md

以获取输出格式。如果需要校准报告的深度和具体程度，可以参考

${CLAUDE_PLUGIN_ROOT}/skills/improve-skill/examples/sample-analysis.md

。

按改进类型分组呈现发现——用户关注的是结果，而非审计维度。每个条目必须包含：括号中的子分析代码、缺口是什么、为什么对用户重要、以及具体的修复方案。使用

references/effectiveness-rubric.md

中的标准校准严重性。

询问用户：“应用所有改进？还是选择特定的改进？”

当报告提交且用户选择确认后，阶段3完成。

Phase 4: Apply Improvements

阶段4：应用改进

Apply confirmed items in order: new features → accuracy fixes → UX improvements → efficiency gains.

For each item:

State what is being changed and why — reference the effectiveness principle, not "you asked"
Make the edit or create the file
Confirm the change integrates cleanly with surrounding content

After applying, briefly explain:

What was changed and why
What was added and why
What was left out and why (effort outweighs benefit, or requires domain knowledge the user must supply)
What was not selected — note it remains available to apply later

Validation: After delivering the explanation, re-read the modified SKILL.md in full and confirm: all selected improvements are present, no surrounding content was inadvertently altered, phases are still numbered with exit conditions, and all references point to files that exist.

Phase 4 is complete when all confirmed items are applied, the explanation is delivered, and the validation pass finds no integration failures.

按以下顺序应用已确认的改进项：新功能→准确性修复→UX改进→效率提升。

对于每个改进项：

说明要更改的内容及原因——引用效能原则，而非“你要求的”
进行编辑或创建文件
确认更改与周围内容能无缝集成

应用完成后，简要说明：

更改了什么及原因
添加了什么及原因
未包含什么及原因（工作量大于收益，或需要用户提供领域知识）
未选择的项——注明这些项仍可在后续应用

验证：提交说明后，重新完整阅读修改后的SKILL.md，确认：所有选定的改进项已应用，未意外修改周围内容，阶段编号和退出条件保持不变，所有引用指向的文件均存在。

当所有已确认的改进项应用完成、说明已提交且验证未发现集成故障后，阶段4完成。

Phase 5: Structural Lint

阶段5：结构检查

After applying effectiveness improvements, invoke the skill-lint agent for a structural quality pass:

Use Task tool with subagent_type=claude-skills:skill-lint:
"Lint the skill at <path-to-skill-directory>. Auto-apply critical and major fixes, report
minor findings for user decision."

Wait for the agent to complete. If it auto-applied structural fixes, note them alongside the effectiveness changes from Phase 4. If it reports minor findings, present them to the user.

Phase 5 is complete when the lint agent returns and any user-selected minor fixes are applied.

应用效能改进后，调用skill-lint代理进行结构质量检查：

使用Task工具，设置subagent_type=claude-skills:skill-lint:
"检查<skill目录路径>下的skill。自动应用关键和主要修复，将次要发现报告给用户供其决策。"

等待代理完成。如果它自动应用了结构修复，将这些修复与阶段4的效能改进一起记录。如果它报告了次要发现，将这些发现呈现给用户。

当lint代理返回且用户选定的次要修复已应用后，阶段5完成。