skill-creator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

skill-creator

—

Build, validate, and iterate agent skills in this monorepo. Bakes in the conventions every skill here follows: kebab-case naming, "Use when" trigger phrases, selective XML for example boundaries, and a RED→GREEN→REFACTOR evaluation cycle (see

references/tdd-for-skills.md

Claude Code 扩展（其他Agent会忽略） --- argument-hint: '[<skill-name>]' user-invocable: true model-invocable: false # 仅手动触发的创作流程；由人类主导循环

When to use

skill-creator

Verbatim trigger phrases the user might say:

"build a skill for X"
"create a new skill"
"scaffold a skill"
"add a skill that does Y"
"make me a skill"
"audit this skill against our rules"
"refactor this skill to match repo conventions"

在本单体仓库中构建、验证和迭代Agent技能。内置所有技能需遵循的规范：短横线命名法（kebab-case）、“Use when”触发语句、示例边界使用选择性XML，以及RED→GREEN→REFACTOR评估周期（详见

references/tdd-for-skills.md

）。

When NOT to use

适用场景

User is modifying source code, not skills
User is debugging an existing skill (just edit it directly)
User wants to install a third-party skill (
```
npx skills add <repo>
```
)
User is writing non-skill markdown (docs, READMEs, etc.)

用户可能使用的精确触发语句：

"为X构建技能"
"创建新技能"
"生成技能模板"
"添加实现Y的技能"
"帮我做一个技能"
"根据我们的规则审核这个技能"
"重构技能以匹配仓库规范"

Workflow

不适用场景

1. Discover

—

Clarify what the skill should do. Answer these before scaffolding:

What user request triggers this skill? Capture verbatim phrases.
Does an existing skill in
```
skills/
```
already cover this? Run
```
ls skills/
```
and skim each
```
SKILL.md
```
description.

If overlap is >70%, propose extending the existing skill instead.

用户正在修改源代码而非技能
用户正在调试现有技能（直接编辑即可）
用户想要安装第三方技能（
```
npx skills add <repo>
```
）
用户正在编写非技能类Markdown文档（如说明文档、README等）

2. Name

工作流程

—

1. 需求确认

Apply

references/naming.md

. Quick check:

```
kebab-case-with-hyphens
```
only
Matches
```
^[a-z][a-z0-9-]+[a-z0-9]$
```
≤64 chars
No abbreviations like
```
bestpractices
```
— use
```
best-practices
```

Prefer

<domain>-<focus>

(e.g.,

ts-best-practices

) over generic

<thing>-rules

明确技能应实现的功能。在生成模板前需回答以下问题：

用户的哪些请求会触发该技能？记录精确触发语句。
```
skills/
```
目录中是否已有覆盖该功能的现有技能？执行
```
ls skills/
```
并浏览每个
```
SKILL.md
```
的描述。

如果功能重叠度超过70%，建议扩展现有技能而非创建新技能。

3. RED phase — write

evals.json

and run baselines

2. 命名

Pressure-test the gap before writing the skill. See

references/pressure-scenarios.md

for what makes a good scenario per skill type.

Ask the user for 3+ pressure scenarios: realistic prompts (not abstract "convert this PDF" — specific, messy, with personal context).
For each scenario, decide assertions (regex / contains / file_exists) — see
```
references/evals-json.md
```
.

Write

skills/<name>/evals.json

from the

templates/evals.json.template

Invoke
```
/skill-eval <name>
```
which dispatches Agent(general-purpose) for each scenario without the skill loaded and saves transcripts to
```
skills/<name>/.workspace/iteration-1/eval-K-name/without_skill/
```
.
Inspect the without-skill transcripts (open one in
```
pnpm skill-tools view <name>
```
). If the baseline already passes the assertions cleanly, the skill is unnecessary — tell the user and stop. Better to skip than ship a no-op skill.

The transcripts are gitignored; the

evals.json

is committed.

遵循

references/naming.md

中的规则。快速检查项：

仅使用
```
kebab-case-with-hyphens
```
格式
符合正则表达式
```
^[a-z][a-z0-9-]+[a-z0-9]$
```
长度≤64字符
不使用缩写（如
```
bestpractices
```
应改为
```
best-practices
```
）
优先使用
```
<domain>-<focus>
```
格式（如
```
ts-best-practices
```
），而非通用的
```
<thing>-rules
```
格式

4. Draft frontmatter

3. RED阶段 — 编写

evals.json

并运行基准测试

Skills here are agent-agnostic:

name

and

description

are universally required (the

skills

CLI rejects skills missing either); the others are Claude Code extensions kept for cross-agent compatibility (other agents ignore them). Full schema in

references/frontmatter.md

yaml

---
name: <skill-name>
description: >-
  This skill should be used when [trigger condition]. Common triggers
  include "verbatim phrase 1", "verbatim phrase 2", and "verbatim phrase 3".
  [What it bakes in / what's distinctive]. Skip when [anti-trigger].

在编写技能前，先对需求缺口进行压力测试。不同类型技能的优质场景编写方法详见

references/pressure-scenarios.md

。

向用户索要3个以上压力测试场景：真实的提示语句（而非抽象的“转换这个PDF”，需具体、贴近实际使用场景并包含上下文）。
为每个场景确定断言规则（正则匹配/包含指定内容/文件存在）——详见
```
references/evals-json.md
```
。

基于

templates/evals.json.template

编写

skills/<name>/evals.json

文件。

调用
```
/skill-eval <name>
```
，该命令会在未加载目标技能的情况下，为每个场景调用通用Agent，并将对话记录保存至
```
skills/<name>/.workspace/iteration-1/eval-K-name/without_skill/
```
。
查看未加载技能时的对话记录（可通过
```
pnpm skill-tools view <name>
```
打开）。如果基准测试已能完全通过断言规则，则无需创建该技能——告知用户并终止流程。与其发布无意义的技能，不如直接跳过。

对话记录会被Git忽略；

evals.json

文件需提交至仓库。

--- Claude Code extensions (ignored by other agents) ---

4. 编写前置元数据

argument-hint: '[<optional-arg>]' user-invocable: true model-invocable: true


`name` must exactly match the skill's directory name (kebab-case).

Description rules ([full list](references/description.md)):

- 80–1024 characters
- Contains `"Use when"` or `"This skill should be used when"`
- Lists ≥3 verbatim trigger phrases in double quotes
- No anti-shortcut words: `then`, `next`, `step 1`, `process`, `first` — these get followed as instructions instead of treated as triggers
- Includes a `Skip when` clause naming what the skill does NOT do

本仓库中的技能与Agent无关：

name

和

description

为必填项（

skills

命令行工具会拒绝缺少这两项的技能）；其余为Claude Code扩展字段，用于跨Agent兼容（其他Agent会忽略这些字段）。完整的元数据 schema 详见

references/frontmatter.md

。

yaml

---
name: <skill-name>
description: >-
  This skill should be used when [触发条件]. Common triggers
  include "精确触发语句1", "精确触发语句2", and "精确触发语句3".
  [技能内置特性/独特之处]. Skip when [不适用场景].

5. Draft body

--- Claude Code extensions (ignored by other agents) ---

Markdown headings (

## ...

### ...

) for structure. XML only inside these tags (when to use which):

```
<example>
```
for full scenarios
```
<good>
```
/
```
<bad>
```
for contrast pairs
```
<input>
```
/
```
<output>
```
for tool-call boundaries

Typical body sections:

```
## When to use
```
— verbatim trigger phrases
```
## When NOT to use
```
— anti-triggers
```
## Workflow
```
— numbered actions the agent takes
```
## Examples
```
— at least one
```
<example>
```
block
```
## References
```
— links to companion docs

argument-hint: '[<可选参数>]' user-invocable: true model-invocable: true


`name`必须与技能目录名称完全一致（使用kebab-case格式）。

描述规则（完整列表见[references/description.md](references/description.md)）：

- 长度为80–1024字符
- 包含`"Use when"`或`"This skill should be used when"`语句
- 列出至少3个带双引号的精确触发语句
- 禁止使用引导性词汇：`then`、`next`、`step 1`、`process`、`first`——这些词汇会被Agent当作执行步骤而非触发条件
- 包含`Skip when`子句，说明技能不适用的场景

6. Self-lint

5. 编写技能主体

Run

pnpm skill-tools lint <name>

. All

error

-severity findings must clear;

warn

and

info

are advisory. If any rule fails, fix the SKILL.md and re-run.

The full rule list lives in

references/lint-checklist.md

. The TS implementation in

packages/skill-tools/src/lib/lint.ts

is the enforcer.

使用Markdown标题（

## ...

、

### ...

）构建结构。仅在以下标签内使用XML（使用场景详见

references/xml-usage.md

）：

```
<example>
```
：完整场景示例
```
<good>
```
/
```
<bad>
```
：对比示例
```
<input>
```
/
```
<output>
```
：工具调用边界

典型的主体章节：

```
## When to use
```
— 精确触发语句
```
## When NOT to use
```
— 不适用场景
```
## Workflow
```
— Agent执行的编号步骤
```
## Examples
```
— 至少一个
```
<example>
```
块
```
## References
```
— 关联文档链接

7. GREEN phase — re-run with the skill loaded

6. 自我检查

Invoke

/skill-eval <name>

again — this dispatches Agent(general-purpose) for each scenario with the new skill in context, saves to

skills/<name>/.workspace/iteration-1/eval-K-name/with_skill/

, then grades.

Acceptance: every eval that failed without the skill should now pass. If any still fail, the skill body is missing instructions — patch and rerun. If any regress (passed without, now fails with), the skill introduced a problem — also patch and rerun.

执行

pnpm skill-tools lint <name>

。所有

error

级别的问题必须修复；

warn

和

info

级别的问题为建议性内容。如果有规则未通过，修改SKILL.md后重新执行检查。

完整的规则列表见

references/lint-checklist.md

。

packages/skill-tools/src/lib/lint.ts

中的TypeScript实现为规则强制执行逻辑。

7.5. REFACTOR — capture rationalizations (discipline skills only)

7. GREEN阶段 — 加载技能后重新运行测试

This is the REFACTOR phase of the RED→GREEN→REFACTOR cycle. If this is a discipline skill (one that enforces rules the agent might rationalize skipping — e.g., "always run tests", "never use

any

", "always use Result"), read the with-skill transcripts. When the subagent skipped a rule and explained why, capture the excuse verbatim into a

## Rationalization table

section at the bottom of

SKILL.md

Format:

markdown

undefined

再次调用

/skill-eval <name>

——该命令会在加载新技能的情况下，为每个场景调用通用Agent，并将对话记录保存至

skills/<name>/.workspace/iteration-1/eval-K-name/with_skill/

，然后进行评分。

验收标准：所有未加载技能时失败的测试用例，加载技能后需全部通过。如果仍有测试失败，说明技能主体缺少必要的执行逻辑——修改后重新运行测试。如果出现退化情况（未加载技能时通过，加载后失败），说明技能引入了问题——同样需要修改后重新运行测试。

Rationalization table

7.5. REFACTOR阶段 — 记录合理化借口（仅规则类技能）

Skipped rule	Verbatim excuse	Why it's wrong
Always run the test	"the change is tiny so I'll skip"	Tiny changes still break behavior; run the test
Use Result instead of throw	"this is just a quick prototype"	Prototypes leak into prod; use Result anyway


Capturing excuses verbatim — not sanitized — is the point. Future agents recognize their own pattern. Skip this step only when the skill has no rules an agent could rationalize skipping (most reference skills, some pattern skills). Technique and discipline skills almost always benefit from a rationalization table.

这是RED→GREEN→REFACTOR循环中的REFACTOR阶段。如果是规则类技能（用于强制执行Agent可能试图跳过的规则——例如“始终运行测试”“禁止使用

any

”“始终使用Result”），请查看加载技能后的对话记录。当子Agent跳过规则并解释原因时，将其借口原封不动记录到SKILL.md底部的

## Rationalization table

章节中。

格式：

markdown

undefined

8. Package

Rationalization table

Write to

skills/<name>/

```
SKILL.md
```
— the skill body
```
evals.json
```
— the test definitions (already created in step 3)
```
LICENSE
```
— MIT (matches repo root)
```
README.md
```
— human-facing summary

Optional companions for non-trivial skills:

```
references/<topic>.md
```
— deeper rules referenced from SKILL.md
```
templates/<thing>.template
```
— boilerplate the skill scaffolds from

The nested

<skill>/.workspace/

directory (transcripts, grading, benchmarks) is gitignored — only

evals.json

ships with the skill.

跳过的规则	原封不动的借口	错误原因
Always run the test	"the change is tiny so I'll skip"	微小改动仍可能破坏功能；必须运行测试
Use Result instead of throw	"this is just a quick prototype"	原型代码可能流入生产环境；仍需使用Result


核心要求是原封不动记录借口，而非整理优化。未来的Agent会识别自身的行为模式。仅当技能没有Agent可能试图跳过的规则时（大多数参考技能、部分模式类技能），可跳过此步骤。技术类和规则类技能几乎都能从合理化记录表中获益。

Examples

8. 打包

<example> <input>User says: "build me a skill for parsing TOML config files"</input> <output> 1. Discover — confirm: "Should this trigger on `*.toml` files? Or any time the user mentions TOML?" Check `skills/` for overlap (none). 2. Name — propose `toml-config-parser` (kebab-case, descriptive). 3. RED — three scenarios: (a) "parse this config.toml", (b) "validate the toml schema", (c) "convert toml to json". Without the skill, agent uses ad-hoc string parsing. 4. Frontmatter — description includes "Use when", lists 3 trigger phrases, adds "Skip when working with YAML or JSON". 5. Body — markdown sections, one `<example>` showing parse-validate-output. 6. Self-lint — name matches regex; description = 412 chars, contains "Use when"; no anti-shortcut words; XML balances. 7. GREEN — rerun RED scenarios; agent now uses zod + smol-toml. 8. Package — write `skills/toml-config-parser/{SKILL.md, README.md, LICENSE}`. </output> </example> <example> <good> description: >- This skill should be used when the user wants to refactor TypeScript code to follow functional patterns. Common triggers include "make this functional", "remove the class", and "use Result instead of throw". Bakes in factory functions over classes, Result<T,E> over exceptions, and immutable state. Skip when working with framework-required classes (PrismaClient, etc.). </good> <bad> description: >- This skill helps with TypeScript. First it analyzes the code, then it refactors it. The process involves several steps. </bad>

The

<bad>

example fails three rules: no "Use when" phrase, no verbatim trigger phrases in quotes, contains anti-shortcut words ("first", "then", "process") that cause the agent to follow them as instructions instead of treating them as triggers. </example>

将以下文件写入

skills/<name>/

```
SKILL.md
```
— 技能主体
```
evals.json
```
— 测试定义（已在步骤3中创建）
```
LICENSE
```
— MIT协议（与仓库根目录一致）
```
README.md
```
— 面向人类的摘要说明

复杂技能可添加可选的配套文件：

```
references/<topic>.md
```
— SKILL.md中引用的详细规则文档
```
templates/<thing>.template
```
— 技能生成时使用的模板代码

嵌套的

<skill>/.workspace/

目录（对话记录、评分、基准测试）会被Git忽略——仅

evals.json

随技能一起发布。

References

示例

```
references/evals-json.md
```
—
```
evals.json
```
schema and assertion types
```
references/pressure-scenarios.md
```
— how to write good pressure scenarios per skill type
```
references/tdd-for-skills.md
```
— RED → GREEN → REFACTOR cycle
```
references/frontmatter.md
```
— frontmatter schema
```
references/naming.md
```
— naming rules
```
references/description.md
```
— description rules + anti-shortcut patterns
```
references/xml-usage.md
```
— when to use XML vs Markdown
```
references/lint-checklist.md
```
— full self-lint checklist

<example> <input>用户说："帮我做一个解析TOML配置文件的技能"</input> <output> 1. 需求确认 — 确认："该技能是否在处理`*.toml`文件时触发？还是用户提及TOML时就触发？"检查`skills/`目录是否有重叠功能（无）。 2. 命名 — 建议使用`toml-config-parser`（kebab-case格式，描述清晰）。 3. RED阶段 — 三个场景：(a) "解析这个config.toml"，(b) "验证TOML schema"，(c) "将TOML转换为JSON"。未加载技能时，Agent使用临时字符串解析方法。 4. 前置元数据 — 描述包含"Use when"语句，列出3个触发短语，并添加"Skip when working with YAML or JSON"。 5. 主体 — Markdown章节，包含一个展示解析-验证-输出流程的`<example>`块。 6. 自我检查 — 名称符合正则规则；描述长度为412字符，包含"Use when"语句；无引导性词汇；XML标签配对正确。 7. GREEN阶段 — 重新运行RED阶段的场景；Agent现在使用zod + smol-toml工具。 8. 打包 — 写入`skills/toml-config-parser/{SKILL.md, README.md, LICENSE}`文件。 </output> </example> <example> <good> description: >- This skill should be used when the user wants to refactor TypeScript code to follow functional patterns. Common triggers include "make this functional", "remove the class", and "use Result instead of throw". Bakes in factory functions over classes, Result<T,E> over exceptions, and immutable state. Skip when working with framework-required classes (PrismaClient, etc.). </good> <bad> description: >- This skill helps with TypeScript. First it analyzes the code, then it refactors it. The process involves several steps. </bad>

<bad>

示例违反了三条规则：没有"Use when"语句、没有带引号的精确触发语句、包含引导性词汇（"first"、"then"、"process"），这些词汇会导致Agent将其当作执行步骤而非触发条件。 </example>

Templates

参考文档

```
templates/SKILL.md.template
```
— boilerplate with placeholders
```
templates/README.md.template
```
— readme boilerplate
```
templates/example-skill.md
```
— fully-worked example skill

```
references/evals-json.md
```
—
```
evals.json
```
schema和断言类型
```
references/pressure-scenarios.md
```
— 不同类型技能的优质压力测试场景编写方法
```
references/tdd-for-skills.md
```
— RED → GREEN → REFACTOR循环
```
references/frontmatter.md
```
— 前置元数据schema
```
references/naming.md
```
— 命名规则
```
references/description.md
```
— 描述规则及引导性词汇模式
```
references/xml-usage.md
```
— XML与Markdown的使用场景对比
```
references/lint-checklist.md
```
— 完整的自我检查清单

—

模板

—

```
templates/SKILL.md.template
```
— 带占位符的模板代码
```
templates/README.md.template
```
— README模板
```
templates/example-skill.md
```
— 完整的示例技能

skill-creator

Original

Translation

skill-creator

Claude Code 扩展（其他Agent会忽略） --- argument-hint: '[<skill-name>]' user-invocable: true model-invocable: false # 仅手动触发的创作流程；由人类主导循环

When to use

skill-creator

When NOT to use

适用场景

Workflow

不适用场景

1. Discover

2. Name

工作流程

1. 需求确认

3. RED phase — write evals.json and run baselines

2. 命名

4. Draft frontmatter

3. RED阶段 — 编写evals.json并运行基准测试

--- Claude Code extensions (ignored by other agents) ---

4. 编写前置元数据

argument-hint: '[<optional-arg>]' user-invocable: true model-invocable: true

5. Draft body

--- Claude Code extensions (ignored by other agents) ---

argument-hint: '[<可选参数>]' user-invocable: true model-invocable: true

6. Self-lint

5. 编写技能主体

7. GREEN phase — re-run with the skill loaded

6. 自我检查

7.5. REFACTOR — capture rationalizations (discipline skills only)

7. GREEN阶段 — 加载技能后重新运行测试

Rationalization table

7.5. REFACTOR阶段 — 记录合理化借口（仅规则类技能）

8. Package

Rationalization table

Examples

8. 打包

References

示例

Templates

参考文档

模板

3. RED phase — write
`evals.json`
and run baselines

3. RED阶段 — 编写
`evals.json`
并运行基准测试