cs-refactor

AI has two consistent failure modes when refactoring code independently: first, it lacks awareness of the module's actual requirements and constraints, leading to functionally inequivalent modifications; second, it takes on a scope that exceeds the context capacity, forgetting earlier constraints as it proceeds. This process inserts a scan checklist + method library between "wanting to optimize" and "starting to modify", allowing AI to only undertake tasks it can reliably complete correctly, and pause honestly for the rest.

Full process:

scan (generate optimization point checklist) → design (confirm which items to implement and the order with the user) → apply (execute item by item, with manual approval for each step)

Core Discipline: Behavioral equivalence is the bottom line. If an action will alter externally observable behavior, do not use the refactor workflow; route to feature (requirement change) or issue (bug fix).

Fastforward Mode (use for small refactors)

When changes are obviously minor — single function, single component, 1-3 optimization points, self-verifiable with tests, no need for human visual inspection — going through the full three stages is overkill. Trigger

cs-refactor-ff

: AI directly identifies, aligns once, modifies in-place, runs tests for self-verification, and does not produce scan / design / checklist.

Trigger signals: Users say "small refactor", "quick refactor", "simply optimize XX function", "modify directly", "skip the steps".

When not to use fastforward:

Changes span > 1 file
Expected modification points exceed 3
Requires visual verification (frontend effects, performance perception)
Modifies public interfaces (needs to use Parallel Change)
No test coverage
Cross-module

In such cases, advise users to follow the standard process. If fastforward starts and the task becomes complex, switch back to the full process starting from scan.

Where to place files

Refactor outputs are gathered under

codestable/refactors/

, with an independent directory for each refactor:

codestable/
└── refactors/
    └── {YYYY-MM-DD}-{slug}/
        ├── {slug}-scan.md              ← Optimization point checklist generated in Stage 1
        ├── {slug}-refactor-design.md   ← Execution plan from Stage 2 (selected items, order, verification)
        ├── {slug}-checklist.yaml       ← Generated in Stage 2, used to advance Stage 3
        └── {slug}-apply-notes.md       ← Execution records from Stage 3 (what was done in each step, verification results, deviations)

Directory naming aligns with feature / issue:

YYYY-MM-DD-{english slug}

, the date is set on the day of first creation and remains unchanged; slug uses lowercase letters, numbers, and hyphens, short enough to clearly indicate what is being modified (e.g.,

user-form-split

export-perf

Why use a separate directory instead of mixing with features/: Refactor outputs are "scans + execution records of the current code state", which are time-sensitive and their value decays over time; feature outputs are "why this capability is designed this way", which have weak timeliness. The archiving logic is different, so mixing them will make it hard to find content later.

Three Stages

Stage	Sub-process	Output	Lead
1 scan	Generate optimization point checklist	{slug}-scan.md	AI scans code + runs pre-checks, user selects items
2 design	Finalize execution plan	{slug}-refactor-design.md + {slug}-checklist.yaml	AI drafts, user conducts overall review
3 apply	Execute item by item	Code changes + {slug}-apply-notes.md	AI executes, manual approval for each step

There are checkpoints between stages. The scan checklist cannot enter design until the user selects items; no code modifications are made until the design is approved by the user; items marked for HUMAN verification in apply cannot proceed to the next step without user confirmation.

Stage 1: scan (Generate optimization point checklist)

First run pre-checks (7 items), stop if any are hit

Run pre-checks before starting the scan. If any item is hit, abort scan and provide routing suggestions, do not force a checklist. The 7 checks and output format are in

reference/refusal-routing.md

in the same directory.

Zero valid outputs — if no worthwhile optimizations are found after scanning, state this honestly instead of forcing entries.

Lock scan scope

Before starting the scan, confirm one thing with the user: which files to scan this time. Default rules:

User specifies specific files/components → scan only those
User says "this page" → scan the page's entry component + directly imported internal modules, do not trace public dependencies
User says "this module" → scan files in the module directory, do not go beyond module boundaries
Scope > 15 files or > 3000 lines → trigger the 6th pre-check, ask the user to narrow the scope

Include test files in the scope (to judge test coverage for the 2nd pre-check).

What to look for during scanning

Use the four-layer classification of the method library as a template to search in the code:

L1 Behavioral Equivalence Migration Signals: A function is called in many places but its interface/implementation needs modification → candidate for Parallel Change; a whole block of old logic needs to be replaced with a new implementation → candidate for Strangler Fig
L2 Code-level Refactoring Signals: Overly long functions (> 50 lines / cyclomatic complexity > 10), repeated conditional fragments, mysterious temporary variables, deeply nested if-else
L3 Structure Splitting Signals: Components > 300 lines, one file handles multiple tasks, container/presentation logic mixed, identical logic written separately in multiple components (frontend); Controller directly calls DB, missing Service layer, Repository bypassed (backend)
L4 Performance Signals: Repeated calculations (memoizable), N+1 queries, list without virtualization/pagination, event listeners not cleaned up, deep reactivity for large objects (Vue)

The complete method library list is in

reference/methods.md

in the same directory, which must be fully loaded as a matching table during scanning.

Output format

{slug}-scan.md

consists of two parts:

Top Overview (one paragraph): Scan scope / number of findings / distribution by category / distribution by risk / recommended priority items / recommended cautious items
Checklist Items (one markdown block per item): Field order and hard constraints are in
```
reference/scan-checklist-format.md
```
in the same directory

After scanning, submit the entire

{slug}-scan.md

to the user. The user selects items to implement (mark ✓), marks questions or rejections (mark ✗ and write reasons), then proceed to Stage 2. Do not select items on behalf of the user.

Stage 2: design (Finalize execution plan)

Input

```
{slug}-scan.md
```
with user selections (✓ items are to be implemented this time, ✗ items are archived for traceability)
Method library
```
reference/methods.md
```
(each selected item must map to a method ID M-Ln-NN)

Tasks

Sort order. Items with dependencies are placed first (e.g., L1 Parallel Change often needs to run first, followed by L2 extraction). Independent items are prioritized by "low risk + AI self-verifiable", and HUMAN verification items are grouped at the end.
Add execution details for each item: Referenced method ID, specific steps, preconditions, exit signals, verification responsible party (AI / HUMAN), rollback strategy (how to restore if problems occur).
Identify pre-dependencies: Items with insufficient test coverage need a pre-step of "supplement characterization tests"; items modifying public interfaces need a pre-step of "search for callers".
Overall review: Submit the full draft of
```
{slug}-refactor-design.md
```
to the user. After user approval, change the
```
status
```
in the frontmatter to
```
approved
```
.
Extract checklist: Extract
```
{slug}-checklist.yaml
```
from the design, with steps corresponding to execution order and checks corresponding to exit signals for each step.

Design file structure

markdown

---
doc_type: refactor-design
refactor: {YYYY-MM-DD}-{slug}
status: draft | approved
scope: {one sentence describing scan scope}
summary: {one sentence describing the items to be implemented}
---

# {slug} refactor design

## 1. Scope of this refactor
- Which items were selected from the scan (listed by number)
- Items explicitly not implemented (marked ✗) and reasons
- Estimated total workload / total risk level

## 2. Pre-dependencies
- Test coverage supplement actions (if needed)
- Caller search actions (if needed)
- Other one-time preparations

## 3. Execution order
List by step, one block per step:
- Step N: {one sentence action}
- Referenced method: M-Ln-NN {method name}
- Specific operations: {apply method library steps to specific files/functions in this project}
- Exit signals: {tests AI runs / pages HUMAN checks}
- Verification responsibility: AI self-verification ｜ HUMAN
- Rollback: {how to restore if problems occur, usually git revert the step}

## 4. Risks and key points
- Summary of high-risk steps (separately highlight steps with high risk in this design)
- Error-prone points (e.g., cross-step data flow changes)

Stage 3: apply (Execute item by item)

Advancement rules

One step at a time, no batch operations. Strictly follow the checklist order; do not start the next step until the current step is completed.
Verify after each step:
- AI self-verification items: Run specified tests / type checks / lint / grep for no residual old references. Record in apply-notes if passed, then proceed to the next step.
- HUMAN verification items: Pause, report "Step N has been completed, please visually confirm at {specific page / operation steps}, I will continue after confirmation". Do not proceed without explicit "continue" from the user.
Record deviations immediately: If unconsidered situations are found during execution (e.g., a caller in dynamic import), pause and report, do not act on your own. Align with the user, add the deviation to apply-notes, and return to Stage 2 to modify the design if necessary.
Self-check for behavioral equivalence: After each step, ask yourself — "Could this step change externally observable behavior?". If in doubt, roll back the step and do not proceed.

apply-notes format

markdown

---
doc_type: refactor-apply-notes
refactor: {YYYY-MM-DD}-{slug}
---

# {slug} apply notes

## Step 1: {action}
- Completion time: {date}
- Modified files: {file list}
- Verification result: {test output / HUMAN confirmation quote}
- Deviations: {none / specific description}

## Step 2: ...

After completion

Run full tests + type checks + lint
Ask the user for a final overall visual confirmation (especially for frontend: open key pages and test functions)
After confirmation, finalize the commit, with the commit message referencing the refactor directory

Exit conditions

Scan pre-checks have been run; if any were hit, the user was guided to the correct route, and only non-hit cases entered scan
User has selected items (✓/✗) in
```
{slug}-scan.md
```
; unselected items did not enter design
Each selected item in
```
{slug}-refactor-design.md
```
is mapped to a method ID in the method library
Design has passed user overall review, status=approved
```
{slug}-checklist.yaml
```
has been generated and validated via validate-yaml.py
Each step in the apply stage has verification records (AI self-verification logs attached, HUMAN confirmation quotes attached)
Full tests / type checks / lint have passed
User has passed the final visual confirmation

Common pitfalls

AI forces checklist entries: Clearly hits pre-checks but finds excuses to bypass, and generates entries with non-quantifiable issues like "code can be more elegant" — should pause immediately and provide routing suggestions
Includes behavior changes: "Incidentally fixed a bug / optimized a prompt" during refactoring — should pause and split into an independent issue or feature
Combines cross-step actions: Submits 2-3 steps in one commit for speed — loses the ability to roll back a single step if problems occur
Includes preference items: Naming preferences, quotes, arrow functions vs function — these go to decisions, not refactor
Directly starts scanning a large module: Enters scan without splitting a scope of >15 files / >3000 lines, resulting in an unmanageable long checklist
Skips HUMAN verification items: Frontend effects cannot be seen by AI; cannot replace manual visual inspection with "type checks passed"
Proceeds with insufficient coverage: Modifies modules without tests, with "behavioral equivalence" only a verbal promise

Boundaries with adjacent workflows

feature: Adding new capabilities / modifying requirements → feature. If "incidentally implement X" comes up during refactoring, pause and split it out.
issue: Fixing bugs / correcting behavior → issue. Bugs found during refactoring should be recorded as new issues, not secretly fixed in the current PR.
decisions: Project-wide long-term constraints ("use composable from now on", "disable mixin") → decisions. Refactor can reference existing decisions as basis, but does not produce decisions.
architecture: Cross-module boundary restructuring / layer adjustment → architecture + decisions. A single refactor does not cross modules; cross-module work should be split into "update architecture documentation + record decisions + N module-level refactors".
tricks / learning: Reusable techniques found during refactoring → tricks; pitfalls encountered → learning.

cs-refactor

NPX Install

Tags

SKILL.md Content (Chinese)

cs-refactor

Fastforward Mode (use for small refactors)

Where to place files

Three Stages

Stage 1: scan (Generate optimization point checklist)

First run pre-checks (7 items), stop if any are hit

Lock scan scope

What to look for during scanning

Output format

Stage 2: design (Finalize execution plan)

Input

Tasks

Design file structure

Stage 3: apply (Execute item by item)

Advancement rules

apply-notes format

After completion

Exit conditions

Common pitfalls

Boundaries with adjacent workflows

Related documents