Loading...
Loading...
Found 141 Skills
Evaluate the quality of CAW (Cobo Agentic Wallet) Agent in local Claude Code, and generate scoring data and analysis reports. Use when: Users want to run CAW evaluation, conduct evaluation, test Skill, assess Agent quality, generate evaluation reports, or say "run evaluation", "evaluate CAW", "eval", "score". For weak model / openclaw evaluation, please use caw-eval-openclaw (only installed on openclaw servers).
Review changes for regressions, contract mismatches, quality gaps, and missing validation evidence.
Deep diagnostic of Claude/SDD configuration. Read-only. Produces audit-report.md consumed by /project-fix. Trigger: /project-audit, audit project, review claude config, project health check.
Assess documentation quality across readability, consistency, audience fit, and prose clarity. Produces a scored review with actionable findings. This skill should be used before releases, during doc reviews, or when documentation feels unclear or inconsistent.
Reflect on previus response and output, based on Self-refinement framework for iterative improvement with complexity triage and verification
Performs ARA Seal Level 2 semantic epistemic review on Agent-Native Research Artifacts, scoring six dimensions (evidence relevance, falsifiability, scope calibration, argument coherence, exploration integrity, methodological rigor) and producing a constructive, severity-ranked report with a Strong Accept-to-Reject recommendation. Use after Level 1 structural validation passes, when an ARA needs an objective epistemic critique before publication or release.
Deep formal test smell audit based on academic research taxonomy (testsmells.org). Detects 19 categorized smell types — conditional logic, mystery guests, sensitive equality, eager tests, and more — with calibrated severity and research-backed remediation. Use for comprehensive test suite health assessments. For a quick pragmatic review, use test-anti-patterns instead. DO NOT USE FOR: writing new tests (use writing-mstest-tests), evaluating assertion quality specifically (use assertion-quality), or finding test duplication and boilerplate (use exp-test-maintainability).
Structural review of documents for gaps, clarity, completeness, and organization. Use when a brainstorm, plan, spec, ADR, or any doc needs polish before the next workflow step. For exploring new ideas from scratch, use brainstorming instead.
Analyze datasets to discover patterns, anomalies, and relationships. Use when exploring data files, generating statistical summaries, checking data quality, or creating visualizations. Supports CSV, Excel, JSON, Parquet, and more.
Evaluates agent skills against Anthropic's best practices. Use when asked to review, evaluate, assess, or audit a skill for quality. Analyzes SKILL.md structure, naming conventions, description quality, content organization, and identifies anti-patterns. Produces actionable improvement recommendations.
Get git records for specified users and days, perform code review for each commit, and generate detailed code review reports
Code review of current git changes, compare to related plan if exists, identify bad engineering, over-engineering, or suboptimal solutions. Use when user asks to review changes, check git diff, validate implementation quality, or assess code changes.