PR Mining Coordinator Skill

Operator Context

This skill operates as an operator for PR mining coordination workflows, configuring Claude's behavior for background job management and tribal knowledge extraction. It implements the Pipeline architectural pattern -- Validate, Mine, Verify, Generate, Report -- with Domain Intelligence embedded in the mining methodology.

Hardcoded Behaviors (Always Apply)

CLAUDE.md Compliance: Read and follow repository CLAUDE.md before execution
Over-Engineering Prevention: Only implement what's directly requested. No speculative features
Background Execution: Mining jobs always run in background with
```
&
```

GitHub Token from Keychain: Uses

security find-internet-password -s github.com -w

Process Tracking: Always store and monitor background job PIDs
Sequential by Default: Run mining jobs one at a time to avoid API rate limits

Default Behaviors (ON unless disabled)

Communication Style: Report facts without self-congratulation. Show command output, not descriptions
Temporary File Cleanup: Remove coordination files and debug outputs at completion. Keep only mining results (JSON) and generated rules (markdown)
Progress Reporting: Show mining job progress every 30-60 seconds
Auto Rules Generation: Generate categorized markdown rules after successful mining
Error Detection: Monitor for API rate limits, auth failures, empty results
Confidence Scoring: Calculate HIGH/MEDIUM/LOW confidence for patterns

Optional Behaviors (OFF unless enabled)

Concurrent Mining: Run multiple repos simultaneously (risk: rate limits)
Historical Analysis: Mine specific date ranges with --since/--until flags
All Comments Mode: Use --all-comments for senior reviewers (default: imperative only)
Cross-Repo Merging: Combine patterns from multiple mining results into unified rules

What This Skill CAN Do

Coordinate background PR mining jobs with the pr-miner tool
Track running jobs and report progress to user
Generate categorized coding rules documents from mined data
Calculate pattern confidence from occurrence frequency
Handle API rate limits, auth failures, and empty result sets

What This Skill CANNOT Do

Mine without a valid GitHub token
Run multiple mining jobs in parallel (sequential by default)
Perform code review (use code-review skill instead)
Write coding standards from scratch without PR data
Skip prerequisite validation or result verification

Instructions

Phase 1: VALIDATE

Goal: Confirm prerequisites before starting any mining operation.

Step 1: Check miner script exists

bash

fish -c "ls ~/.claude/skills/pr-miner/scripts/miner.py"

Expected: File exists at path.

Step 2: Verify GitHub token

bash

fish -c "security find-internet-password -s github.com -w 2>/dev/null"

Expected: Token printed (ghp_...).

Step 3: Verify reviewer username (if filtering by reviewer)

bash

fish -c "gh pr list --repo {org/repo} --search 'reviewed-by:{username}' --limit 5"

Expected: PR results confirm username is valid and active.

Gate: Miner script exists, token available, reviewer verified. Proceed only when gate passes.

Phase 2: MINE

Goal: Execute mining job in background and track progress.

Step 1: Start mining job

bash

fish -c "set -x GITHUB_TOKEN (security find-internet-password -s github.com -w 2>/dev/null) && \
  cd ~/.claude/skills/pr-miner && \
  ./venv/bin/python3 scripts/miner.py {repos} mined_data/{output}.json {flags} --summary" &

Output naming:

{reviewer}_{repos}_{YYYY-MM-DD}.json

{repos}_all_{YYYY-MM-DD}.json

See

references/mining-commands.md

for full command patterns and flag reference.

Step 2: Track progress

Monitor background job with BashOutput tool. Check every 30-60 seconds. Report progress to user.

Step 3: Handle multiple repos

Run jobs sequentially. Wait for each to complete before starting next.

Gate: Mining job completes with non-zero interaction count. Proceed only when gate passes.

Phase 3: VERIFY

Goal: Confirm mining output is valid and contains usable data.

Step 1: Check output file exists and has content

bash

fish -c "cat ~/.claude/skills/pr-miner/mined_data/{output}.json | head -50"

Step 2: Validate structure

Confirm JSON matches expected schema:

json

{
  "metadata": {
    "repos": ["org/repo"],
    "reviewer": "username",
    "mined_at": "2025-11-29T10:30:00Z",
    "pr_count": 100,
    "interaction_count": 36
  },
  "interactions": [
    {
      "pr_number": 123,
      "pr_title": "Add feature X",
      "comment": "Use errors.Is() instead of comparing error strings",
      "code_before": "if err.Error() == \"not found\" {",
      "code_after": "if errors.Is(err, ErrNotFound) {"
    }
  ]
}

interaction_count

is 0, do not proceed -- see Error Handling for "0 interactions found".

Step 3: Check interaction quality

Verify interactions have: pr_number, pr_title, comment text, and ideally code_before/code_after pairs. Interactions without code pairs can still produce rules but lack concrete examples.

Gate: Output JSON is valid, contains interactions with usable data. Proceed only when gate passes.

Phase 4: GENERATE

Goal: Produce categorized coding rules document from mined data.

Step 1: Load and categorize patterns

Read mined JSON. Categorize interactions by topic using standard categories from

references/pattern-categories.md

Step 2: Score confidence

Level	Criteria	Action
HIGH	5+ occurrences from senior reviewers	Include as standard practice
MEDIUM	2-4 occurrences	Include with context caveats
LOW	Single occurrence	Place in "Additional Observations"

Step 3: Generate markdown rules document

Follow this structure for each pattern entry:

markdown

## {Category Name}

### {Pattern Name} ({CONFIDENCE} confidence)

**Pattern**: {Brief description}

**Good**:
\`\`\`{lang}
{good_example_code}
\`\`\`

**Bad**:
\`\`\`{lang}
{bad_example_code}
\`\`\`

**Rationale**: From PR #{number} review by {reviewer}:
"{comment_text}"

Order categories by total pattern count (most patterns first). Within each category, sort HIGH before MEDIUM before LOW.

Step 4: Save rules

bash

fish -c "cat > ~/.claude/skills/pr-miner/rules/{repos}_coding_rules.md"

Gate: Rules document is categorized, confidence-scored, and saved to disk.

Phase 5: REPORT

Goal: Deliver comprehensive results to user.

Provide:

PRs analyzed count
Interactions extracted count
File paths for mined data and generated rules
Top HIGH confidence patterns with occurrence counts
Summary of MEDIUM and LOW confidence pattern counts

Gate: User has all information needed to act on mining results.

Examples

Example 1: Mine Specific Reviewer

User says: "Mine senior-reviewer's patterns from go-libs" Actions:

Verify miner, token, and reviewer username (VALIDATE)
Run mining with --reviewer and --all-comments flags (MINE)
Check output JSON for valid interactions (VERIFY)
Categorize patterns and generate rules markdown (GENERATE)
Report top patterns and file locations (REPORT) Result: Categorized coding rules with confidence scores

Example 2: Team Standards Extraction

User says: "Get coding standards from service-a and service-b" Actions:

Verify miner and token, no reviewer to verify (VALIDATE)
Run mining without --reviewer to capture all reviewers (MINE)
Confirm output has interactions from multiple reviewers (VERIFY)
Generate team-wide rules document (GENERATE)
Report findings with reviewer distribution (REPORT) Result: Team-wide coding rules across both repositories

Error Handling

Error: "API rate limit exceeded"

Cause: GitHub API 5000 requests/hour exhausted by mining operations Solution:

Report remaining quota and reset time to user
Stop current job if rate limit is critically low (<150 remaining)
Wait for reset or cancel and retry later
For future runs: reduce --limit or mine fewer repos per job

Error: "Authentication failed"

Cause: GitHub token expired, revoked, or missing from keychain Solution:

Run

fish -c "security find-internet-password -s github.com -w 2>/dev/null"

to check token

If empty: token not in keychain. User must add it
If present but rejected: token expired or lacks repo scope
Guide user to update token with
```
security add-internet-password
```

Error: "0 interactions found"

Cause: Wrong reviewer username, no PR activity, or date range too narrow Solution:

Verify reviewer username with

gh pr list --search 'reviewed-by:{username}'

Re-run without --reviewer to confirm data exists
Widen date range by removing --since/--until
Check if repo has PR review comments (not just approvals)

Error: "Mining job timeout (>5 min)"

Cause: Large repo, many PRs, or slow API responses Solution:

Report current progress to user
Continue monitoring -- mining is still running
If stuck: check for network issues or API downtime
For future runs: reduce --limit to smaller batches

Anti-Patterns

Anti-Pattern 1: Mining Without Verifying Reviewer Username

What it looks like: Running

--reviewer senior-reviewer

without checking the actual GitHub username Why wrong: Job completes successfully with 0 interactions. Wastes API quota and 5-10 minutes. Username errors are silent. Do instead: Verify username with

gh pr list --search 'reviewed-by:{username}'

before mining.

Anti-Pattern 2: Running Multiple Mining Jobs in Parallel

What it looks like: Starting 3+ mining jobs simultaneously to save time Why wrong: Exhausts 5000 requests/hour rate limit across all jobs. Later jobs fail mid-execution. Cannot track which job consumed quota. Do instead: Run jobs sequentially. Wait for each to complete before starting the next.

Anti-Pattern 3: Generating Flat Rules Without Categorization

What it looks like: A numbered list of 50 patterns with no organization or confidence scoring Why wrong: Overwhelming to read. No way to find relevant patterns. Loses priority context. Do instead: Categorize by topic (Error Handling, Testing, API Design, etc.) and sort by confidence level within each category. See

references/pattern-categories.md

Anti-Pattern 4: Skipping --all-comments for Senior Reviewers

What it looks like: Mining a senior reviewer without the --all-comments flag and getting 0-2 interactions Why wrong: Senior reviewers use questions ("Why not use errors.Is here?") and suggestions instead of imperatives. Default mode misses the majority of their feedback. Do instead: Always use

--all-comments

when mining senior or experienced reviewers.

Anti-Pattern 5: Testing Multi-Repo Mining Without Single-Repo Validation

What it looks like: Mining 5 repos at once on the first attempt without verifying any individually Why wrong: If any repo has access issues, entire job fails after minutes of wasted time. Cannot determine which repo caused failure. Do instead: Test with a single repo and

--limit 10

first. Expand incrementally after confirming access.

References

This skill uses these reference files:

```
${CLAUDE_SKILL_DIR}/references/mining-commands.md
```
: Command patterns, flag reference, output naming conventions
```
${CLAUDE_SKILL_DIR}/references/pattern-categories.md
```
: Standard categories for coding rules (10 categories with examples)

${CLAUDE_SKILL_DIR}/references/reviewer-usernames.md

: Known GitHub usernames and verification methods

Domain-Specific Anti-Rationalization

Rationalization	Why It's Wrong	Required Action
"Username is probably right"	Probably = 0 interactions after 5 min	Verify with gh pr list first
"Parallel mining saves time"	Saves nothing when rate limit kills jobs	Run sequentially
"Just dump all patterns"	Flat lists are unusable at 50+ items	Categorize and score confidence
"Low limit is enough"	Small samples produce low-confidence rules	Use --limit 100+ for meaningful patterns

pr-mining-coordinator

NPX Install

Tags

SKILL.md Content

PR Mining Coordinator Skill

Operator Context

Hardcoded Behaviors (Always Apply)

Default Behaviors (ON unless disabled)

Optional Behaviors (OFF unless enabled)

What This Skill CAN Do

What This Skill CANNOT Do

Instructions

Phase 1: VALIDATE

Phase 2: MINE

Phase 3: VERIFY

Phase 4: GENERATE

Phase 5: REPORT

Examples

Example 1: Mine Specific Reviewer

Example 2: Team Standards Extraction

Error Handling

Error: "API rate limit exceeded"

Error: "Authentication failed"

Error: "0 interactions found"

Error: "Mining job timeout (>5 min)"

Anti-Patterns

Anti-Pattern 1: Mining Without Verifying Reviewer Username

Anti-Pattern 2: Running Multiple Mining Jobs in Parallel

Anti-Pattern 3: Generating Flat Rules Without Categorization

Anti-Pattern 4: Skipping --all-comments for Senior Reviewers

Anti-Pattern 5: Testing Multi-Repo Mining Without Single-Repo Validation

References

Domain-Specific Anti-Rationalization