Loading...
Loading...
Benchmark CodeGraph retrieval quality on a real codebase by comparing agent behavior with vs without CodeGraph. Use when the user runs /agent-eval or asks to test, benchmark, audit, or validate a codegraph version (the local dev build or a published npm version) against a language's repo.
npx skill4agent add colbymchenry/codegraph agent-evalscripts/agent-eval/tmuxclaudenodegit- [ ] 1. Pick version (local or npm)
- [ ] 2. Pick language
- [ ] 3. Pick repo by size
- [ ] 4. Pick harness (headless / tmux / both)
- [ ] 5. Run audit.sh in the background
- [ ] 6. Report resultsAskUserQuestion0.7.10locallatest0.7.10.claude/skills/agent-eval/corpus.jsonAskUserQuestionexcalidraw — Medium (~600 files)repoquestionAskUserQuestionheadlessclaude -ptmuxallscripts/agent-eval/audit.sh <VERSION> <repo-name> <repo-url> "<question>" <MODE>parse-run.mjsReadparse-session.mjsVERDICT: codegraph_explore used Nx | Read N | Grep/Bash NTOKENS:audit.sh.codegraphaudit.shcodegraphlocal-install.sh/tmp/codegraph-corpuscorpus.jsonnamereposizefilesquestion