Loading...
Loading...
Use when CI tests fail on main branch after PR merge, or when investigating flaky test failures in CI environments
npx skill4agent add payloadcms/payload triage-ci-flakemainmainpnpm dev $SUITE_NAMEpnpm prepare-run-test-against-prodpnpm dev:prod $SUITE_NAMEdigraph triage_ci {
"CI failure reported" [shape=box];
"Extract details from CI logs" [shape=box];
"Identify suite and test name" [shape=box];
"Run dev server: pnpm dev $SUITE" [shape=box];
"Run specific test by name" [shape=box];
"Did test fail?" [shape=diamond];
"Debug with dev code" [shape=box];
"Run prepare-run-test-against-prod" [shape=box];
"Run: pnpm dev:prod $SUITE" [shape=box];
"Run specific test again" [shape=box];
"Did test fail now?" [shape=diamond];
"Debug bundling issue" [shape=box];
"Unable to reproduce - check logs" [shape=box];
"Fix and verify" [shape=box];
"CI failure reported" -> "Extract details from CI logs";
"Extract details from CI logs" -> "Identify suite and test name";
"Identify suite and test name" -> "Run dev server: pnpm dev $SUITE";
"Run dev server: pnpm dev $SUITE" -> "Run specific test by name";
"Run specific test by name" -> "Did test fail?";
"Did test fail?" -> "Debug with dev code" [label="yes"];
"Did test fail?" -> "Run prepare-run-test-against-prod" [label="no"];
"Run prepare-run-test-against-prod" -> "Run: pnpm dev:prod $SUITE";
"Run: pnpm dev:prod $SUITE" -> "Run specific test again";
"Run specific test again" -> "Did test fail now?";
"Did test fail now?" -> "Debug bundling issue" [label="yes"];
"Did test fail now?" -> "Unable to reproduce - check logs" [label="no"];
"Debug with dev code" -> "Fix and verify";
"Debug bundling issue" -> "Fix and verify";
}i18nfieldslexicaltest/i18n/e2e.spec.ts# ========================================
# STEP 2A: STOP ALL SERVERS
# ========================================
lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"
# ========================================
# STEP 2B: START DEV SERVER
# ========================================
# Start dev server with the suite (in background with run_in_background=true)
pnpm dev $SUITE_NAME
# ========================================
# STEP 2C: WAIT FOR SERVER READY
# ========================================
# Wait for server to be ready (REQUIRED - do not skip)
until curl -s http://localhost:3000/admin > /dev/null 2>&1; do sleep 1; done && echo "Server ready"
# ========================================
# STEP 2D: RUN SPECIFIC TEST
# ========================================
# Run ONLY the specific failing test using Playwright directly
# For E2E tests (DO NOT use pnpm test:e2e as it spawns its own server):
pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name"
# For integration tests:
pnpm test:int $SUITE_NAME -t "exact test name"# ========================================
# STEP 3A: STOP ALL SERVERS (INCLUDING DEV SERVER FROM STEP 2)
# ========================================
lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"
# ========================================
# STEP 3B: BUILD AND PACK FOR PROD
# ========================================
# Build all packages and pack them (this takes time - be patient)
pnpm prepare-run-test-against-prod
# ========================================
# STEP 3C: START PROD SERVER
# ========================================
# Start prod dev server (in background with run_in_background=true)
pnpm dev:prod $SUITE_NAME
# ========================================
# STEP 3D: WAIT FOR SERVER READY
# ========================================
# Wait for server to be ready (REQUIRED - do not skip)
until curl -s http://localhost:3000/admin > /dev/null 2>&1; do sleep 1; done && echo "Server ready"
# ========================================
# STEP 3E: RUN SPECIFIC TEST
# ========================================
# Run the specific test again using Playwright directly
pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name"
# OR for integration tests:
pnpm test:int $SUITE_NAME -t "exact test name"for i in {1..10}; do pnpm test:e2e...; donetoBeVisible()toHaveText()waitForFunction()afterEachafterEachdeleteAllsetTimeoutsleepwaitForPageStability()playwright/no-networkidlewaitForLoadState('networkidle')payload/no-wait-functionwait()payload/no-flaky-assertionsplaywright/prefer-web-first-assertions# Ensure dev server is running on port 3000
# Run test multiple times to confirm stability
for i in {1..10}; do
pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts -g "exact test name" || break
done
# Run full suite
pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts
# If you modified bundled code, test with prod build
lsof -ti:3000 | xargs kill -9 2>/dev/null
pnpm prepare-run-test-against-prod
pnpm dev:prod $SUITE_NAME
until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done
pnpm exec playwright test test/$SUITE_NAME/e2e.spec.ts| Rationalization | Reality |
|---|---|
| "The logs show the exact error" | Logs show symptoms, not root cause. Reproduce. |
| "I can see the problem in the code" | You're guessing. Reproduce to confirm. |
| "This is obviously a race condition" | Maybe. Reproduce to be sure. |
| "I've seen this error before" | This might be different. Reproduce. |
| "The stack trace is clear" | Stack trace shows where, not why. Reproduce. |
| "Time pressure - need to fix fast" | Reproducing IS fast. Guessing wastes time. |
| "The test file shows the issue" | Reading ≠ running. Execute the commands. |
| "I'll analyze the code first" | Code analysis comes AFTER reproduction. |
| "Let me investigate the root cause" | Reproduction IS the investigation. |
| "I need to understand the error" | Understanding comes from seeing it fail. |
User: CI failed on i18n suite - https://github.com/.../actions/runs/XXX
Agent:
1. Extracted details from CI logs:
- Suite: i18n
- Test: "ensure translations update correctly when switching language"
- File: test/i18n/e2e.spec.ts:124
- Error: "Error loading language: en" TypeError: network error
2. Executing reproduction workflow:
[Uses Bash tool to kill ports]
$ lsof -ti:3000 | xargs kill -9 2>/dev/null || echo "Port 3000 clear"
[Uses Bash tool with run_in_background=true]
$ pnpm dev i18n
[Waits for server to start with curl check]
$ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done && echo "Server ready"
[Uses Bash tool to run Playwright directly]
$ pnpm exec playwright test test/i18n/e2e.spec.ts -g "ensure translations update correctly when switching language"
Output shows: Test PASSED locally
3. Test passed with dev code, trying bundled code:
[Kills ports again]
$ lsof -ti:3000 | xargs kill -9 2>/dev/null
[Uses Bash tool]
$ pnpm prepare-run-test-against-prod
[Uses Bash tool with run_in_background=true]
$ pnpm dev:prod i18n
[Waits for server]
$ until curl -s http://localhost:3000/admin > /dev/null; do sleep 1; done
[Uses Bash tool]
$ pnpm exec playwright test test/i18n/e2e.spec.ts -g "ensure translations update correctly when switching language"
Output shows: Test FAILED - reproduced!
4. Now that I've reproduced it, analyzing the failure...
[Code analysis, root cause identification]
5. Implementing fix: [specific changes]
6. Verification:
[Uses Bash tool]
$ for i in {1..10}; do pnpm test:e2e i18n -g "exact test"; done
All 10 runs passed!User: CI failed on i18n suite
Agent: Looking at the error logs, I can see this is a race condition...
[Analyzes code, proposes fix without running any commands]
❌ WRONG - This violates the skill. Must execute reproduction commands first.| Mistake | Fix |
|---|---|
| Running full test suite first | Run specific test by name |
| Skipping dev code reproduction | Always try dev code first |
| Not testing with bundled code | If dev passes, test with |
| Proposing fix without reproducing | Follow the workflow - reproduce first |
Using | Use condition-based waiting with |
Adding arbitrary | Use Playwright's built-in assertions and waits |
The fix has been verified and is ready for review. Would you like me to create a PR with these changes?
Summary of changes:
- [List files modified]
- [Brief description of the fix]
- [Verification results]