Implement Conformance Testing Script
This skill produces a single executable script that runs the conformance tests for a generated build folder, following a consistent, language-agnostic pattern.
The reference implementations are:
- assets/run_conformance_tests_java.sh — Java, install-inline variant.
- assets/run_conformance_tests_python.sh — Python, install-inline variant.
- assets/run_conformance_tests_<lang>.ps1 — Windows PowerShell equivalents.
Read both before writing anything — every script you produce must be a faithful translation of the same pattern into the target language's tooling and the user's shell environment.
How conformance scripts differ from unit-test scripts
A conformance script is structurally very close to a unit-test script (see the sibling skill
implement-unit-testing-script
) but with two important differences:
- Two positional arguments instead of one. A conformance script takes both the build folder (source under test) and a separate conformance tests folder (the tests to execute against that build).
- Tests are loaded from outside the working folder. The build is staged into and the script s into it, but the test command is pointed at the original
$current_dir/<conformance_tests_folder>
. Tests are never copied into the staging area.
Everything else — toolchain check, build staging, dependency isolation, exit codes — is the same.
Variant decision: install-inline vs. activate-only
Before writing anything, decide which variant to emit. Both variants share toolchain check, arg validation, cwd capture, test execution, and exit-code handling — they differ only in the middle (steps 4–7 of
the pattern below).
| Look for an existing | Emit |
|---|
prepare_environment_<lang>.sh
/ in the project's folder (or wherever 's prepare-environment-script:
key points) | Activate-only variant. Verifies the prepared env, activates it, and runs tests. Does not stage the build or install deps — prepare already did. |
| Nothing — no prepare script | Install-inline variant. Stages the build, installs deps, and runs tests in one shot. |
Why this split exists
The conformance runner is invoked
once per functional spec by the renderer. Each functional spec in a module has its own
conformance_tests/<module>/<spec>/
folder, and after the renderer finishes generating code for a new spec, it runs the conformance tests of
every previous spec in the same module to detect regressions. For a module with N functional specs, this script is called
on the order of N times per render — not once per render.
That per-spec invocation pattern is what makes the install step expensive. A naive runner that does
/
/
/
on every invocation pays the install cost N times per render. For anything beyond a toy project, that cost dominates wall-clock time.
The two variants are a direct response to this:
- Install-inline is correct only when N is small (a few specs) or dependencies are cheap. It is self-contained: stage, install, run, repeat from scratch every invocation.
- Activate-only is the production answer.
prepare_environment_<lang>
runs once per render and pays the install cost a single time, populating with the warmed environment. Each of the N conformance invocations then just attaches to that working folder and runs the tests — no install, no compile, just activate-and-go.
Why picking the right variant matters: if you emit the install-inline variant alongside an existing prepare script, prepare's work is wiped (by the script's
) or duplicated (by re-running install) on every run — defeating prepare's whole purpose. Conversely, emitting activate-only without a prepare script means the "verify prepared environment" check fails on every run because nothing has populated the working folder. See
Anti-Patterns.
Pick the Shell First
Before writing anything, decide which shell flavor the script must target — it depends on the user's environment, not on the language:
- Bash () — macOS, Linux, WSL, CI runners on Linux. Default unless the user is on native Windows.
- PowerShell () — native Windows / PowerShell-only environments.
If you can't tell from the project (no obvious OS hints, no existing scripts), ask the user.
The same pattern applies to both. Only the syntax changes.
The Pattern
Steps 1–3 and step 8 are identical in both variants. Steps 4–7 differ — pick the subsection below that matches the variant you decided on.
Common steps (both variants)
- Toolchain check. Verify that the required language runtime / build tool (and the required version, if any) is installed. If not, print an error and exit with code .
- Argument validation. Require two positional arguments: and
<conformance_tests_folder>
. If either is missing, print usage and exit with code .
- Capture original cwd. Store in a variable ( / ) before changing directories — the test command in step 8 needs it to resolve the conformance tests folder.
Steps 4–7 — install-inline variant (no prepare script)
- Working directory setup. Define a working folder at . Wipe it ( /
Remove-Item -Recurse -Force
) and recreate it. This folder — and only this folder — is where every subsequent write must land.
- Copy the build. Recursively copy everything from () into the working folder. Do not copy the conformance tests — they stay where they are. After this step both (build folder) and (conformance tests folder) are treated as read-only for the rest of the script.
- Enter the working directory. / into . If that fails, exit with code . All remaining steps run from inside the working folder; they must never write back to or .
- Install dependencies into an isolated environment inside . Set up a per-working-folder dependency location (a Python venv at , a local , a project-scoped Maven repo at , etc.) and install/resolve all dependencies into it. Never install into the source build folder (), the conformance tests folder (), the user's global cache (, system-wide , , , ...), or anywhere outside . If the install command fails, propagate its exit code immediately and do not proceed to step 8. See Dependency isolation (install-inline).
Steps 4–7 — activate-only variant (prepare script exists)
-
Verify the prepared environment. Both:
- Check that the working folder exists.
- Check that the language's isolation location inside it exists (e.g. for Python, for Java, for Node, for Go, for Rust).
If either check fails, print a helpful error (
"Error: prepared environment missing — did you run prepare_environment_<lang>.<sh|ps1> first?"
) and exit
.
Do not silently fall back to creating it inline — that would mask a real misconfiguration and turn this script into the install-inline variant in disguise. After this step both
and
are treated as
read-only for the rest of the script.
-
Enter the working directory. /
into
. If that fails, exit
. All remaining steps run from inside the working folder; they must never write back to
or
.
-
Activate the prepared dependency environment. Per-language:
- Python:
source .venv/bin/activate
(must succeed; exit on failure).
- Java: set
MAVEN_LOCAL_REPO="$(pwd)/.m2"
so it can be passed as -Dmaven.repo.local="$MAVEN_LOCAL_REPO"
to in step 8.
- Node.js / Go / Rust: nothing to activate explicitly — the test command in step 8 just needs to receive the same isolation flag/env var that prepare used ( is found by default; pass / ).
Activation is
always relative to the working folder, never to
or
— prepare populated
, and that is the only place to attach to.
-
(There is no step 7 in this variant — install was prepare's job. Skip straight to step 8.)
Common step 8 (both variants)
-
Run the conformance tests. Invoke the language's standard test command,
pointed at $current_dir/<conformance_tests_folder>
(the original cwd from step 3 + the second arg). The script's final exit code is whatever the test command returns — except for the "no tests discovered" case below.
The test command is
read-only with respect to
. It loads test files from there, but any artifacts the runner produces (caches, JUnit XML, coverage reports, compiled test classes, etc.) must land inside
, not next to the test files. If your chosen runner defaults to writing output beside the tests, pass an explicit output-directory flag pointing inside the working folder (e.g.
pytest --basetemp=./.pytest_tmp
,
jest --cacheDirectory=./.jest_cache
, Maven
under
via
mvn -f "$current_dir/$2/pom.xml" -Dproject.build.directory="$(pwd)/target" test
).
Read-only inputs — hard rule
A conformance script has
two read-only inputs: the source build folder (
) and the conformance tests folder (
). Neither one may be written to under any circumstances. The script must never:
- install dependencies into or (no inside /, no inside them, no writing into them, no Cargo build artifacts ending up under them),
- write a virtualenv / / / / directory inside or ,
- run the test command with its set to or (every test command runs from inside after the in step 6 / activate-only step 5),
- create logs, caches, build outputs, JUnit XML, coverage reports, compiled test classes, or temp files inside or .
Why each input is read-only:
- (build folder) is shared with the renderer ( by default) and downstream tooling. Writing into it corrupts the renderer's view of "what was generated" and breaks subsequent renders. The whole point of staging into is so the source folder stays a clean, reproducible artifact of the render.
- (conformance tests folder) is the user's authored test source — typically checked into version control. Writing into it pollutes the working tree, churns git status, and (with frameworks that auto-discover) can make subsequent runs pick up generated files as if they were tests.
If you find yourself about to issue any command whose
is
or
, or whose target path starts with
or
,
stop. Either move the operation into
, or you're doing something the script must not do.
"No tests discovered" detection
The Python reference script grep's the test runner output for
and exits
if no tests ran. Replicate the equivalent check for the target language wherever that language's test runner silently passes when given an empty test set:
- Python :
- Node.js :
- Go : /
- Rust :
- Java : usually fails loudly already; no extra check needed.
A silently-passing zero-test run is the most dangerous failure mode of a conformance runner — always guard against it. This applies to both variants.
Conventions
Shared across both shell flavors and both variants:
- Exit codes:
- — unrecoverable invocation error: missing argument, missing toolchain, can't enter working folder, can't create venv (install-inline), or prepared environment missing/broken (activate-only). Matches the reference scripts'
UNRECOVERABLE_ERROR_EXIT_CODE
.
- — "no tests discovered" guard tripped (see above).
- Any other non-zero code — propagated from the underlying test command.
- Working folder naming: where is a short identifier for the language (, , , , , ...). Use the first argument (the build folder) in the path, never the conformance tests folder. All dependency installs, build outputs, caches, test runner artifacts, and the test invocation itself live inside this folder. Nothing the script does should touch after step 5 (install-inline) / step 4 (activate-only), or at any point.
- Logging: print short progress lines (
"Preparing <lang> build subfolder: ..."
, "Activating prepared virtual environment..."
, "Running <lang> conformance tests..."
) so failures are easy to triage. Wrap noisy "preparing" lines in a check if matching the Python reference.
- Capture before . This is the single most common bug in hand-written conformance scripts: forgetting that the conformance tests folder argument is relative to the invocation directory, not the working folder.
Dependency isolation (install-inline)
This section applies to
install-inline scripts only. For activate-only scripts, the isolation location is set up by prepare; you just need to point the test command at it — see
Activating a prepared environment.
The dependency environment must live
inside so the test run can't be polluted by — or pollute — the user's global caches. Pick the most idiomatic isolation mechanism for the language:
| Language | Isolation mechanism | Install command (run inside ) | Test command (point at ) |
|---|
| Python | at | python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
| python -m unittest discover -b -s "$current_dir/$2"
(or ) |
| Node.js | local (default) | (preferred) or | npx jest --rootDir "$current_dir/$2"
|
| Java | project-scoped Maven repo at | mvn -Dmaven.repo.local=./.m2 install -DskipTests
(build + install artifact so the test pom can resolve it) | mvn -f "$current_dir/$2/pom.xml" -Dmaven.repo.local="$(pwd)/.m2" test
|
| Go | module cache at | GOMODCACHE="$PWD/.gocache" go mod download
(optional pre-warm) | GOMODCACHE="$PWD/.gocache" go test "$current_dir/$2/..."
|
| Rust | cargo home at | CARGO_HOME="$PWD/.cargo" cargo fetch
(optional pre-warm) | CARGO_HOME="$PWD/.cargo" cargo test --manifest-path "$current_dir/$2/Cargo.toml"
|
Notes:
- Every path in the install command and test command is relative to . That's why the script s into the working folder in step 6 — from that point on, , , , etc. all resolve under , never under or .
- Always pass the isolation flag/env var to both the install command and the test command. They must agree on where deps live, otherwise the test command will silently fall back to the global cache or (worse) write into / .
- Python is the only ecosystem where the venv is mandatory to satisfy "into a virtual environment" literally. The others use language-native equivalents that achieve the same isolation.
- Propagate the install exit code immediately. In Bash: . In PowerShell: check and if non-zero.
- Time the dependency setup with (Bash) / (PowerShell) and print
"Requirements setup completed in X.XX seconds"
. If this number is large, that's the signal to add a prepare_environment_<lang>
script (and switch this script to the activate-only variant).
Activating a prepared environment (activate-only)
This section applies to activate-only scripts only. The isolation location was created by prepare; conformance just needs to attach to it and pass the right flags to the test command.
| Language | Verify exists in step 4 | Activate in step 6 | Test command in step 8 (point at ) |
|---|
| Python | .tmp/<lang>_$1/.venv/bin/activate
| source .venv/bin/activate
(after -ing into the working folder) | python -m unittest discover -b -s "$current_dir/$2"
|
| Node.js | .tmp/<lang>_$1/node_modules/
| (nothing) | npx jest --rootDir "$current_dir/$2"
|
| Java | | MAVEN_LOCAL_REPO="$(pwd)/.m2"
| mvn -f "$current_dir/$2/pom.xml" -Dmaven.repo.local="$MAVEN_LOCAL_REPO" test
|
| Go | | export GOMODCACHE="$(pwd)/.gocache"
| go test "$current_dir/$2/..."
|
| Rust | | export CARGO_HOME="$(pwd)/.cargo"
| cargo test --manifest-path "$current_dir/$2/Cargo.toml"
|
Notes:
- Verify, don't recreate. If is missing, exit with a clear "did you run prepare_environment first?" message — do not silently fall back to creating it inline. That would silently degrade a misconfigured project into the install-inline path and mask the real problem.
- Match prepare's isolation paths exactly. If prepare puts the venv at and you look for it at , the verify step will always fail. Read
implement-prepare-environment-script
for the canonical paths.
- Don't time anything in this variant. The slow phase is prepare; conformance just runs the tests. Adding a duration log here is misleading — it makes the script look like it's doing the install when it isn't.
Bash specifics
- Shebang: .
- File naming:
run_conformance_tests_<lang>.sh
, placed in (skill reference) or (target project).
- Arguments: = build folder, = conformance tests folder.
- Make it executable: the produced script.
- failure check: the reference scripts use the + pattern. Keep it.
PowerShell specifics
- No shebang. Use a
param([Parameter(Mandatory=$true)][string]$BuildFolder, [Parameter(Mandatory=$true)][string]$ConformanceTestsFolder)
block at the top instead.
- File naming:
run_conformance_tests_<lang>.ps1
.
- Exit codes: use etc. (PowerShell honors them just like Bash).
- Toolchain check: prefer
Get-Command <tool> -ErrorAction SilentlyContinue
and, where a specific version is needed, parse the tool's output.
- Filesystem: use ,
Remove-Item -Recurse -Force
, New-Item -ItemType Directory
, , . Quote paths to handle spaces.
- Capture original cwd:
$currentDir = (Get-Location).Path
before any call.
- No step needed. If execution policy is likely to block the script, mention
Set-ExecutionPolicy -Scope CurrentUser RemoteSigned
to the user — don't bake it into the script.
Workflow
- Decide the variant. Look in the project for
prepare_environment_<lang>.sh
/ (check , then any prepare-environment-script:
key in ). If present → emit activate-only. If absent → emit install-inline. See Variant decision.
- Confirm the target language, shell flavor (Bash or PowerShell), and dependency manifest (, / , , , , ...). Ask if any is unclear.
- Read assets/run_conformance_tests_java.sh and assets/run_conformance_tests_python.sh to refresh the exact structure. Both are install-inline references — for activate-only, follow steps 4–7 of the activate-only variant and the Activating a prepared environment table.
- Translate each step into the equivalent commands for the target language and shell. The toolchain check, dependency install/activate, and test invocation are the language-specific parts; the rest is mechanical translation between Bash and PowerShell syntax.
- Pick the right per-language row:
- Install-inline: Dependency isolation (install-inline) table — use the same flag/env var in steps 7 and 8.
- Activate-only: Activating a prepared environment table — use the matching verify, activate, and test-command columns in steps 4, 6, and 8.
- Add the language-appropriate "no tests discovered" guard from No tests discovered detection.
- Save the new script. For Bash, it.
- For activate-only scripts only: smoke-test by running
prepare_environment_<lang>.<sh|ps1> <build> && run_conformance_tests_<lang>.<sh|ps1> <build> <tests>
. If the conformance script errors with "prepared environment missing" right after a successful prepare, the two scripts disagree on either the working-folder path or the isolation location — fix that before declaring done.
Anti-Patterns
- (Hard mistake) Don't install into, build into, or otherwise write to the source build folder () or the conformance tests folder (). Both arguments are read-only input. Every install, cache, build artifact, log, JUnit XML, coverage report, compiled test class, and temp file must land in . This includes never running , , , or with or as their or target; never letting a venv / / / / directory appear inside or ; and never running the test command from inside either folder. The whole point of staging into is so the build folder remains a clean artifact of the render and the conformance tests folder remains a clean tree under the user's version control — writing to either one corrupts those guarantees.
- Don't emit the install-inline variant when a
prepare_environment_<lang>
script already exists. The conformance script's will wipe everything prepare did, and the inline install will redo it from scratch on every run. Always run the Variant decision check first.
- Don't emit the activate-only variant when no prepare script exists. The "verify prepared environment" check will fail on every run because nothing has populated the working folder.
- Don't silently fall back from activate-only to install-inline when the prepared environment is missing. Exit with a clear error so the misconfiguration is visible. Silent fallback hides the real bug and produces inconsistent behavior between runs.
- Don't copy the conformance tests folder into . Only the build folder is staged (and only in install-inline). The test folder is read in place from .
- Don't compute the test path after . Capture first; otherwise will be resolved relative to the working folder and silently miss the tests.
- Don't skip the "no tests discovered" check. A conformance suite that finds zero tests and exits is the worst possible failure mode — it looks like success in CI.
- Don't skip the toolchain check, even when "everyone has it installed" — exit code is what the calling system relies on to detect a missing runtime.
- Don't reuse the source folder in place (install-inline). Always copy into first; the renderer relies on this isolation.
- Don't change the exit-code contract. Other parts of the system branch on and specifically — and these codes must be identical between the Bash and PowerShell variants.
- Don't write a cross-shell hybrid (e.g. a that detects PowerShell, or vice versa). Ship one script per shell, named with the appropriate extension.
- Don't install dependencies into the user's global location (, system-wide , , etc.) in the install-inline variant. Always isolate inside so concurrent runs and other projects can't interfere.
- Don't run the test command without first verifying the install / activation succeeded. A failed install (or missing prepared env) followed by a "test" run produces misleading errors that look like test failures.