Loading...
Loading...
Compile TensorRT-LLM on a compute node inside a Docker container. Use this when already on a compute node with GPUs visible.
npx skill4agent add nvidia/skills exec-local-compile| Scenario | Use This Skill? |
|---|---|
On a compute node with GPUs visible ( | Yes |
| On a SLURM login node (no GPUs) | No — use |
nvidia-smi/usr/local/tensorrtnvidia-smicdgit checkout main && git pull-c--clean./scripts/build_wheel.py --trt_root /usr/local/tensorrt --benchmarks --use_ccache -a "<arch>" -f --nvtx<arch>nvidia-smipip install -e .[devel]python3 -c "import tensorrt_llm; print(tensorrt_llm.__version__)"| Flag | Description |
|---|---|
| TensorRT installation path (standard in NVIDIA containers) |
| Build the C++ benchmarks |
| Target GPU architecture(s) |
| Enable NVTX markers for profiling |
| Use ccache for faster recompilation |
| Skip some kernels for faster dev compilation. Always use for dev builds. |
| Clean build directory before building. Only when needed (see below). |
| Build in-place without creating a wheel file |
| Skip virtual environment creation |
| Value | GPU Family |
|---|---|
| Blackwell (B200, GB200) |
| Hopper (H100, H200) |
| Ada Lovelace (L40S) |
| Ampere (A100) |
| Multiple architectures |
-cCMakeLists.txt*.cmake