Loading...
Loading...
Found 3 Skills
Verify and build the required environment for Triton operator development on the Ascend platform, including configurations of dependencies such as CANN, Python/torch/torch_npu/triton-ascend and PATH environment variables. This is used when users need to configure the Triton operator development environment, check the installation of CANN/torch/triton-ascend, or verify whether the environment is available.
Maintain JSONL-only profiler performance test cases under csrc/ops/<op>/test in ascend-kernel. Collect data using torch_npu.profiler (with fixed warmup=5 and active=5), aggregate the Total Time(us) from ASCEND_PROFILER_OUTPUT/op_statistic.csv, and output a unified Markdown comparison report (custom operator vs baseline) that includes a DType column. Do not generate perf_cases.json or *_profiler_results.json. Refer to examples/layer_norm_profiler_reference/ for the reference implementation.
AI for Science 场景下的昇腾 NPU Profiling 采集与性能分析 Skill,用于在华为 Ascend NPU 上使用 torch_npu.profiler 采集 L0、L1、L2 级性能数据,分析训练或推理中的算子耗时、调用栈、内存与瓶颈,并指导后续调优。