Loading...
Found 1 Skills
Maintain JSONL-only profiler performance test cases under csrc/ops/<op>/test in ascend-kernel. Collect data using torch_npu.profiler (with fixed warmup=5 and active=5), aggregate the Total Time(us) from ASCEND_PROFILER_OUTPUT/op_statistic.csv, and output a unified Markdown comparison report (custom operator vs baseline) that includes a DType column. Do not generate perf_cases.json or *_profiler_results.json. Refer to examples/layer_norm_profiler_reference/ for the reference implementation.