Loading...
Loading...
Found 2 Skills
Evaluate the performance of Triton operators on Ascend NPU. It is used when users need to analyze operator performance bottlenecks, collect and compare operator performance using msprof/msprof op, diagnose Memory-Bound/Compute-Bound bottlenecks, measure hardware utilization metrics, and generate performance evaluation reports.
Optimize the performance of Triton operators optimized for Ascend NPU. This guide is for users who need to optimize the performance of Triton operators on Ascend NPU, resolve UB overflow, improve Cube unit utilization, and design Tiling strategies.