Loading...
Loading...
Found 905 Skills
Huawei Ascend NPU npu-smi command reference. Use for device queries (health, temperature, power, memory, processes, ECC), configuration (thresholds, modes, fan), firmware upgrades (MCU, bootloader, VRD), virtualization (vNPU), and certificate management.
Static inspection of Triton operator code quality (Host side + Device side) for Ascend NPU. Used when users need to identify potential bugs, API misuses, and performance risks by reading code. Core capabilities: (1) Ascend API constraint compliance check (2) Mask integrity verification (3) Precision processing review (4) Code pattern recognition. Note: This Skill only focuses on static code analysis; compile-time and runtime issues are handled by other Skills.
GPU Code to Ascend NPU Adaptation Review Expert. When users need to migrate GPU-based code (especially deep learning and model inference-related code) to Huawei Ascend NPU, this skill must be used for comprehensive review. This skill can identify bottlenecks in GPU-to-NPU migration, write adaptation scripts, generate verification plans, and output a complete Markdown review report. Trigger scenarios include: users mentioning keywords such as "NPU adaptation", "Ascend migration", "GPU to NPU", "Ascend", "CANN", "model migration", "operator adaptation", or users requesting to review GPU code repositories and migrate to the NPU platform.
Verify and build the required environment for Triton operator development on the Ascend platform, including configurations of dependencies such as CANN, Python/torch/torch_npu/triton-ascend and PATH environment variables. This is used when users need to configure the Triton operator development environment, check the installation of CANN/torch/triton-ascend, or verify whether the environment is available.
AscendC Operator Precision Evaluation. Generate a comprehensive precision test case set (≥30 cases) for the compiled and installed operator, run the tests and generate a precision verification report. Keywords: precision test, precision evaluation, precision report, accuracy, error analysis. After execution, YOU MUST display the overview, failure summary and key findings in the current conversation, and must not only attach the report path.
Evaluate the performance of Triton operators on Ascend NPU. It is used when users need to analyze operator performance bottlenecks, collect and compare operator performance using msprof/msprof op, diagnose Memory-Bound/Compute-Bound bottlenecks, measure hardware utilization metrics, and generate performance evaluation reports.
Deep Performance Optimization Skill for Triton Operators on Ascend NPU, dedicated to achieving the Triton operator performance improvement required by users. Core technologies include but are not limited to Unified Buffer (UB) capacity planning, multi-Tokens parallel processing, MTE/Vector pipeline parallelism, mask optimization, etc. This Skill must be triggered when the user mentions the following: performance optimization of Vector-type Triton operators on Ascend NPU.
Generate PyTorch-style interface documentation (README.md) for AscendC operators. Trigger scenarios: Use this when interface documentation needs to be generated after compilation and debugging are completed, or when the user mentions "generate operator documentation", "create README", "document operator", "help me write documentation" (in operator context), "operator documentation".
Generate interface documents for Triton operators of Ascend NPU. Used when users need to create or update interface documents for Triton operators of Ascend NPU. Core capabilities: (1) Generate standardized documents based on templates (2) Support the list of Ascend NPU product models (3) Provide specifications for operator parameter descriptions (4) Generate call example frameworks.
Python code refactoring skills, covering code smell identification, design pattern application, readability improvement, and practical experience. This skill is applicable when users request "refactor code", "refactor", "code optimization", "improve code quality", "code smell review", "apply design patterns", "enhance readability", or submit code review requests. It supports generating structured refactoring documents after refactoring completion ("output refactoring document", "generate refactoring report"). It includes practical patterns extracted from 20+ real refactoring PRs in the vllm-ascend repository.
AscendC Operator End-to-End Development Orchestrator. Used when users need to develop new operators, implement custom operators, or complete the full process from requirements to testing. Keywords: operator development, end-to-end, full process, workflow orchestration, new operator creation.
AscendC Operator Design Completion - Assist users in completing operator architecture design, interface definition, and performance planning. Use this skill when users mention operator design, operator development, tiling strategy, memory planning, AscendC kernel design, two-level tiling, inter-core splitting, or intra-core splitting.