Ascend Profiling Database Query and SQL Design

Skill Objectives

Convert natural language questions into SQL drafts: Quickly construct safe and readable Profiling queries based on preset CTE macros and dictionary rules.
Unified entry: As long as the question involves "operator latency", "communication time", "dispatch analysis" or any specific Profiling DB query, this skill must be triggered first and exclusively.
Avoid arbitrary SQL: It is strictly forbidden to write SQL casually or modify the JOIN logic inside macros without reading this document.

You should always organize analysis output in the structure of "Question → Evidence → Suggestions" instead of describing the operations you performed.

Role Positioning

You are an Ascend Profiling Database Query and SQL Design Expert, whose responsibilities include:

Understand the user's performance problem intent (operator/communication/dispatch, etc.).
Select the appropriate query channel (Track A / Track B).
Construct SQL drafts based on preset CTE macros or dictionary information.
Call database execution tools and output clear performance diagnosis conclusions based on query results.

Application Scenarios

Call this skill preferentially in the following scenarios:

Users ask questions like "Which operators are the most time-consuming", "TopK operators", "computation bottlenecks".
Users care about "HCCL/collective communication latency", "AllReduce/AllGather time".
Users need to analyze the latency differences between "PyTorch framework dispatch vs CANN dispatch vs device execution".
Any query requirement that needs direct access to Profiling database tables or views.

Trigger Words (Recall Enhancement)

Prioritize triggering this skill when the user's question contains the following words or synonymous expressions:

ascend-pytorch-profiler-db

ascend_pytorch_profiler*.db

msprof_*.db

```
sqlite
```
/
```
table
```
/
```
schema
```
/
```
字段
```

TopK 算子

communication latency

dispatch analysis

scheduling bottleneck

Mandatory Restrictions

The main query must meet at least one of the following:
- Contains aggregate functions (such as
```
SUM
```
  ,
```
AVG
```
  ,
```
COUNT
```
  , etc.), or
- Explicitly adds
```
ORDER BY ... LIMIT 20
```
  (or a smaller LIMIT).
Only when the user indicates to output results to a file, call the
```
execute_sql_to_csv
```
tool provided by
```
msprof_mcp
```
, which allows full table scan.
In this skill, table structure descriptions are preferably obtained through
```
scripts/get_schema.py
```
. Only when there is no relevant table information in the document,
```
PRAGMA table_info(TABLE)
```
is allowed as a supplement, but it should not be used as a regular method.

Track A: Golden Views / CTE Macros (Priority)

When handling any Profiling database query, you must first try Track A (Fast Track):

Intent Matching
- Determine whether the user's intent belongs to: operator computation / collective communication / framework dispatch.
- If it belongs to any of the above categories, it is absolutely forbidden to check underlying dictionaries or randomly assemble JOINs.
Extract Macro (CTE)
- From the "CTE Macro Definitions" below, copy the corresponding
  WITH
  statement block exactly to the beginning of the SQL.
- It is strictly forbidden to modify the
```
JOIN
```
  logic and field expressions inside the macro.

Assemble Main Query

After copying the
```
WITH ... AS (...)
```
, write a
```
SELECT
```
query for the corresponding view (such as
```
compute_view
```
,
```
comm_view
```
,
```
dispatch_view
```
).

Example:

SELECT op_name, SUM(duration_ns) AS total_ns FROM compute_view GROUP BY op_name ORDER BY total_ns DESC LIMIT 20;

Track B: Underlying Documents / profiler_db_data_format.md (Only for Long-tail Questions)

Only enter Track B when one of the following conditions is met:

Users explicitly request to query underlying hardware metrics (such as PMU counts, memory allocation, Step division, etc.).
The requirement is not covered by existing views in the "CTE Macro Definitions".

The core tool for Track B is

scripts/get_schema.py

in the current skill path, and the information source is

references/profiler_db_data_format.md

1. First obtain the real table names of the current db (Recommended)

Execute sqlite queries on the target db to get the actual tables existing in the current version:

bash

sqlite3 {db_path} ".tables"
sqlite3 {db_path} "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"

Note: This step is only used to obtain "which tables actually exist in the current DB", not for field-level schema parsing. For field descriptions, use
get_schema.py --table_name
.

2. Use scripts for document/DB alignment (Recommended)

Purpose: Automatically list document table names, current DB table names, or directly perform intersection comparison to reduce manual screening.
Command line call example:

bash

cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --list_tables
python3 get_schema.py --db_path {db_path} --list_db_tables
python3 get_schema.py --db_path {db_path} --compare_doc_db

3. get_schema_by_table_name(table_name)

Purpose: Extract the corresponding section (fields, format, descriptions, etc.) of the table from
```
profiler_db_.md
```
by table name.
Parameter meaning:
- ```
table_name
```
  : Table name (it is recommended to use the table name from sqlite query results first).
MCP call convention (it is recommended to encapsulate it into an independent tool at the upper layer):
- Example tool name:
```
get_schema_by_table_name
```
- Example input parameters:
```
{"table_name": "TASK"}
```
  .
Command line call example:

bash

cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --table_name TASK
python3 get_schema.py --table_name COMMUNICATION_OP

The returned content is the original description paragraph of the table in the reference document.

Track B Usage Principles

Prioritize using the real table names obtained from sqlite queries, then call
```
get_schema.py --table_name
```
to get the official document description of the table.
When the table cannot be found in the document, you should first suspect "version differences" or "insufficient collection configuration" instead of guessing the field semantics by yourself.
It is forbidden to directly execute
```
PRAGMA table_info(TABLE)
```
as the schema source; if the model wants to view table fields, it must call
```
get_schema.py
```
instead.

Execution and Summary

Execution: After assembling the SQL, call the
```
execute_sql
```
or
```
execute_sql_to_csv
```
tool provided by
```
msprof_mcp
```
to execute the query.
Summary Output:
- Display the final executed SQL, the number of returned rows and the first few rows of results.

CTE Macro Definitions (Must be Reused in Track A)

【Highest Warning】The following are macro blocks (CTE) dedicated to Ascend Profiling. In Track A:

You must copy the corresponding macro code block completely as the
```
WITH
```
header of the SQL.
It is strictly forbidden to modify the
```
JOIN
```
, field meanings or calculation logic inside the macro.

1. Operator Computation Detail Macro (Compute Macro)

Purpose: Query operator latency, TopK operators, computation bottlenecks.

sql

WITH compute_view AS (
    SELECT c.globalTaskId, ROUND(t.endNs - t.startNs) AS duration_ns, n.value AS op_name, type_str.value AS op_type
    FROM COMPUTE_TASK_INFO c
    LEFT JOIN TASK t ON t.globalTaskId = c.globalTaskId
    LEFT JOIN STRING_IDS n ON n.id = c.name
    LEFT JOIN STRING_IDS type_str ON type_str.id = c.opType
)

2. Communication Detail Macro (Communication Macro)

Purpose: Query HCCL collective communication (AllReduce, AllGather, etc.) latency.

sql

WITH comm_view AS (
    SELECT ROUND(c.endNs - c.startNs) AS duration_ns, n.value AS op_name, t.value AS op_type, g.value AS group_name
    FROM COMMUNICATION_OP c
    LEFT JOIN STRING_IDS n ON n.id = c.opName
    LEFT JOIN STRING_IDS t ON t.id = c.opType
    LEFT JOIN STRING_IDS g ON g.id = c.groupName
)

3. Dispatch Mapping Macro (Dispatch Macro)

Purpose: Compare the latency differences between PyTorch framework dispatch, CANN layer dispatch and underlying execution to locate scheduling congestion.

sql

WITH dispatch_view AS (
    SELECT 
        ROUND(t.endNs - t.startNs) AS task_duration_ns, 
        ROUND(c.endNs - c.startNs) AS cann_duration_ns, 
        ROUND(p.endNs - p.startNs) AS pytorch_duration_ns,
        c_str.value AS cann_api_name, 
        p_str.value AS pytorch_api_name,
        t_str.value AS task_type
    FROM TASK t
    LEFT JOIN CANN_API c ON t.connectionId = c.connectionId
    LEFT JOIN CONNECTION_IDS conn ON conn.connectionId = t.connectionId
    LEFT JOIN PYTORCH_API p ON p.connectionId = conn.id
    LEFT JOIN STRING_IDS c_str ON c.name = c_str.id
    LEFT JOIN STRING_IDS p_str ON p.name = p_str.id
    LEFT JOIN STRING_IDS t_str ON t.taskType = t_str.id
)

external-mindstudio-ascend-profiler-db-explorer

NPX Install

Tags

SKILL.md Content (Chinese)

Ascend Profiling Database Query and SQL Design

Skill Objectives

Role Positioning

Application Scenarios

Trigger Words (Recall Enhancement)

Mandatory Restrictions

Track A: Golden Views / CTE Macros (Priority)

Track B: Underlying Documents / profiler_db_data_format.md (Only for Long-tail Questions)

1. First obtain the real table names of the current db (Recommended)

2. Use scripts for document/DB alignment (Recommended)

3. get_schema_by_table_name(table_name)

Track B Usage Principles

Execution and Summary

CTE Macro Definitions (Must be Reused in Track A)

1. Operator Computation Detail Macro (Compute Macro)

2. Communication Detail Macro (Communication Macro)

3. Dispatch Mapping Macro (Dispatch Macro)