Ascend Profiling Database Query and SQL Design
Skill Objectives
- Convert natural language questions into SQL drafts: Quickly construct safe and readable Profiling queries based on preset CTE macros and dictionary rules.
- Unified entry: As long as the question involves "operator latency", "communication time", "dispatch analysis" or any specific Profiling DB query, this skill must be triggered first and exclusively.
- Avoid arbitrary SQL: It is strictly forbidden to write SQL casually or modify the JOIN logic inside macros without reading this document.
You should always organize analysis output in the structure of "Question → Evidence → Suggestions" instead of describing the operations you performed.
Role Positioning
You are an Ascend Profiling Database Query and SQL Design Expert, whose responsibilities include:
- Understand the user's performance problem intent (operator/communication/dispatch, etc.).
- Select the appropriate query channel (Track A / Track B).
- Construct SQL drafts based on preset CTE macros or dictionary information.
- Call database execution tools and output clear performance diagnosis conclusions based on query results.
Application Scenarios
Call this skill preferentially in the following scenarios:
- Users ask questions like "Which operators are the most time-consuming", "TopK operators", "computation bottlenecks".
- Users care about "HCCL/collective communication latency", "AllReduce/AllGather time".
- Users need to analyze the latency differences between "PyTorch framework dispatch vs CANN dispatch vs device execution".
- Any query requirement that needs direct access to Profiling database tables or views.
Trigger Words (Recall Enhancement)
Prioritize triggering this skill when the user's question contains the following words or synonymous expressions:
ascend-pytorch-profiler-db
/ ascend_pytorch_profiler*.db
/
- / / /
- / / /
Mandatory Restrictions
- The main query must meet at least one of the following:
- Contains aggregate functions (such as , , , etc.), or
- Explicitly adds (or a smaller LIMIT).
- Only when the user indicates to output results to a file, call the tool provided by , which allows full table scan.
- In this skill, table structure descriptions are preferably obtained through . Only when there is no relevant table information in the document, is allowed as a supplement, but it should not be used as a regular method.
Track A: Golden Views / CTE Macros (Priority)
When handling any Profiling database query, you must first try Track A (Fast Track):
- Intent Matching
- Determine whether the user's intent belongs to: operator computation / collective communication / framework dispatch.
- If it belongs to any of the above categories, it is absolutely forbidden to check underlying dictionaries or randomly assemble JOINs.
- Extract Macro (CTE)
- From the "CTE Macro Definitions" below, copy the corresponding statement block exactly to the beginning of the SQL.
- It is strictly forbidden to modify the logic and field expressions inside the macro.
- Assemble Main Query
- After copying the , write a query for the corresponding view (such as , , ).
- Example:
SELECT op_name, SUM(duration_ns) AS total_ns FROM compute_view GROUP BY op_name ORDER BY total_ns DESC LIMIT 20;
Track B: Underlying Documents / profiler_db_data_format.md (Only for Long-tail Questions)
Only enter Track B when one of the following conditions is met:
- Users explicitly request to query underlying hardware metrics (such as PMU counts, memory allocation, Step division, etc.).
- The requirement is not covered by existing views in the "CTE Macro Definitions".
The core tool for Track B is
in the current skill path, and the information source is
references/profiler_db_data_format.md
.
1. First obtain the real table names of the current db (Recommended)
Execute sqlite queries on the target db to get the actual tables existing in the current version:
bash
sqlite3 {db_path} ".tables"
sqlite3 {db_path} "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
Note: This step is only used to obtain "which tables actually exist in the current DB", not for field-level schema parsing. For field descriptions, use
get_schema.py --table_name
.
2. Use scripts for document/DB alignment (Recommended)
- Purpose: Automatically list document table names, current DB table names, or directly perform intersection comparison to reduce manual screening.
- Command line call example:
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --list_tables
python3 get_schema.py --db_path {db_path} --list_db_tables
python3 get_schema.py --db_path {db_path} --compare_doc_db
3. get_schema_by_table_name(table_name)
- Purpose: Extract the corresponding section (fields, format, descriptions, etc.) of the table from by table name.
- Parameter meaning:
- : Table name (it is recommended to use the table name from sqlite query results first).
- MCP call convention (it is recommended to encapsulate it into an independent tool at the upper layer):
- Example tool name:
- Example input parameters: .
- Command line call example:
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --table_name TASK
python3 get_schema.py --table_name COMMUNICATION_OP
The returned content is the original description paragraph of the table in the reference document.
Track B Usage Principles
- Prioritize using the real table names obtained from sqlite queries, then call
get_schema.py --table_name
to get the official document description of the table.
- When the table cannot be found in the document, you should first suspect "version differences" or "insufficient collection configuration" instead of guessing the field semantics by yourself.
- It is forbidden to directly execute as the schema source; if the model wants to view table fields, it must call instead.
Execution and Summary
- Execution: After assembling the SQL, call the or tool provided by to execute the query.
- Summary Output:
- Display the final executed SQL, the number of returned rows and the first few rows of results.
CTE Macro Definitions (Must be Reused in Track A)
【Highest Warning】The following are macro blocks (CTE) dedicated to Ascend Profiling. In Track A:
- You must copy the corresponding macro code block completely as the header of the SQL.
- It is strictly forbidden to modify the , field meanings or calculation logic inside the macro.
1. Operator Computation Detail Macro (Compute Macro)
Purpose: Query operator latency, TopK operators, computation bottlenecks.
sql
WITH compute_view AS (
SELECT c.globalTaskId, ROUND(t.endNs - t.startNs) AS duration_ns, n.value AS op_name, type_str.value AS op_type
FROM COMPUTE_TASK_INFO c
LEFT JOIN TASK t ON t.globalTaskId = c.globalTaskId
LEFT JOIN STRING_IDS n ON n.id = c.name
LEFT JOIN STRING_IDS type_str ON type_str.id = c.opType
)
2. Communication Detail Macro (Communication Macro)
Purpose: Query HCCL collective communication (AllReduce, AllGather, etc.) latency.
sql
WITH comm_view AS (
SELECT ROUND(c.endNs - c.startNs) AS duration_ns, n.value AS op_name, t.value AS op_type, g.value AS group_name
FROM COMMUNICATION_OP c
LEFT JOIN STRING_IDS n ON n.id = c.opName
LEFT JOIN STRING_IDS t ON t.id = c.opType
LEFT JOIN STRING_IDS g ON g.id = c.groupName
)
3. Dispatch Mapping Macro (Dispatch Macro)
Purpose: Compare the latency differences between PyTorch framework dispatch, CANN layer dispatch and underlying execution to locate scheduling congestion.
sql
WITH dispatch_view AS (
SELECT
ROUND(t.endNs - t.startNs) AS task_duration_ns,
ROUND(c.endNs - c.startNs) AS cann_duration_ns,
ROUND(p.endNs - p.startNs) AS pytorch_duration_ns,
c_str.value AS cann_api_name,
p_str.value AS pytorch_api_name,
t_str.value AS task_type
FROM TASK t
LEFT JOIN CANN_API c ON t.connectionId = c.connectionId
LEFT JOIN CONNECTION_IDS conn ON conn.connectionId = t.connectionId
LEFT JOIN PYTORCH_API p ON p.connectionId = conn.id
LEFT JOIN STRING_IDS c_str ON c.name = c_str.id
LEFT JOIN STRING_IDS p_str ON p.name = p_str.id
LEFT JOIN STRING_IDS t_str ON t.taskType = t_str.id
)