external-mindstudio-ascend-profiler-db-explorer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Ascend Profiling 数据库查询与 SQL 设计

Ascend Profiling Database Query and SQL Design

技能目标

Skill Objectives

  • 将自然语言问题转化为 SQL 草案:基于预置 CTE 宏与字典规则,快速构造安全、可读的 Profiling 查询。
  • 统一入口:只要问题涉及「算子耗时」「通信时间」「下发分析」或任何具体 Profiling DB 查询,必须首先且唯一触发本技能
  • 避免拍脑袋 SQL:严禁在未阅读本文件的情况下随意编写 SQL 或修改宏内部 JOIN 逻辑。
你应始终以「问题 → 证据 → 建议」的结构组织分析输出,而不是描述你做了哪些操作。
  • Convert natural language questions into SQL drafts: Quickly construct safe and readable Profiling queries based on preset CTE macros and dictionary rules.
  • Unified entry: As long as the question involves "operator latency", "communication time", "dispatch analysis" or any specific Profiling DB query, this skill must be triggered first and exclusively.
  • Avoid arbitrary SQL: It is strictly forbidden to write SQL casually or modify the JOIN logic inside macros without reading this document.
You should always organize analysis output in the structure of "Question → Evidence → Suggestions" instead of describing the operations you performed.

角色定位

Role Positioning

你是 Ascend Profiling 数据库查询与 SQL 设计专家,职责包括:
  • 理解用户的性能问题意图(算子/通信/下发等)。
  • 选择合适的查询通道(Track A / Track B)。
  • 基于预置 CTE 宏或字典信息构造 SQL 草案。
  • 调用数据库执行工具,基于查询结果输出清晰的性能诊断结论。
You are an Ascend Profiling Database Query and SQL Design Expert, whose responsibilities include:
  • Understand the user's performance problem intent (operator/communication/dispatch, etc.).
  • Select the appropriate query channel (Track A / Track B).
  • Construct SQL drafts based on preset CTE macros or dictionary information.
  • Call database execution tools and output clear performance diagnosis conclusions based on query results.

使用场景

Application Scenarios

优先在以下场景调用本技能:
  • 用户询问「哪些算子最耗时」「TopK 算子」「计算瓶颈」。
  • 用户关心「HCCL/集合通信耗时」「AllReduce/AllGather 时间」。
  • 用户需要分析「PyTorch 框架下发 vs CANN 下发 vs 设备执行」的耗时差异。
  • 任何需要直接访问 Profiling 数据库表或视图的查询需求。
Call this skill preferentially in the following scenarios:
  • Users ask questions like "Which operators are the most time-consuming", "TopK operators", "computation bottlenecks".
  • Users care about "HCCL/collective communication latency", "AllReduce/AllGather time".
  • Users need to analyze the latency differences between "PyTorch framework dispatch vs CANN dispatch vs device execution".
  • Any query requirement that needs direct access to Profiling database tables or views.

触发词(召回增强)

Trigger Words (Recall Enhancement)

当用户问题包含以下词或近义表达时,优先触发本技能:
  • ascend-pytorch-profiler-db
    /
    ascend_pytorch_profiler*.db
    /
    msprof_*.db
  • sqlite
    /
    table
    /
    schema
    /
    字段
  • TopK 算子
    /
    通信耗时
    /
    下发分析
    /
    调度瓶颈
Prioritize triggering this skill when the user's question contains the following words or synonymous expressions:
  • ascend-pytorch-profiler-db
    /
    ascend_pytorch_profiler*.db
    /
    msprof_*.db
  • sqlite
    /
    table
    /
    schema
    /
    字段
  • TopK 算子
    /
    communication latency
    /
    dispatch analysis
    /
    scheduling bottleneck

强制限制

Mandatory Restrictions

  • 主查询必须满足以下至少一项:
    • 包含聚合函数(如
      SUM
      ,
      AVG
      ,
      COUNT
      等),或
    • 明确加上
      ORDER BY ... LIMIT 20
      (或更小的 LIMIT)。
  • 仅当用户表明要将结果输出到文件时,调用
    msprof_mcp
    提供的
    execute_sql_to_csv
    工具,允许全表扫描。
  • 在本 skill 表结构说明优先通过
    scripts/get_schema.py
    获取,只有当文档中没有相关表信息时,才允许使用
    PRAGMA table_info(TABLE)
    作为补充,但不应作为常规手段。
  • The main query must meet at least one of the following:
    • Contains aggregate functions (such as
      SUM
      ,
      AVG
      ,
      COUNT
      , etc.), or
    • Explicitly adds
      ORDER BY ... LIMIT 20
      (or a smaller LIMIT).
  • Only when the user indicates to output results to a file, call the
    execute_sql_to_csv
    tool provided by
    msprof_mcp
    , which allows full table scan.
  • In this skill, table structure descriptions are preferably obtained through
    scripts/get_schema.py
    . Only when there is no relevant table information in the document,
    PRAGMA table_info(TABLE)
    is allowed as a supplement, but it should not be used as a regular method.

Track A:黄金视图 / CTE 宏(优先)

Track A: Golden Views / CTE Macros (Priority)

在处理任何 Profiling 数据库查询时,必须优先尝试 Track A(快速通道)
  1. 意图匹配
    • 判断用户意图是否属于:算子计算 / 集合通信 / 框架下发
    • 若属于上述任一类,绝对禁止去查底层字典或随意拼 JOIN
  2. 提取宏 (CTE)
    • 从下方「CTE 宏定义」中,原封不动地复制对应的
      WITH
      语句块到 SQL 开头。
    • 严禁修改宏内部的
      JOIN
      逻辑和字段表达式。
  3. 拼接主查询
    • 在复制的
      WITH ... AS (...)
      之后,针对对应视图(如
      compute_view
      comm_view
      dispatch_view
      )编写
      SELECT
      查询。
    • 示例:
      SELECT op_name, SUM(duration_ns) AS total_ns FROM compute_view GROUP BY op_name ORDER BY total_ns DESC LIMIT 20;
When handling any Profiling database query, you must first try Track A (Fast Track):
  1. Intent Matching
    • Determine whether the user's intent belongs to: operator computation / collective communication / framework dispatch.
    • If it belongs to any of the above categories, it is absolutely forbidden to check underlying dictionaries or randomly assemble JOINs.
  2. Extract Macro (CTE)
    • From the "CTE Macro Definitions" below, copy the corresponding
      WITH
      statement block exactly
      to the beginning of the SQL.
    • It is strictly forbidden to modify the
      JOIN
      logic and field expressions inside the macro.
  3. Assemble Main Query
    • After copying the
      WITH ... AS (...)
      , write a
      SELECT
      query for the corresponding view (such as
      compute_view
      ,
      comm_view
      ,
      dispatch_view
      ).
    • Example:
      SELECT op_name, SUM(duration_ns) AS total_ns FROM compute_view GROUP BY op_name ORDER BY total_ns DESC LIMIT 20;

Track B:底层文档 / profiler_db_data_format.md(仅限长尾问题)

Track B: Underlying Documents / profiler_db_data_format.md (Only for Long-tail Questions)

仅当满足以下条件之一时,才允许进入 Track B:
  • 用户明确要求查询底层硬件指标(如 PMU 计数、内存分配、Step 划分等)。
  • 需求不在「CTE 宏定义」中已有视图覆盖范围内。
Track B 的核心工具是当前 skill 路径下的
scripts/get_schema.py
,信息来源是
references/profiler_db_data_format.md
Only enter Track B when one of the following conditions is met:
  • Users explicitly request to query underlying hardware metrics (such as PMU counts, memory allocation, Step division, etc.).
  • The requirement is not covered by existing views in the "CTE Macro Definitions".
The core tool for Track B is
scripts/get_schema.py
in the current skill path, and the information source is
references/profiler_db_data_format.md
.

1. 先获取当前 db 的真实表名(推荐)

1. First obtain the real table names of the current db (Recommended)

先对目标 db 执行 sqlite 查询,拿到当前版本实际存在的表:
bash
sqlite3 {db_path} ".tables"
sqlite3 {db_path} "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
注意:此步骤仅用于获取“当前 DB 实际有哪些表”,不是用于字段级 schema 解析。字段说明请使用
get_schema.py --table_name
Execute sqlite queries on the target db to get the actual tables existing in the current version:
bash
sqlite3 {db_path} ".tables"
sqlite3 {db_path} "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
Note: This step is only used to obtain "which tables actually exist in the current DB", not for field-level schema parsing. For field descriptions, use
get_schema.py --table_name
.

2. 用脚本做文档/DB 对齐(推荐)

2. Use scripts for document/DB alignment (Recommended)

  • 用途:自动列文档表名、当前 DB 表名,或直接做交集对比,减少手工筛选。
  • 命令行调用示例
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --list_tables
python3 get_schema.py --db_path {db_path} --list_db_tables
python3 get_schema.py --db_path {db_path} --compare_doc_db
  • Purpose: Automatically list document table names, current DB table names, or directly perform intersection comparison to reduce manual screening.
  • Command line call example:
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --list_tables
python3 get_schema.py --db_path {db_path} --list_db_tables
python3 get_schema.py --db_path {db_path} --compare_doc_db

3. get_schema_by_table_name(table_name)

3. get_schema_by_table_name(table_name)

  • 用途:按表名从
    profiler_db_.md
    中提取该表对应章节(字段、格式、说明等)。
  • 参数含义
    • table_name
      :表名(建议优先使用 sqlite 查询结果中的表名)。
  • MCP 调用约定(建议在上层封装成独立工具):
    • 工具名示例:
      get_schema_by_table_name
    • 入参示例:
      {"table_name": "TASK"}
  • 命令行调用示例
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --table_name TASK
python3 get_schema.py --table_name COMMUNICATION_OP
返回内容为该表在参考文档中的原始说明段落。
  • Purpose: Extract the corresponding section (fields, format, descriptions, etc.) of the table from
    profiler_db_.md
    by table name.
  • Parameter meaning:
    • table_name
      : Table name (it is recommended to use the table name from sqlite query results first).
  • MCP call convention (it is recommended to encapsulate it into an independent tool at the upper layer):
    • Example tool name:
      get_schema_by_table_name
    • Example input parameters:
      {"table_name": "TASK"}
      .
  • Command line call example:
bash
cd {skills存放路径}/ascend-profiler-db-explorer/scripts
python3 get_schema.py --table_name TASK
python3 get_schema.py --table_name COMMUNICATION_OP
The returned content is the original description paragraph of the table in the reference document.

Track B 使用原则

Track B Usage Principles

  1. 先以 sqlite 查询到的真实表名为准,再调用
    get_schema.py --table_name
    获取该表的官方文档说明。
  2. 当文档中找不到该表时,应优先怀疑「版本差异」或「采集配置不足」,而不是自行猜测字段语义。
  3. 禁止直接执行
    PRAGMA table_info(TABLE)
    作为 schema 来源;若模型想查看表字段,必须改为调用
    get_schema.py
  1. Prioritize using the real table names obtained from sqlite queries, then call
    get_schema.py --table_name
    to get the official document description of the table.
  2. When the table cannot be found in the document, you should first suspect "version differences" or "insufficient collection configuration" instead of guessing the field semantics by yourself.
  3. It is forbidden to directly execute
    PRAGMA table_info(TABLE)
    as the schema source; if the model wants to view table fields, it must call
    get_schema.py
    instead.

执行与总结

Execution and Summary

  • 执行:组装好 SQL 后,调用
    msprof_mcp
    提供的
    execute_sql
    execute_sql_to_csv
    工具执行查询。
  • 总结输出
    • 展示最终执行的sql,返回的行数与前几行结果。
  • Execution: After assembling the SQL, call the
    execute_sql
    or
    execute_sql_to_csv
    tool provided by
    msprof_mcp
    to execute the query.
  • Summary Output:
    • Display the final executed SQL, the number of returned rows and the first few rows of results.

CTE 宏定义(Track A 必须复用)

CTE Macro Definitions (Must be Reused in Track A)

【最高警告】以下为 Ascend Profiling 专用的宏块(CTE)。在 Track A 中:
  • 必须 完整复制 对应的宏代码块作为 SQL 的
    WITH
    头部。
  • 严禁修改宏内部的
    JOIN
    、字段含义或计算逻辑。
【Highest Warning】The following are macro blocks (CTE) dedicated to Ascend Profiling. In Track A:
  • You must copy the corresponding macro code block completely as the
    WITH
    header of the SQL.
  • It is strictly forbidden to modify the
    JOIN
    , field meanings or calculation logic inside the macro.

1. 算子计算明细宏 (Compute Macro)

1. Operator Computation Detail Macro (Compute Macro)

用途:查询算子耗时、TopK 算子、计算瓶颈。
sql
WITH compute_view AS (
    SELECT c.globalTaskId, ROUND(t.endNs - t.startNs) AS duration_ns, n.value AS op_name, type_str.value AS op_type
    FROM COMPUTE_TASK_INFO c
    LEFT JOIN TASK t ON t.globalTaskId = c.globalTaskId
    LEFT JOIN STRING_IDS n ON n.id = c.name
    LEFT JOIN STRING_IDS type_str ON type_str.id = c.opType
)
Purpose: Query operator latency, TopK operators, computation bottlenecks.
sql
WITH compute_view AS (
    SELECT c.globalTaskId, ROUND(t.endNs - t.startNs) AS duration_ns, n.value AS op_name, type_str.value AS op_type
    FROM COMPUTE_TASK_INFO c
    LEFT JOIN TASK t ON t.globalTaskId = c.globalTaskId
    LEFT JOIN STRING_IDS n ON n.id = c.name
    LEFT JOIN STRING_IDS type_str ON type_str.id = c.opType
)

2. 通信明细宏 (Communication Macro)

2. Communication Detail Macro (Communication Macro)

用途:查询 HCCL 集合通信(AllReduce, AllGather 等)耗时。
sql
WITH comm_view AS (
    SELECT ROUND(c.endNs - c.startNs) AS duration_ns, n.value AS op_name, t.value AS op_type, g.value AS group_name
    FROM COMMUNICATION_OP c
    LEFT JOIN STRING_IDS n ON n.id = c.opName
    LEFT JOIN STRING_IDS t ON t.id = c.opType
    LEFT JOIN STRING_IDS g ON g.id = c.groupName
)
Purpose: Query HCCL collective communication (AllReduce, AllGather, etc.) latency.
sql
WITH comm_view AS (
    SELECT ROUND(c.endNs - c.startNs) AS duration_ns, n.value AS op_name, t.value AS op_type, g.value AS group_name
    FROM COMMUNICATION_OP c
    LEFT JOIN STRING_IDS n ON n.id = c.opName
    LEFT JOIN STRING_IDS t ON t.id = c.opType
    LEFT JOIN STRING_IDS g ON g.id = c.groupName
)

3. 下发映射宏 (Dispatch Macro)

3. Dispatch Mapping Macro (Dispatch Macro)

用途:对比 PyTorch 框架下发、CANN 层下发与底层执行的耗时差异,定位调度拥塞。
sql
WITH dispatch_view AS (
    SELECT 
        ROUND(t.endNs - t.startNs) AS task_duration_ns, 
        ROUND(c.endNs - c.startNs) AS cann_duration_ns, 
        ROUND(p.endNs - p.startNs) AS pytorch_duration_ns,
        c_str.value AS cann_api_name, 
        p_str.value AS pytorch_api_name,
        t_str.value AS task_type
    FROM TASK t
    LEFT JOIN CANN_API c ON t.connectionId = c.connectionId
    LEFT JOIN CONNECTION_IDS conn ON conn.connectionId = t.connectionId
    LEFT JOIN PYTORCH_API p ON p.connectionId = conn.id
    LEFT JOIN STRING_IDS c_str ON c.name = c_str.id
    LEFT JOIN STRING_IDS p_str ON p.name = p_str.id
    LEFT JOIN STRING_IDS t_str ON t.taskType = t_str.id
)
Purpose: Compare the latency differences between PyTorch framework dispatch, CANN layer dispatch and underlying execution to locate scheduling congestion.
sql
WITH dispatch_view AS (
    SELECT 
        ROUND(t.endNs - t.startNs) AS task_duration_ns, 
        ROUND(c.endNs - c.startNs) AS cann_duration_ns, 
        ROUND(p.endNs - p.startNs) AS pytorch_duration_ns,
        c_str.value AS cann_api_name, 
        p_str.value AS pytorch_api_name,
        t_str.value AS task_type
    FROM TASK t
    LEFT JOIN CANN_API c ON t.connectionId = c.connectionId
    LEFT JOIN CONNECTION_IDS conn ON conn.connectionId = t.connectionId
    LEFT JOIN PYTORCH_API p ON p.connectionId = conn.id
    LEFT JOIN STRING_IDS c_str ON c.name = c_str.id
    LEFT JOIN STRING_IDS p_str ON p.name = p_str.id
    LEFT JOIN STRING_IDS t_str ON t.taskType = t_str.id
)