nocobase-data-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Goal

目标

Use NocoBase MCP tools to locate the right collection, query business data safely, and produce reliable summaries or grouped analysis.
使用NocoBase MCP工具定位正确的集合,安全查询业务数据,生成可靠的汇总或分组分析结果。

Prerequisite

前置条件

  • NocoBase MCP must already be reachable and authenticated.
  • If MCP returns authentication errors such as
    Auth required
    , stop and ask the user to refresh MCP authentication before continuing.
  • NocoBase MCP必须已可访问且已完成身份验证。
  • 如果MCP返回诸如
    Auth required
    之类的身份验证错误,请停止操作,先要求用户刷新MCP身份验证后再继续。

Default strategy

默认策略

  1. Inspect the data source path first.
  2. Prefer the
    main
    data source.
  3. If the target collection is not in
    main
    , inspect other enabled data sources.
  4. Once the collection is located, use that
    dataSource
    explicitly in every subsequent
    resource_*
    call.
  5. Prefer
    resource_query
    for counts and grouped analysis only after confirming the query parameter contract.
  6. Fall back to
    resource_list
    plus manual counting when query results are suspicious or need cross-checking.
Useful references:
  • references/analysis-patterns.md
    for common business analysis shapes
  • references/metric-checklist.md
    for metric definition and scope checks
  • references/entity-mapping.md
    for mapping business terms to collections and fields
  1. 优先检查数据源路径。
  2. 优先使用
    main
    数据源。
  3. 如果目标集合不在
    main
    中,检查其他已启用的数据源。
  4. 定位到集合后,在后续所有
    resource_*
    调用中显式指定该
    dataSource
  5. 仅在确认查询参数约定后,优先使用
    resource_query
    进行计数和分组分析。
  6. 当查询结果存疑或需要交叉校验时,降级使用
    resource_list
    加手动计数的方式。
有用的参考资料:
  • references/analysis-patterns.md
    :常见业务分析场景模板
  • references/metric-checklist.md
    :指标定义和范围校验清单
  • references/entity-mapping.md
    :业务术语与集合、字段的映射关系

Data source discovery

数据源发现

  • If the user explicitly names a data source, use it directly.
  • Otherwise start with
    main
    .
  • When the target collection is not found in
    main
    , call
    data_sources:list_enabled
    and inspect other enabled data sources one by one.
  • If multiple data sources contain the same collection name:
    • default to
      main
      when
      main
      is one of them;
    • otherwise present the candidates and explain which one you are using.
  • In the final answer, state which data source was used.
  • 如果用户明确指定了数据源,直接使用该数据源。
  • 否则默认从
    main
    开始查找。
  • main
    中找不到目标集合时,调用
    data_sources:list_enabled
    逐个检查其他已启用的数据源。
  • 如果多个数据源包含相同的集合名称:
    • main
      是其中之一,默认使用
      main
    • 否则展示所有候选数据源并说明你将使用哪一个。
  • 在最终答案中说明使用的是哪个数据源。

Collection discovery

集合发现

  • If the user gives a collection name, verify it exists before querying.
  • If the user uses business terms such as "leads", "users", "orders", or "opportunities", inspect collection metadata to map the business term to the actual collection name.
  • Prefer
    collections:listMeta
    for a fast overview.
  • Then use
    collections:get
    with
    appends: ["fields"]
    when you need exact field names, relation targets, or enum options.
  • Use
    references/entity-mapping.md
    as a reusable heuristic for common business nouns and likely field categories.
  • 如果用户提供了集合名称,查询前先验证该集合是否存在。
  • 如果用户使用了业务术语,例如“线索”、“用户”、“订单”或“商机”,检查集合元数据,将业务术语映射为实际的集合名称。
  • 优先使用
    collections:listMeta
    快速获取概览。
  • 当你需要确切的字段名称、关联目标或枚举选项时,使用携带
    appends: ["fields"]
    参数的
    collections:get
    接口。
  • 可复用
    references/entity-mapping.md
    作为通用业务名词和对应字段类别的启发式映射规则。

Query contract checks

查询约定校验

Before using
resource_query
, verify the request shape matches the real backend contract:
  • measures[].aggregation
    , not
    aggregate
  • orders[].order
    , not
    direction
  • field
    should usually be passed as a field path array such as
    ["id"]
    or
    ["owner", "nickname"]
  • Use
    alias
    whenever the result will be referenced in output or
    having
Correct examples:
json
{
  "resource": "lead",
  "dataSource": "main",
  "measures": [
    { "aggregation": "count", "field": ["id"], "alias": "lead_count" }
  ]
}
json
{
  "resource": "lead",
  "dataSource": "main",
  "dimensions": [
    { "field": ["status"], "alias": "status" }
  ],
  "measures": [
    { "aggregation": "count", "field": ["id"], "alias": "lead_count" }
  ],
  "orders": [
    { "field": ["status"], "alias": "status", "order": "asc" }
  ]
}
使用
resource_query
前,验证请求格式是否符合实际后端约定:
  • 使用
    measures[].aggregation
    而非
    aggregate
  • 使用
    orders[].order
    而非
    direction
  • field
    通常应作为字段路径数组传递,例如
    ["id"]
    ["owner", "nickname"]
  • 当结果会在输出或
    having
    中被引用时,务必使用
    alias
正确示例:
json
{
  "resource": "lead",
  "dataSource": "main",
  "measures": [
    { "aggregation": "count", "field": ["id"], "alias": "lead_count" }
  ]
}
json
{
  "resource": "lead",
  "dataSource": "main",
  "dimensions": [
    { "field": ["status"], "alias": "status" }
  ],
  "measures": [
    { "aggregation": "count", "field": ["id"], "alias": "lead_count" }
  ],
  "orders": [
    { "field": ["status"], "alias": "status", "order": "asc" }
  ]
}

Recommended workflow

推荐工作流

1. Confirm reachability

1. 确认可访问性

  • Use
    auth:check
    .
  • If authentication fails, stop.
  • 使用
    auth:check
    接口。
  • 如果身份验证失败,停止操作。

2. Find the collection

2. 查找集合

  • First inspect
    main
    .
  • If not found, inspect other enabled data sources.
  • Read fields before querying if field names or relations are uncertain.
  • 首先检查
    main
    数据源。
  • 如未找到,检查其他已启用的数据源。
  • 如果不确定字段名称或关联关系,查询前先读取字段信息。

3. Start with simple counts

3. 从简单计数开始

  • Use
    resource_query
    with a single
    count(id)
    measure.
  • Keep the first query minimal so you can validate the result shape quickly.
  • 使用仅携带单个
    count(id)
    度量的
    resource_query
    请求。
  • 保持首次查询尽可能简洁,以便你可以快速验证结果格式。

4. Add grouped analysis

4. 增加分组分析

Common grouped views:
  • by status
  • by owner
  • by source
  • by department
  • by created date or month
For relation labels, use field paths such as:
  • ["owner", "nickname"]
  • ["mainDepartment", "title"]
常见分组视图:
  • 按状态
  • 按负责人
  • 按来源
  • 按部门
  • 按创建日期或月份
对于关联对象标签,使用字段路径,例如:
  • ["owner", "nickname"]
  • ["mainDepartment", "title"]

5. Cross-check when needed

5. 必要时交叉校验

Re-check with
resource_list
when:
  • the grouped rows look duplicated unexpectedly;
  • the numeric result looks like record IDs instead of counts;
  • totals do not match between summary and grouped output;
  • the collection may be affected by ACL scope or hidden filters.
When cross-checking:
  • fetch enough rows to cover the visible dataset or use pagination;
  • count and group manually from the returned records;
  • compare the manual result with
    resource_query
    .
出现以下情况时,使用
resource_list
重新校验:
  • 分组行意外出现重复;
  • 数值结果看起来像记录ID而非计数;
  • 汇总结果和分组输出的总计不匹配;
  • 集合可能受ACL范围或隐藏筛选器影响。
交叉校验时:
  • 拉取足够多的行覆盖可见数据集,或使用分页;
  • 基于返回的记录手动计数和分组;
  • 对比手动计算结果与
    resource_query
    的结果。

6. Present the result

6. 展示结果

Report:
  • the collection used;
  • the data source used;
  • the total count;
  • the key grouped breakdowns;
  • any caveat such as ACL scope, null values, or fallback to manual verification.
报告内容需包含:
  • 使用的集合;
  • 使用的数据源;
  • 总计数;
  • 核心分组明细;
  • 任何注意事项,例如ACL范围、空值、或降级使用手动验证的情况。

Analysis entry points

分析入口分类

Classify the user request before querying:
  • overview
    for current totals and main distributions
  • distribution
    for grouped counts by status, owner, source, team, or department
  • funnel
    for stage-based business progression
  • trend
    for date or month-based change over time
  • ranking
    for top owners, sources, accounts, or products
  • quality-check
    for missing values, null-heavy fields, suspicious statuses, or orphaned relations
Use
references/analysis-patterns.md
for the recommended query shapes for each pattern.
查询前先对用户请求进行分类:
  • overview
    :当前总计和主要分布情况
  • distribution
    :按状态、负责人、来源、团队或部门的分组计数
  • funnel
    :基于阶段的业务流程转化
  • trend
    :按日期或月份的随时间变化趋势
  • ranking
    :排名靠前的负责人、来源、客户或产品
  • quality-check
    :缺失值、空值占比高的字段、可疑状态、或孤立关联关系
每种模式的推荐查询格式可参考
references/analysis-patterns.md

Metric definition checks

指标定义校验

Before returning an answer, verify the metric scope:
  • what time range is included
  • which time field drives the range
  • whether the metric is total count, distinct count, sum, or average
  • whether archived, inactive, null, or other terminal states should be included
  • whether grouped totals reconcile with the grand total
Use
references/metric-checklist.md
when the user request is ambiguous or the metric may be interpreted in more than one way.
返回答案前,校验指标范围:
  • 包含的时间范围
  • 驱动时间范围的时间字段
  • 指标是总计数、去重计数、求和还是平均值
  • 是否应该包含已归档、非活跃、空值或其他终止状态的数据
  • 分组总计是否与总计数一致
当用户请求不明确或指标存在多种解读方式时,参考
references/metric-checklist.md

Common pitfalls

常见陷阱

  • Do not assume the collection is in
    main
    ; check
    main
    first, then search other enabled data sources.
  • Do not omit
    dataSource
    after the collection has been located.
  • Do not use
    aggregate
    in query measures; the backend expects
    aggregation
    .
  • Do not use
    direction
    in query orders; the backend expects
    order
    .
  • Do not assume suspicious aggregate output is correct.
  • If a "count" result looks like
    36
    ,
    54
    ,
    80
    , or another plausible record ID, verify whether aggregation was actually applied.
  • Relation label grouping requires the real relation path and target field, not guessed labels.
  • 不要假设集合一定在
    main
    中;先检查
    main
    ,再搜索其他已启用的数据源。
  • 定位到集合后不要遗漏
    dataSource
    参数。
  • 不要在查询度量中使用
    aggregate
    ;后端期望的参数是
    aggregation
  • 不要在查询排序中使用
    direction
    ;后端期望的参数是
    order
  • 不要默认可疑的聚合输出是正确的。
  • 如果“计数”结果看起来像
    36
    54
    80
    或其他合理的记录ID,验证是否实际应用了聚合逻辑。
  • 关联标签分组需要使用真实的关联路径和目标字段,而非猜测的标签。

Verification checklist

校验清单

  • MCP is authenticated.
  • The collection exists in the chosen data source.
  • The fields used in
    dimensions
    ,
    measures
    ,
    orders
    , and
    filter
    actually exist.
  • resource_query
    uses
    aggregation
    and
    order
    .
  • Summary totals match grouped totals, or any mismatch is explained.
  • The final answer states the data source and any verification fallback used.
  • MCP已完成身份验证。
  • 集合在选择的数据源中存在。
  • dimensions
    measures
    orders
    filter
    中使用的字段真实存在。
  • resource_query
    使用了
    aggregation
    order
    参数。
  • 汇总总计与分组总计匹配,或对不匹配的情况做出了解释。
  • 最终答案说明了使用的数据源和所有降级验证的方式。