alibabacloud-cms-dataset

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

CMS Dataset Lifecycle Management and Querying

CMS数据集生命周期管理与查询

Manage CloudMonitor (CMS) dataset resources — list, inspect, create, update, delete datasets and execute dataset-level queries — using the
aliyun
CLI.
Architecture:
CMS Workspace + Dataset (Schema + Fields) + ExecuteQuery

使用
aliyun
CLI管理CloudMonitor(CMS)数据集资源——包括列出、查看、创建、更新、删除数据集,以及执行数据集级别的查询。
架构
CMS Workspace + Dataset (Schema + Fields) + ExecuteQuery

Installation

安装

Install Aliyun CLI

安装阿里云CLI

Run
aliyun version
to verify if version >=
3.3.3
. If not installed or outdated, follow the doc references/cli-installation-guide.md to install or update.
运行
aliyun version
验证版本是否≥
3.3.3
。若未安装或版本过时,请遵循文档references/cli-installation-guide.md进行安装或更新。

Ensure plugins up-to-date

确保插件为最新版本

[MUST] run
aliyun configure set --auto-plugin-install true
to enable automatic plugin installation. [MUST] run
aliyun plugin update
to ensure that any existing plugins are always up-to-date.

[必须] 运行
aliyun configure set --auto-plugin-install true
以启用自动插件安装。 [必须] 运行
aliyun plugin update
以确保所有现有插件始终保持最新。

AI-Mode Lifecycle

AI模式生命周期

At the start of the Core Workflow (before any CLI invocation): [MUST] Enable AI-Mode — AI-mode is required for Agent Skill execution.
bash
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-cms-dataset"
[MUST] Disable AI-Mode at EVERY exit point — Before delivering the final response for ANY reason, always disable AI-mode first. This applies to ALL exit paths: workflow success, workflow failure, error/exception, user cancellation, session end, or any other scenario where no further CLI commands will be executed.
bash
aliyun configure ai-mode disable

在核心工作流开始时(执行任何CLI调用之前): [必须] 启用AI模式——AI模式是Agent Skill执行的必要条件。
bash
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-cms-dataset"
[必须] 在所有退出点禁用AI模式——无论出于何种原因交付最终响应之前,始终先禁用AI模式。这适用于所有退出路径:工作流成功、工作流失败、错误/异常、用户取消、会话结束,或任何其他不再执行CLI命令的场景。
bash
aliyun configure ai-mode disable

Authentication

身份验证

Pre-check: Alibaba Cloud Credentials Required
Security Rules:
  • NEVER read, echo, or print AK/SK values (e.g.,
    echo $ALIBABA_CLOUD_ACCESS_KEY_ID
    is FORBIDDEN)
  • NEVER ask the user to input AK/SK directly in the conversation or command line
  • NEVER use
    aliyun configure set
    with literal credential values
  • ONLY use
    aliyun configure list
    to check credential status
bash
aliyun configure list
Check the output for a valid profile (AK, STS, or OAuth identity).
If no valid profile exists, STOP here.
  1. Obtain credentials from Alibaba Cloud Console
  2. Configure credentials outside of this session (via
    aliyun configure
    in terminal or environment variables in shell profile)
  3. Return and re-run after
    aliyun configure list
    shows a valid profile

预检查:需要阿里云凭证
安全规则
  • 绝对不要读取、回显或打印AK/SK值(例如,
    echo $ALIBABA_CLOUD_ACCESS_KEY_ID
    是被禁止的)
  • 绝对不要要求用户在对话或命令行中直接输入AK/SK
  • 绝对不要使用
    aliyun configure set
    并传入明文凭证值
  • 仅允许使用
    aliyun configure list
    检查凭证状态
bash
aliyun configure list
检查输出中是否有有效的配置文件(AK、STS或OAuth身份)。
如果没有有效配置文件,请在此处停止。
  1. 阿里云控制台获取凭证
  2. 在本次会话之外配置凭证(通过终端中的
    aliyun configure
    或Shell配置文件中的环境变量)
  3. aliyun configure list
    显示有效配置文件后,返回并重新运行

RAM Policy

RAM权限策略

See references/ram-policies.md for the full permission list.
Minimum required actions:
cms:ListDatasets
,
cms:GetDataset
,
cms:CreateDataset
,
cms:UpdateDataset
,
cms:DeleteDataset
,
cms:ExecuteQuery
.
[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:
  1. Read
    references/ram-policies.md
    to get the full list of permissions required by this SKILL
  2. Use
    ram-permission-diagnose
    skill to guide the user through requesting the necessary permissions
  3. Pause and wait until the user confirms that the required permissions have been granted

完整权限列表请参考references/ram-policies.md
最低要求的操作权限:
cms:ListDatasets
cms:GetDataset
cms:CreateDataset
cms:UpdateDataset
cms:DeleteDataset
cms:ExecuteQuery
[必须] 权限失败处理:如果在执行过程中的任何阶段,因权限错误导致命令或API调用失败,请遵循以下流程:
  1. 阅读
    references/ram-policies.md
    获取此SKILL所需的完整权限列表
  2. 使用
    ram-permission-diagnose
    技能引导用户申请必要的权限
  3. 暂停并等待用户确认所需权限已被授予

Parameter Confirmation

参数确认

IMPORTANT: Parameter Confirmation — Before executing any command or API call, ALL user-customizable parameters (e.g., RegionId, instance names, CIDR blocks, passwords, domain names, resource specifications, etc.) MUST be confirmed with the user. Do NOT assume or use default values without explicit user approval.
ParameterRequired/OptionalDescriptionDefault Value
workspace
RequiredCMS workspace ID
dataset-name
Required (except ListDatasets)Dataset name (4-63 chars)
region
OptionalRegion IDCLI profile region
schema
Required (CreateDataset)Schema JSON object
description
OptionalDataset description
query
Required (ExecuteQuery)Query text
type
Auto-filled (ExecuteQuery)Query type placeholder. Always
SQL
, do not ask user
SQL
max-results
Optional (ListDatasets)Max results per page
next-token
Optional (ListDatasets)Pagination token for next page

重要提示:参数确认——在执行任何命令或API调用之前, 所有用户可自定义的参数(例如RegionId、实例名称、CIDR块、 密码、域名、资源规格等)必须与用户确认。未经用户明确批准,请勿假设或使用默认值。
参数必填/可选描述默认值
workspace
必填CMS工作区ID
dataset-name
必填(ListDatasets除外)数据集名称(4-63字符)
region
可选地域IDCLI配置文件地域
schema
必填(CreateDataset)Schema JSON对象
description
可选数据集描述
query
必填(ExecuteQuery)查询文本
type
自动填充(ExecuteQuery)查询类型占位符。始终为
SQL
,无需询问用户
SQL
max-results
可选(ListDatasets)每页最大结果数
next-token
可选(ListDatasets)下一页的分页令牌

Naming Rules

命名规则

Dataset names and schema field names must satisfy:
  • Length: 4-63 characters
  • Start with a lowercase letter
  • Only lowercase letters, digits, and underscores
  • No leading/trailing underscore, no consecutive underscores
Regex:
^[a-z](?!.*__)[a-z0-9_]{2,61}[a-z0-9]$
Invalid names must NOT be silently normalized. Ask the user for a valid replacement.

数据集名称和Schema字段名称必须满足:
  • 长度:4-63字符
  • 以小写字母开头
  • 仅包含小写字母、数字和下划线
  • 无首尾下划线,无连续下划线
正则表达式:
^[a-z](?!.*__)[a-z0-9_]{2,61}[a-z0-9]$
无效名称不得自动规范化。请要求用户提供有效的替代名称。

Core Workflow

核心工作流

[CRITICAL] Product Binding: All dataset operations in this skill use the
cms
product ONLY
. The correct command pattern is
aliyun cms <action> --api-version 2024-03-30
. DO NOT use any other product for dataset operations —
dataworks
,
adb
,
sls
,
maxcompute
,
opensearch
,
pai
,
dashvector
, or any other service are WRONG for this scenario. If a command fails, check parameters and permissions — do NOT switch to a different product.
All commands use
--api-version 2024-03-30
. Do not pass
--endpoint
by default. Use
--region
(not
--region-id
) when specifying a region.
[关键] 产品绑定:本技能中的所有数据集操作仅使用**
cms
产品**。 正确的命令格式为
aliyun cms <action> --api-version 2024-03-30
请勿使用任何其他产品进行数据集操作——
dataworks
adb
sls
maxcompute
opensearch
pai
dashvector
或任何其他服务在此场景下都是错误的。 如果命令失败,请检查参数和权限——不要切换到其他产品。
所有命令均使用
--api-version 2024-03-30
。默认情况下请勿传递
--endpoint
。指定地域时使用
--region
(而非
--region-id
)。

0. Verify Workspace Exists

0. 验证工作区存在

[MUST] Before executing any dataset operation, call
get-workspace
to verify the workspace exists. Do NOT skip this step or use ListDatasets to infer workspace existence.
bash
aliyun cms get-workspace --api-version 2024-03-30 \
  --workspace <workspace>
If the workspace does not exist (returns error), create it via
put-workspace
:
bash
aliyun cms put-workspace --api-version 2024-03-30 \
  --workspace-name <workspace> \
  --sls-project <sls-project>
--sls-project
is required. If the user does not specify one, use the same value as the workspace name.
[必须] 在执行任何数据集操作之前,调用
get-workspace
验证工作区是否存在。请勿跳过此步骤或使用ListDatasets推断工作区是否存在。
bash
aliyun cms get-workspace --api-version 2024-03-30 \
  --workspace <workspace>
如果工作区不存在(返回错误),请通过
put-workspace
创建:
bash
aliyun cms put-workspace --api-version 2024-03-30 \
  --workspace-name <workspace> \
  --sls-project <sls-project>
--sls-project
是必填项。如果用户未指定,请使用与工作区名称相同的值。

1. List Datasets

1. 列出数据集

bash
aliyun cms list-datasets --api-version 2024-03-30 \
  --workspace <workspace> \
  [--dataset-name <filter>] \
  [--max-results <n>] \
  [--next-token <token>]
bash
aliyun cms list-datasets --api-version 2024-03-30 \
  --workspace <workspace> \
  [--dataset-name <filter>] \
  [--max-results <n>] \
  [--next-token <token>]

2. Get Dataset

2. 获取数据集详情

bash
aliyun cms get-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name>
bash
aliyun cms get-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name>

3. Create Dataset

3. 创建数据集

Safety: Before creating, check whether the dataset already exists via ListDatasets. If the dataset already exists, inform the user and ask whether to proceed (the API will return an error for duplicates). Always call CreateDataset when the user requests creation — do not silently skip it.
Pass the schema JSON directly as a single-quoted string:
bash
aliyun cms create-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name> \
  --schema '{"message_text":{"type":"text","chn":true,"embedding":"text-embedding-v4"},"service_name":{"type":"text","chn":false}}' \
  [--description "<description>"]
Schema rules:
  • The
    --schema
    value is the field definitions object directly (not wrapped in an API body).
  • Each top-level key is a field name. Each value has:
    type
    , optional
    chn
    , optional
    embedding
    , optional
    jsonKeys
    .
  • Only
    type: "text"
    fields may enable
    embedding
    . Reject embedding on non-text fields.
  • jsonKeys
    defines nested JSON key structures. Nested keys support
    type
    and
    chn
    only (no
    embedding
    ):
json
{
  "event_data": {
    "type": "text",
    "chn": false,
    "jsonKeys": {
      "source": {
        "type": "text",
        "chn": false
      },
      "message": {
        "type": "text",
        "chn": true
      }
    }
  }
}
安全提示:创建之前,通过ListDatasets检查数据集是否已存在。如果数据集已存在,请告知用户并询问是否继续(API会因重复返回错误)。当用户请求创建时,始终调用CreateDataset——请勿静默跳过。
直接将Schema JSON作为单引号字符串传递:
bash
aliyun cms create-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name> \
  --schema '{"message_text":{"type":"text","chn":true,"embedding":"text-embedding-v4"},"service_name":{"type":"text","chn":false}}' \
  [--description "<description>"]
Schema规则
  • --schema
    的值是字段定义对象本身(无需包裹在API请求体中)。
  • 每个顶级键是字段名称。每个值包含:
    type
    、可选的
    chn
    、可选的
    embedding
    、可选的
    jsonKeys
  • type: "text"
    类型的字段可以启用
    embedding
    。拒绝在非文本字段上启用embedding。
  • jsonKeys
    定义嵌套JSON键结构。嵌套键仅支持
    type
    chn
    (不支持
    embedding
    ):
json
{
  "event_data": {
    "type": "text",
    "chn": false,
    "jsonKeys": {
      "source": {
        "type": "text",
        "chn": false
      },
      "message": {
        "type": "text",
        "chn": true
      }
    }
  }
}

4. Update Dataset

4. 更新数据集

Safety: Read and show the current description before updating.
Limitation: UpdateDataset can only modify the description. Schema cannot be updated through this API. If the user needs to change the schema, they must delete and recreate the dataset.
bash
undefined
安全提示:更新之前,读取并显示当前描述。
限制:UpdateDataset仅能修改描述。无法通过此API更新Schema。如果用户需要更改Schema,必须删除并重新创建数据集。
bash
undefined

Show current state

显示当前状态

aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>

Update after user confirms

用户确认后更新

aliyun cms update-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
undefined
aliyun cms update-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
undefined

5. Delete Dataset

5. 删除数据集

Safety: Read and show the dataset, then ask for explicit confirmation identifying workspace and dataset name before deleting.
bash
undefined
安全提示:读取并显示数据集信息,然后在删除前要求用户明确确认工作区和数据集名称。
bash
undefined

Show dataset to confirm

显示数据集以确认

aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>

Delete after explicit confirmation

明确确认后删除

aliyun cms delete-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
undefined
aliyun cms delete-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
undefined

6. Execute Query

6. 执行查询

Safety: When possible, inspect the dataset schema first via GetDataset so field names come from the actual schema.
bash
aliyun cms execute-query --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name> \
  --type SQL \
  --query '<query>'
  • --type
    is a required placeholder. Always pass
    SQL
    .
  • If the user provides a complete query, preserve it except for safe shell quoting.
  • Natural-language to query: When the user describes an analysis intent in natural language instead of providing a query, first call GetDataset to retrieve the actual schema and field names, then generate a query based on those field names. Never guess field names without inspecting the schema.
  • Present the full JSON response first, then summarize: progress, returned rows, affected rows, and elapsed time.

安全提示:尽可能先通过GetDataset检查数据集Schema,确保字段名称来自实际Schema。
bash
aliyun cms execute-query --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name> \
  --type SQL \
  --query '<query>'
  • --type
    是必填占位符。始终传递
    SQL
  • 如果用户提供完整查询,除了安全的Shell引号处理外,保留原查询内容。
  • 自然语言转查询:当用户用自然语言描述分析意图而非提供查询时,首先调用GetDataset获取实际Schema和字段名称,然后基于这些字段名称生成查询。切勿在未检查Schema的情况下猜测字段名称。
  • 先展示完整的JSON响应,然后进行总结:进度、返回行数、受影响行数和耗时。

Success Verification

成功验证

See references/verification-method.md for step-by-step verification commands for each operation.

每个操作的分步验证命令请参考references/verification-method.md

Cleanup

清理

To remove a dataset created during this session:
bash
aliyun cms delete-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name>

要删除本次会话中创建的数据集:
bash
aliyun cms delete-dataset --api-version 2024-03-30 \
  --workspace <workspace> \
  --dataset-name <dataset-name>

Output Expectations

输出预期

  • Show complete JSON first for any API response.
  • Then provide a short human-readable summary.
  • For write previews: include workspace, dataset name, region, description, and full schema.
  • For query results: include progress, row count, affected rows, and elapsed time.

  • 对于任何API响应,先展示完整的JSON。
  • 然后提供简短的人类可读总结。
  • 对于写入操作预览:包含工作区、数据集名称、地域、描述和完整Schema。
  • 对于查询结果:包含进度、行数、受影响行数和耗时。

Command Tables

命令表

See references/related-commands.md for the full command reference.
CommandDescription
aliyun cms get-workspace
Verify workspace exists
aliyun cms put-workspace
Create or update a workspace
aliyun cms list-datasets
List datasets in a workspace
aliyun cms get-dataset
Get dataset details and schema
aliyun cms create-dataset
Create a new dataset with schema
aliyun cms update-dataset
Update dataset description
aliyun cms delete-dataset
Delete a dataset
aliyun cms execute-query
Execute a query against a dataset

完整命令参考请查看references/related-commands.md
命令描述
aliyun cms get-workspace
验证工作区是否存在
aliyun cms put-workspace
创建或更新工作区
aliyun cms list-datasets
列出工作区中的数据集
aliyun cms get-dataset
获取数据集详情和Schema
aliyun cms create-dataset
创建带Schema的新数据集
aliyun cms update-dataset
更新数据集描述
aliyun cms delete-dataset
删除数据集
aliyun cms execute-query
对数据集执行查询

Best Practices

最佳实践

  1. Always specify
    --api-version 2024-03-30
    — the default CMS version (
    2019-01-01
    ) does not support dataset operations.
  2. Validate dataset names and field names against the naming regex before calling CreateDataset.
  3. Use GetDataset to inspect schema before generating queries — use actual field names, not guesses.
  4. Pass schema JSON directly as a single-quoted string to avoid shell quoting issues.
  5. Always confirm write operations (Create, Update, Delete) with the user before execution.
  6. Check dataset existence before creating to avoid duplicates.
  7. Use
    --region
    (not
    --region-id
    ) when specifying a region explicitly.
  8. Do not pass
    --endpoint
    unless explicitly required; if needed, use
    cms.<region>.aliyuncs.com
    .
  9. For ExecuteQuery, always pass
    --type SQL
    as a required placeholder.
  10. Prefer inline JSON for
    --schema
    to avoid temporary file management.
  11. Never switch products. If a
    cms
    command fails, debug parameters/permissions — do not try
    dataworks
    ,
    adb
    ,
    sls
    ,
    maxcompute
    , or other products. The "workspace" in this skill is a CMS workspace, not a DataWorks/SLS/MaxCompute project.
  12. Set explicit timeouts. Use
    --read-timeout 30 --connect-timeout 10
    for metadata operations (list/get/create/update/delete). For ExecuteQuery use
    --read-timeout 120 --connect-timeout 10
    as queries may take longer.

  1. 始终指定
    --api-version 2024-03-30
    ——默认的CMS版本(
    2019-01-01
    )不支持数据集操作。
  2. 在调用CreateDataset之前,根据命名正则表达式验证数据集名称和字段名称。
  3. 生成查询之前使用GetDataset检查Schema——使用实际字段名称,而非猜测。
  4. 将Schema JSON直接作为单引号字符串传递,避免Shell引号问题。
  5. 执行写入操作(创建、更新、删除)之前,始终与用户确认。
  6. 创建之前检查数据集是否存在,避免重复。
  7. 明确指定地域时使用
    --region
    (而非
    --region-id
    )。
  8. 除非明确要求,否则请勿传递
    --endpoint
    ;如果需要,使用
    cms.<region>.aliyuncs.com
  9. 对于ExecuteQuery,始终传递
    --type SQL
    作为必填占位符。
  10. 优先使用内联JSON作为
    --schema
    ,避免临时文件管理。
  11. 切勿切换产品。如果
    cms
    命令失败,请调试参数/权限——不要尝试
    dataworks
    adb
    sls
    maxcompute
    或其他产品。本技能中的"workspace"是CMS工作区,而非DataWorks/SLS/MaxCompute项目。
  12. 设置明确的超时时间。元数据操作(列出/获取/创建/更新/删除)使用
    --read-timeout 30 --connect-timeout 10
    。对于ExecuteQuery使用
    --read-timeout 120 --connect-timeout 10
    ,因为查询可能耗时更长。

Reference Links

参考链接

ReferenceDescription
references/ram-policies.mdRAM permission requirements
references/related-commands.mdFull CLI command reference
references/verification-method.mdSuccess verification steps
references/acceptance-criteria.mdCorrect/incorrect pattern examples
references/cli-installation-guide.mdCLI installation guide
参考文档描述
references/ram-policies.mdRAM权限要求
references/related-commands.md完整CLI命令参考
references/verification-method.md成功验证步骤
references/acceptance-criteria.md正确/错误模式示例
references/cli-installation-guide.mdCLI安装指南