alibabacloud-cms-dataset
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCMS Dataset Lifecycle Management and Querying
CMS数据集生命周期管理与查询
Manage CloudMonitor (CMS) dataset resources — list, inspect, create, update, delete datasets and execute dataset-level queries — using the CLI.
aliyunArchitecture:
CMS Workspace + Dataset (Schema + Fields) + ExecuteQuery使用 CLI管理CloudMonitor(CMS)数据集资源——包括列出、查看、创建、更新、删除数据集,以及执行数据集级别的查询。
aliyun架构:
CMS Workspace + Dataset (Schema + Fields) + ExecuteQueryInstallation
安装
Install Aliyun CLI
安装阿里云CLI
Run to verify if version >= . If not installed or outdated, follow the doc references/cli-installation-guide.md to install or update.
aliyun version3.3.3运行验证版本是否≥。若未安装或版本过时,请遵循文档references/cli-installation-guide.md进行安装或更新。
aliyun version3.3.3Ensure plugins up-to-date
确保插件为最新版本
[MUST] runto enable automatic plugin installation. [MUST] runaliyun configure set --auto-plugin-install trueto ensure that any existing plugins are always up-to-date.aliyun plugin update
[必须] 运行以启用自动插件安装。 [必须] 运行aliyun configure set --auto-plugin-install true以确保所有现有插件始终保持最新。aliyun plugin update
AI-Mode Lifecycle
AI模式生命周期
At the start of the Core Workflow (before any CLI invocation): [MUST] Enable AI-Mode — AI-mode is required for Agent Skill execution.bashaliyun configure ai-mode enable aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-cms-dataset"
[MUST] Disable AI-Mode at EVERY exit point — Before delivering the final response for ANY reason, always disable AI-mode first. This applies to ALL exit paths: workflow success, workflow failure, error/exception, user cancellation, session end, or any other scenario where no further CLI commands will be executed.bashaliyun configure ai-mode disable
在核心工作流开始时(执行任何CLI调用之前): [必须] 启用AI模式——AI模式是Agent Skill执行的必要条件。bashaliyun configure ai-mode enable aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-cms-dataset"
[必须] 在所有退出点禁用AI模式——无论出于何种原因交付最终响应之前,始终先禁用AI模式。这适用于所有退出路径:工作流成功、工作流失败、错误/异常、用户取消、会话结束,或任何其他不再执行CLI命令的场景。bashaliyun configure ai-mode disable
Authentication
身份验证
Pre-check: Alibaba Cloud Credentials RequiredSecurity Rules:
- NEVER read, echo, or print AK/SK values (e.g.,
is FORBIDDEN)echo $ALIBABA_CLOUD_ACCESS_KEY_ID- NEVER ask the user to input AK/SK directly in the conversation or command line
- NEVER use
with literal credential valuesaliyun configure set- ONLY use
to check credential statusaliyun configure listbashaliyun configure listCheck the output for a valid profile (AK, STS, or OAuth identity).If no valid profile exists, STOP here.
- Obtain credentials from Alibaba Cloud Console
- Configure credentials outside of this session (via
in terminal or environment variables in shell profile)aliyun configure- Return and re-run after
shows a valid profilealiyun configure list
预检查:需要阿里云凭证安全规则:
- 绝对不要读取、回显或打印AK/SK值(例如,
是被禁止的)echo $ALIBABA_CLOUD_ACCESS_KEY_ID- 绝对不要要求用户在对话或命令行中直接输入AK/SK
- 绝对不要使用
并传入明文凭证值aliyun configure set- 仅允许使用
检查凭证状态aliyun configure listbashaliyun configure list检查输出中是否有有效的配置文件(AK、STS或OAuth身份)。如果没有有效配置文件,请在此处停止。
- 从阿里云控制台获取凭证
- 在本次会话之外配置凭证(通过终端中的
或Shell配置文件中的环境变量)aliyun configure- 当
显示有效配置文件后,返回并重新运行aliyun configure list
RAM Policy
RAM权限策略
See references/ram-policies.md for the full permission list.
Minimum required actions: , , , , , .
cms:ListDatasetscms:GetDatasetcms:CreateDatasetcms:UpdateDatasetcms:DeleteDatasetcms:ExecuteQuery[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:
- Read
to get the full list of permissions required by this SKILLreferences/ram-policies.md- Use
skill to guide the user through requesting the necessary permissionsram-permission-diagnose- Pause and wait until the user confirms that the required permissions have been granted
完整权限列表请参考references/ram-policies.md。
最低要求的操作权限:、、、、、。
cms:ListDatasetscms:GetDatasetcms:CreateDatasetcms:UpdateDatasetcms:DeleteDatasetcms:ExecuteQuery[必须] 权限失败处理:如果在执行过程中的任何阶段,因权限错误导致命令或API调用失败,请遵循以下流程:
- 阅读
获取此SKILL所需的完整权限列表references/ram-policies.md- 使用
技能引导用户申请必要的权限ram-permission-diagnose- 暂停并等待用户确认所需权限已被授予
Parameter Confirmation
参数确认
IMPORTANT: Parameter Confirmation — Before executing any command or API call, ALL user-customizable parameters (e.g., RegionId, instance names, CIDR blocks, passwords, domain names, resource specifications, etc.) MUST be confirmed with the user. Do NOT assume or use default values without explicit user approval.
| Parameter | Required/Optional | Description | Default Value |
|---|---|---|---|
| Required | CMS workspace ID | — |
| Required (except ListDatasets) | Dataset name (4-63 chars) | — |
| Optional | Region ID | CLI profile region |
| Required (CreateDataset) | Schema JSON object | — |
| Optional | Dataset description | — |
| Required (ExecuteQuery) | Query text | — |
| Auto-filled (ExecuteQuery) | Query type placeholder. Always | |
| Optional (ListDatasets) | Max results per page | — |
| Optional (ListDatasets) | Pagination token for next page | — |
重要提示:参数确认——在执行任何命令或API调用之前, 所有用户可自定义的参数(例如RegionId、实例名称、CIDR块、 密码、域名、资源规格等)必须与用户确认。未经用户明确批准,请勿假设或使用默认值。
| 参数 | 必填/可选 | 描述 | 默认值 |
|---|---|---|---|
| 必填 | CMS工作区ID | — |
| 必填(ListDatasets除外) | 数据集名称(4-63字符) | — |
| 可选 | 地域ID | CLI配置文件地域 |
| 必填(CreateDataset) | Schema JSON对象 | — |
| 可选 | 数据集描述 | — |
| 必填(ExecuteQuery) | 查询文本 | — |
| 自动填充(ExecuteQuery) | 查询类型占位符。始终为 | |
| 可选(ListDatasets) | 每页最大结果数 | — |
| 可选(ListDatasets) | 下一页的分页令牌 | — |
Naming Rules
命名规则
Dataset names and schema field names must satisfy:
- Length: 4-63 characters
- Start with a lowercase letter
- Only lowercase letters, digits, and underscores
- No leading/trailing underscore, no consecutive underscores
Regex:
^[a-z](?!.*__)[a-z0-9_]{2,61}[a-z0-9]$Invalid names must NOT be silently normalized. Ask the user for a valid replacement.
数据集名称和Schema字段名称必须满足:
- 长度:4-63字符
- 以小写字母开头
- 仅包含小写字母、数字和下划线
- 无首尾下划线,无连续下划线
正则表达式:
^[a-z](?!.*__)[a-z0-9_]{2,61}[a-z0-9]$无效名称不得自动规范化。请要求用户提供有效的替代名称。
Core Workflow
核心工作流
[CRITICAL] Product Binding: All dataset operations in this skill use theproduct ONLY. The correct command pattern iscms. DO NOT use any other product for dataset operations —aliyun cms <action> --api-version 2024-03-30,dataworks,adb,sls,maxcompute,opensearch,pai, or any other service are WRONG for this scenario. If a command fails, check parameters and permissions — do NOT switch to a different product.dashvector
All commands use . Do not pass by default. Use (not ) when specifying a region.
--api-version 2024-03-30--endpoint--region--region-id[关键] 产品绑定:本技能中的所有数据集操作仅使用**产品**。 正确的命令格式为cms。 请勿使用任何其他产品进行数据集操作——aliyun cms <action> --api-version 2024-03-30、dataworks、adb、sls、maxcompute、opensearch、pai或任何其他服务在此场景下都是错误的。 如果命令失败,请检查参数和权限——不要切换到其他产品。dashvector
所有命令均使用。默认情况下请勿传递。指定地域时使用(而非)。
--api-version 2024-03-30--endpoint--region--region-id0. Verify Workspace Exists
0. 验证工作区存在
[MUST] Before executing any dataset operation, callto verify the workspace exists. Do NOT skip this step or use ListDatasets to infer workspace existence.get-workspace
bash
aliyun cms get-workspace --api-version 2024-03-30 \
--workspace <workspace>If the workspace does not exist (returns error), create it via :
put-workspacebash
aliyun cms put-workspace --api-version 2024-03-30 \
--workspace-name <workspace> \
--sls-project <sls-project>--sls-project[必须] 在执行任何数据集操作之前,调用验证工作区是否存在。请勿跳过此步骤或使用ListDatasets推断工作区是否存在。get-workspace
bash
aliyun cms get-workspace --api-version 2024-03-30 \
--workspace <workspace>如果工作区不存在(返回错误),请通过创建:
put-workspacebash
aliyun cms put-workspace --api-version 2024-03-30 \
--workspace-name <workspace> \
--sls-project <sls-project>--sls-project1. List Datasets
1. 列出数据集
bash
aliyun cms list-datasets --api-version 2024-03-30 \
--workspace <workspace> \
[--dataset-name <filter>] \
[--max-results <n>] \
[--next-token <token>]bash
aliyun cms list-datasets --api-version 2024-03-30 \
--workspace <workspace> \
[--dataset-name <filter>] \
[--max-results <n>] \
[--next-token <token>]2. Get Dataset
2. 获取数据集详情
bash
aliyun cms get-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name>bash
aliyun cms get-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name>3. Create Dataset
3. 创建数据集
Safety: Before creating, check whether the dataset already exists via ListDatasets. If the dataset already exists, inform the user and ask whether to proceed (the API will return an error for duplicates). Always call CreateDataset when the user requests creation — do not silently skip it.
Pass the schema JSON directly as a single-quoted string:
bash
aliyun cms create-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name> \
--schema '{"message_text":{"type":"text","chn":true,"embedding":"text-embedding-v4"},"service_name":{"type":"text","chn":false}}' \
[--description "<description>"]Schema rules:
- The value is the field definitions object directly (not wrapped in an API body).
--schema - Each top-level key is a field name. Each value has: , optional
type, optionalchn, optionalembedding.jsonKeys - Only fields may enable
type: "text". Reject embedding on non-text fields.embedding - defines nested JSON key structures. Nested keys support
jsonKeysandtypeonly (nochn):embedding
json
{
"event_data": {
"type": "text",
"chn": false,
"jsonKeys": {
"source": {
"type": "text",
"chn": false
},
"message": {
"type": "text",
"chn": true
}
}
}
}安全提示:创建之前,通过ListDatasets检查数据集是否已存在。如果数据集已存在,请告知用户并询问是否继续(API会因重复返回错误)。当用户请求创建时,始终调用CreateDataset——请勿静默跳过。
直接将Schema JSON作为单引号字符串传递:
bash
aliyun cms create-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name> \
--schema '{"message_text":{"type":"text","chn":true,"embedding":"text-embedding-v4"},"service_name":{"type":"text","chn":false}}' \
[--description "<description>"]Schema规则:
- 的值是字段定义对象本身(无需包裹在API请求体中)。
--schema - 每个顶级键是字段名称。每个值包含:、可选的
type、可选的chn、可选的embedding。jsonKeys - 仅类型的字段可以启用
type: "text"。拒绝在非文本字段上启用embedding。embedding - 定义嵌套JSON键结构。嵌套键仅支持
jsonKeys和type(不支持chn):embedding
json
{
"event_data": {
"type": "text",
"chn": false,
"jsonKeys": {
"source": {
"type": "text",
"chn": false
},
"message": {
"type": "text",
"chn": true
}
}
}
}4. Update Dataset
4. 更新数据集
Safety: Read and show the current description before updating.
Limitation: UpdateDataset can only modify the description. Schema cannot be updated through this API. If the user needs to change the schema, they must delete and recreate the dataset.
bash
undefined安全提示:更新之前,读取并显示当前描述。
限制:UpdateDataset仅能修改描述。无法通过此API更新Schema。如果用户需要更改Schema,必须删除并重新创建数据集。
bash
undefinedShow current state
显示当前状态
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
--workspace <workspace> --dataset-name <dataset-name>
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
--workspace <workspace> --dataset-name <dataset-name>
Update after user confirms
用户确认后更新
aliyun cms update-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
undefinedaliyun cms update-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
--workspace <workspace>
--dataset-name <dataset-name>
--description "<new-description>"
undefined5. Delete Dataset
5. 删除数据集
Safety: Read and show the dataset, then ask for explicit confirmation identifying workspace and dataset name before deleting.
bash
undefined安全提示:读取并显示数据集信息,然后在删除前要求用户明确确认工作区和数据集名称。
bash
undefinedShow dataset to confirm
显示数据集以确认
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
--workspace <workspace> --dataset-name <dataset-name>
aliyun cms get-dataset --api-version 2024-03-30
--workspace <workspace> --dataset-name <dataset-name>
--workspace <workspace> --dataset-name <dataset-name>
Delete after explicit confirmation
明确确认后删除
aliyun cms delete-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--workspace <workspace>
--dataset-name <dataset-name>
undefinedaliyun cms delete-dataset --api-version 2024-03-30
--workspace <workspace>
--dataset-name <dataset-name>
--workspace <workspace>
--dataset-name <dataset-name>
undefined6. Execute Query
6. 执行查询
Safety: When possible, inspect the dataset schema first via GetDataset so field names come from the actual schema.
bash
aliyun cms execute-query --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name> \
--type SQL \
--query '<query>'- is a required placeholder. Always pass
--type.SQL - If the user provides a complete query, preserve it except for safe shell quoting.
- Natural-language to query: When the user describes an analysis intent in natural language instead of providing a query, first call GetDataset to retrieve the actual schema and field names, then generate a query based on those field names. Never guess field names without inspecting the schema.
- Present the full JSON response first, then summarize: progress, returned rows, affected rows, and elapsed time.
安全提示:尽可能先通过GetDataset检查数据集Schema,确保字段名称来自实际Schema。
bash
aliyun cms execute-query --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name> \
--type SQL \
--query '<query>'- 是必填占位符。始终传递
--type。SQL - 如果用户提供完整查询,除了安全的Shell引号处理外,保留原查询内容。
- 自然语言转查询:当用户用自然语言描述分析意图而非提供查询时,首先调用GetDataset获取实际Schema和字段名称,然后基于这些字段名称生成查询。切勿在未检查Schema的情况下猜测字段名称。
- 先展示完整的JSON响应,然后进行总结:进度、返回行数、受影响行数和耗时。
Success Verification
成功验证
See references/verification-method.md for step-by-step verification commands for each operation.
每个操作的分步验证命令请参考references/verification-method.md。
Cleanup
清理
To remove a dataset created during this session:
bash
aliyun cms delete-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name>要删除本次会话中创建的数据集:
bash
aliyun cms delete-dataset --api-version 2024-03-30 \
--workspace <workspace> \
--dataset-name <dataset-name>Output Expectations
输出预期
- Show complete JSON first for any API response.
- Then provide a short human-readable summary.
- For write previews: include workspace, dataset name, region, description, and full schema.
- For query results: include progress, row count, affected rows, and elapsed time.
- 对于任何API响应,先展示完整的JSON。
- 然后提供简短的人类可读总结。
- 对于写入操作预览:包含工作区、数据集名称、地域、描述和完整Schema。
- 对于查询结果:包含进度、行数、受影响行数和耗时。
Command Tables
命令表
See references/related-commands.md for the full command reference.
| Command | Description |
|---|---|
| Verify workspace exists |
| Create or update a workspace |
| List datasets in a workspace |
| Get dataset details and schema |
| Create a new dataset with schema |
| Update dataset description |
| Delete a dataset |
| Execute a query against a dataset |
完整命令参考请查看references/related-commands.md。
| 命令 | 描述 |
|---|---|
| 验证工作区是否存在 |
| 创建或更新工作区 |
| 列出工作区中的数据集 |
| 获取数据集详情和Schema |
| 创建带Schema的新数据集 |
| 更新数据集描述 |
| 删除数据集 |
| 对数据集执行查询 |
Best Practices
最佳实践
- Always specify — the default CMS version (
--api-version 2024-03-30) does not support dataset operations.2019-01-01 - Validate dataset names and field names against the naming regex before calling CreateDataset.
- Use GetDataset to inspect schema before generating queries — use actual field names, not guesses.
- Pass schema JSON directly as a single-quoted string to avoid shell quoting issues.
- Always confirm write operations (Create, Update, Delete) with the user before execution.
- Check dataset existence before creating to avoid duplicates.
- Use (not
--region) when specifying a region explicitly.--region-id - Do not pass unless explicitly required; if needed, use
--endpoint.cms.<region>.aliyuncs.com - For ExecuteQuery, always pass as a required placeholder.
--type SQL - Prefer inline JSON for to avoid temporary file management.
--schema - Never switch products. If a command fails, debug parameters/permissions — do not try
cms,dataworks,adb,sls, or other products. The "workspace" in this skill is a CMS workspace, not a DataWorks/SLS/MaxCompute project.maxcompute - Set explicit timeouts. Use for metadata operations (list/get/create/update/delete). For ExecuteQuery use
--read-timeout 30 --connect-timeout 10as queries may take longer.--read-timeout 120 --connect-timeout 10
- 始终指定——默认的CMS版本(
--api-version 2024-03-30)不支持数据集操作。2019-01-01 - 在调用CreateDataset之前,根据命名正则表达式验证数据集名称和字段名称。
- 生成查询之前使用GetDataset检查Schema——使用实际字段名称,而非猜测。
- 将Schema JSON直接作为单引号字符串传递,避免Shell引号问题。
- 执行写入操作(创建、更新、删除)之前,始终与用户确认。
- 创建之前检查数据集是否存在,避免重复。
- 明确指定地域时使用(而非
--region)。--region-id - 除非明确要求,否则请勿传递;如果需要,使用
--endpoint。cms.<region>.aliyuncs.com - 对于ExecuteQuery,始终传递作为必填占位符。
--type SQL - 优先使用内联JSON作为,避免临时文件管理。
--schema - 切勿切换产品。如果命令失败,请调试参数/权限——不要尝试
cms、dataworks、adb、sls或其他产品。本技能中的"workspace"是CMS工作区,而非DataWorks/SLS/MaxCompute项目。maxcompute - 设置明确的超时时间。元数据操作(列出/获取/创建/更新/删除)使用。对于ExecuteQuery使用
--read-timeout 30 --connect-timeout 10,因为查询可能耗时更长。--read-timeout 120 --connect-timeout 10
Reference Links
参考链接
| Reference | Description |
|---|---|
| references/ram-policies.md | RAM permission requirements |
| references/related-commands.md | Full CLI command reference |
| references/verification-method.md | Success verification steps |
| references/acceptance-criteria.md | Correct/incorrect pattern examples |
| references/cli-installation-guide.md | CLI installation guide |
| 参考文档 | 描述 |
|---|---|
| references/ram-policies.md | RAM权限要求 |
| references/related-commands.md | 完整CLI命令参考 |
| references/verification-method.md | 成功验证步骤 |
| references/acceptance-criteria.md | 正确/错误模式示例 |
| references/cli-installation-guide.md | CLI安装指南 |