using-datahub

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Using DataHub Skills

使用DataHub Skills

You have access to 5 DataHub catalog interaction skills. Use this guide to route the user's request to the correct skill.

你可以使用5个DataHub目录交互Skill,参考本指南将用户请求路由到正确的Skill。

Skill Routing Table

技能路由表

User IntentSkillCommand
Find or discover entities (search, browse, filter, list)Search
/datahub-search
Answer a question about the catalog ("who owns X?", "how many X?")Search
/datahub-search
Update metadata (descriptions, tags, glossary terms, ownership, deprecation)Enrich
/datahub-enrich
Explore lineage (upstream, downstream, impact, root cause, dependencies)Lineage
/datahub-lineage
Data quality (assertions, incidents, health checks)Quality
/datahub-quality
Notifications (subscribe to assertion failures, incidents)Quality
/datahub-quality
Install CLI, authenticate, verify connectionSetup
/datahub-setup
Configure default scopes and profilesSetup
/datahub-setup

用户意图Skill命令
查找或发现实体(搜索、浏览、筛选、罗列)Search
/datahub-search
回答关于目录的问题("谁拥有X?"、"有多少个X?")Search
/datahub-search
更新元数据(描述、标签、术语表术语、所有权、弃用状态)Enrich
/datahub-enrich
探索血缘(上游、下游、影响、根因、依赖)Lineage
/datahub-lineage
数据质量(断言、事件、健康检查)Quality
/datahub-quality
通知(订阅断言失败、事件告警)Quality
/datahub-quality
安装CLI、身份认证、验证连接Setup
/datahub-setup
配置默认范围和配置文件Setup
/datahub-setup

Disambiguation Rules

消歧规则

When the intent is ambiguous, use these rules:
当意图不明确时,遵循以下规则:

"Tag" requests

"Tag" 请求

  • All tag operations (PII, sensitive, important, reviewed, team-x) → Enrich (general metadata)
  • 所有标签操作(PII、敏感、重要、已审核、team-x)→ Enrich(通用元数据操作)

"Domain" requests

"Domain" 请求

  • Filter search to a domainSearch (scoped search)
  • Configure default domainSetup (profile configuration)
  • 按域筛选搜索Search(限定范围搜索)
  • 配置默认域Setup(配置文件设置)

"Quality" or "health" requests

"质量" 或 "健康" 请求

  • Failing assertions, active incidents, health statusQuality
  • Create assertions, run quality checks, raise incidentsQuality
  • Subscribe to assertion failures or incidentsQuality
  • Metadata quality/documentation/ownership coverage → Use Search to gather the data and synthesize the answer
  • 断言失败、活跃事件、健康状态Quality
  • 创建断言、运行质量检查、上报事件Quality
  • 订阅断言失败或事件告警Quality
  • 元数据质量/文档/所有权覆盖率 → 使用 Search 收集数据并整合答案

Lineage vs. Search

血缘 vs 搜索

  • "What feeds into X" / "what depends on X" / "impact of changing X"Lineage
  • "What dashboards use table X"Lineage (relationship traversal)
  • "Who owns X" / "what is X"Search (metadata lookup)
  • "什么内容输入到X" / "什么依赖X" / "修改X的影响"Lineage
  • "哪些仪表盘使用了表X"Lineage(关系遍历)
  • "谁拥有X" / "X是什么"Search(元数据查询)

Setup vs. other skills

安装配置 vs 其他技能

  • "Set up" / "install" / "authenticate" / "verify connection"Setup
  • "Configure defaults" / "set default platform" / "create profile"Setup
  • "Check if DataHub is working"Setup (connectivity verification)

  • "设置" / "安装" / "身份认证" / "验证连接"Setup
  • "配置默认项" / "设置默认平台" / "创建配置文件"Setup
  • "检查DataHub是否正常运行"Setup(连通性验证)

CLI Attribution

CLI 归因

When running
datahub
CLI commands, pass
-C skill=<name>
on the root command so usage can be attributed:
bash
datahub -C skill=datahub-search search "revenue"
datahub -C skill=datahub-enrich graphql --query '...'
datahub -C skill=datahub-lineage lineage --urn "..."
Use the skill name from the YAML frontmatter. If
-C
is not recognized, omit it — the command works the same without it.

运行
datahub
CLI命令时,在根命令中传入
-C skill=<name>
,以便统计使用情况:
bash
datahub -C skill=datahub-search search "revenue"
datahub -C skill=datahub-enrich graphql --query '...'
datahub -C skill=datahub-lineage lineage --urn "..."
使用YAML前置元数据中的Skill名称即可。如果
-C
参数不被识别,可以省略,命令的运行效果不受影响。

Critical Rules

重要规则

  1. Never guess the skill. If the intent is genuinely ambiguous, ask the user to clarify.
  2. One skill per request unless the user explicitly asks for multiple operations.
  3. Lineage is for lineage only — not for general "what is this entity?" questions (that's Search).
  4. Search handles ad-hoc questions. "Who owns X?" and "what columns does X have?" are Search questions, not Lineage.
  5. Enrich handles all metadata writes — descriptions, tags, glossary terms, ownership, deprecation.
  6. Quality handles data quality — assertions, incidents, health checks, subscriptions.
  7. Setup handles environment and configuration — CLI install, auth, connectivity, default scopes.
  1. 绝对不要猜测使用哪个Skill。如果意图确实不明确,请让用户澄清。
  2. 每个请求仅调用一个Skill,除非用户明确要求执行多个操作。
  3. Lineage仅用于血缘相关查询 — 不用于通用的 "这个实体是什么?" 类问题(这类问题归Search处理)。
  4. Search处理临时查询。"谁拥有X?" 和 "X有哪些列?" 属于Search类问题,不是Lineage类问题。
  5. Enrich处理所有元数据写入操作 — 描述、标签、术语表术语、所有权、弃用状态。
  6. Quality处理数据质量相关操作 — 断言、事件、健康检查、订阅。
  7. Setup处理环境和配置相关操作 — CLI安装、认证、连通性、默认范围。