sf-datacloud-prepare

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

sf-datacloud-prepare: Data Cloud Prepare Phase

sf-datacloud-prepare: Data Cloud 准备阶段

Use this skill when the user needs ingestion and lake preparation work: data streams, Data Lake Objects, transforms, or DocAI-based extraction.
当用户需要数据摄入与数据湖准备工作时使用此技能:包括数据流、数据湖对象(DLO)、转换规则或基于DocAI的提取操作。

When This Skill Owns the Task

此技能的适用场景

Use
sf-datacloud-prepare
when the work involves:
  • sf data360 data-stream *
  • sf data360 dlo *
  • sf data360 transform *
  • sf data360 docai *
  • choosing how data should enter Data Cloud
Delegate elsewhere when the user is:
  • still creating/testing source connections → sf-datacloud-connect
  • mapping to DMOs or designing IR/data graphs → sf-datacloud-harmonize
  • querying ingested data → sf-datacloud-retrieve

当工作涉及以下内容时,使用
sf-datacloud-prepare
  • sf data360 data-stream *
  • sf data360 dlo *
  • sf data360 transform *
  • sf data360 docai *
  • 规划数据进入Data Cloud的方式
当用户进行以下操作时,请转交至其他技能:
  • 仍在创建/测试源连接 → sf-datacloud-connect
  • 映射至DMO或设计身份解析/数据图谱 → sf-datacloud-harmonize
  • 查询已摄入的数据 → sf-datacloud-retrieve

Required Context to Gather First

需先收集的必要上下文信息

Ask for or infer:
  • target org alias
  • source connection name
  • source object / dataset
  • desired stream type
  • DLO naming expectations
  • whether the user is creating, updating, running, or deleting a stream

询问或推断以下信息:
  • 目标组织别名
  • 源连接名称
  • 源对象/数据集
  • 期望的流类型
  • DLO命名规范
  • 用户是要创建、更新、运行还是删除数据流

Core Operating Rules

核心操作规则

  • Verify the external plugin runtime before running Data Cloud commands.
  • Run the shared readiness classifier before mutating ingestion assets:
    node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --phase prepare --json
    .
  • Prefer inspecting existing streams and DLOs before creating new ingestion assets.
  • Suppress linked-plugin warning noise with
    2>/dev/null
    for normal usage.
  • Treat DLO naming and field naming as Data Cloud-specific, not CRM-native.
  • Hand off to Harmonize only after ingestion assets are clearly healthy.

  • 运行Data Cloud命令前,先验证外部插件运行环境。
  • 在修改摄入资产前,先运行共享就绪性分类器:
    node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --phase prepare --json
  • 在创建新的摄入资产前,优先检查现有数据流和DLO。
  • 在常规使用中,使用
    2>/dev/null
    抑制链接插件的警告信息。
  • DLO命名和字段命名需遵循Data Cloud的特定规范,而非CRM原生规范。
  • 仅当摄入资产状态正常后,再转交至Harmonize技能处理。

Recommended Workflow

推荐工作流程

1. Classify readiness for prepare work

1. 分类准备工作的就绪状态

bash
node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --phase prepare --json
bash
node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --phase prepare --json

2. Inspect existing ingestion assets

2. 检查现有摄入资产

bash
sf data360 data-stream list -o <org> 2>/dev/null
sf data360 dlo list -o <org> 2>/dev/null
bash
sf data360 data-stream list -o <org> 2>/dev/null
sf data360 dlo list -o <org> 2>/dev/null

3. Create or inspect streams intentionally

3. 有针对性地创建或检查数据流

bash
sf data360 data-stream get -o <org> --name <stream> 2>/dev/null
sf data360 data-stream create-from-object -o <org> --object Contact --connection SalesforceDotCom_Home 2>/dev/null
sf data360 data-stream create -o <org> -f stream.json 2>/dev/null
bash
sf data360 data-stream get -o <org> --name <stream> 2>/dev/null
sf data360 data-stream create-from-object -o <org> --object Contact --connection SalesforceDotCom_Home 2>/dev/null
sf data360 data-stream create -o <org> -f stream.json 2>/dev/null

4. Check DLO shape

4. 检查DLO结构

bash
sf data360 dlo get -o <org> --name Contact_Home__dll 2>/dev/null
bash
sf data360 dlo get -o <org> --name Contact_Home__dll 2>/dev/null

5. Only then move into harmonization

5. 之后再进入协调阶段

Once the stream and DLO are healthy, hand off to sf-datacloud-harmonize.

当数据流和DLO状态正常后,转交至sf-datacloud-harmonize处理。

High-Signal Gotchas

高风险注意事项

  • CRM-backed stream behavior is not the same as fully custom connector-framework ingestion.
  • Stream deletion can also delete the associated DLO unless the delete mode says otherwise.
  • DLO field naming differs from CRM field naming.
  • Query DLO record counts with Data Cloud SQL instead of assuming list output is sufficient.
  • CdpDataStreams
    means the stream module is gated for the current org/user; guide the user to provisioning/permissions review instead of retrying blindly.

  • 基于CRM的数据流行为与完全自定义的连接器框架摄入行为不同。
  • 除非删除模式另有说明,否则删除数据流可能会同时删除关联的DLO。
  • DLO字段命名与CRM字段命名不同。
  • 需使用Data Cloud SQL查询DLO记录数,不要假设列表输出的数据足够准确。
  • 若出现
    CdpDataStreams
    提示,说明当前组织/用户的流模块被限制;请引导用户审核权限或进行资源配置,而非盲目重试。

Output Format

输出格式

text
Prepare task: <stream / dlo / transform / docai>
Source: <connection + object>
Target org: <alias>
Artifacts: <stream names / dlo names / json definitions>
Verification: <passed / partial / blocked>
Next step: <harmonize or retrieve>

text
Prepare task: <stream / dlo / transform / docai>
Source: <connection + object>
Target org: <alias>
Artifacts: <stream names / dlo names / json definitions>
Verification: <passed / partial / blocked>
Next step: <harmonize or retrieve>

References

参考资料

  • README.md
  • ../sf-datacloud/assets/definitions/data-stream.template.json
  • ../sf-datacloud/references/plugin-setup.md
  • ../sf-datacloud/references/feature-readiness.md
  • README.md
  • ../sf-datacloud/assets/definitions/data-stream.template.json
  • ../sf-datacloud/references/plugin-setup.md
  • ../sf-datacloud/references/feature-readiness.md