carto-create-workflow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesecarto-create-workflow
carto-create-workflow
CARTO Workflows is a visual DAG builder that compiles to warehouse SQL. Each workflow runs inside a connected warehouse — no CARTO compute is involved at execution time. This skill covers the full lifecycle: building the DAG (the bulk of this file), operating it via the CLI (CRUD, schedules), and cross-profile copy ( promotion, customer-segregated workspaces via ) — see the references below.
dev → prodcarto workflows copyFor one-off ad-hoc SQL, use — workflows are for repeatable, scheduled, multi-step DAGs.
carto-query-datawarehouseBundle structure, component schemas, input formats, and gotchas are all served by the CLI — never hardcode or assume them. The CLI is the source of truth.
Live introspection commands (use these before reaching for any reference file):
| Command | What it serves |
|---|---|
| Index of all bundle/DAG schema sections |
| Top-level bundle shape (id, title, connectionId, config, privacy, tags). |
| Full DAG config (schemaVersion, connectionProvider enum, nodes, edges, variables, viewport, useCache, executionSettings, schedule) |
| Generic node shape, including |
| Source/ |
| Full customsql node spec |
| Copy-paste customsql node template (with |
| Edge shape |
| Edge handle naming reference — sourceHandle/targetHandle by node type, by operator, by component. Critical for valid edges. |
| Variable (parameter) shape — |
| Declarative schedule metadata fields |
| All valid enums (node types, providers, privacies, schedule frequencies) |
| Component catalog for the connected warehouse |
| Per-component |
| Input-type |
| Full command reference, including schedule-expression dialects per engine |
References (only for what the CLI doesn't serve):
- — per-warehouse details (BigQuery, Snowflake, Databricks): identifier quoting, column casing, AT path.
references/providers/ - —
references/scheduling.mdvsaddsemantics, bundle-level schedule warning, activity-log verification.update - — publishing a workflow as an MCP tool or callable API endpoint: bundle requirements (
references/mcp-and-api-publish.md+ scoped variables + draft descriptions),native.mcptooloutputvs{{@var}}substitution syntax,@varNumber → FLOAT64gotcha, post-publish verification.LIMIT - —
references/cross-profile-copy.mdmechanics, connection mapping (workflows copy/--connection-mapping),--connection, why copies are always new workflows.--skip-source-validation - — schedules don't transfer across
references/schedule-readd.md; how to re-add them, including dialect translation when source and destination engines differ.workflows copy
must match the connection.connectionProvider(enum inconfig.connectionProvider) must match the connection's actual provider — mismatches generate the wrong SQL dialect and error at runtime. Look it up withschema enums(carto connections list --search <name> --jsonrequires a UUID).connections get
CARTO Workflows是一款可视化DAG构建工具,可编译为数据仓库SQL。每个工作流都在已连接的数据仓库内运行——执行阶段不涉及CARTO计算。本技能覆盖完整生命周期:构建DAG(本文档核心内容)、通过CLI操作工作流(增删改查、调度),以及跨配置文件复制( 升级、通过实现客户隔离工作区)——详见下方参考资料。
dev → prodcarto workflows copy对于一次性临时SQL,请使用——工作流适用于可重复、可调度的多步骤DAG。
carto-query-datawarehouseBundle结构、组件Schema、输入格式及注意事项均由CLI提供——切勿硬编码或主观假设。CLI是唯一可信来源。
实时自省命令(在查阅任何参考文件前先使用这些命令):
| 命令 | 用途 |
|---|---|
| 所有bundle/DAG schema章节的索引 |
| 顶层bundle结构(id、title、connectionId、config、privacy、tags)。 |
| 完整DAG配置(schemaVersion、connectionProvider枚举、nodes、edges、variables、viewport、useCache、executionSettings、schedule) |
| 通用节点结构,包括 |
| Source/ |
| 完整customsql节点规范 |
| 可直接复制的customsql节点模板(含 |
| 边的结构 |
| 边句柄命名参考——按节点类型、操作符、组件划分的sourceHandle/targetHandle。对创建有效边至关重要。 |
| 变量(参数)结构—— |
| 声明式调度元数据字段 |
| 所有有效枚举值(节点类型、提供商、隐私设置、调度频率) |
| 已连接数据仓库的组件目录 |
| 单个组件的 |
| 输入类型的 |
| 完整命令参考,包括各引擎对应的调度表达式语法 |
参考资料(仅用于CLI未覆盖的内容):
- ——各数据仓库细节(BigQuery、Snowflake、Databricks):标识符引用、列大小写、AT路径。
references/providers/ - ——
references/scheduling.md与add的语义差异、bundle级调度警告、活动日志验证。update - ——将工作流发布为MCP工具或可调用API端点:bundle要求(
references/mcp-and-api-publish.md+ 作用域变量 + 草稿描述)、native.mcptooloutput与{{@var}}替换语法、@varNumber → FLOAT64注意事项、发布后验证。LIMIT - ——
references/cross-profile-copy.md机制、连接映射(workflows copy/--connection-mapping)、--connection、复制为何始终生成新工作流。--skip-source-validation - ——调度不会随
references/schedule-readd.md转移;如何重新添加调度,包括源引擎与目标引擎不同时的语法转换。workflows copy
必须与连接匹配。connectionProvider(config.connectionProvider中的枚举值)必须与连接的实际提供商匹配——不匹配会生成错误的SQL方言并在运行时出错。使用schema enums查询(carto connections list --search <name> --json需要UUID)。connections get
Development process
开发流程
Follow these 6 phases in order for every workflow request. Do not skip or reorder them.
每个工作流请求都必须按以下6个阶段依次执行,不得跳过或调整顺序。
Phase 1 — Gather information
阶段1 — 收集信息
-
Identify data sources. If the user named tables, note them. Otherwise discover what's available withand
carto connections list.carto connections describe <connection> "<fqn>" -
Clarify the goal. What transformation? What output? What filters/conditions?
-
Determine the connection.. Note its
carto connections list | head -n 20(provider/bigquery/snowflake) — you will need it for the next step.databricks -
Read the provider reference.<critical-rule id="read-provider-reference"> Before writing any node, you MUST open `references/providers/<provider>.md` (e.g. `references/providers/bigquery.md`) and read it end-to-end. This is non-negotiable.<why>It contains identifier-quoting rules, column-casing behaviour, Analytics Toolbox path, schedule-expression dialect, and customsql/
$aplaceholder requirements that$bcannot catch. These only surface later asvalidatefailures or runtime SQL errors, and are the single most common cause of late-stage rework.</why>verify-remote<do-not>Do not skip this step because the next phases look concrete. Do not rely on memory of a previous run — provider files change.</do-not> </critical-rule> -
Fetch the component catalog.— your only source of truth for component names.
carto workflows components list --connection <connection> --json
-
确定数据源。如果用户指定了表,记录下来。否则使用和
carto connections list发现可用数据源。carto connections describe <connection> "<fqn>" -
明确目标。要进行什么转换?输出是什么?有哪些过滤/条件?
-
确定连接。执行,记录其
carto connections list | head -n 20(provider/bigquery/snowflake)——下一步会用到。databricks -
阅读提供商参考文档<critical-rule id="read-provider-reference"> 在编写任何节点之前,你必须打开`references/providers/<provider>.md`(例如`references/providers/bigquery.md`)并完整阅读。这是硬性要求。<why>文档包含标识符引用规则、列大小写行为、Analytics Toolbox路径、调度表达式语法以及customsql的/
$a占位符要求,这些都是$b无法检测到的。这些问题只会在后续validate失败或运行时SQL错误中暴露,是后期返工最常见的原因。</why>verify-remote<do-not>不要因为后续阶段看似具体就跳过此步骤。不要依赖之前的记忆——提供商文档可能会更新。</do-not> </critical-rule> -
获取组件目录。执行——这是组件名称的唯一可信来源。
carto workflows components list --connection <connection> --json
Phase 2 — Design the approach
阶段2 — 设计实现方案
- Select components from the catalog you fetched.
- Fetch schemas for every component you plan to use. returns
carto workflows components get <name1>,<name2>,<name3> --connection <connection> --json,inputs, andoutputs. Read thenotesarray carefully — it contains gotchas.notes - Fetch input type formats. returns
carto workflows components get <component1>,<component2> --connection <connection> --input-formats --json,format, andexamplesfor each input/output type. Pass component names (e.g.pitfalls), NOT input-type names.native.buffer - Design principles:
- Preserve identifier and spatial columns throughout.
- Prefer native components over . This is not a soft preference. See Native-first rule.
native.customsql - H3/Quadbin columns work for visualization without geometry extraction.
- Use standard names for visualization: ,
geom,h3.quadbin
- 从目录中选择组件
- 获取所有计划使用的组件的Schema。执行返回
carto workflows components get <name1>,<name2>,<name3> --connection <connection> --json、inputs和outputs。仔细阅读notes数组——其中包含注意事项。notes - 获取输入类型格式。执行返回各输入/输出类型的
carto workflows components get <component1>,<component2> --connection <connection> --input-formats --json、format和examples。传入组件名称(例如pitfalls),而非输入类型名称。native.buffer - 设计原则:
- 全程保留标识符和空间列。
- 优先使用原生组件而非。这不是软性偏好。 详见原生优先规则。
native.customsql - H3/Quadbin列无需提取几何即可用于可视化。
- 使用标准名称用于可视化:、
geom、h3。quadbin
Phase 3 — Present plan, surface gaps, confirm
阶段3 — 展示方案、暴露缺口、确认需求
Present the workflow plan (components, data flow, decisions). Then explicitly enumerate every gap before building:
- Unresolved parameters — thresholds, radii, filter values, time windows, k for k-NN, aggregation columns, output table names, etc.
- Analytical decisions left to the user — significance levels, distance metrics, join types, null-handling, dedup keys, CRS, H3/quadbin resolution.
- Ambiguities in the request — anything where you had to guess intent.
For each gap, propose a sensible default with its rationale (e.g. "p-value threshold: suggest — conventional significance level", "buffer distance: suggest — matches the city-block scale of the input"), and ask the user to confirm or override. Never silently pick a value for a user-facing analytical parameter. Wait for confirmation before building.
0.051000m展示工作流方案(组件、数据流、决策)。然后明确列出所有未解决的缺口再开始构建:
- 未解决的参数——阈值、半径、过滤值、时间窗口、k-NN的k值、聚合列、输出表名等。
- 留给用户的分析决策——显著性水平、距离度量、连接类型、空值处理、去重键、CRS、H3/quadbin分辨率。
- 需求中的歧义——任何你不得不猜测意图的内容。
对于每个缺口,提出合理的默认值并说明理由(例如“p值阈值:建议——常规显著性水平”,“缓冲区距离:建议——匹配输入数据的城市街区尺度”),并请求用户确认或修改。切勿擅自为面向用户的分析参数选择值。等待确认后再开始构建。
0.051000mPhase 4 — Build the workflow
阶段4 — 构建工作流
-
Create the workflow file. Get the bundle/node/edge/variable shapes from(start with
carto workflows schema [section], thenbundle,node,node.source,node.customsql,edge). For customsql nodes, copy the template fromhandles.carto workflows schema customsqlIf you set the optional top-level, it must be an object, not a string:privacy(the field name nests). Omit the field entirely if you don't need it —"privacy": { "privacy": "private" }will fail"privacy": "private".validateSource nodes () — treattype: "source"like any other component: fetch its spec withReadTableto get the canonicalcarto workflows components get ReadTable --connection <conn> --jsonandinputs[*].title. (inputs[*].descriptionis hidden fromReadTablebecause it's groupedcomponents list, but__internalreturns it normally.) Two source-only rulesgetcannot tell you, both fromget:schema node.source- The canvas display name lives in , NOT
data.label. Generic nodes usedata.title; source nodes usetitle.label - and
data.idmust be the same FQN.data.inputs[0].value
Canvas layout & naming — apply on every node, every workflow. None of this affects execution, but the user opens the DAG in Builder and a sloppy canvas reads as low quality. The numbers are small and stable; just apply them.- Snap grid is 16 px. Every and
xyou write must bey. Builder snaps drags to this grid; off-grid values look subtly misaligned next to anything the user nudged.% 16 == 0 - Card widths are fixed by node type: source nodes render at 192 px (12 cells), generic components at 64 px (4 cells). Knowing this is what lets you reason about gaps.
- Card heights are fixed: every component card and source card is 80 px (5 cells) tall, with a 16 px label rendered below the card body. The label is not part of the card — it lives in the gap to the next card.
- Canonical inter-card gap (right edge → next left edge): 80 px (5 cells) for tight linear placement; 128 px (8 cells) at a fan-in (a join's left input, where an edge from another row needs room). The gap is the constant; left-edge-to-left-edge Δx differs across patterns only because cards have different widths. So a generic→generic linear step is Δx=144 (9 cells); a source→generic step at the same gap is Δx=272 (17 cells); a generic→generic fan-in step is Δx=192 (12 cells).
- Canonical vertical gap (card body bottom → next card body top): 80 px (5 cells), of which the first 16 px is the card's label and the remaining 64 px is whitespace. The label always sits inside the gap, never inside the card. So a stacked-card step is top-to-top Δy = 160 px (10 cells) — 80 (body) + 16 (label) + 64 (whitespace).
- Layout. Source nodes stack at the leftmost column with the same , Δy = 144 px (9 cells). The main pipeline runs at the y-midline of the source rows — e.g. sources at y=80 and y=224 → pipeline at y=160. Joins on the midline visually receive both inputs symmetrically.
x - and
data.titleare different fields — never duplicate.data.label= short instance-specific verb (≤ 15 chars) describing what this node does in this DAG (title,"Rank","Join to score")."To H3"= the component's canonical type name as Builder shows it on a fresh drop (label,"Join","Create Column") — read from"H3 from GeoPoint"→carto workflows components get <name> --json. Source nodes only rendercomponents[0].titleon canvas (treat it as a short alias for the table:data.label,"Candidates")."Score grid C"
- The canvas display name lives in
-
Runafter every write to the file. It's offline, fast, and catches structural errors immediately:
validatebashcarto workflows validate workflow.json --jsonTreat any save without a passingas broken — fix before continuing to the next node/edge.validateis authoritative. If a component schema fromvalidatedisagrees with whatcomponents getaccepts, trustvalidateand adjust the bundle to satisfy it. Do not "fix" the bundle to match the schema if it's already passing validation.validate -
Runat branch boundaries, not on every save. It hits the warehouse (slower, requires auth), so reserve it for whole sub-DAGs once their structure validates clean, and once at the end before presenting:
verifybashcarto workflows verify-remote workflow.json --connection <connection-name> --jsonis what catches column-type mismatches, missing tables, and AT resolution — thingsverifycannot see.validate -
Fix errors silently — don't expose implementation details to the user.
-
Iterate until complete, with bothand a final
validateclean.verify
-
创建工作流文件。从获取bundle/node/edge/variable结构(先从
carto workflows schema [section]开始,然后是bundle、node、node.source、node.customsql、edge)。对于customsql节点,从handles复制模板。carto workflows schema customsql如果设置可选的顶层,它必须是一个对象,而非字符串:privacy(字段名称嵌套)。如果不需要则完全省略该字段——"privacy": { "privacy": "private" }会导致"privacy": "private"失败。validateSource节点()——将type: "source"视为普通组件:使用ReadTable获取规范的carto workflows components get ReadTable --connection <conn> --json和inputs[*].title。(inputs[*].description在ReadTable中隐藏,因为它属于components list分组,但__internal命令可正常返回。)有两个get无法告知的Source节点专属规则,均来自get:schema node.source- 画布显示名称位于,而非
data.label。通用节点使用data.title;Source节点使用title。label - 和
data.id必须是相同的FQN。data.inputs[0].value
画布布局与命名——每个节点、每个工作流都需遵循。这些不影响执行,但用户会在Builder中打开DAG,混乱的画布会显得质量低下。数值较小且固定,直接应用即可。- 对齐网格为16 px。你编写的每个和
x都必须满足y。Builder会将拖拽操作对齐到此网格;非网格值与用户微调过的元素相邻时会显得轻微错位。% 16 == 0 - 卡片宽度由节点类型决定:Source节点渲染宽度为192 px(12格),通用组件为64 px(4格)。了解这一点才能合理规划间距。
- 卡片高度固定:每个组件卡片和Source卡片高度均为80 px(5格),卡片下方会渲染16 px的标签。标签不属于卡片——位于下一张卡片的间距中。
- 标准卡片间距(右边缘→下一张左边缘):线性紧凑布局为80 px(5格);扇入处(连接的左输入,需要为另一行的边留出空间)为128 px(8格)。间距是固定值;左边缘到左边缘的Δx因卡片宽度不同而变化。因此通用→通用的线性步骤Δx=144(9格);Source→通用的同间距步骤Δx=272(17格);通用→通用的扇入步骤Δx=192(12格)。
- 标准垂直间距(卡片主体底部→下一张卡片主体顶部):80 px(5格),其中前16 px是卡片标签,剩余64 px是空白。标签始终位于间距内,而非卡片内。因此堆叠卡片的顶部到顶部Δy=160 px(10格)——80(主体)+16(标签)+64(空白)。
- 布局:Source节点堆叠在最左侧列,相同,Δy=144 px(9格)。主流水线位于Source行的y中线——例如Source位于y=80和y=224 → 流水线位于y=160。连接位于中线时,视觉上能对称接收两个输入。
x - 和
data.title是不同字段——切勿重复。data.label= 简短的实例化动词(≤15字符),描述此节点在当前DAG中的作用(title、"Rank"、"Join to score")。"To H3"= Builder中拖放新组件时显示的规范类型名称(label、"Join"、"Create Column")——从"H3 from GeoPoint"→carto workflows components get <name> --json获取。Source节点仅在画布上渲染components[0].title(将其视为表的简短别名:data.label、"Candidates")。"Score grid C"
- 画布显示名称位于
-
每次保存文件后都运行。它是离线的,速度快,可立即捕获结构错误:
validatebashcarto workflows validate workflow.json --json任何未通过的保存都视为损坏——修复后再继续下一个节点/边。validate具有权威性。如果validate返回的组件Schema与components get接受的内容不一致,请信任validate并调整bundle以满足要求。如果bundle已通过验证,请勿为匹配Schema而“修改”bundle。validate -
在分支节点处运行,而非每次保存都运行。它会访问数据仓库(速度较慢,需要授权),因此仅在子DAG结构验证通过后,以及最终提交前使用:
verifybashcarto workflows verify-remote workflow.json --connection <connection-name> --json会捕获列类型不匹配、表缺失、AT解析问题——这些都是verify无法检测到的。validate -
静默修复错误——不要向用户暴露实现细节。
-
迭代至完成,确保和最终
validate均通过。verify
Phase 5 — Present result
阶段5 — 提交成果
Summarize what was built. Confirm validation success. Wait for user confirmation.
总结构建内容,确认验证成功,等待用户确认。
Phase 6 — Upload to CARTO
阶段6 — 上传至CARTO
- Ask if the user wants to upload.
- Upload and provide the URL:
The connection comes frombash
carto workflows create --file workflow.json --verifyinside the bundle — noconnectionIdflag here.--connection - Do NOT auto-execute unless explicitly requested.
- 询问用户是否需要上传。
- 上传并提供URL:
连接信息来自bundle内的bash
carto workflows create --file workflow.json --verify——无需connectionId参数。--connection - 除非明确要求,否则不要自动执行。
Native-first rule
原生优先规则
native.customsql- The native chain would require more than ~4-5 nodes to express the same logic.
- A specific operation has no native equivalent at all (verified via ).
carto workflows components list - The expression genuinely needs raw warehouse SQL (e.g., ,
ST_UNION_AGG,LOGICAL_OR, last-N windowing).ML.PREDICT
Common operations and their native equivalents — try these first:
| If you'd write SQL like… | Use natives |
|---|---|
| |
| |
| |
| |
| |
| |
| |
| |
| H3 binning / boundary / center / polyfill | |
| |
| z-score / standardization | |
| weighted composite score | |
| Getis-Ord Gi*, GWR, isolines | |
| Save final node to a table | |
Signals you're reaching for customsql too early — stop and look for a native chain instead:
- The customsql is just a clause, a single
WHERE, aJOINwith one or two aggregates, or a column projection.GROUP BY - It wraps a single warehouse function (,
ST_BUFFER, etc.) for which a dedicated native exists.H3_FROMGEOGPOINT - Its only purpose is to project/rename/re-cast columns — use (free-form SELECT body, one node) for multiple columns;
native.selectis for adding a single computed column.native.selectexpression - You're chaining customsql outputs through more customsql nodes — chain natives instead.
When customsql is genuinely the right call, the per-warehouse SQL-dialect footguns live in the matching (BigQuery backticks, Snowflake casing, Databricks identifiers).
references/providers/*.mdnative.customsql- 使用原生组件链表达相同逻辑需要约4-5个以上节点。
- 特定操作完全没有原生等效组件(通过验证)。
carto workflows components list - 表达式确实需要原始数据仓库SQL(例如、
ST_UNION_AGG、LOGICAL_OR、最近N条数据窗口函数)。ML.PREDICT
常见操作及其原生等效组件——优先尝试这些:
| 若你要编写此类SQL… | 使用原生组件 |
|---|---|
| |
| |
| |
| |
| |
| |
跨两张表计算 | |
| |
| H3分箱/边界/中心/填充 | |
| |
| z-score / 标准化 | |
| 加权综合评分 | |
| Getis-Ord Gi*、GWR、等值线 | |
| 将最终节点保存到表 | |
以下信号表明你过早使用customsql——停止操作并寻找原生组件链:
- customsql仅包含子句、单个
WHERE、带一两个聚合函数的JOIN或列投影。GROUP BY - 它封装了单个数据仓库函数(、
ST_BUFFER等),而已有专门的原生组件。H3_FROMGEOGPOINT - 其唯一目的是投影/重命名/转换列——多列使用(自由形式SELECT主体,单个节点);添加单个计算列使用
native.select。native.selectexpression - 你正在将customsql输出链式传递给更多customsql节点——改为链式调用原生组件。
当确实需要使用customsql时,各数据仓库的SQL方言陷阱在对应的中(BigQuery反引号、Snowflake大小写、Databricks标识符)。
references/providers/*.mdFetching component & input information
获取组件与输入信息
Do not rely on memorized component schemas or input formats. Always fetch live data from the CLI.
| Command | Purpose |
|---|---|
| List all available components |
| Component schemas with |
| Input type |
What to look for in the response:
- Component — gotcha strings: non-obvious behavior, deprecated status, output column naming.
notes - Input — prose describing the expected value shape.
format - Input — concrete JSON snippets showing correct usage.
examples - Input — common mistakes, evaluation order, format quirks.
pitfalls - Component — copy verbatim into the authored node's
version(string). Generic nodes without it are flagged OUTDATED in Builder.data.version - Input (Selection / Enum) — the engine matches values exactly. Copy each option string verbatim — preserve case, never paraphrase or Title-Case (e.g. spatialjoin's
optionsacceptsjointype, not"inner")."Inner"
For values that may evolve over time (component versions, bundle/config defaults, enum option lists), treat the CLI's / output as the single source of truth — never hardcode values in your own templates. Specifically:
components getschema- — read the current default from
config.schemaVersion→carto workflows schema config --json. Today it'sproperties.schemaVersion.default(string), but resolve at author time so future bumps don't require a skill update."1.0.0"
不要依赖记忆中的组件Schema或输入格式。始终从CLI获取实时数据。
| 命令 | 用途 |
|---|---|
| 列出所有可用组件 |
| 组件Schema,包含 |
| 组件所用输入类型的 |
响应中需要关注的内容:
- 组件——注意事项:非直观行为、已弃用状态、输出列命名。
notes - 输入——描述预期值结构的说明文字。
format - 输入——展示正确用法的具体JSON片段。
examples - 输入——常见错误、求值顺序、格式 quirks。
pitfalls - 组件——原样复制到创建的节点的
version(字符串)。没有该字段的通用节点在Builder中会被标记为OUTDATED。data.version - 输入(选择/枚举)——引擎会精确匹配值。原样复制每个选项字符串——保留大小写,切勿改写或转为标题大小写(例如spatialjoin的
options接受jointype,而非"inner")。"Inner"
对于可能随时间演变的值(组件版本、bundle/config默认值、枚举选项列表),将CLI的 / 输出视为唯一可信来源——切勿在自己的模板中硬编码值。具体来说:
components getschema- ——从
config.schemaVersion→carto workflows schema config --json读取当前默认值。目前是properties.schemaVersion.default(字符串),但在创建时解析,以便未来版本升级无需更新技能。"1.0.0"
Provider-specific notes
提供商特定说明
Different warehouses have different SQL dialects, table-naming conventions, and column-casing rules. Always check the matching provider guide:
references/providers/bigquery.mdreferences/providers/snowflake.mdreferences/providers/databricks.md
Input-type formats (, , , , etc.) and per-component gotchas (including the "AT components need , not " rule) are served by the CLI itself — see Fetching component & input information.
TableColumnColumnsForJoinSelectColumnAggregationverifyvalidate不同数据仓库有不同的SQL方言、表命名约定和列大小写规则。务必查看对应的提供商指南:
references/providers/bigquery.mdreferences/providers/snowflake.mdreferences/providers/databricks.md
输入类型格式(、、、等)和组件特定注意事项(包括“AT组件需要,而非”规则)由CLI直接提供——详见获取组件与输入信息。
TableColumnColumnsForJoinSelectColumnAggregationverifyvalidateOperating a workflow (after it's built)
工作流操作(构建完成后)
Once a workflow exists in CARTO, the CLI exposes CRUD and schedule management. Quick reference:
bash
undefined工作流在CARTO中创建后,CLI提供增删改查和调度管理功能。快速参考:
bash
undefinedList / inspect
列出 / 查看详情
carto workflows list --json
carto workflows get <id>
carto workflows list --json
carto workflows get <id>
Update with edited JSON
使用编辑后的JSON更新
carto workflows update <id> --file workflow.json
carto workflows update <id> --file workflow.json
Add / remove a schedule
添加 / 删除调度
carto workflows schedule add <id> --expression "every day 08:00"
carto workflows schedule remove <id>
Always-on guidance:
- **Workflows run on the connection's warehouse.** A workflow with a BigQuery connection cannot use Snowflake-specific SQL.
- **Schedule expression syntax depends on the engine** — natural-language for BQ/CARTO DW (`"every day 08:00"`), cron for Snowflake/Postgres (`"0 8 * * *"`), Quartz cron for Databricks (`"0 0 8 * * ?"`). See [`references/scheduling.md`](references/scheduling.md). Picking the wrong dialect fails at schedule-add time.
- **Copying a workflow across profiles** (dev → prod, customer-segregated workspaces) is covered in [`references/cross-profile-copy.md`](references/cross-profile-copy.md). Schedules don't transfer — see [`references/schedule-readd.md`](references/schedule-readd.md).
- **Deleting a workflow doesn't delete its outputs.** Tables/views the workflow created in the warehouse persist; clean them up with `carto sql job` if needed.
- **`workflows update` replaces the whole DAG.** There's no per-node patch. Always `get` first, edit, then `update`.
- **Workflow execution status** lives in the activity log (`WorkflowRun`, `WorkflowExecutionComplete` event types). For health monitoring of scheduled workflows, query that log via [`carto-query-datawarehouse`](../carto-query-datawarehouse) — see `references/activity-queries.md` in that skill.carto workflows schedule add <id> --expression "every day 08:00"
carto workflows schedule remove <id>
通用指导:
- **工作流在连接的数据仓库上运行**。使用BigQuery连接的工作流无法使用Snowflake专属SQL。
- **调度表达式语法取决于引擎**——BQ/CARTO DW使用自然语言(`"every day 08:00"`),Snowflake/Postgres使用cron(`"0 8 * * *"`),Databricks使用Quartz cron(`"0 0 8 * * ?"`)。详见[`references/scheduling.md`](references/scheduling.md)。选择错误语法会在添加调度时失败。
- **跨配置文件复制工作流**(dev → prod、客户隔离工作区)详见[`references/cross-profile-copy.md`](references/cross-profile-copy.md)。调度不会随复制转移——详见[`references/schedule-readd.md`](references/schedule-readd.md)。
- **删除工作流不会删除其输出**。工作流在数据仓库中创建的表/视图会保留;如需清理请使用`carto sql job`。
- **`workflows update`会替换整个DAG**。不支持单个节点补丁。始终先`get`、编辑、再`update`。
- **工作流执行状态**位于活动日志中(`WorkflowRun`、`WorkflowExecutionComplete`事件类型)。如需监控调度工作流的健康状态,请通过[`carto-query-datawarehouse`](../carto-query-datawarehouse)查询该日志——详见该技能中的`references/activity-queries.md`。