carto-spatial-enrichment

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Spatial Enrichment in CARTO Workflows

CARTO工作流中的空间富集

Prerequisites: Load

carto-create-workflow

for the development process.

This skill covers the universal pattern for enriching spatial data with demographics, risk scores, or any variable from a spatial features dataset.

前置条件：开发过程中需加载

carto-create-workflow

。

本技能涵盖了使用人口统计数据、风险评分或空间特征数据集中的任意变量来富集空间数据的通用模式。

Instructions

操作步骤

Follow the 5-step universal enrichment pattern. Each step maps to one or more workflow components.

遵循以下5步通用富集模式，每个步骤对应一个或多个工作流组件。

Step 1: Load source data

步骤1：加载源数据

Load the business entities (stores, points, polygons) that need enrichment.

Points or polygons from a table: use a
```
ReadTable
```
source node
Custom geometry: use
```
native.tablefromgeojson
```
to inline a GeoJSON polygon

加载需要进行富集的业务实体（门店、点、多边形）。

来自表格的点或多边形：使用
```
ReadTable
```
源节点
自定义几何图形：使用
```
native.tablefromgeojson
```
嵌入GeoJSON多边形

Step 2: Define target area

步骤2：定义目标区域

Choose one method based on the use case:

Method	Component	When to use
Buffer	`native.buffer`	Distance-based area (e.g. 1km around each store)
Isochrones	`native.isolines`	Drive-time or walk-time areas
Direct polygon	(none needed)	Source data is already polygons
Direct points	(none needed)	Skip to Step 4 with `native.enrichpoints`

根据使用场景选择以下方法之一：

方法	组件	使用场景
缓冲区	`native.buffer`	基于距离的区域（例如每个门店周边1公里范围）
等时线	`native.isolines`	驾车或步行可达区域
直接使用多边形	（无需组件）	源数据本身已是多边形
直接使用点	（无需组件）	跳过步骤3，直接使用 `native.enrichpoints`

Step 3: Spatial indexing (polyfill)

步骤3：空间索引（多边形填充）

Convert areas into a grid for enrichment. This step is required when using grid-based enrichment (
native.h3enrich
) or manual JOIN.

H3 grid (most common):
```
native.h3polyfill
```
-- set
```
resolution
```
to match the enrichment dataset
Quadbin grid:
```
native.quadbinpolyfill
```
-- set
```
resolution
```
to match the enrichment dataset

Key decision -- index type: Use H3 unless the enrichment data is natively in Quadbin.

将区域转换为网格以进行富集。当使用基于网格的富集（
native.h3enrich
）或手动JOIN时，此步骤为必填项。

H3网格（最常用）：
```
native.h3polyfill
```
—— 设置
```
resolution
```
以匹配富集数据集
Quadbin网格：
```
native.quadbinpolyfill
```
—— 设置
```
resolution
```
以匹配富集数据集

关键决策——索引类型：除非富集数据原生为Quadbin格式，否则默认使用H3。

Step 4: Enrich

步骤4：执行富集

Two approaches, each with different column naming:

A) CARTO ENRICH procedures (recommended for Data Observatory or spatial features data):

```
native.h3enrich
```
-- enrich an H3 grid
```
native.enrichpoints
```
-- enrich points directly (skip Step 3)
```
native.enrichpolygons
```
-- enrich polygons directly (skip Step 3)

Output columns are named

{variable}_{aggregation}

(e.g.

population_sum

air_quality_avg

)

B) Manual JOIN on the spatial index column:

Use
```
native.join
```
with the H3/Quadbin column as the join key
Output columns from the secondary table get a
```
_joined
```
suffix
Default is INNER JOIN (silently drops unmatched cells)

Aggregation method guidance:

```
SUM
```
-- population counts, totals
```
MAX
```
/
```
MIN
```
-- risk scores, thresholds
```
AVG
```
-- quality metrics, indices

有两种方法，各自的列命名规则不同：

A) CARTO ENRICH 流程（推荐用于数据观测站或空间特征数据）：

```
native.h3enrich
```
—— 富集H3网格
```
native.enrichpoints
```
—— 直接富集点数据（跳过步骤3）
```
native.enrichpolygons
```
—— 直接富集多边形数据（跳过步骤3）

输出列命名格式为

{variable}_{aggregation}

（例如

population_sum

、

air_quality_avg

）

B) 基于空间索引列的手动JOIN：

使用
```
native.join
```
，以H3/Quadbin列作为连接键
从次级表输出的列会添加
```
_joined
```
后缀
默认使用INNER JOIN（自动丢弃未匹配的单元格）

聚合方法指南：

```
SUM
```
—— 人口统计、总计类数据
```
MAX
```
/
```
MIN
```
—— 风险评分、阈值类数据
```
AVG
```
—— 质量指标、指数类数据

Step 5: Save results

步骤5：保存结果

Use

native.saveastable

to persist the enriched output.

If the goal is per-entity enrichment (e.g. population per store), add a second JOIN + GROUP BY to aggregate grid-level results back to the source entity level.

Success: The workflow loads source data, defines areas, indexes to a grid (if needed), enriches with the target variables using the correct aggregation, and saves the result. Column names in downstream references match the enrichment method used.

使用

native.saveastable

保存富集后的输出结果。

如果目标是实现按实体富集（例如每个门店的人口数据），需添加第二次JOIN + GROUP BY操作，将网格级别的结果聚合回源实体级别。

成功标准：工作流需完成源数据加载、区域定义、网格索引（如需要）、使用正确聚合方式富集目标变量，并保存结果。下游引用中的列名需与所使用的富集方法匹配。

Gotchas

注意事项

Provider casing & SQL dialect. This skill documents output column names in lowercase (
```
population_sum
```
,
```
air_quality_avg
```
,
```
<column>_joined
```
, etc.) — BigQuery / Databricks / Postgres / Redshift convention. On Snowflake, unquoted identifiers surface UPPERCASE — reference them as
```
POPULATION_SUM
```
,
```
AIR_QUALITY_AVG
```
,
```
<COLUMN>_JOINED
```
. See
```
carto-create-workflow/references/providers/<provider>.md
```
for casing rules and SQL dialect equivalents.
Resolution alignment is critical. The polyfill resolution MUST match the enrichment dataset's native resolution (e.g. H3 resolution 8 with resolution 8 spatial features). A mismatch produces zero JOIN matches with NO error.
Manual JOIN drops unmatched cells.
```
native.join
```
defaults to INNER JOIN, silently dropping cells with no enrichment data. Use LEFT JOIN if completeness matters.
Deduplicate after polyfill. Use
```
SELECT DISTINCT
```
or GROUP BY on the index column to remove duplicate cells. If you need to preserve source identity (e.g. which store each cell came from), set
```
includecols: true
```
in the polyfill node.
Column naming differs by method. ENRICH procedures produce
```
{variable}_{aggregation}
```
columns. Manual JOIN produces
```
{column}_joined
```
columns. Plan downstream SQL references accordingly.
Buffer distance is in meters. Isoline range units depend on type: seconds for time-based, meters for distance-based.
Re-aggregation needed for entity-level results. After grid enrichment, data is at the cell level. To get per-store or per-location totals, add a second JOIN + GROUP BY step to roll cell-level values back to the source entity.

提供商大小写规则与SQL方言：本技能文档中的输出列名为小写格式（
```
population_sum
```
、
```
air_quality_avg
```
、
```
<column>_joined
```
等）—— 这是BigQuery / Databricks / Postgres / Redshift的约定。在Snowflake中，未加引号的标识符会显示为大写——需以
```
POPULATION_SUM
```
、
```
AIR_QUALITY_AVG
```
、
```
<COLUMN>_JOINED
```
的形式引用。请查看
```
carto-create-workflow/references/providers/<provider>.md
```
获取大小写规则和SQL方言对应说明。
分辨率对齐至关重要：多边形填充的分辨率必须与富集数据集的原生分辨率匹配（例如H3分辨率8对应分辨率8的空间特征数据）。不匹配会导致JOIN无匹配结果且无错误提示。
手动JOIN会丢弃未匹配单元格：
```
native.join
```
默认使用INNER JOIN，会自动丢弃无富集数据的单元格。如果需要保证完整性，请使用LEFT JOIN。
多边形填充后需去重：对索引列使用
```
SELECT DISTINCT
```
或GROUP BY来移除重复单元格。如果需要保留源实体标识（例如每个单元格来自哪个门店），请在多边形填充节点中设置
```
includecols: true
```
。
列命名因方法而异：ENRICH流程生成
```
{variable}_{aggregation}
```
格式的列，手动JOIN生成
```
{column}_joined
```
格式的列。请据此规划下游SQL引用。
缓冲区距离单位为米：等时线范围单位取决于类型：基于时间的为秒，基于距离的为米。
实体级结果需重新聚合：网格富集后的数据为单元格级别。如需获取每个门店或位置的总计数据，需添加第二次JOIN + GROUP BY步骤，将单元格级别的值汇总回源实体级别。

Reference Templates

参考模板

Templates included in this skill folder (from the CARTO Workflows template repository):

File	Pattern	Description
enrich_grid.json	GeoJSON polygon -> H3 polyfill -> ENRICH_GRID	Enrich a custom area with sociodemographic H3 data
enrich_points.json	Filter points -> ENRICH_POINTS	Enrich point locations with polygon-based risk data
estimate_population_around_retail_stores.json	Points -> Buffer -> H3 polyfill -> JOIN -> GROUP BY	Full entity-level enrichment with re-aggregation

本技能文件夹中包含以下模板（来自CARTO工作流模板仓库）：

文件	模式	描述
enrich_grid.json	GeoJSON多边形 -> H3多边形填充 -> ENRICH_GRID	使用社会人口统计H3数据富集自定义区域
enrich_points.json	过滤点 -> ENRICH_POINTS	使用基于多边形的风险数据富集点位置
estimate_population_around_retail_stores.json	点 -> 缓冲区 -> H3多边形填充 -> JOIN -> GROUP BY	包含重新聚合的完整实体级富集流程

Common Variations

常见变体

Variation	Steps used	Key differences
Enrich points directly	1 -> 4 -> 5	Skip grid; use `native.enrichpoints`
Enrich polygons directly	1 -> 4 -> 5	Skip grid; use `native.enrichpolygons`
Buffer + grid enrichment	1 -> 2 -> 3 -> 4 -> 5	`native.buffer` then polyfill then enrich
Isochrone + grid enrichment	1 -> 2 -> 3 -> 4 -> 5	`native.isolines` then polyfill then enrich
Re-aggregate to source entity	1 -> 2 -> 3 -> 4 -> JOIN + GROUP BY -> 5	Add second JOIN to map cells back to source entities

变体	使用步骤	核心差异
直接富集点数据	1 -> 4 -> 5	跳过网格；使用 `native.enrichpoints`
直接富集多边形数据	1 -> 4 -> 5	跳过网格；使用 `native.enrichpolygons`
缓冲区+网格富集	1 -> 2 -> 3 -> 4 -> 5	先使用 `native.buffer` ，再进行多边形填充，最后富集
等时线+网格富集	1 -> 2 -> 3 -> 4 -> 5	先使用 `native.isolines` ，再进行多边形填充，最后富集
重新聚合至源实体	1 -> 2 -> 3 -> 4 -> JOIN + GROUP BY -> 5	添加第二次JOIN，将单元格映射回源实体