carto-spatial-autocorrelation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Spatial Autocorrelation with Moran's I

利用Moran's I进行空间自相关分析

Builds CARTO Workflows that measure spatial autocorrelation using Moran's I, determining whether a variable exhibits clustering, dispersion, or randomness, and classifying each location into HH/HL/LH/LL quadrants.

Prerequisites: Load

carto-create-workflow

for the development process, JSON structure, and validation commands.

When to use Moran's I vs Getis-Ord Gi*:

Moran's I: "Is there clustering?" + classify into cluster types (HH, HL, LH, LL) + identify spatial outliers (HL, LH)
Getis-Ord Gi*: "Where are the hotspots/coldspots?" + magnitude of clustering (z-scores)

构建CARTO工作流，通过Moran's I测量空间自相关，判断变量是否呈现聚类、离散或随机分布，并将每个位置分类为HH/HL/LH/LL象限。

前提条件：加载

carto-create-workflow

以获取开发流程、JSON结构和验证命令。

Moran's I与Getis-Ord Gi*的适用场景对比：

Moran's I：用于回答“是否存在聚类？”+ 分类聚类类型（HH、HL、LH、LL）+ 识别空间异常值（HL、LH）
Getis-Ord Gi*：用于回答“热点/冷点位于何处？”+ 衡量聚类强度（z值）

Instructions

操作步骤

A Moran's I workflow follows this pipeline:

Source Data -> (Filter) -> Spatial Indexing (H3) -> Aggregation -> Moran's I -> (Filter Significant) -> Save

Moran's I工作流遵循以下流程：

源数据 -> (过滤) -> 空间索引（H3）-> 聚合 -> Moran's I分析 -> (过滤显著结果) -> 保存

Step 1: Load Source Data

步骤1：加载源数据

Use

native.gettablebyname

. The input table typically contains point geometries or pre-indexed grid data.

Success: Node outputs a table with a geometry column (e.g.

geom

) or an existing spatial index column.

使用

native.gettablebyname

。输入表通常包含点几何数据或预索引的网格数据。

成功标志：节点输出包含几何列（如

geom

）或现有空间索引列的表格。

Step 2: Filter (if needed)

步骤2：过滤（如有需要）

Use

native.wheresimplified

native.where

to narrow the dataset (e.g. filter by category, date range, non-null values).

Success: Output contains only the subset relevant to the analysis.

使用

native.wheresimplified

或

native.where

缩小数据集范围（例如按类别、日期范围、非空值过滤）。

成功标志：输出仅包含与分析相关的子集。

Step 3: Spatial Indexing

步骤3：空间索引

Convert point geometries to H3 cells using

native.h3frompoint

Resolution guidance -- higher resolution = smaller cells = more local patterns:

Resolution	Cell size	Use case
H3 res 7	~5 km edge	District/city-level patterns
H3 res 8	~2 km edge	Neighborhood-level
H3 res 9	~500m edge	Street-level (used in Berlin POI tutorial)

Success: Every row has a spatial index column (e.g.

h3

使用

native.h3frompoint

将点几何数据转换为H3单元格。

分辨率指南——分辨率越高，单元格越小，越能捕捉局部模式：

分辨率	单元格边长	适用场景
H3 res 7	~5公里	区域/城市级模式
H3 res 8	~2公里	社区级模式
H3 res 9	~500米	街道级模式（柏林POI教程中使用）

成功标志：每一行都包含空间索引列（如

h3

）。

Step 4: Aggregate per Cell

步骤4：按单元格聚合

Use

native.groupby

to produce one row per cell with a numeric value:

Group by: the spatial index column (
```
h3
```
)
Aggregation:
```
geoid,count
```
(or
```
value_col,sum
```
/
```
value_col,avg
```
)

Success: Output has exactly one row per unique cell with a numeric column (e.g.

geoid_count

使用

native.groupby

生成每个单元格对应一行的数值结果：

分组依据：空间索引列（
```
h3
```
）
聚合方式：
```
geoid,count
```
（或
```
value_col,sum
```
/
```
value_col,avg
```
）

成功标志：输出中每个唯一单元格对应一行，且包含数值列（如

geoid_count

）。

Step 5: Run Moran's I

步骤5：运行Moran's I分析

Use

native.moransi

with:

Input	Description	Default
`indexcol`	Column with H3/Quadbin indexes	`h3`
`valuecol`	Numeric column to test for autocorrelation	`geoid_count`
`size`	K-ring neighborhood radius (in hops)	`3`
`decay`	Distance decay function for spatial weights	`uniform`

Decay options:

uniform

inverse

inverse_square

exponential

```
uniform
```
: Equal weight to all neighbors within the k-ring
```
exponential
```
: Weight decreases exponentially with distance (used in Berlin POI tutorial)

K-ring size: Larger = broader neighborhood = smoother global patterns. Smaller = more localized assessment. The choice of neighborhood size significantly affects results.

Success: Output contains

index

morans_i

p_value

, and

quadrant

columns for every cell. (See the Provider casing note in Gotchas — Snowflake surfaces these UPPERCASE.)

使用

native.moransi

，参数如下：

输入参数	描述	默认值
`indexcol`	包含H3/Quadbin索引的列	`h3`
`valuecol`	用于测试自相关的数值列	`geoid_count`
`size`	K环邻域半径（以跳数为单位）	`3`
`decay`	空间权重的距离衰减函数	`uniform`

衰减函数选项：

uniform

、

inverse

、

inverse_square

、

exponential

。

```
uniform
```
：K环内所有邻域权重相等
```
exponential
```
：权重随距离呈指数衰减（柏林POI教程中使用）

K环大小：数值越大，邻域范围越广，全局模式越平滑；数值越小，评估越本地化。邻域大小的选择对结果影响显著。

成功标志：输出包含每个单元格的

index

、

morans_i

、

p_value

和

quadrant

列。（注意“注意事项”中的提供商大小写说明——Snowflake会将这些列名显示为大写。）

Step 6: Filter Significant Results (recommended)

步骤6：过滤显著结果（推荐）

Use

native.where

to keep only statistically significant cells. Quadrant classification is only meaningful for significant cells.

Common filters:

```
p_value < 0.05
```
-- all significant cells (95% confidence)
```
p_value < 0.05 AND quadrant = 'HH'
```
-- high-value clusters only

p_value < 0.05 AND (quadrant = 'HL' OR quadrant = 'LH')

-- spatial outliers only

Success: Only cells with statistically meaningful spatial patterns remain.

使用

native.where

仅保留具有统计显著性的单元格。只有显著单元格的象限分类才有意义。

常用过滤条件：

```
p_value < 0.05
```
——所有显著单元格（95%置信度）
```
p_value < 0.05 AND quadrant = 'HH'
```
——仅保留高值聚类

p_value < 0.05 AND (quadrant = 'HL' OR quadrant = 'LH')

——仅保留空间异常值

成功标志：仅保留具有统计意义空间模式的单元格。

Step 7: Save

步骤7：保存结果

Use

native.saveastable

to persist results. The H3/Quadbin column is directly visualizable in CARTO Builder without geometry conversion.

Success: Validated workflow that can be uploaded via

carto workflows create

使用

native.saveastable

保存结果。H3/Quadbin列可直接在CARTO Builder中可视化，无需转换几何数据。

成功标志：生成可通过

carto workflows create

上传的验证通过的工作流。

Output Columns

输出列说明

Column	Meaning
`index`	Spatial index cell ID (H3 or Quadbin)
`morans_i`	Local Moran's I value -- positive = similar neighbors, negative = dissimilar neighbors
`p_value`	Statistical significance -- lower = more confident
`quadrant`	Cluster classification: `HH` , `HL` , `LH` , or `LL`

The engine declares these lowercase. See the Provider casing note in Gotchas for Snowflake.

列名	含义
`index`	空间索引单元格ID（H3或Quadbin）
`morans_i`	局部Moran's I值——正值表示邻域值相似，负值表示邻域值相异
`p_value`	统计显著性——值越低，置信度越高
`quadrant`	聚类分类： `HH` 、 `HL` 、 `LH` 或 `LL`

引擎默认将这些列名定义为小写。Snowflake的列名规则请查看“注意事项”部分。

Interpreting Results

结果解读

Global Moran's I (overall pattern):

> 0 = spatial clustering (similar values near each other)
< 0 = spatial dispersion (dissimilar values near each other)
Near 0 = spatial randomness

Local quadrants (per-cell classification):

Quadrant	Meaning	Interpretation
HH	High value surrounded by high values	Cluster core
LL	Low value surrounded by low values	Low-value cluster
HL	High value surrounded by low values	Spatial outlier (high anomaly)
LH	Low value surrounded by high values	Spatial outlier (low anomaly)

全局Moran's I（整体模式）：

0 = 空间聚类（相似值聚集）
< 0 = 空间离散（相异值相邻）
接近0 = 空间随机分布

局部象限（按单元格分类）：

象限	含义	解读
HH	高值被高值包围	聚类核心
LL	低值被低值包围	低值聚类
HL	高值被低值包围	空间异常值（高值异常点）
LH	低值被高值包围	空间异常值（低值异常点）

Gotchas

注意事项

Provider casing & SQL dialect. This skill documents columns in lowercase (BigQuery / Databricks / Postgres / Redshift convention). On Snowflake, unquoted identifiers surface UPPERCASE — reference
```
H3
```
,
```
INDEX
```
,
```
MORANS_I
```
,
```
P_VALUE
```
,
```
QUADRANT
```
,
```
GEOID_COUNT
```
in expressions. See
```
carto-create-workflow/references/providers/<provider>.md
```
for casing rules and SQL dialect equivalents.
The Moran's I component requires the Analytics Toolbox. Always run
```
carto workflows verify-remote --connection <conn>
```
to ensure the AT path is resolved.
```
carto workflows validate
```
is offline and cannot resolve AT location.
The output column is named
```
index
```
, not
```
h3
```
or
```
quadbin
```
. If you need to join back to original data, rename it (e.g. with
```
native.renamecolumn
```
). This is the same behavior as Getis-Ord.
The
```
valuecol
```
must be numeric. If you are counting features, the group-by step must produce a count column -- do not pass the raw index column as the value.
Resolution too high + large area = very many cells, which can be slow or hit memory limits. Start with a moderate resolution and refine.
Moran's I is sensitive to the definition of neighborhood. Both k-ring size and decay function choice materially affect results. Document your choices and consider testing alternatives.
Quadrant classification is only meaningful for statistically significant cells. Always filter by
```
p_value
```
before interpreting quadrants -- non-significant cells may show any quadrant label by chance.
The decay input parameter is named
```
decay
```
(not
```
kernel
```
). Check the component schema if unsure.

提供商大小写与SQL方言：本技能文档中的列名为小写（符合BigQuery / Databricks / Postgres / Redshift的约定）。在Snowflake中，未加引号的标识符会显示为大写——在表达式中需引用
```
H3
```
、
```
INDEX
```
、
```
MORANS_I
```
、
```
P_VALUE
```
、
```
QUADRANT
```
、
```
GEOID_COUNT
```
。请查看
```
carto-create-workflow/references/providers/<provider>.md
```
获取大小写规则和SQL方言对应关系。
Moran's I组件需要Analytics Toolbox。请始终运行
```
carto workflows verify-remote --connection <conn>
```
以确保AT路径已解析。
```
carto workflows validate
```
为离线验证，无法解析AT位置。
输出列名为
```
index
```
，而非
```
h3
```
或
```
quadbin
```
。如果需要与原始数据关联，请重命名该列（例如使用
```
native.renamecolumn
```
）。此行为与Getis-Ord一致。
```
valuecol
```
必须为数值类型。如果是统计要素数量，分组步骤必须生成计数列——请勿将原始索引列作为值传入。
分辨率过高+分析区域过大=单元格数量过多，可能导致运行缓慢或触发内存限制。建议从适中分辨率开始，逐步优化。
Moran's I对邻域定义敏感。K环大小和衰减函数的选择都会对结果产生实质性影响。请记录你的选择，并考虑测试替代方案。
只有具有统计显著性的单元格，其象限分类才有意义。解读象限前请务必按
```
p_value
```
过滤——非显著单元格的象限标签可能是随机产生的。
衰减输入参数名为
```
decay
```
（而非
```
kernel
```
）。如有疑问，请查看组件架构。

Reference Templates

参考模板

Resource	Description
BQ Tutorial	Computing spatial autocorrelation of POI locations in Berlin (BigQuery)
SF Tutorial	Same tutorial for Snowflake
Workflow template	"Computing the spatial auto-correlation of point of interest locations" (available in CARTO Workspace)

资源	描述
BQ教程	计算柏林POI位置的空间自相关（BigQuery）
SF教程	适用于Snowflake的同款教程
工作流模板	“计算兴趣点位置的空间自相关”（在CARTO Workspace中可用）

Common Variations

常见变体

Variant	How
Pre-indexed data	Skip Step 3 if data already has H3/Quadbin column
Polygon input instead of points	Use `native.h3polyfill` instead of `native.h3frompoint`
Complete grid (no gaps)	Polyfill study area boundary first, then enrich with data (same approach as hotspot analysis)
Combine with Getis-Ord	Run both analyses on the same aggregated grid, then join results for a richer picture
Filter to outliers only	Keep `HL` and `LH` quadrants to find anomalous locations

变体	实现方式
预索引数据	若数据已包含H3/Quadbin列，跳过步骤3
输入为面数据而非点数据	使用 `native.h3polyfill` 替代 `native.h3frompoint`
完整网格（无间隙）	先对研究区域边界进行Polyfill，再补充数据（与热点分析方法相同）
结合Getis-Ord分析	在同一聚合网格上运行两种分析，然后关联结果以获取更丰富的信息
仅保留异常值	保留 `HL` 和 `LH` 象限以查找异常位置