fiftyone-embeddings-visualization

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Embeddings Visualization in FiftyOne

FiftyOne中的嵌入可视化

Key Directives

核心规则

ALWAYS follow these rules:

请始终遵循以下规则：

1. Set context first

1. 先设置上下文

python

set_context(dataset_name="my-dataset")

python

set_context(dataset_name="my-dataset")

2. Launch FiftyOne App

2. 启动FiftyOne App

Brain operators are delegated and require the app:

python

launch_app()

Wait 5-10 seconds for initialization.

Brain算子需要依赖App运行：

python

launch_app()

等待5-10秒完成初始化。

3. Discover operators dynamically

3. 动态发现算子

python

undefined

python

undefined

List all brain operators

列出所有Brain算子

list_operators(builtin_only=False)

Get schema for specific operator

获取指定算子的架构

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

undefined

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

undefined

4. Compute embeddings before visualization

4. 先计算嵌入再可视化

Embeddings are required for dimensionality reduction:

python

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_sim",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

降维需要先有嵌入数据：

python

execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_sim",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",
        "backend": "sklearn",
        "metric": "cosine"
    }
)

5. Close app when done

5. 完成后关闭App

python

close_app()

python

close_app()

Complete Workflow

完整工作流程

Step 1: Setup

步骤1：环境设置

python

undefined

python

undefined

Set context

设置上下文

set_context(dataset_name="my-dataset")

Launch app (required for brain operators)

启动App（Brain算子必需）

launch_app()

undefined

launch_app()

undefined

Step 2: Verify Brain Plugin

步骤2：验证Brain插件

python

undefined

python

undefined

Check if brain plugin is available

检查Brain插件是否可用

list_plugins(enabled=True)

If not installed:

如果未安装：

download_plugin( url_or_repo="voxel51/fiftyone-plugins", plugin_names=["@voxel51/brain"] ) enable_plugin(plugin_name="@voxel51/brain")

undefined

download_plugin( url_or_repo="voxel51/fiftyone-plugins", plugin_names=["@voxel51/brain"] ) enable_plugin(plugin_name="@voxel51/brain")

undefined

Step 3: Discover Brain Operators

步骤3：发现Brain算子

python

undefined

python

undefined

List all available operators

列出所有可用算子

list_operators(builtin_only=False)

Get schema for compute_visualization

获取compute_visualization的架构

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

undefined

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

undefined

Step 4: Check for Existing Embeddings or Compute New Ones

步骤4：检查现有嵌入或计算新嵌入

First, check if the dataset already has embeddings by looking at the operator schema:

python

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

首先，通过查看算子架构检查数据集是否已有嵌入：

python

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

Look for existing embeddings fields in the "embeddings" choices

在"embeddings"选项中查找现有嵌入字段

(e.g., "clip_embeddings", "dinov2_embeddings")

（例如："clip_embeddings", "dinov2_embeddings"）


**If embeddings exist:** Skip to Step 5 and use the existing embeddings field.

**If no embeddings exist:** Compute them:
```python
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_viz",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",  # Field name to store embeddings
        "backend": "sklearn",
        "metric": "cosine"
    }
)

Required parameters for compute_similarity:

```
brain_key
```
- Unique identifier for this brain run
```
model
```
- Model from FiftyOne Model Zoo to generate embeddings
```
embeddings
```
- Field name where embeddings will be stored
```
backend
```
- Similarity backend (use
```
"sklearn"
```
)
```
metric
```
- Distance metric (use
```
"cosine"
```
or
```
"euclidean"
```
)

Recommended embedding models:

```
clip-vit-base32-torch
```
- Best for general visual + semantic similarity
```
dinov2-vits14-torch
```
- Best for visual similarity only
```
resnet50-imagenet-torch
```
- Classic CNN features
```
mobilenet-v2-imagenet-torch
```
- Fast, lightweight option


**如果已有嵌入：** 跳至步骤5，直接使用现有嵌入字段。

**如果没有嵌入：** 计算嵌入：
```python
execute_operator(
    operator_uri="@voxel51/brain/compute_similarity",
    params={
        "brain_key": "img_viz",
        "model": "clip-vit-base32-torch",
        "embeddings": "clip_embeddings",  # 存储嵌入的字段名
        "backend": "sklearn",
        "metric": "cosine"
    }
)

compute_similarity的必填参数：

```
brain_key
```
- 本次Brain运行的唯一标识符
```
model
```
- 用于生成嵌入的FiftyOne模型库中的模型
```
embeddings
```
- 存储嵌入数据的字段名
```
backend
```
- 相似度计算后端（推荐使用
```
"sklearn"
```
）
```
metric
```
- 距离度量（推荐使用
```
"cosine"
```
或
```
"euclidean"
```
）

推荐的嵌入模型：

```
clip-vit-base32-torch
```
- 通用视觉+语义相似度的最佳选择
```
dinov2-vits14-torch
```
- 仅视觉相似度的最佳选择
```
resnet50-imagenet-torch
```
- 经典CNN特征提取模型
```
mobilenet-v2-imagenet-torch
```
- 快速轻量型模型

Step 5: Compute 2D Visualization

步骤5：计算2D可视化

Use existing embeddings field OR the brain_key from Step 4:

python

undefined

使用现有嵌入字段或步骤4中的

brain_key

：

python

undefined

Option A: Use existing embeddings field (e.g., clip_embeddings)

选项A：使用现有嵌入字段（例如clip_embeddings）

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "img_viz", "embeddings": "clip_embeddings", # Use existing field "method": "umap", "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "img_viz", "embeddings": "clip_embeddings", # 使用现有字段 "method": "umap", "num_dims": 2 } )

Option B: Use brain_key from compute_similarity

选项B：使用compute_similarity中的brain_key

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "img_viz", # Same key used in compute_similarity "method": "umap", "num_dims": 2 } )


**Dimensionality reduction methods:**
- `umap` - (Recommended) Preserves local and global structure, faster. Requires `umap-learn` package.
- `tsne` - Better local structure, slower on large datasets. No extra dependencies.
- `pca` - Linear reduction, fastest but less informative

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "img_viz", # 与compute_similarity中使用的key一致 "method": "umap", "num_dims": 2 } )


**降维方法：**
- `umap` - （推荐）保留局部和全局结构，速度更快。需要安装`umap-learn`包。
- `tsne` - 局部结构表现更好，但在大型数据集上速度较慢，无需额外依赖。
- `pca` - 线性降维，速度最快但信息保留较少

Step 6: Direct User to Embeddings Panel

步骤6：引导用户打开嵌入面板

After computing visualization, direct the user to open the FiftyOne App at http://localhost:5151/ and:

Click the Embeddings panel icon (scatter plot icon, looks like a grid of dots) in the top toolbar
Select the brain key (e.g.,
```
img_viz
```
) from the dropdown
Points represent samples in 2D embedding space
Use the "Color by" dropdown to color points by a field (e.g.,
```
ground_truth
```
,
```
predictions
```
)
Click points to select samples, use lasso tool to select groups

IMPORTANT: Do NOT use

set_view(exists=["brain_key"])

- this filters samples and is not needed for visualization. The Embeddings panel automatically shows all samples with computed coordinates.

计算完成后，引导用户打开FiftyOne App（地址：http://localhost:5151/）并执行以下操作：

点击顶部工具栏中的嵌入面板图标（散点图样式，类似点阵网格）
从下拉菜单中选择对应的brain key（例如
```
img_viz
```
）
每个点代表嵌入空间中的一个样本
使用**"按颜色区分"**下拉菜单，根据指定字段（如
```
ground_truth
```
、
```
predictions
```
）为点着色
点击点可选中样本，使用套索工具可选中样本组

重要提示： 请勿使用

set_view(exists=["brain_key"])

- 该操作会过滤样本，可视化无需此步骤。嵌入面板会自动显示所有已计算坐标的样本。

Step 7: Explore and Filter (Optional)

步骤7：探索与过滤（可选）

To filter samples while viewing in the Embeddings panel:

python

undefined

在嵌入面板中查看时，可通过以下方式过滤样本：

python

undefined

Filter to specific class

过滤指定类别的样本

set_view(filters={"ground_truth.label": "dog"})

Filter by tag

按标签过滤样本

set_view(tags=["validated"])

Clear filter to show all

清除过滤器，显示所有样本

clear_view()


These filters will update the Embeddings panel to show only matching samples.

clear_view()


这些过滤器会同步更新嵌入面板，仅显示匹配的样本。

Step 8: Find Outliers

步骤8：查找异常值

Outliers appear as isolated points far from clusters:

python

undefined

异常值表现为远离聚类的孤立点：

python

undefined

Compute uniqueness scores (higher = more unique/outlier)

计算唯一性分数（分数越高表示样本越独特/越可能是异常值）

execute_operator( operator_uri="@voxel51/brain/compute_uniqueness", params={ "brain_key": "img_viz" } )

View most unique samples (potential outliers)

查看最独特的样本（潜在异常值）

set_view(sort_by="uniqueness", reverse=True, limit=50)

undefined

set_view(sort_by="uniqueness", reverse=True, limit=50)

undefined

Step 9: Find Clusters

步骤9：查找聚类

Use the App's Embeddings panel to visually identify clusters, then:

Option A: Lasso selection in App

Use lasso tool to select a cluster
Selected samples are highlighted
Tag or export selected samples

Option B: Use similarity to find cluster members

python

undefined

使用App的嵌入面板可视化识别聚类，然后通过以下方式处理：

选项A：在App中使用套索选择

使用套索工具选中一个聚类
选中的样本会被高亮显示
可为选中样本添加标签或导出

选项B：通过相似度查找聚类成员

python

undefined

Sort by similarity to a representative sample

按与代表性样本的相似度排序

execute_operator( operator_uri="@voxel51/brain/sort_by_similarity", params={ "brain_key": "img_viz", "query_id": "sample_id_from_cluster", "k": 100 } )

undefined

execute_operator( operator_uri="@voxel51/brain/sort_by_similarity", params={ "brain_key": "img_viz", "query_id": "sample_id_from_cluster", "k": 100 } )

undefined

Step 10: Clean Up

步骤10：清理环境

python

close_app()

python

close_app()

Available Tools

可用工具

Session View Tools

会话视图工具

Tool	Description
`set_view(filters={...})`	Filter samples by field values
`set_view(tags=[...])`	Filter samples by tags
`set_view(sort_by="...", reverse=True)`	Sort samples by field
`set_view(limit=N)`	Limit to N samples
`clear_view()`	Clear filters, show all samples

工具	描述
`set_view(filters={...})`	按字段值过滤样本
`set_view(tags=[...])`	按标签过滤样本
`set_view(sort_by="...", reverse=True)`	按字段对样本排序
`set_view(limit=N)`	限制显示N个样本
`clear_view()`	清除过滤器，显示所有样本

Brain Operators for Visualization

用于可视化的Brain算子

Use

list_operators()

to discover and

get_operator_schema()

to see parameters:

Operator	Description
`@voxel51/brain/compute_similarity`	Compute embeddings and similarity index
`@voxel51/brain/compute_visualization`	Reduce embeddings to 2D/3D for visualization
`@voxel51/brain/compute_uniqueness`	Score samples by uniqueness (outlier detection)
`@voxel51/brain/sort_by_similarity`	Sort by similarity to a query sample

使用

list_operators()

发现算子，使用

get_operator_schema()

查看参数：

算子	描述
`@voxel51/brain/compute_similarity`	计算嵌入和相似度索引
`@voxel51/brain/compute_visualization`	将嵌入降维至2D/3D用于可视化
`@voxel51/brain/compute_uniqueness`	计算样本的唯一性分数（异常值检测）
`@voxel51/brain/sort_by_similarity`	按与查询样本的相似度排序

Common Use Cases

常见使用场景

Use Case 1: Basic Dataset Exploration

场景1：基础数据集探索

Visualize dataset structure and explore clusters:

python

set_context(dataset_name="my-dataset")
launch_app()

可视化数据集结构并探索聚类：

python

set_context(dataset_name="my-dataset")
launch_app()

Check for existing embeddings in schema

检查架构中是否有现有嵌入

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

If embeddings exist (e.g., clip_embeddings), use them directly:

如果已有嵌入（例如clip_embeddings），直接使用：

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "exploration", "embeddings": "clip_embeddings", "method": "umap", # or "tsne" if umap-learn not installed "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "exploration", "embeddings": "clip_embeddings", "method": "umap", # 如果未安装umap-learn，可改用"tsne" "num_dims": 2 } )

Direct user to App Embeddings panel at http://localhost:5151/

引导用户打开App嵌入面板（地址：http://localhost:5151/）

1. Click Embeddings panel icon

1. 点击嵌入面板图标

2. Select "exploration" from dropdown

2. 从下拉菜单中选择"exploration"

3. Use "Color by" to color by ground_truth or predictions

3. 使用"按颜色区分"功能，根据ground_truth或predictions着色

undefined

undefined

Use Case 2: Find Outliers in Dataset

场景2：查找数据集中的异常值

Identify anomalous or mislabeled samples:

python

set_context(dataset_name="my-dataset")
launch_app()

识别异常或标记错误的样本：

python

set_context(dataset_name="my-dataset")
launch_app()

Check for existing embeddings in schema

检查架构中是否有现有嵌入

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

If no embeddings exist, compute them:

如果没有嵌入，先计算：

execute_operator( operator_uri="@voxel51/brain/compute_similarity", params={ "brain_key": "outliers", "model": "clip-vit-base32-torch", "embeddings": "clip_embeddings", "backend": "sklearn", "metric": "cosine" } )

Compute uniqueness scores

计算唯一性分数

execute_operator( operator_uri="@voxel51/brain/compute_uniqueness", params={"brain_key": "outliers"} )

Generate visualization (use existing embeddings field or brain_key)

生成可视化（使用现有嵌入字段或brain_key）

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "outliers", "embeddings": "clip_embeddings", # Use existing field if available "method": "umap", # or "tsne" if umap-learn not installed "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "outliers", "embeddings": "clip_embeddings", # 如果有现有字段则使用 "method": "umap", # 如果未安装umap-learn，可改用"tsne" "num_dims": 2 } )

Direct user to App at http://localhost:5151/

引导用户打开App（地址：http://localhost:5151/）

1. Click Embeddings panel icon

1. 点击嵌入面板图标

2. Select "outliers" from dropdown

2. 从下拉菜单中选择"outliers"

3. Outliers appear as isolated points far from clusters

3. 异常值表现为远离聚类的孤立点

4. Optionally sort by uniqueness field in the App sidebar

4. 可在App侧边栏中按唯一性分数排序

undefined

undefined

Use Case 3: Compare Classes in Embedding Space

场景3：在嵌入空间中比较类别

See how different classes cluster:

python

set_context(dataset_name="my-dataset")
launch_app()

查看不同类别的聚类情况：

python

set_context(dataset_name="my-dataset")
launch_app()

Check for existing embeddings in schema

检查架构中是否有现有嵌入

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

If no embeddings exist, compute them:

如果没有嵌入，先计算：

execute_operator( operator_uri="@voxel51/brain/compute_similarity", params={ "brain_key": "class_viz", "model": "clip-vit-base32-torch", "embeddings": "clip_embeddings", "backend": "sklearn", "metric": "cosine" } )

Generate visualization (use existing embeddings field or brain_key)

生成可视化（使用现有嵌入字段或brain_key）

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "class_viz", "embeddings": "clip_embeddings", # Use existing field if available "method": "umap", # or "tsne" if umap-learn not installed "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "class_viz", "embeddings": "clip_embeddings", # 如果有现有字段则使用 "method": "umap", # 如果未安装umap-learn，可改用"tsne" "num_dims": 2 } )

Direct user to App at http://localhost:5151/

引导用户打开App（地址：http://localhost:5151/）

1. Click Embeddings panel icon

1. 点击嵌入面板图标

2. Select "class_viz" from dropdown

2. 从下拉菜单中选择"class_viz"

3. Use "Color by" dropdown to color by ground_truth or predictions

3. 使用"按颜色区分"下拉菜单，根据ground_truth或predictions着色

Look for:

观察以下情况：

- Well-separated clusters = good class distinction

- 聚类分离清晰 = 类别区分度好

- Overlapping clusters = similar classes or confusion

- 聚类重叠 = 类别相似或模型易混淆

- Scattered points = high variance within class

- 点分布分散 = 类别内部差异大

undefined

undefined

Use Case 4: Analyze Model Predictions

场景4：分析模型预测结果

Compare ground truth vs predictions in embedding space:

python

set_context(dataset_name="my-dataset")
launch_app()

在嵌入空间中比较真实标签与模型预测结果：

python

set_context(dataset_name="my-dataset")
launch_app()

Check for existing embeddings in schema

检查架构中是否有现有嵌入

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

If no embeddings exist, compute them:

如果没有嵌入，先计算：

execute_operator( operator_uri="@voxel51/brain/compute_similarity", params={ "brain_key": "pred_analysis", "model": "clip-vit-base32-torch", "embeddings": "clip_embeddings", "backend": "sklearn", "metric": "cosine" } )

Generate visualization (use existing embeddings field or brain_key)

生成可视化（使用现有嵌入字段或brain_key）

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "pred_analysis", "embeddings": "clip_embeddings", # Use existing field if available "method": "umap", # or "tsne" if umap-learn not installed "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "pred_analysis", "embeddings": "clip_embeddings", # 如果有现有字段则使用 "method": "umap", # 如果未安装umap-learn，可改用"tsne" "num_dims": 2 } )

Direct user to App at http://localhost:5151/

引导用户打开App（地址：http://localhost:5151/）

1. Click Embeddings panel icon

1. 点击嵌入面板图标

2. Select "pred_analysis" from dropdown

2. 从下拉菜单中选择"pred_analysis"

3. Color by ground_truth - see true class distribution

3. 按ground_truth着色 - 查看真实类别分布

4. Color by predictions - see model's view

4. 按predictions着色 - 查看模型的分类视角

5. Look for mismatches to find errors

5. 查找不匹配的样本以定位模型错误

undefined

undefined

Use Case 5: t-SNE for Publication-Quality Plots

场景5：用于出版物级别的t-SNE可视化

Use t-SNE for better local structure (no extra dependencies):

python

set_context(dataset_name="my-dataset")
launch_app()

使用t-SNE获得更优的局部结构表现（无需额外依赖）：

python

set_context(dataset_name="my-dataset")
launch_app()

Check for existing embeddings in schema

检查架构中是否有现有嵌入

get_operator_schema(operator_uri="@voxel51/brain/compute_visualization")

If no embeddings exist, compute them (DINOv2 for visual similarity):

如果没有嵌入，先计算（使用DINOv2用于视觉相似度）：

execute_operator( operator_uri="@voxel51/brain/compute_similarity", params={ "brain_key": "tsne_viz", "model": "dinov2-vits14-torch", "embeddings": "dinov2_embeddings", "backend": "sklearn", "metric": "cosine" } )

Generate t-SNE visualization (no umap-learn dependency needed)

生成t-SNE可视化（无需安装umap-learn）

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "tsne_viz", "embeddings": "dinov2_embeddings", # Use existing field if available "method": "tsne", "num_dims": 2 } )

execute_operator( operator_uri="@voxel51/brain/compute_visualization", params={ "brain_key": "tsne_viz", "embeddings": "dinov2_embeddings", # 如果有现有字段则使用 "method": "tsne", "num_dims": 2 } )

Direct user to App at http://localhost:5151/

引导用户打开App（地址：http://localhost:5151/）

1. Click Embeddings panel icon

1. 点击嵌入面板图标

2. Select "tsne_viz" from dropdown

2. 从下拉菜单中选择"tsne_viz"

3. t-SNE provides better local cluster structure than UMAP

3. t-SNE相比UMAP能提供更优的局部聚类结构

undefined

undefined

Troubleshooting

故障排除

Error: "No executor available"

Cause: Delegated operators require the App executor
Solution: Ensure
```
launch_app()
```
was called and wait 5-10 seconds

Error: "Brain key not found"

Cause: Embeddings not computed
Solution: Run
```
compute_similarity
```
first with a
```
brain_key
```

Error: "Operator not found"

Cause: Brain plugin not installed
Solution: Install with
```
download_plugin()
```
and
```
enable_plugin()
```

Error: "You must install the
umap-learn>=0.5
package"

Cause: UMAP method requires the
```
umap-learn
```
package
Solutions:
1. Install umap-learn: Ask user if they want to run
```
pip install umap-learn
```
2. Use t-SNE instead: Change
```
method
```
  to
```
"tsne"
```
  (no extra dependencies)
3. Use PCA instead: Change
```
method
```
  to
```
"pca"
```
  (fastest, no extra dependencies)
After installing umap-learn, restart Claude Code/MCP server and retry

Visualization is slow

Use UMAP instead of t-SNE for large datasets
Use faster embedding model:
```
mobilenet-v2-imagenet-torch
```
Process subset first:
```
set_view(limit=1000)
```

Embeddings panel not showing

Ensure visualization was computed (not just embeddings)
Check brain_key matches in both compute_similarity and compute_visualization
Refresh the App page

Points not colored correctly

Verify the field exists on samples
Check field type is compatible (Classification, Detections, or string)

错误："No executor available"

原因：委托算子需要依赖App执行器
解决方案：确保已调用
```
launch_app()
```
并等待5-10秒

错误："Brain key not found"

原因：未计算嵌入数据
解决方案：先使用
```
compute_similarity
```
算子并指定
```
brain_key
```

错误："Operator not found"

原因：未安装Brain插件
解决方案：使用
```
download_plugin()
```
和
```
enable_plugin()
```
安装插件

错误："You must install the
umap-learn>=0.5
package"

原因：使用UMAP方法需要安装
```
umap-learn
```
包
解决方案：
1. 安装umap-learn：询问用户是否执行
```
pip install umap-learn
```
2. 改用t-SNE：将
```
method
```
  改为
```
"tsne"
```
  （无需额外依赖）
3. 改用PCA：将
```
method
```
  改为
```
"pca"
```
  （速度最快，无需额外依赖）
安装umap-learn后，重启Claude Code/MCP服务器并重试

可视化速度慢

大型数据集使用UMAP替代t-SNE
使用更快的嵌入模型：
```
mobilenet-v2-imagenet-torch
```
先处理子集：
```
set_view(limit=1000)
```

嵌入面板未显示

确保已计算可视化（而不仅仅是嵌入）
检查compute_similarity和compute_visualization中使用的brain_key是否一致
刷新App页面

点着色异常

验证样本中是否存在指定字段
检查字段类型是否兼容（分类、检测或字符串类型）

Best Practices

最佳实践

Discover dynamically - Use
```
list_operators()
```
and
```
get_operator_schema()
```
to get current operator names and parameters
Choose the right model - CLIP for semantic similarity, DINOv2 for visual similarity
Start with UMAP - Faster and often better than t-SNE for exploration
Use uniqueness for outliers - More reliable than visual inspection alone
Store embeddings - Reuse for multiple visualizations via
```
brain_key
```
Subset large datasets - Compute on subset first, then full dataset

动态发现算子 - 使用
```
list_operators()
```
和
```
get_operator_schema()
```
获取当前算子名称和参数
选择合适的模型 - CLIP适用于语义相似度，DINOv2适用于视觉相似度
优先使用UMAP - 相比t-SNE，UMAP速度更快，探索效果更优
用唯一性分数检测异常值 - 比单纯的视觉检查更可靠
存储嵌入数据 - 通过
```
brain_key
```
复用嵌入数据进行多次可视化
先处理数据集子集 - 先在子集上验证流程，再处理完整数据集

Performance Notes

性能说明

Embedding computation time:

1,000 images: ~1-2 minutes
10,000 images: ~10-15 minutes
100,000 images: ~1-2 hours

Visualization computation time:

UMAP: ~30 seconds for 10,000 samples
t-SNE: ~5-10 minutes for 10,000 samples
PCA: ~5 seconds for 10,000 samples

Memory requirements:

~2KB per image for embeddings
~16 bytes per image for 2D coordinates

嵌入计算时间：

1000张图片：约1-2分钟
10000张图片：约10-15分钟
100000张图片：约1-2小时

可视化计算时间：

UMAP：10000个样本约30秒
t-SNE：10000个样本约5-10分钟
PCA：10000个样本约5秒

内存需求：

每张图片的嵌入约占2KB
2D坐标每张图片约占16字节