apify-generate-output-schema

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Generate Actor Output Schema

生成Actor输出模式

You are generating output schema files for an Apify Actor. The output schema tells Apify Console how to display run results. You will analyze the Actor's source code, create

dataset_schema.json

output_schema.json

, and

key_value_store_schema.json

(if the Actor uses key-value store), and update

actor.json

你正在为Apify Actor生成输出模式文件。输出模式用于告知Apify控制台如何显示运行结果。你需要分析Actor的源代码，创建

dataset_schema.json

、

output_schema.json

和

key_value_store_schema.json

（如果Actor使用了键值存储），并更新

actor.json

。

Core Principles

核心原则

Analyze code first: Read the Actor's source to understand what data it actually pushes to the dataset — never guess
Every field is nullable: APIs and websites are unpredictable — always set
```
"nullable": true
```
Anonymize examples: Never use real user IDs, usernames, or personal data in examples
Verify against code: If TypeScript types exist, cross-check the schema against both the type definition AND the code that produces the values
Reuse existing patterns: Before generating schemas, check if other Actors in the same repository already have output schemas — match their structure, naming conventions, description style, and formatting
Don't reinvent the wheel: Reuse existing type definitions, interfaces, and utilities from the codebase instead of creating duplicate definitions

先分析代码：阅读Actor的源代码，明确它实际推送到dataset中的数据——绝不凭空猜测
所有字段均可为空：API和网站的不可预测性——始终设置
```
"nullable": true
```
示例数据匿名化：不要在示例中使用真实的用户ID、用户名或个人数据
与代码交叉验证：如果存在TypeScript类型定义，要同时对照类型定义和生成值的代码来验证模式
复用现有模式：在生成模式之前，检查同一仓库中的其他Actor是否已有输出模式——匹配它们的结构、命名规范、描述风格和格式
不重复造轮子：复用代码库中已有的类型定义、接口和工具，而非创建重复定义

Phase 1: Discover Actor Structure

阶段1：探索Actor结构

Goal: Locate the Actor and understand its output

Initial request: $ARGUMENTS

Actions:

Create todo list with all phases
Find the
```
.actor/
```
directory containing
```
actor.json
```
Read
```
actor.json
```
to understand the Actor's configuration

Check if

dataset_schema.json

output_schema.json

, and

key_value_store_schema.json

already exist

Search for existing schemas in the repository: Look for other
```
.actor/
```
directories or schema files (e.g.,
```
**/dataset_schema.json
```
,
```
**/output_schema.json
```
,
```
**/key_value_store_schema.json
```
) to learn the repo's conventions — match their description style, field naming, example formatting, and overall structure
Find all places where data is pushed to the dataset:
- JavaScript/TypeScript: Search for
```
Actor.pushData(
```
  ,
```
dataset.pushData(
```
  ,
```
Dataset.pushData(
```
- Python: Search for
```
Actor.push_data(
```
  ,
```
dataset.push_data(
```
  ,
```
Dataset.push_data(
```

Find all places where data is stored in the key-value store:

JavaScript/TypeScript: Search for

Actor.setValue(

keyValueStore.setValue(

KeyValueStore.setValue(

Python: Search for

Actor.set_value(

key_value_store.set_value(

KeyValueStore.set_value(

Find output type definitions — reuse them directly instead of recreating from scratch:
- TypeScript: Look for output type interfaces/types (e.g., in
```
src/types/
```
  ,
```
src/types/output.ts
```
  ). If an interface or type already defines the output shape, derive the schema fields from it — do not create a parallel definition
- Python: Look for TypedDict, dataclass, or Pydantic model definitions. Use the existing field names, types, and docstrings as the source of truth
Check for existing shared schema utilities or helper functions in the codebase that handle schema generation or validation — reuse them rather than creating new logic
If inline
```
storages.dataset
```
or
```
storages.keyValueStore
```
config exists in
```
actor.json
```
, note it for migration

Present findings to user: list all discovered dataset output fields, key-value store keys, their types, and where they come from.

目标：定位Actor并理解其输出内容

初始请求：$ARGUMENTS

操作步骤:

创建包含所有阶段的待办事项列表
找到包含
```
actor.json
```
的
```
.actor/
```
目录
读取
```
actor.json
```
以了解Actor的配置信息

检查

dataset_schema.json

、

output_schema.json

和

key_value_store_schema.json

是否已存在

在仓库中搜索现有模式：查找其他
```
.actor/
```
目录或模式文件（例如
```
**/dataset_schema.json
```
、
```
**/output_schema.json
```
、
```
**/key_value_store_schema.json
```
），学习仓库的规范——匹配它们的描述风格、字段命名、示例格式和整体结构
找到所有将数据推送到dataset的位置：
- JavaScript/TypeScript：搜索
```
Actor.pushData(
```
  、
```
dataset.pushData(
```
  、
```
Dataset.pushData(
```
- Python：搜索
```
Actor.push_data(
```
  、
```
dataset.push_data(
```
  、
```
Dataset.push_data(
```

找到所有将数据存储到键值存储的位置：

JavaScript/TypeScript：搜索

Actor.setValue(

、

keyValueStore.setValue(

、

KeyValueStore.setValue(

Python：搜索

Actor.set_value(

、

key_value_store.set_value(

、

KeyValueStore.set_value(

找到输出类型定义——直接复用而非从头创建：
- TypeScript：查找输出类型接口/类型（例如在
```
src/types/
```
  、
```
src/types/output.ts
```
  中）。如果已有接口或类型定义了输出结构，直接从中派生模式字段——不要创建并行定义
- Python：查找TypedDict、dataclass或Pydantic模型定义。使用现有的字段名称、类型和文档字符串作为事实来源
检查代码库中是否有处理模式生成或验证的共享工具函数或辅助函数——复用它们而非创建新逻辑
如果
```
actor.json
```
中存在内联的
```
storages.dataset
```
或
```
storages.keyValueStore
```
配置，记录下来以便迁移

向用户展示发现结果：列出所有已发现的dataset输出字段、键值存储键、它们的类型以及来源位置。

Phase 2: Generate

dataset_schema.json

阶段2：生成

dataset_schema.json

Goal: Create a complete dataset schema with field definitions and display views

目标：创建包含字段定义和显示视图的完整dataset模式

File structure

文件结构

json

{
    "actorSpecification": 1,
    "fields": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            // ALL output fields here — every field the Actor can produce,
            // not just the ones shown in the overview view
        },
        "required": [],
        "additionalProperties": true
    },
    "views": {
        "overview": {
            "title": "Overview",
            "description": "Most important fields at a glance",
            "transformation": {
                "fields": [
                    // 8-12 most important field names
                ]
            },
            "display": {
                "component": "table",
                "properties": {
                    // Display config for each overview field
                }
            }
        }
    }
}

json

{
    "actorSpecification": 1,
    "fields": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            // 所有输出字段都在这里——Actor可能生成的每个字段，
            // 不仅是概览视图中显示的字段
        },
        "required": [],
        "additionalProperties": true
    },
    "views": {
        "overview": {
            "title": "Overview",
            "description": "Most important fields at a glance",
            "transformation": {
                "fields": [
                    // 8-12个最重要的字段名称
                ]
            },
            "display": {
                "component": "table",
                "properties": {
                    // 每个概览字段的显示配置
                }
            }
        }
    }
}

Consistency with existing schemas

与现有模式保持一致

If existing output schemas were found in the repository during Phase 1 (step 5), follow their conventions:

Match the description writing style (sentence case vs. lowercase, period vs. no period, etc.)
Match the field naming convention (camelCase vs. snake_case) — this must also match the actual keys produced by the Actor code
Match the example value style (e.g., date formats, URL patterns, placeholder names)
Match the view structure (number of fields in overview, display format choices)
Match the JSON formatting (indentation, property ordering, spacing) — all schemas in the same repository must use identical formatting, including standalone Actors

When the Actor code already has well-defined TypeScript interfaces or Python type classes, derive fields directly from those types rather than re-analyzing pushData/push_data calls from scratch. The type definition is the canonical source.

如果在阶段1的步骤5中发现仓库中已有输出模式，遵循它们的规范：

匹配描述写作风格（句首大写 vs 全小写、是否加句号等）
匹配字段命名规范（驼峰式 vs 蛇形命名）——这必须与Actor代码生成的实际键名匹配
匹配示例值风格（例如日期格式、URL模式、占位符名称）
匹配视图结构（概览中的字段数量、显示格式选择）
匹配JSON格式（缩进、属性顺序、空格）——同一仓库中的所有模式必须使用完全相同的格式，包括独立的Actor

当Actor代码已有完善的TypeScript接口或Python类型类时，直接从这些类型派生字段，而非从头分析pushData/push_data调用。类型定义是权威来源。

Hard rules (no exceptions)

硬性规则（无例外）

Rule	Detail
All fields in `properties`	The `fields.properties` object must contain every field the Actor can output, not just the fields shown in the overview view. The views section selects a subset for display — the `properties` section must be the complete superset
`"nullable": true`	On every field — APIs are unpredictable
`"additionalProperties": true`	On the top-level `fields` object AND on every nested object within `properties` . This is the most commonly missed rule — it must appear at both levels
`"required": []`	Always empty array — on the top-level `fields` object AND on every nested object within `properties`
Anonymized examples	No real user IDs, usernames, or content
`"type"` required with `"nullable"`	AJV rejects `nullable` without a `type` on the same field

Warning — most common mistakes:
Only including fields that appear in the overview view. The
fields.properties
must list ALL output fields, even if they are not in the
views
section.
Only adding
"required": []
and
"additionalProperties": true
on nested object-type properties but forgetting them on the top-level
fields
object. Both levels need them.

Note:
nullable
is an Apify-specific extension to JSON Schema draft-07. It is intentional and correct.

规则	细节
`properties` 中包含所有字段	`fields.properties` 对象必须包含Actor可能输出的每一个字段，不仅是概览视图中显示的字段。视图部分选择子集用于显示—— `properties` 部分必须是完整的超集
`"nullable": true`	应用于每一个字段——API不可预测
`"additionalProperties": true`	应用于顶层 `fields` 对象以及 `properties` 内的每一个嵌套对象。这是最常被忽略的规则——两个层级都必须添加
`"required": []`	始终为空数组——应用于顶层 `fields` 对象以及 `properties` 内的每一个嵌套对象
示例数据匿名化	不使用真实用户ID、用户名或内容
带 `"nullable"` 时必须有 `"type"`	AJV会拒绝没有同时设置 `type` 的 `nullable` 字段

警告——最常见的错误:
仅包含出现在概览视图中的字段。
fields.properties
必须列出所有输出字段，即使它们不在
views
部分。
仅在嵌套对象类型的属性上添加
"required": []
和
"additionalProperties": true
，但忘记在顶层
fields
对象上添加。两个层级都需要这些配置。

注意：
nullable
是Apify对JSON Schema draft-07的特定扩展。这是有意且正确的。

Field type patterns

字段类型模式

String field:

json

"title": {
    "type": "string",
    "description": "Title of the scraped item",
    "nullable": true,
    "example": "Example Item Title"
}

Number field:

json

"viewCount": {
    "type": "number",
    "description": "Number of views",
    "nullable": true,
    "example": 15000
}

Boolean field:

json

"isVerified": {
    "type": "boolean",
    "description": "Whether the account is verified",
    "nullable": true,
    "example": true
}

Array field:

json

"hashtags": {
    "type": "array",
    "description": "Hashtags associated with the item",
    "items": { "type": "string" },
    "nullable": true,
    "example": ["#example", "#demo"]
}

Nested object field:

json

"authorInfo": {
    "type": "object",
    "description": "Information about the author",
    "properties": {
        "name": { "type": "string", "nullable": true },
        "url": { "type": "string", "nullable": true }
    },
    "required": [],
    "additionalProperties": true,
    "nullable": true,
    "example": { "name": "Example Author", "url": "https://example.com/author" }
}

Enum field:

json

"contentType": {
    "type": "string",
    "description": "Type of content",
    "enum": ["article", "video", "image"],
    "nullable": true,
    "example": "article"
}

Union type (e.g., TypeScript
ObjectType | string
):

json

"metadata": {
    "type": ["object", "string"],
    "description": "Structured metadata object, or error string if unavailable",
    "nullable": true,
    "example": { "key": "value" }
}

字符串字段:

json

"title": {
    "type": "string",
    "description": "Title of the scraped item",
    "nullable": true,
    "example": "Example Item Title"
}

数字字段:

json

"viewCount": {
    "type": "number",
    "description": "Number of views",
    "nullable": true,
    "example": 15000
}

布尔字段:

json

"isVerified": {
    "type": "boolean",
    "description": "Whether the account is verified",
    "nullable": true,
    "example": true
}

数组字段:

json

"hashtags": {
    "type": "array",
    "description": "Hashtags associated with the item",
    "items": { "type": "string" },
    "nullable": true,
    "example": ["#example", "#demo"]
}

嵌套对象字段:

json

"authorInfo": {
    "type": "object",
    "description": "Information about the author",
    "properties": {
        "name": { "type": "string", "nullable": true },
        "url": { "type": "string", "nullable": true }
    },
    "required": [],
    "additionalProperties": true,
    "nullable": true,
    "example": { "name": "Example Author", "url": "https://example.com/author" }
}

枚举字段:

json

"contentType": {
    "type": "string",
    "description": "Type of content",
    "enum": ["article", "video", "image"],
    "nullable": true,
    "example": "article"
}

联合类型（例如TypeScript
ObjectType | string
）:

json

"metadata": {
    "type": ["object", "string"],
    "description": "Structured metadata object, or error string if unavailable",
    "nullable": true,
    "example": { "key": "value" }
}

Anonymized example values

匿名化示例值

Use realistic but generic values. Follow platform ID format conventions:

Field type	Example approach
IDs	Match platform format and length (e.g., 11 chars for YouTube video IDs)
Usernames	`"exampleuser"` , `"sampleuser123"`
Display names	`"Example Channel"` , `"Sample Author"`
URLs	Use platform's standard URL format with fake IDs
Dates	`"2025-01-15T12:00:00.000Z"` (ISO 8601)
Text content	Generic descriptive text, e.g., `"This is an example description."`

使用真实感强但通用的值。遵循平台ID格式规范：

字段类型	示例方法
ID	匹配平台格式和长度（例如YouTube视频ID为11个字符）
用户名	`"exampleuser"` 、 `"sampleuser123"`
显示名称	`"Example Channel"` 、 `"Sample Author"`
URL	使用平台标准URL格式搭配虚构ID
日期	`"2025-01-15T12:00:00.000Z"` （ISO 8601格式）
文本内容	通用描述性文本，例如 `"This is an example description."`

Views section

视图部分

```
transformation.fields
```
: List 8–12 most important field names (order = column order in UI)
```
display.properties
```
: One entry per overview field with
```
label
```
and
```
format
```

Available formats:

"text"

"number"

"date"

"link"

"boolean"

"image"

"array"

"object"

Pick fields that give users the most useful at-a-glance summary of the data.

```
transformation.fields
```
: 列出8-12个最重要的字段名称（顺序对应UI中的列顺序）
```
display.properties
```
: 每个概览字段对应一个条目，包含
```
label
```
和
```
format
```

可用格式：

"text"

、

"number"

、

"date"

、

"link"

、

"boolean"

、

"image"

、

"array"

、

"object"

选择能让用户快速了解数据核心信息的字段。

Phase 3: Generate

key_value_store_schema.json

(if applicable)

阶段3：生成

key_value_store_schema.json

（如适用）

Goal: Define key-value store collections if the Actor stores data in the key-value store

Skip this phase if no
Actor.setValue()
/
Actor.set_value()
calls were found in Phase 1 (beyond the default
INPUT
key).

目标：如果Actor使用键值存储，定义键值存储集合

跳过此阶段：如果在阶段1中未发现
Actor.setValue()
/
Actor.set_value()
调用（默认的
INPUT
键除外）。

File structure

文件结构

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "<Descriptive title — what the key-value store contains>",
    "description": "<One sentence describing the stored data>",
    "collections": {
        "<collectionName>": {
            "title": "<Human-readable title>",
            "description": "<What this collection contains>",
            "keyPrefix": "<prefix->"
        }
    }
}

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "<Descriptive title — what the key-value store contains>",
    "description": "<One sentence describing the stored data>",
    "collections": {
        "<collectionName>": {
            "title": "<Human-readable title>",
            "description": "<What this collection contains>",
            "keyPrefix": "<prefix->"
        }
    }
}

How to identify collections

如何识别集合

Group the discovered

setValue

set_value

calls by key pattern:

Fixed keys (e.g.,
```
"RESULTS"
```
,
```
"summary"
```
) — use
```
"key"
```
(exact match)

Dynamic keys with a prefix (e.g.,

"screenshot-${id}"

f"image-{name}"

) — use

"keyPrefix"

Each group becomes a collection.

按键模式对已发现的

setValue

set_value

调用进行分组：

固定键（例如
```
"RESULTS"
```
、
```
"summary"
```
）——使用
```
"key"
```
（精确匹配）
带前缀的动态键（例如
```
"screenshot-${id}"
```
、
```
f"image-{name}"
```
）——使用
```
"keyPrefix"
```

每个分组对应一个集合。

Collection properties

集合属性

Property	Required	Description
`title`	Yes	Shown in UI tabs
`description`	No	Shown in UI tooltips
`key`	Conditional	Exact key for single-key collections (use `key` OR `keyPrefix` , not both)
`keyPrefix`	Conditional	Prefix for multi-key collections (use `key` OR `keyPrefix` , not both)
`contentTypes`	No	Restrict allowed MIME types (e.g., `["image/jpeg"]` , `["application/json"]` )
`jsonSchema`	No	JSON Schema draft-07 for validating `application/json` content

属性	是否必填	描述
`title`	是	在UI标签中显示
`description`	否	在UI提示框中显示
`key`	可选	单键集合的精确键名（ `key` 和 `keyPrefix` 二选一，不能同时使用）
`keyPrefix`	可选	多键集合的前缀（ `key` 和 `keyPrefix` 二选一，不能同时使用）
`contentTypes`	否	限制允许的MIME类型（例如 `["image/jpeg"]` 、 `["application/json"]` ）
`jsonSchema`	否	用于验证 `application/json` 内容的JSON Schema draft-07

Examples

示例

Single file output (e.g., a report):

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "Analysis Results",
    "description": "Key-value store containing analysis output",
    "collections": {
        "report": {
            "title": "Report",
            "description": "Final analysis report",
            "key": "REPORT",
            "contentTypes": ["application/json"]
        }
    }
}

Multiple files with prefix (e.g., screenshots):

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "Scraped Files",
    "description": "Key-value store containing downloaded files and screenshots",
    "collections": {
        "screenshots": {
            "title": "Screenshots",
            "description": "Page screenshots captured during scraping",
            "keyPrefix": "screenshot-",
            "contentTypes": ["image/png", "image/jpeg"]
        },
        "documents": {
            "title": "Documents",
            "description": "Downloaded document files",
            "keyPrefix": "doc-",
            "contentTypes": ["application/pdf", "text/html"]
        }
    }
}

单个文件输出（例如报告）:

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "Analysis Results",
    "description": "Key-value store containing analysis output",
    "collections": {
        "report": {
            "title": "Report",
            "description": "Final analysis report",
            "key": "REPORT",
            "contentTypes": ["application/json"]
        }
    }
}

带前缀的多个文件（例如截图）:

json

{
    "actorKeyValueStoreSchemaVersion": 1,
    "title": "Scraped Files",
    "description": "Key-value store containing downloaded files and screenshots",
    "collections": {
        "screenshots": {
            "title": "Screenshots",
            "description": "Page screenshots captured during scraping",
            "keyPrefix": "screenshot-",
            "contentTypes": ["image/png", "image/jpeg"]
        },
        "documents": {
            "title": "Documents",
            "description": "Downloaded document files",
            "keyPrefix": "doc-",
            "contentTypes": ["application/pdf", "text/html"]
        }
    }
}

Phase 4: Generate

output_schema.json

阶段4：生成

output_schema.json

Goal: Create the output schema that tells Apify Console where to find results

For most Actors that push data to a dataset, this is a minimal file:

json

{
    "actorOutputSchemaVersion": 1,
    "title": "<Descriptive title — what the Actor returns>",
    "description": "<One sentence describing the output data>",
    "properties": {
        "dataset": {
            "type": "string",
            "title": "Results",
            "description": "Dataset containing all scraped data",
            "template": "{{links.apiDefaultDatasetUrl}}/items"
        }
    }
}

Critical: Each property entry must include
"type": "string"
— this is an Apify-specific convention. The Apify meta-validator rejects properties without it (and rejects
"type": "object"
— only
"string"
is valid here).

key_value_store_schema.json

was generated in Phase 3, add a second property:

json

"files": {
    "type": "string",
    "title": "Files",
    "description": "Key-value store containing downloaded files",
    "template": "{{links.apiDefaultKeyValueStoreUrl}}/keys"
}

目标：创建输出模式，告知Apify控制台结果的位置

对于大多数将数据推送到dataset的Actor，这是一个极简文件：

json

{
    "actorOutputSchemaVersion": 1,
    "title": "<Descriptive title — what the Actor returns>",
    "description": "<One sentence describing the output data>",
    "properties": {
        "dataset": {
            "type": "string",
            "title": "Results",
            "description": "Dataset containing all scraped data",
            "template": "{{links.apiDefaultDatasetUrl}}/items"
        }
    }
}

关键注意事项：每个属性条目必须包含
"type": "string"
——这是Apify的特定约定。Apify元验证器会拒绝没有该属性的条目（并且拒绝
"type": "object"
——仅
"string"
有效）。

如果在阶段3中生成了

key_value_store_schema.json

，添加第二个属性：

json

"files": {
    "type": "string",
    "title": "Files",
    "description": "Key-value store containing downloaded files",
    "template": "{{links.apiDefaultKeyValueStoreUrl}}/keys"
}

Available template variables

可用模板变量

```
{{links.apiDefaultDatasetUrl}}
```
— API URL of default dataset
```
{{links.apiDefaultKeyValueStoreUrl}}
```
— API URL of default key-value store
```
{{links.publicRunUrl}}
```
— Public run URL
```
{{links.consoleRunUrl}}
```
— Console run URL
```
{{links.apiRunUrl}}
```
— API run URL
```
{{links.containerRunUrl}}
```
— URL of webserver running inside the run
```
{{run.defaultDatasetId}}
```
— ID of the default dataset
```
{{run.defaultKeyValueStoreId}}
```
— ID of the default key-value store

```
{{links.apiDefaultDatasetUrl}}
```
— 默认dataset的API URL
```
{{links.apiDefaultKeyValueStoreUrl}}
```
— 默认键值存储的API URL
```
{{links.publicRunUrl}}
```
— 公开运行URL
```
{{links.consoleRunUrl}}
```
— 控制台运行URL
```
{{links.apiRunUrl}}
```
— API运行URL
```
{{links.containerRunUrl}}
```
— 运行在容器内的Web服务器URL
```
{{run.defaultDatasetId}}
```
— 默认dataset的ID
```
{{run.defaultKeyValueStoreId}}
```
— 默认键值存储的ID

Phase 5: Update

actor.json

阶段5：更新

actor.json

Goal: Wire the schema files into the Actor configuration

Actions:

Read the current
```
actor.json
```

Add or update the

storages.dataset

reference:

json

"storages": {
    "dataset": "./dataset_schema.json"
}

key_value_store_schema.json

was generated, add the reference:

json

"storages": {
    "dataset": "./dataset_schema.json",
    "keyValueStore": "./key_value_store_schema.json"
}

Add or update the
```
output
```
reference:
json
```
"output": "./output_schema.json"
```
If
```
actor.json
```
had inline
```
storages.dataset
```
or
```
storages.keyValueStore
```
objects (not string paths), migrate their content into the respective schema files and replace the inline objects with file path strings

目标：将模式文件关联到Actor配置中

操作步骤:

读取当前的
```
actor.json
```

添加或更新

storages.dataset

引用：

json

"storages": {
    "dataset": "./dataset_schema.json"
}

如果生成了

key_value_store_schema.json

，添加引用：

json

"storages": {
    "dataset": "./dataset_schema.json",
    "keyValueStore": "./key_value_store_schema.json"
}

添加或更新
```
output
```
引用：
json
```
"output": "./output_schema.json"
```
如果
```
actor.json
```
中存在内联的
```
storages.dataset
```
或
```
storages.keyValueStore
```
对象（而非字符串路径），将其内容迁移到对应的模式文件中，并用文件路径字符串替换内联对象

Phase 6: Review and Validate

阶段6：审核与验证

Goal: Ensure correctness and completeness

Checklist:

Every output field from the source code is in
```
dataset_schema.json
```
```
fields.properties
```
— not just the overview view fields but ALL fields the Actor can produce
Every field has
```
"nullable": true
```

The top-level
fields
object has both

"additionalProperties": true

and

"required": []

Every nested object within

properties

also has

"additionalProperties": true

and

"required": []

Every field has a
```
"description"
```
and an
```
"example"
```
All example values are anonymized
```
"type"
```
is present on every field that has
```
"nullable"
```
Views list 8–12 most useful fields with correct display formats
```
output_schema.json
```
has
```
"type": "string"
```
on every property
If key-value store is used:
```
key_value_store_schema.json
```
has collections matching all
```
setValue
```
/
```
set_value
```
calls
If key-value store is used: each collection uses either
```
key
```
or
```
keyPrefix
```
(not both)
```
actor.json
```
references all generated schema files
Schema field names match the actual keys in the code (camelCase/snake_case consistency)
If existing schemas were found in the repo, the new schema follows their conventions (description style, example format, view structure)
Schema fields are derived from existing type definitions (interfaces, TypedDicts, dataclasses) where available — no duplicated or divergent field definitions

Present the generated schemas to the user for review before writing them.

目标：确保正确性和完整性

检查清单:

源代码中的每一个输出字段都已包含在
```
dataset_schema.json
```
的
```
fields.properties
```
中——不仅是概览视图的字段，还包括Actor可能生成的所有字段
每个字段都设置了
```
"nullable": true
```

顶层
fields
对象同时包含

"additionalProperties": true

和

"required": []

properties

内的每一个嵌套对象也同时包含

"additionalProperties": true

和

"required": []

每个字段都有
```
"description"
```
和
```
"example"
```
所有示例值均已匿名化
每个带
```
"nullable"
```
的字段都设置了
```
"type"
```
视图部分列出了8-12个最有用的字段，并设置了正确的显示格式
```
output_schema.json
```
中的每个属性都包含
```
"type": "string"
```
如果使用了键值存储：
```
key_value_store_schema.json
```
中的集合与所有
```
setValue
```
/
```
set_value
```
调用匹配
如果使用了键值存储：每个集合仅使用
```
key
```
或
```
keyPrefix
```
中的一个（不同时使用）
```
actor.json
```
引用了所有生成的模式文件
模式字段名称与代码中的实际键名匹配（驼峰/蛇形命名一致）
如果仓库中已有模式，新模式遵循了它们的规范（描述风格、示例格式、视图结构）
模式字段尽可能从现有类型定义（接口、TypedDict、dataclass）派生——无重复或不一致的字段定义

在写入文件前，向用户展示生成的模式以供审核。

Phase 7: Summary

阶段7：总结

Goal: Document what was created

Report:

Files created or updated
Number of fields in the dataset schema
Number of collections in the key-value store schema (if generated)
Fields selected for the overview view
Any fields that need user clarification (ambiguous types, unclear nullability)
Suggested next steps (test locally with
```
apify run
```
, verify output tab in Console)

目标：记录已完成的工作

报告内容:

创建或更新的文件
dataset模式中的字段数量
键值存储模式中的集合数量（如果已生成）
概览视图中选择的字段
需要用户澄清的字段（类型模糊、可空性不明确）
建议的下一步操作（使用
```
apify run
```
本地测试，在控制台中验证输出标签页）

apify-generate-output-schema

Original

Translation

Generate Actor Output Schema

生成Actor输出模式

Core Principles

核心原则

Phase 1: Discover Actor Structure

阶段1：探索Actor结构

Phase 2: Generate dataset_schema.json

阶段2：生成dataset_schema.json

File structure

文件结构

Consistency with existing schemas

与现有模式保持一致

Hard rules (no exceptions)

硬性规则（无例外）

Field type patterns

字段类型模式

Anonymized example values

匿名化示例值

Views section

视图部分

Phase 3: Generate key_value_store_schema.json (if applicable)

阶段3：生成key_value_store_schema.json（如适用）

File structure

文件结构

How to identify collections

如何识别集合

Collection properties

集合属性

Examples

示例

Phase 4: Generate output_schema.json

阶段4：生成output_schema.json

Available template variables

可用模板变量

Phase 5: Update actor.json

阶段5：更新actor.json

Phase 6: Review and Validate

阶段6：审核与验证

Phase 7: Summary

阶段7：总结

Phase 2: Generate
`dataset_schema.json`

阶段2：生成
`dataset_schema.json`

Phase 3: Generate
`key_value_store_schema.json`
(if applicable)

阶段3：生成
`key_value_store_schema.json`
（如适用）

Phase 4: Generate
`output_schema.json`

阶段4：生成
`output_schema.json`

Phase 5: Update
`actor.json`

阶段5：更新
`actor.json`