apify-sdk-integration

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Apify SDK Integration

Apify SDK 集成

Add Apify Actor execution to an existing application. This skill covers the

apify-client

package for JS/TS and Python, plus the REST API for other languages.

将Apify Actor执行功能添加到现有应用程序中。本技能涵盖适用于JS/TS和Python的

apify-client

包，以及适用于其他语言的REST API。

When to Use This Skill

何时使用本技能

Adding web scraping or automation to an existing app
Calling Apify Actors programmatically from application code
Building a product that uses Apify as a backend service
Integrating Actor results into a data pipeline

为现有应用添加网页抓取或自动化功能
从应用代码中以编程方式调用Apify Actors
构建以Apify作为后端服务的产品
将Actor结果集成到数据管道中

Critical: Package Naming

重要提示：包命名

apify-client
is the API client for calling Actors from your app. apify
is the SDK for building Actors (wrong package for this use case).
Always install
apify-client
. Never install
apify
for integration work.

apify-client
是用于从你的应用中调用 Actors的API客户端。 apify
是用于构建 Actors的SDK（不适用于此使用场景）。
请始终安装
apify-client
。集成工作绝对不要安装
apify
。

Prerequisites

前提条件

The user needs an

APIFY_TOKEN

. Direct them to Console > Settings > Integrations at https://console.apify.com/settings/integrations to create one. If they don't have an account: https://console.apify.com/sign-up (free, no credit card).

Store the token securely — environment variable or secrets manager, never hardcoded.

用户需要一个

APIFY_TOKEN

。引导他们访问https://console.apify.com/settings/integrations（控制台 > 设置 > 集成）来创建一个。如果没有账户：https://console.apify.com/sign-up（免费，无需信用卡）。

请安全存储令牌——使用环境变量或密钥管理器，绝对不要硬编码。

Finding the Right Actor

选择合适的Actor

Before writing integration code, find the Actor that fits the user's needs. Use the MCP tools if available:

```
search-actors
```
— search the Apify Store by keyword
```
fetch-actor-details
```
— get the Actor's input schema, output format, and pricing

Alternatively, browse https://apify.com/store. Append

.md

to any Actor's Store URL to get its docs in markdown.

编写集成代码之前，先找到符合用户需求的Actor。如果可用，请使用MCP工具：

```
search-actors
```
—— 按关键词搜索Apify商店
```
fetch-actor-details
```
—— 获取Actor的输入模式、输出格式和定价信息

或者浏览https://apify.com/store。在任意Actor的商店URL后添加`.md`即可获取其Markdown格式的文档。

JavaScript / TypeScript

Install

安装

bash

npm install apify-client

bash

npm install apify-client

Synchronous Execution (wait for results)

同步执行（等待结果）

typescript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('apify/web-scraper').call({
    startUrls: [{ url: 'https://example.com' }],
    maxPagesPerCrawl: 10,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

.call()

blocks until the Actor finishes. Use for short-running Actors (under a few minutes).

typescript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('apify/web-scraper').call({
    startUrls: [{ url: 'https://example.com' }],
    maxPagesPerCrawl: 10,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

.call()

会阻塞直到Actor执行完成。适用于短时间运行的Actor（几分钟以内）。

Asynchronous Execution (start and poll/retrieve later)

异步执行（启动后轮询/稍后获取结果）

typescript

const run = await client.actor('apify/web-scraper').start({
    startUrls: [{ url: 'https://example.com' }],
});

// Poll for completion
const finishedRun = await client.run(run.id).waitForFinish();

// Retrieve results
const { items } = await client.dataset(finishedRun.defaultDatasetId).listItems();

Use

.start()

.waitForFinish()

for long-running Actors or when you need the run ID immediately.

typescript

const run = await client.actor('apify/web-scraper').start({
    startUrls: [{ url: 'https://example.com' }],
});

// 轮询等待完成
const finishedRun = await client.run(run.id).waitForFinish();

// 获取结果
const { items } = await client.dataset(finishedRun.defaultDatasetId).listItems();

对于长时间运行的Actor，或者需要立即获取运行ID的场景，请使用

.start()

.waitForFinish()

。

Retrieving Results

获取结果

typescript

// Dataset items (structured data from pushData)
const { items } = await client.dataset(run.defaultDatasetId).listItems({
    limit: 100,
    offset: 0,
});

// Key-value store (files, screenshots, etc.)
const record = await client.keyValueStore(run.defaultKeyValueStoreId).getRecord('OUTPUT');

typescript

// 数据集条目（来自pushData的结构化数据）
const { items } = await client.dataset(run.defaultDatasetId).listItems({
    limit: 100,
    offset: 0,
});

// 键值存储（文件、截图等）
const record = await client.keyValueStore(run.defaultKeyValueStoreId).getRecord('OUTPUT');

Error Handling

错误处理

typescript

try {
    const run = await client.actor('apify/web-scraper').call(input);

    if (run.status !== 'SUCCEEDED') {
        const log = await client.log(run.id).get();
        throw new Error(`Actor failed with status ${run.status}: ${log}`);
    }

    const { items } = await client.dataset(run.defaultDatasetId).listItems();
} catch (error) {
    if (error.message?.includes('not found')) {
        // Actor ID is wrong or Actor was deleted
    } else if (error.statusCode === 401) {
        // Invalid or missing APIFY_TOKEN
    }
    throw error;
}

typescript

try {
    const run = await client.actor('apify/web-scraper').call(input);

    if (run.status !== 'SUCCEEDED') {
        const log = await client.log(run.id).get();
        throw new Error(`Actor failed with status ${run.status}: ${log}`);
    }

    const { items } = await client.dataset(run.defaultDatasetId).listItems();
} catch (error) {
    if (error.message?.includes('not found')) {
        // Actor ID错误或Actor已被删除
    } else if (error.statusCode === 401) {
        // APIFY_TOKEN无效或缺失
    }
    throw error;
}

Python

Install

安装

bash

pip install apify-client

bash

pip install apify-client

Synchronous Execution

同步执行

python

from apify_client import ApifyClient
import os

client = ApifyClient(token=os.environ['APIFY_TOKEN'])

run = client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
    'maxPagesPerCrawl': 10,
})

items = client.dataset(run['defaultDatasetId']).list_items().items

python

from apify_client import ApifyClient
import os

client = ApifyClient(token=os.environ['APIFY_TOKEN'])

run = client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
    'maxPagesPerCrawl': 10,
})

items = client.dataset(run['defaultDatasetId']).list_items().items

Asynchronous Execution

异步执行

python

run = client.actor('apify/web-scraper').start(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

python

run = client.actor('apify/web-scraper').start(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

Poll for completion

轮询等待完成

finished_run = client.run(run['id']).wait_for_finish()

items = client.dataset(finished_run['defaultDatasetId']).list_items().items

undefined

finished_run = client.run(run['id']).wait_for_finish()

items = client.dataset(finished_run['defaultDatasetId']).list_items().items

undefined

Async Client (asyncio)

异步客户端（asyncio）

python

from apify_client import ApifyClientAsync

client = ApifyClientAsync(token=os.environ['APIFY_TOKEN'])

run = await client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

items = (await client.dataset(run['defaultDatasetId']).list_items()).items

python

from apify_client import ApifyClientAsync

client = ApifyClientAsync(token=os.environ['APIFY_TOKEN'])

run = await client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

items = (await client.dataset(run['defaultDatasetId']).list_items()).items

REST API (Any Language)

REST API（任意语言）

For languages without an official client, use the REST API directly.

对于没有官方客户端的语言，可以直接使用REST API。

Start a Run

启动运行

POST https://api.apify.com/v2/acts/{actorId}/runs
Authorization: Bearer <APIFY_TOKEN>
Content-Type: application/json

{ "startUrls": [{ "url": "https://example.com" }] }

POST https://api.apify.com/v2/acts/{actorId}/runs
Authorization: Bearer <APIFY_TOKEN>
Content-Type: application/json

{ "startUrls": [{ "url": "https://example.com" }] }

Get Run Status

获取运行状态

GET https://api.apify.com/v2/acts/{actorId}/runs/{runId}
Authorization: Bearer <APIFY_TOKEN>

GET https://api.apify.com/v2/acts/{actorId}/runs/{runId}
Authorization: Bearer <APIFY_TOKEN>

Get Dataset Items

获取数据集条目

GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
Authorization: Bearer <APIFY_TOKEN>

Full API reference: https://docs.apify.com/api/v2

GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
Authorization: Bearer <APIFY_TOKEN>

完整API参考：https://docs.apify.com/api/v2

Best Practices

最佳实践

Set timeouts: Pass
```
timeoutSecs
```
in the Actor input or use
```
waitSecs
```
on
```
.call()
```
to avoid indefinite waits.
Paginate large datasets: Use
```
limit
```
and
```
offset
```
when retrieving dataset items. Default limit is 250K items.
Reuse clients: Create one
```
ApifyClient
```
instance and reuse it across calls.
Handle Actor-specific input: Every Actor has its own input schema. Use
```
fetch-actor-details
```
MCP tool or append
```
.md
```
to the Actor's Store URL to get the schema before constructing input.

设置超时： 在Actor输入中传入
```
timeoutSecs
```
，或者在
```
.call()
```
中使用
```
waitSecs
```
，避免无限等待。
分页处理大型数据集： 获取数据集条目时使用
```
limit
```
和
```
offset
```
。默认限制为250K条目。
复用客户端： 创建一个
```
ApifyClient
```
实例并在多次调用中复用。
处理Actor特定输入： 每个Actor都有自己的输入模式。在构造输入之前，使用
```
fetch-actor-details
```
MCP工具或在Actor的商店URL后添加
```
.md
```
来获取模式。

Documentation

文档

Apify API client for JS: https://docs.apify.com/api/client/js
Apify API client for Python: https://docs.apify.com/api/client/python
REST API reference: https://docs.apify.com/api/v2
Apify docs (LLM-friendly): https://docs.apify.com/llms.txt
Apify docs (full): https://docs.apify.com/llms-full.txt

If the Apify MCP server is available, use

search-apify-docs

and

fetch-apify-docs

tools for contextual documentation lookups during development.

Apify JavaScript API客户端：https://docs.apify.com/api/client/js
Apify Python API客户端：https://docs.apify.com/api/client/python
REST API参考：https://docs.apify.com/api/v2
Apify文档（LLM友好型）：https://docs.apify.com/llms.txt
Apify完整文档：https://docs.apify.com/llms-full.txt

如果Apify MCP服务器可用，开发过程中可使用

search-apify-docs

和

fetch-apify-docs

工具进行上下文文档查找。