apify-sdk-integration
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseApify SDK Integration
Apify SDK 集成
Add Apify Actor execution to an existing application. This skill covers the package for JS/TS and Python, plus the REST API for other languages.
apify-client将Apify Actor执行功能添加到现有应用程序中。本技能涵盖适用于JS/TS和Python的包,以及适用于其他语言的REST API。
apify-clientWhen to Use This Skill
何时使用本技能
- Adding web scraping or automation to an existing app
- Calling Apify Actors programmatically from application code
- Building a product that uses Apify as a backend service
- Integrating Actor results into a data pipeline
- 为现有应用添加网页抓取或自动化功能
- 从应用代码中以编程方式调用Apify Actors
- 构建以Apify作为后端服务的产品
- 将Actor结果集成到数据管道中
Critical: Package Naming
重要提示:包命名
is the API client for calling Actors from your app.apify-clientis the SDK for building Actors (wrong package for this use case).apifyAlways install. Never installapify-clientfor integration work.apify
是用于从你的应用中调用 Actors的API客户端。apify-client是用于构建 Actors的SDK(不适用于此使用场景)。apify请始终安装。集成工作绝对不要安装apify-client。apify
Prerequisites
前提条件
The user needs an . Direct them to Console > Settings > Integrations at https://console.apify.com/settings/integrations to create one. If they don't have an account: https://console.apify.com/sign-up (free, no credit card).
APIFY_TOKENStore the token securely — environment variable or secrets manager, never hardcoded.
用户需要一个。引导他们访问https://console.apify.com/settings/integrations(控制台 > 设置 > 集成)来创建一个。如果没有账户:https://console.apify.com/sign-up(免费,无需信用卡)。
APIFY_TOKEN请安全存储令牌——使用环境变量或密钥管理器,绝对不要硬编码。
Finding the Right Actor
选择合适的Actor
Before writing integration code, find the Actor that fits the user's needs. Use the MCP tools if available:
- — search the Apify Store by keyword
search-actors - — get the Actor's input schema, output format, and pricing
fetch-actor-details
Alternatively, browse https://apify.com/store. Append to any Actor's Store URL to get its docs in markdown.
.md编写集成代码之前,先找到符合用户需求的Actor。如果可用,请使用MCP工具:
- —— 按关键词搜索Apify商店
search-actors - —— 获取Actor的输入模式、输出格式和定价信息
fetch-actor-details
JavaScript / TypeScript
JavaScript / TypeScript
Install
安装
bash
npm install apify-clientbash
npm install apify-clientSynchronous Execution (wait for results)
同步执行(等待结果)
typescript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('apify/web-scraper').call({
startUrls: [{ url: 'https://example.com' }],
maxPagesPerCrawl: 10,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();.call()typescript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('apify/web-scraper').call({
startUrls: [{ url: 'https://example.com' }],
maxPagesPerCrawl: 10,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();.call()Asynchronous Execution (start and poll/retrieve later)
异步执行(启动后轮询/稍后获取结果)
typescript
const run = await client.actor('apify/web-scraper').start({
startUrls: [{ url: 'https://example.com' }],
});
// Poll for completion
const finishedRun = await client.run(run.id).waitForFinish();
// Retrieve results
const { items } = await client.dataset(finishedRun.defaultDatasetId).listItems();Use + for long-running Actors or when you need the run ID immediately.
.start().waitForFinish()typescript
const run = await client.actor('apify/web-scraper').start({
startUrls: [{ url: 'https://example.com' }],
});
// 轮询等待完成
const finishedRun = await client.run(run.id).waitForFinish();
// 获取结果
const { items } = await client.dataset(finishedRun.defaultDatasetId).listItems();对于长时间运行的Actor,或者需要立即获取运行ID的场景,请使用 + 。
.start().waitForFinish()Retrieving Results
获取结果
typescript
// Dataset items (structured data from pushData)
const { items } = await client.dataset(run.defaultDatasetId).listItems({
limit: 100,
offset: 0,
});
// Key-value store (files, screenshots, etc.)
const record = await client.keyValueStore(run.defaultKeyValueStoreId).getRecord('OUTPUT');typescript
// 数据集条目(来自pushData的结构化数据)
const { items } = await client.dataset(run.defaultDatasetId).listItems({
limit: 100,
offset: 0,
});
// 键值存储(文件、截图等)
const record = await client.keyValueStore(run.defaultKeyValueStoreId).getRecord('OUTPUT');Error Handling
错误处理
typescript
try {
const run = await client.actor('apify/web-scraper').call(input);
if (run.status !== 'SUCCEEDED') {
const log = await client.log(run.id).get();
throw new Error(`Actor failed with status ${run.status}: ${log}`);
}
const { items } = await client.dataset(run.defaultDatasetId).listItems();
} catch (error) {
if (error.message?.includes('not found')) {
// Actor ID is wrong or Actor was deleted
} else if (error.statusCode === 401) {
// Invalid or missing APIFY_TOKEN
}
throw error;
}typescript
try {
const run = await client.actor('apify/web-scraper').call(input);
if (run.status !== 'SUCCEEDED') {
const log = await client.log(run.id).get();
throw new Error(`Actor failed with status ${run.status}: ${log}`);
}
const { items } = await client.dataset(run.defaultDatasetId).listItems();
} catch (error) {
if (error.message?.includes('not found')) {
// Actor ID错误或Actor已被删除
} else if (error.statusCode === 401) {
// APIFY_TOKEN无效或缺失
}
throw error;
}Python
Python
Install
安装
bash
pip install apify-clientbash
pip install apify-clientSynchronous Execution
同步执行
python
from apify_client import ApifyClient
import os
client = ApifyClient(token=os.environ['APIFY_TOKEN'])
run = client.actor('apify/web-scraper').call(run_input={
'startUrls': [{'url': 'https://example.com'}],
'maxPagesPerCrawl': 10,
})
items = client.dataset(run['defaultDatasetId']).list_items().itemspython
from apify_client import ApifyClient
import os
client = ApifyClient(token=os.environ['APIFY_TOKEN'])
run = client.actor('apify/web-scraper').call(run_input={
'startUrls': [{'url': 'https://example.com'}],
'maxPagesPerCrawl': 10,
})
items = client.dataset(run['defaultDatasetId']).list_items().itemsAsynchronous Execution
异步执行
python
run = client.actor('apify/web-scraper').start(run_input={
'startUrls': [{'url': 'https://example.com'}],
})python
run = client.actor('apify/web-scraper').start(run_input={
'startUrls': [{'url': 'https://example.com'}],
})Poll for completion
轮询等待完成
finished_run = client.run(run['id']).wait_for_finish()
items = client.dataset(finished_run['defaultDatasetId']).list_items().items
undefinedfinished_run = client.run(run['id']).wait_for_finish()
items = client.dataset(finished_run['defaultDatasetId']).list_items().items
undefinedAsync Client (asyncio)
异步客户端(asyncio)
python
from apify_client import ApifyClientAsync
client = ApifyClientAsync(token=os.environ['APIFY_TOKEN'])
run = await client.actor('apify/web-scraper').call(run_input={
'startUrls': [{'url': 'https://example.com'}],
})
items = (await client.dataset(run['defaultDatasetId']).list_items()).itemspython
from apify_client import ApifyClientAsync
client = ApifyClientAsync(token=os.environ['APIFY_TOKEN'])
run = await client.actor('apify/web-scraper').call(run_input={
'startUrls': [{'url': 'https://example.com'}],
})
items = (await client.dataset(run['defaultDatasetId']).list_items()).itemsREST API (Any Language)
REST API(任意语言)
For languages without an official client, use the REST API directly.
对于没有官方客户端的语言,可以直接使用REST API。
Start a Run
启动运行
POST https://api.apify.com/v2/acts/{actorId}/runs
Authorization: Bearer <APIFY_TOKEN>
Content-Type: application/json
{ "startUrls": [{ "url": "https://example.com" }] }POST https://api.apify.com/v2/acts/{actorId}/runs
Authorization: Bearer <APIFY_TOKEN>
Content-Type: application/json
{ "startUrls": [{ "url": "https://example.com" }] }Get Run Status
获取运行状态
GET https://api.apify.com/v2/acts/{actorId}/runs/{runId}
Authorization: Bearer <APIFY_TOKEN>GET https://api.apify.com/v2/acts/{actorId}/runs/{runId}
Authorization: Bearer <APIFY_TOKEN>Get Dataset Items
获取数据集条目
GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
Authorization: Bearer <APIFY_TOKEN>Full API reference: https://docs.apify.com/api/v2
GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
Authorization: Bearer <APIFY_TOKEN>完整API参考:https://docs.apify.com/api/v2
Best Practices
最佳实践
- Set timeouts: Pass in the Actor input or use
timeoutSecsonwaitSecsto avoid indefinite waits..call() - Paginate large datasets: Use and
limitwhen retrieving dataset items. Default limit is 250K items.offset - Reuse clients: Create one instance and reuse it across calls.
ApifyClient - Handle Actor-specific input: Every Actor has its own input schema. Use MCP tool or append
fetch-actor-detailsto the Actor's Store URL to get the schema before constructing input..md
- 设置超时: 在Actor输入中传入,或者在
timeoutSecs中使用.call(),避免无限等待。waitSecs - 分页处理大型数据集: 获取数据集条目时使用和
limit。默认限制为250K条目。offset - 复用客户端: 创建一个实例并在多次调用中复用。
ApifyClient - 处理Actor特定输入: 每个Actor都有自己的输入模式。在构造输入之前,使用MCP工具或在Actor的商店URL后添加
fetch-actor-details来获取模式。.md
Documentation
文档
- Apify API client for JS: https://docs.apify.com/api/client/js
- Apify API client for Python: https://docs.apify.com/api/client/python
- REST API reference: https://docs.apify.com/api/v2
- Apify docs (LLM-friendly): https://docs.apify.com/llms.txt
- Apify docs (full): https://docs.apify.com/llms-full.txt
If the Apify MCP server is available, use and tools for contextual documentation lookups during development.
search-apify-docsfetch-apify-docs- Apify JavaScript API客户端:https://docs.apify.com/api/client/js
- Apify Python API客户端:https://docs.apify.com/api/client/python
- REST API参考:https://docs.apify.com/api/v2
- Apify文档(LLM友好型):https://docs.apify.com/llms.txt
- Apify完整文档:https://docs.apify.com/llms-full.txt
如果Apify MCP服务器可用,开发过程中可使用和工具进行上下文文档查找。
search-apify-docsfetch-apify-docs