nova-act

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Onboarding

入门指南

Official Documentation: Nova Act GitHub README is the source of truth.
官方文档: Nova Act GitHub README 是权威参考来源。

Step 1: Authentication Setup

步骤1:认证设置

Nova Act supports two authentication methods. You MUST prompt the user for which method they want to use!
  • API Key (Quick Start) — Best for development/testing. Generate at https://nova.amazon.com/act?tab=dev_tools, then
    export NOVA_ACT_API_KEY="your_key"
    . No decorator needed.
  • AWS Credentials + Workflow Definition (Production) — Best for IAM-based access, S3 export. Requires AWS creds +
    @workflow
    decorator. See
    references/workflow_definitions.md
    .
Full setup details:
references/authentication.md
Nova Act支持两种认证方式。您必须询问用户想要使用哪种方式!
  • API密钥(快速入门) — 最适合开发/测试场景。访问https://nova.amazon.com/act?tab=dev_tools生成密钥,然后执行
    export NOVA_ACT_API_KEY="your_key"
    。无需使用装饰器。
  • AWS凭证 + 工作流定义(生产环境) — 最适合基于IAM的访问和S3导出场景。需要AWS凭证 +
    @workflow
    装饰器。详情请查看
    references/workflow_definitions.md
完整设置说明:
references/authentication.md

Step 2: Start Using Nova Act

步骤2:开始使用Nova Act

Route to the right mode:
User wants to...Route to
Explore a website interactively
references/browser_cli.md
(Option A)
Build a coding agent with browser
references/browser_cli.md
(Option A)
Write a repeatable test script
references/qa_tests.md
(Option B)
Write a Python automation script
references/qa_tests.md
(Option B) +
references/data_extraction.md
Convert manual tests to automated
references/gherkin_testing.md
Understand a codebase via its UI
references/flow_discovery.md
Reproduce a bug
references/bug_reproduction.md
Iterate on / refine automation prompts
references/workflow_refinement.md
Generate mock sites from real site recordings
references/mock_generation.md
Deploy to production
references/deployment_cli.md
Option A: Browser CLI (recommended for exploration and agent tool-use)
bash
pip install 'nova-act[cli]'
act browser execute "Go to https://example.com, find the pricing page, and extract plan names and prices" --session-id work
根据需求选择对应模式:
用户想要...跳转至
交互式探索网站
references/browser_cli.md
(选项A)
构建具备浏览器能力的编码Agent
references/browser_cli.md
(选项A)
编写可重复执行的测试脚本
references/qa_tests.md
(选项B)
编写Python自动化脚本
references/qa_tests.md
(选项B) +
references/data_extraction.md
将手动测试转换为自动化测试
references/gherkin_testing.md
通过UI理解代码库
references/flow_discovery.md
复现Bug
references/bug_reproduction.md
迭代/优化自动化提示词
references/workflow_refinement.md
从真实站点录制内容生成模拟站点
references/mock_generation.md
部署到生产环境
references/deployment_cli.md
选项A:Browser CLI(推荐用于探索和Agent工具调用)
bash
pip install 'nova-act[cli]'
act browser execute "Go to https://example.com, find the pricing page, and extract plan names and prices" --session-id work

If you need to observe current state:

If you need to observe current state:

act browser ask "What page am I on?" --session-id work
act browser ask "What page am I on?" --session-id work

If you need a zero-inference jump:

If you need a zero-inference jump:

act browser goto https://example.com/specific-page --session-id work
Default to `execute` with a detailed plan. Use individual commands (`goto`, `ask`, etc.) for recovery, observation, or zero-inference actions. Pass NovaAct constructor or method args with `--nova-arg`: e.g. `--nova-arg max_steps=5 --nova-arg headless=true`. Run any command with `--help` for all available options.

> **IMPORTANT**: Before running any browser command, ask the user whether they want headed (visible browser) or headless mode, unless they have already specified. Use `--headed` or `--headless` accordingly.

> ⚠️ **CRITICAL: NEVER kill Chrome or Chromium processes** (e.g., `pkill chrome`, `kill -9 $(pgrep chrome)`, `killall Google Chrome`). Nova Act manages its own browser lifecycle. Killing browser processes externally will corrupt session state and break automation. If a browser appears stuck, use `act browser sessions` to check status, or start a new session.

> 💡 **Localhost HTTPS Testing**: When testing against a local HTTPS server (e.g., `https://localhost:8443`), Chrome will reject self-signed certificates by default. Add `--launch-arg=--ignore-certificate-errors --ignore-https-errors` to bypass this:
> ```bash
> act browser goto https://localhost:8443 --session-id dev --launch-arg=--ignore-certificate-errors --ignore-https-errors
> ```
> ⚠️ Only use `--ignore-certificate-errors` for **localhost** development servers. For non-local URLs, valid certificates should be used unless you have a specific reason to bypass validation.

Full command reference: `references/browser_cli.md`

**Option B: Python Script (recommended for repeatable automation)**
```python
from nova_act import NovaAct

with NovaAct(starting_page="https://example.com") as nova:
    nova.act("click the first link on the page")
    result = nova.act_get("What is the page title?")
    print(f"Page title: {result.response}")
Full testing guide:
references/qa_tests.md
act browser goto https://example.com/specific-page --session-id work
默认使用`execute`命令并配合详细执行计划。使用单独命令(如`goto`、`ask`等)进行恢复、观察或无推理操作。通过`--nova-arg`传递NovaAct构造函数或方法参数:例如`--nova-arg max_steps=5 --nova-arg headless=true`。所有命令均可添加`--help`查看可用选项。

> **重要提示**:在运行任何浏览器命令前,询问用户是否需要可视化浏览器(headed模式)或无头模式(headless模式),除非用户已明确指定。相应使用`--headed`或`--headless`参数。

> ⚠️ **关键注意事项:切勿手动终止Chrome或Chromium进程**(例如:`pkill chrome`、`kill -9 $(pgrep chrome)`、`killall Google Chrome`)。Nova Act会自行管理浏览器生命周期。外部终止浏览器进程会破坏会话状态并导致自动化失败。若浏览器出现卡顿,使用`act browser sessions`查看状态,或启动新会话。

> 💡 **本地HTTPS测试**:当测试本地HTTPS服务器(例如`https://localhost:8443`)时,Chrome默认会拒绝自签名证书。添加`--launch-arg=--ignore-certificate-errors --ignore-https-errors`参数绕过此限制:
> ```bash
> act browser goto https://localhost:8443 --session-id dev --launch-arg=--ignore-certificate-errors --ignore-https-errors
> ```
> ⚠️ 仅在**本地**开发服务器中使用`--ignore-certificate-errors`。对于非本地URL,应使用有效证书,除非有特殊理由需要绕过验证。

完整命令参考:`references/browser_cli.md`

**选项B:Python脚本(推荐用于可重复的自动化场景)**
```python
from nova_act import NovaAct

with NovaAct(starting_page="https://example.com") as nova:
    nova.act("click the first link on the page")
    result = nova.act_get("What is the page title?")
    print(f"Page title: {result.response}")
完整测试指南:
references/qa_tests.md

Step 3: Validate Installation

步骤3:验证安装

  • Python 3.10+:
    python --version
  • Nova Act SDK:
    pip install 'nova-act[cli]'
    /
    pip show nova-act
  • Google Chrome (Optional):
    playwright install chrome
  • Python 3.10+:执行
    python --version
    检查
  • Nova Act SDK:执行
    pip install 'nova-act[cli]'
    /
    pip show nova-act
    检查
  • Google Chrome(可选):执行
    playwright install chrome
    安装

Steering Files

指引文件

Quickstart: When given a URL to explore or automate, start with
references/browser_cli.md
.
FileWhat It Covers
references/browser_cli.md
Browser CLI — commands, sessions, config, page exploration. Default for interactive work and agent tool-use. You MUST read the ENTIRE file — it contains critical CLI best practices, command decision guides, and example transcripts that are essential for correct usage
references/qa_tests.md
Script writing — QA testing with pytest/unittest, act() vs act_get(), schemas, assertion patterns
references/deployment_cli.md
Deployment CLI — deploy workflows to AgentCore Runtime on AWS
references/authentication.md
API key vs IAM, session management
references/workflow_definitions.md
Workflow definitions with AWS CLI
references/data_extraction.md
Structured extraction with Pydantic
references/hitl.md
Human approval and UI takeover patterns
references/tool_use.md
External tools (@tool decorator, MCP servers)
references/agentcore_browser.md
Remote browser via AgentCore
references/playwright_interop.md
Direct Playwright page access, sensitive input, file downloads
references/session_logs.md
Logs, screenshots, and video recordings
references/trajectory_analysis.md
Understanding Nova Act trajectories
references/visual_reporting.md
Visual reporting — post-session markdown reports from steps.yaml, snapshots, and screenshots
references/gherkin_testing.md
Gherkin testing
.feature
file writing,
qa-plan
compilation, session export → Gherkin conversion
references/flow_discovery.md
Flow discovery — live app exploration → codebase mapping, developer onboarding docs
references/bug_reproduction.md
Bug reproduction — reproduce → capture evidence → export → regression test
references/parallel_sessions.md
Parallel sessions — run multiple browser sessions concurrently with subagents, chunking, session limits
references/workflow_refinement.md
Workflow refinement — analyze chain-of-thought, identify failure patterns, revise and validate browser automation workflows
references/mock_generation.md
Mock generation — generate static HTML mock sites from real site trajectories for fast, offline iteration
This skill uses the Nova Act SDK which is licensed under the Apache-2.0 license.
快速开始: 当需要探索或自动化某个URL时,从
references/browser_cli.md
开始。
文件涵盖内容
references/browser_cli.md
Browser CLI — 命令、会话、配置、页面探索。适用于交互式操作和Agent工具调用的默认方式。您必须完整阅读此文件 — 其中包含关键的CLI最佳实践、命令决策指南和示例脚本,是正确使用的必要参考
references/qa_tests.md
脚本编写 — 使用pytest/unittest进行QA测试、act()与act_get()对比、Schema、断言模式
references/deployment_cli.md
部署CLI — 将工作流部署到AWS上的AgentCore Runtime
references/authentication.md
API密钥与IAM对比、会话管理
references/workflow_definitions.md
使用AWS CLI定义工作流
references/data_extraction.md
使用Pydantic进行结构化数据提取
references/hitl.md
人工审批与UI接管模式
references/tool_use.md
外部工具(@tool装饰器、MCP服务器)
references/agentcore_browser.md
通过AgentCore使用远程浏览器
references/playwright_interop.md
直接访问Playwright页面、敏感输入、文件下载
references/session_logs.md
日志、截图和视频录制
references/trajectory_analysis.md
分析Nova Act执行轨迹
references/visual_reporting.md
可视化报告 — 基于steps.yaml、快照和截图生成会话后的Markdown报告
references/gherkin_testing.md
Gherkin测试 — 编写
.feature
文件、编译
qa-plan
、将会话导出转换为Gherkin格式
references/flow_discovery.md
流程发现 — 实时应用探索 → 代码库映射、开发者入门文档
references/bug_reproduction.md
Bug复现 — 复现Bug → 捕获证据 → 导出 → 回归测试
references/parallel_sessions.md
并行会话 — 与子Agent配合同时运行多个浏览器会话、分块处理、会话限制
references/workflow_refinement.md
工作流优化 — 分析思维链、识别失败模式、修订并验证浏览器自动化工作流
references/mock_generation.md
模拟站点生成 — 从真实站点执行轨迹生成静态HTML模拟站点,用于快速离线迭代
本技能使用的Nova Act SDK采用Apache-2.0许可证授权。