nova-act
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOnboarding
入门指南
Official Documentation: Nova Act GitHub README is the source of truth.
官方文档: Nova Act GitHub README 是权威参考来源。
Step 1: Authentication Setup
步骤1:认证设置
Nova Act supports two authentication methods. You MUST prompt the user for which method they want to use!
- API Key (Quick Start) — Best for development/testing. Generate at https://nova.amazon.com/act?tab=dev_tools, then . No decorator needed.
export NOVA_ACT_API_KEY="your_key" - AWS Credentials + Workflow Definition (Production) — Best for IAM-based access, S3 export. Requires AWS creds + decorator. See
@workflow.references/workflow_definitions.md
Full setup details:
references/authentication.mdNova Act支持两种认证方式。您必须询问用户想要使用哪种方式!
- API密钥(快速入门) — 最适合开发/测试场景。访问https://nova.amazon.com/act?tab=dev_tools生成密钥,然后执行 。无需使用装饰器。
export NOVA_ACT_API_KEY="your_key" - AWS凭证 + 工作流定义(生产环境) — 最适合基于IAM的访问和S3导出场景。需要AWS凭证 + 装饰器。详情请查看
@workflow。references/workflow_definitions.md
完整设置说明:
references/authentication.mdStep 2: Start Using Nova Act
步骤2:开始使用Nova Act
Route to the right mode:
| User wants to... | Route to |
|---|---|
| Explore a website interactively | |
| Build a coding agent with browser | |
| Write a repeatable test script | |
| Write a Python automation script | |
| Convert manual tests to automated | |
| Understand a codebase via its UI | |
| Reproduce a bug | |
| Iterate on / refine automation prompts | |
| Generate mock sites from real site recordings | |
| Deploy to production | |
Option A: Browser CLI (recommended for exploration and agent tool-use)
bash
pip install 'nova-act[cli]'
act browser execute "Go to https://example.com, find the pricing page, and extract plan names and prices" --session-id work根据需求选择对应模式:
| 用户想要... | 跳转至 |
|---|---|
| 交互式探索网站 | |
| 构建具备浏览器能力的编码Agent | |
| 编写可重复执行的测试脚本 | |
| 编写Python自动化脚本 | |
| 将手动测试转换为自动化测试 | |
| 通过UI理解代码库 | |
| 复现Bug | |
| 迭代/优化自动化提示词 | |
| 从真实站点录制内容生成模拟站点 | |
| 部署到生产环境 | |
选项A:Browser CLI(推荐用于探索和Agent工具调用)
bash
pip install 'nova-act[cli]'
act browser execute "Go to https://example.com, find the pricing page, and extract plan names and prices" --session-id workIf you need to observe current state:
If you need to observe current state:
act browser ask "What page am I on?" --session-id work
act browser ask "What page am I on?" --session-id work
If you need a zero-inference jump:
If you need a zero-inference jump:
act browser goto https://example.com/specific-page --session-id work
Default to `execute` with a detailed plan. Use individual commands (`goto`, `ask`, etc.) for recovery, observation, or zero-inference actions. Pass NovaAct constructor or method args with `--nova-arg`: e.g. `--nova-arg max_steps=5 --nova-arg headless=true`. Run any command with `--help` for all available options.
> **IMPORTANT**: Before running any browser command, ask the user whether they want headed (visible browser) or headless mode, unless they have already specified. Use `--headed` or `--headless` accordingly.
> ⚠️ **CRITICAL: NEVER kill Chrome or Chromium processes** (e.g., `pkill chrome`, `kill -9 $(pgrep chrome)`, `killall Google Chrome`). Nova Act manages its own browser lifecycle. Killing browser processes externally will corrupt session state and break automation. If a browser appears stuck, use `act browser sessions` to check status, or start a new session.
> 💡 **Localhost HTTPS Testing**: When testing against a local HTTPS server (e.g., `https://localhost:8443`), Chrome will reject self-signed certificates by default. Add `--launch-arg=--ignore-certificate-errors --ignore-https-errors` to bypass this:
> ```bash
> act browser goto https://localhost:8443 --session-id dev --launch-arg=--ignore-certificate-errors --ignore-https-errors
> ```
> ⚠️ Only use `--ignore-certificate-errors` for **localhost** development servers. For non-local URLs, valid certificates should be used unless you have a specific reason to bypass validation.
Full command reference: `references/browser_cli.md`
**Option B: Python Script (recommended for repeatable automation)**
```python
from nova_act import NovaAct
with NovaAct(starting_page="https://example.com") as nova:
nova.act("click the first link on the page")
result = nova.act_get("What is the page title?")
print(f"Page title: {result.response}")Full testing guide:
references/qa_tests.mdact browser goto https://example.com/specific-page --session-id work
默认使用`execute`命令并配合详细执行计划。使用单独命令(如`goto`、`ask`等)进行恢复、观察或无推理操作。通过`--nova-arg`传递NovaAct构造函数或方法参数:例如`--nova-arg max_steps=5 --nova-arg headless=true`。所有命令均可添加`--help`查看可用选项。
> **重要提示**:在运行任何浏览器命令前,询问用户是否需要可视化浏览器(headed模式)或无头模式(headless模式),除非用户已明确指定。相应使用`--headed`或`--headless`参数。
> ⚠️ **关键注意事项:切勿手动终止Chrome或Chromium进程**(例如:`pkill chrome`、`kill -9 $(pgrep chrome)`、`killall Google Chrome`)。Nova Act会自行管理浏览器生命周期。外部终止浏览器进程会破坏会话状态并导致自动化失败。若浏览器出现卡顿,使用`act browser sessions`查看状态,或启动新会话。
> 💡 **本地HTTPS测试**:当测试本地HTTPS服务器(例如`https://localhost:8443`)时,Chrome默认会拒绝自签名证书。添加`--launch-arg=--ignore-certificate-errors --ignore-https-errors`参数绕过此限制:
> ```bash
> act browser goto https://localhost:8443 --session-id dev --launch-arg=--ignore-certificate-errors --ignore-https-errors
> ```
> ⚠️ 仅在**本地**开发服务器中使用`--ignore-certificate-errors`。对于非本地URL,应使用有效证书,除非有特殊理由需要绕过验证。
完整命令参考:`references/browser_cli.md`
**选项B:Python脚本(推荐用于可重复的自动化场景)**
```python
from nova_act import NovaAct
with NovaAct(starting_page="https://example.com") as nova:
nova.act("click the first link on the page")
result = nova.act_get("What is the page title?")
print(f"Page title: {result.response}")完整测试指南:
references/qa_tests.mdStep 3: Validate Installation
步骤3:验证安装
- Python 3.10+:
python --version - Nova Act SDK: /
pip install 'nova-act[cli]'pip show nova-act - Google Chrome (Optional):
playwright install chrome
- Python 3.10+:执行检查
python --version - Nova Act SDK:执行/
pip install 'nova-act[cli]'检查pip show nova-act - Google Chrome(可选):执行安装
playwright install chrome
Steering Files
指引文件
Quickstart: When given a URL to explore or automate, start with .
references/browser_cli.md| File | What It Covers |
|---|---|
| Browser CLI — commands, sessions, config, page exploration. Default for interactive work and agent tool-use. You MUST read the ENTIRE file — it contains critical CLI best practices, command decision guides, and example transcripts that are essential for correct usage |
| Script writing — QA testing with pytest/unittest, act() vs act_get(), schemas, assertion patterns |
| Deployment CLI — deploy workflows to AgentCore Runtime on AWS |
| API key vs IAM, session management |
| Workflow definitions with AWS CLI |
| Structured extraction with Pydantic |
| Human approval and UI takeover patterns |
| External tools (@tool decorator, MCP servers) |
| Remote browser via AgentCore |
| Direct Playwright page access, sensitive input, file downloads |
| Logs, screenshots, and video recordings |
| Understanding Nova Act trajectories |
| Visual reporting — post-session markdown reports from steps.yaml, snapshots, and screenshots |
| Gherkin testing — |
| Flow discovery — live app exploration → codebase mapping, developer onboarding docs |
| Bug reproduction — reproduce → capture evidence → export → regression test |
| Parallel sessions — run multiple browser sessions concurrently with subagents, chunking, session limits |
| Workflow refinement — analyze chain-of-thought, identify failure patterns, revise and validate browser automation workflows |
| Mock generation — generate static HTML mock sites from real site trajectories for fast, offline iteration |
This skill uses the Nova Act SDK which is licensed under the Apache-2.0 license.
快速开始: 当需要探索或自动化某个URL时,从开始。
references/browser_cli.md| 文件 | 涵盖内容 |
|---|---|
| Browser CLI — 命令、会话、配置、页面探索。适用于交互式操作和Agent工具调用的默认方式。您必须完整阅读此文件 — 其中包含关键的CLI最佳实践、命令决策指南和示例脚本,是正确使用的必要参考 |
| 脚本编写 — 使用pytest/unittest进行QA测试、act()与act_get()对比、Schema、断言模式 |
| 部署CLI — 将工作流部署到AWS上的AgentCore Runtime |
| API密钥与IAM对比、会话管理 |
| 使用AWS CLI定义工作流 |
| 使用Pydantic进行结构化数据提取 |
| 人工审批与UI接管模式 |
| 外部工具(@tool装饰器、MCP服务器) |
| 通过AgentCore使用远程浏览器 |
| 直接访问Playwright页面、敏感输入、文件下载 |
| 日志、截图和视频录制 |
| 分析Nova Act执行轨迹 |
| 可视化报告 — 基于steps.yaml、快照和截图生成会话后的Markdown报告 |
| Gherkin测试 — 编写 |
| 流程发现 — 实时应用探索 → 代码库映射、开发者入门文档 |
| Bug复现 — 复现Bug → 捕获证据 → 导出 → 回归测试 |
| 并行会话 — 与子Agent配合同时运行多个浏览器会话、分块处理、会话限制 |
| 工作流优化 — 分析思维链、识别失败模式、修订并验证浏览器自动化工作流 |
| 模拟站点生成 — 从真实站点执行轨迹生成静态HTML模拟站点,用于快速离线迭代 |
本技能使用的Nova Act SDK采用Apache-2.0许可证授权。