Electron App Automation
Electron应用自动化
Automate any Electron desktop app using agent-browser. Electron apps are built on Chromium and expose a Chrome DevTools Protocol (CDP) port that agent-browser can connect to, enabling the same snapshot-interact workflow used for web pages.
使用agent-browser自动化任意Electron桌面应用。Electron应用基于Chromium构建,会暴露一个Chrome DevTools Protocol(CDP)端口,agent-browser可连接该端口,从而实现与网页相同的“快照-交互”工作流。
- Launch the Electron app with remote debugging enabled
- Connect agent-browser to the CDP port
- Snapshot to discover interactive elements
- Interact using element refs
- Re-snapshot after navigation or state changes
- 启动开启远程调试的Electron应用
- 连接agent-browser到CDP端口
- 快照发现交互元素
- 交互使用元素引用
- 重新快照在导航或状态变更后
Launch an Electron app with remote debugging
Launch an Electron app with remote debugging
open -a "Slack" --args --remote-debugging-port=9222
open -a "Slack" --args --remote-debugging-port=9222
Connect agent-browser to the app
Connect agent-browser to the app
agent-browser connect 9222
agent-browser connect 9222
Standard workflow from here
Standard workflow from here
agent-browser snapshot -i
agent-browser click @e5
agent-browser screenshot slack-desktop.png
agent-browser snapshot -i
agent-browser click @e5
agent-browser screenshot slack-desktop.png
Launching Electron Apps with CDP
开启CDP的Electron应用启动方式
Every Electron app supports the
flag since it's built into Chromium.
所有Electron应用都支持
参数,因为该参数内置于Chromium中。
open -a "Slack" --args --remote-debugging-port=9222
open -a "Slack" --args --remote-debugging-port=9222
open -a "Visual Studio Code" --args --remote-debugging-port=9223
open -a "Visual Studio Code" --args --remote-debugging-port=9223
open -a "Discord" --args --remote-debugging-port=9224
open -a "Discord" --args --remote-debugging-port=9224
open -a "Figma" --args --remote-debugging-port=9225
open -a "Figma" --args --remote-debugging-port=9225
open -a "Notion" --args --remote-debugging-port=9226
open -a "Notion" --args --remote-debugging-port=9226
open -a "Spotify" --args --remote-debugging-port=9227
open -a "Spotify" --args --remote-debugging-port=9227
bash
slack --remote-debugging-port=9222
code --remote-debugging-port=9223
discord --remote-debugging-port=9224
bash
slack --remote-debugging-port=9222
code --remote-debugging-port=9223
discord --remote-debugging-port=9224
bash
"C:\Users\%USERNAME%\AppData\Local\slack\slack.exe" --remote-debugging-port=9222
"C:\Users\%USERNAME%\AppData\Local\Programs\Microsoft VS Code\Code.exe" --remote-debugging-port=9223
Important: If the app is already running, quit it first, then relaunch with the flag. The
flag must be present at launch time.
bash
"C:\Users\%USERNAME%\AppData\Local\slack\slack.exe" --remote-debugging-port=9222
"C:\Users\%USERNAME%\AppData\Local\Programs\Microsoft VS Code\Code.exe" --remote-debugging-port=9223
重要提示: 如果应用已在运行,请先退出,再添加参数重新启动。
参数必须在启动时添加。
Connect to a specific port
Connect to a specific port
agent-browser connect 9222
agent-browser connect 9222
Or use --cdp on each command
Or use --cdp on each command
agent-browser --cdp 9222 snapshot -i
agent-browser --cdp 9222 snapshot -i
Auto-discover a running Chromium-based app
Auto-discover a running Chromium-based app
agent-browser --auto-connect snapshot -i
After `connect`, all subsequent commands target the connected app without needing `--cdp`.
agent-browser --auto-connect snapshot -i
执行`connect`后,后续所有命令都会指向已连接的应用,无需再使用`--cdp`参数。
Electron apps often have multiple windows or webviews. Use tab commands to list and switch between them:
Electron应用通常包含多个窗口或网页视图。使用标签页命令列出并切换它们:
List all available targets (windows, webviews, etc.)
List all available targets (windows, webviews, etc.)
Switch to a specific tab by index
Switch to a specific tab by index
Switch by URL pattern
Switch by URL pattern
agent-browser tab --url "settings"
agent-browser tab --url "settings"
Inspect and Navigate an App
检查并导航应用
bash
open -a "Slack" --args --remote-debugging-port=9222
sleep 3 # Wait for app to start
agent-browser connect 9222
agent-browser snapshot -i
bash
open -a "Slack" --args --remote-debugging-port=9222
sleep 3 # Wait for app to start
agent-browser connect 9222
agent-browser snapshot -i
Read the snapshot output to identify UI elements
Read the snapshot output to identify UI elements
agent-browser click @e10 # Navigate to a section
agent-browser snapshot -i # Re-snapshot after navigation
agent-browser click @e10 # Navigate to a section
agent-browser snapshot -i # Re-snapshot after navigation
Take Screenshots of Desktop Apps
为桌面应用截图
bash
agent-browser connect 9222
agent-browser screenshot app-state.png
agent-browser screenshot --full full-app.png
agent-browser screenshot --annotate annotated-app.png
bash
agent-browser connect 9222
agent-browser screenshot app-state.png
agent-browser screenshot --full full-app.png
agent-browser screenshot --annotate annotated-app.png
Extract Data from a Desktop App
从桌面应用提取数据
bash
agent-browser connect 9222
agent-browser snapshot -i
agent-browser get text @e5
agent-browser snapshot --json > app-state.json
bash
agent-browser connect 9222
agent-browser snapshot -i
agent-browser get text @e5
agent-browser snapshot --json > app-state.json
Fill Forms in Desktop Apps
在桌面应用中填写表单
bash
agent-browser connect 9222
agent-browser snapshot -i
agent-browser fill @e3 "search query"
agent-browser press Enter
agent-browser wait 1000
agent-browser snapshot -i
bash
agent-browser connect 9222
agent-browser snapshot -i
agent-browser fill @e3 "search query"
agent-browser press Enter
agent-browser wait 1000
agent-browser snapshot -i
Run Multiple Apps Simultaneously
同时运行多个应用
Use named sessions to control multiple Electron apps at the same time:
Connect to Slack
Connect to Slack
agent-browser --session slack connect 9222
agent-browser --session slack connect 9222
Connect to VS Code
Connect to VS Code
agent-browser --session vscode connect 9223
agent-browser --session vscode connect 9223
Interact with each independently
Interact with each independently
agent-browser --session slack snapshot -i
agent-browser --session vscode snapshot -i
agent-browser --session slack snapshot -i
agent-browser --session vscode snapshot -i
Playwright overrides the color scheme to
by default when connecting via CDP. To preserve dark mode:
bash
agent-browser connect 9222
agent-browser --color-scheme dark snapshot -i
Or set it globally:
bash
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser connect 9222
通过CDP连接时,Playwright默认会将配色方案覆盖为
。如需保留深色模式:
bash
agent-browser connect 9222
agent-browser --color-scheme dark snapshot -i
或全局设置:
bash
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser connect 9222
"Connection refused" or "Cannot connect"
“连接被拒绝”或“无法连接”
- Make sure the app was launched with
--remote-debugging-port=NNNN
- If the app was already running, quit and relaunch with the flag
- Check that the port isn't in use by another process:
- 确保应用是通过
--remote-debugging-port=NNNN
参数启动的
- 如果应用已在运行,请退出后重新添加参数启动
- 检查端口是否被其他进程占用:
App launches but connect fails
应用启动但连接失败
- Wait a few seconds after launch before connecting ()
- Some apps take time to initialize their webview
- 启动后等待几秒再连接()
- 部分应用需要时间初始化网页视图
Elements not appearing in snapshot
元素未出现在快照中
- The app may use multiple webviews. Use to list targets and switch to the right one
- Use
agent-browser snapshot -i -C
to include cursor-interactive elements (divs with onclick handlers)
- 应用可能使用了多个网页视图。使用列出目标并切换到正确的视图
- 使用
agent-browser snapshot -i -C
包含光标可交互元素(带有onclick处理器的div)
Cannot type in input fields
无法在输入框中输入
- Try
agent-browser keyboard type "text"
to type at the current focus without a selector
- Some Electron apps use custom input components; use
agent-browser keyboard inserttext "text"
to bypass key events
- 尝试使用
agent-browser keyboard type "text"
在当前焦点位置输入,无需选择器
- 部分Electron应用使用自定义输入组件;使用
agent-browser keyboard inserttext "text"
绕过按键事件
Any app built on Electron works, including:
- Communication: Slack, Discord, Microsoft Teams, Signal, Telegram Desktop
- Development: VS Code, GitHub Desktop, Postman, Insomnia
- Design: Figma, Notion, Obsidian
- Media: Spotify, Tidal
- Productivity: Todoist, Linear, 1Password
If an app is built with Electron, it supports
and can be automated with agent-browser.
所有基于Electron构建的应用都可使用,包括:
- 通讯类: Slack、Discord、Microsoft Teams、Signal、Telegram Desktop
- 开发类: VS Code、GitHub Desktop、Postman、Insomnia
- 设计类: Figma、Notion、Obsidian
- 媒体类: Spotify、Tidal
- 生产力类: Todoist、Linear、1Password
只要应用基于Electron构建,就支持
参数,可通过agent-browser实现自动化。