screenclaw
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesescreenclaw
screenclaw
核心规则
Core Rules
- 坐标必须来自 ScreenClaw 截图上的 网格交叉点,不能用内部视觉坐标或凭感觉推测。
XxY - 每个会话固定一个 ,所有公开调用只使用
session_id。scripts/screenclaw.py|ps1|sh - 调用 endpoint 前阅读 ;参数错误或 API 报错时回到文档修正。
references/api/{endpoint}.md - 首次动态坐标、高风险坐标、看不清目标或数字时,必须先裁剪放大或 marker 反验。
- API 成功只代表指令已发送,不代表界面达成目标;操作后必须截图验证。
- 收到 时,阅读并执行
SELF_CHECK_REQUIRED的自检程序,让关键上下文重新装载,再用references/self_check.md总结执行内容后重试截图。self_check
- Coordinates must come from the grid intersections on ScreenClaw screenshots; internal visual coordinates or speculative coordinates are not allowed.
XxY - A fixed is used for each session, and all public calls only use
session_id.scripts/screenclaw.py|ps1|sh - Read before calling an endpoint; if there are parameter errors or API errors, return to the document to correct them.
references/api/{endpoint}.md - For first-time dynamic coordinates, high-risk coordinates, or when the target or numbers are unclear, you must first crop and zoom or verify with a marker.
- API success only means the instruction has been sent, not that the interface has achieved the goal; you must take a screenshot to verify after the operation.
- When receiving , read and execute the self-check procedure in
SELF_CHECK_REQUIREDto reload key context, then summarize the executed content withreferences/self_check.mdand retry taking screenshots.self_check
心智模型
Mental Model
- 外部事实优先:截图、marker、API 返回、窗口列表是事实;你的直觉和预设结论必须服从这些事实。marker 不在目标上,就说明坐标错了。
- 先证伪,再操作:候选坐标默认可能是错的。先用 crop 或 marker 找反例,确认标记点实际落在哪里,再操作。
- 失败先回到截图:操作失败、结果不符合预期、连续微调无效时,先重新截图、裁剪、读文档和参数,不要继续猜坐标或重复点击。
- External facts take priority: screenshots, markers, API returns, and window lists are facts; your intuition and preset conclusions must obey these facts. If the marker is not on the target, it means the coordinates are wrong.
- Falsify first, then operate: candidate coordinates are assumed to be wrong by default. Use crop or marker to find counterexamples, confirm where the marked point actually falls, then operate.
- Return to screenshots when failing: when operations fail, results do not meet expectations, or continuous fine-tuning is ineffective, first re-take screenshots, crop, read documents and parameters, do not continue guessing coordinates or repeating clicks.
固定工作循环
Fixed Work Cycle
text
理解目标 -> health -> config -> get_window_list -> screenshot -> 读坐标 -> marker 反验 -> 操作 -> screenshot 验证text
Understand Goal -> health -> config -> get_window_list -> screenshot -> Read Coordinates -> Marker Verification -> Operation -> Screenshot Verification1. 初始化
1. Initialization
- 根据用户语言回复。
- 阅读 获取
references/config.md、api_url、token、ai_app_type规则。session_id - 阅读 了解统一脚本入口和点号路径格式。
scripts/README.md - 调用 确认服务可用。
health - 搜索 是否有匹配模板。
references/scenarios/ - 复杂、多步、高风险任务先维护 2-5 步简短计划;简单单步任务不强制创建待办。
- Respond according to the user's language.
- Read to get the rules for
references/config.md,api_url,token, andai_app_type.session_id - Read to understand the unified script entry and dot path format.
scripts/README.md - Call to confirm the service is available.
health - Search for matching templates.
references/scenarios/ - For complex, multi-step, high-risk tasks, maintain a short plan of 2-5 steps first; short single-step tasks do not require creating to-dos.
2. 获取目标窗口
2. Get Target Window
- 调用 找主窗口和可能的子窗口。
get_window_list - 新进程或窗口不确定时,对候选窗口截图,记录窗口内容和可用 。
window_id/main_window_id - 后续操作失败时,先检查是否选错窗口,再换操作模式。
- Call to find the main window and possible child windows.
get_window_list - When a new process or window is uncertain, take screenshots of candidate windows and record window content and available .
window_id/main_window_id - When subsequent operations fail, first check if the wrong window was selected, then switch operation modes.
3. 截图与读坐标
3. Screenshot and Read Coordinates
- 定位坐标使用 。
screenshot coordinate_type=grid - 分析内容或给用户看图可用 。
coordinate_type=no - 默认先依赖服务端自适应网格参数。
- 目标没有被交叉点覆盖时,阅读 调整
references/api/screenshot.md。grid.density_x/y - 数字或元素看不清时,使用 或调整数字参数。
crop_zoom_screenshot - 首次动态坐标或高风险坐标,先用 看清局部,再用
crop_zoom_screenshot反向验证。marker.0.x/y - marker 反验要先找图上的标记点实际落在哪里,描述那里有什么,再判断它是否等于目标;不要先假设候选坐标正确。
- Use to locate coordinates.
screenshot coordinate_type=grid - Use to analyze content or show images to users.
coordinate_type=no - By default, rely on server-side adaptive grid parameters first.
- When the target is not covered by intersections, read to adjust
references/api/screenshot.md.grid.density_x/y - When numbers or elements are unclear, use or adjust number parameters.
crop_zoom_screenshot - For first-time dynamic coordinates or high-risk coordinates, first use to see details clearly, then verify in reverse with
crop_zoom_screenshot.marker.0.x/y - For marker verification, first find where the marked point on the image actually falls, describe what is there, then judge whether it matches the target; do not assume candidate coordinates are correct first.
4. 操作与验证
4. Operation and Verification
- 操作模式优先 。
background - 无效或必须物理输入时才考虑
background。hijack - 用户主动要求、游戏实时操作、中文输入法候选面板等持续物理输入场景,阅读 后进入托管。
references/api/delegated.md - 探索阶段单步调用;流程稳定、需要瞬间观察 hover/菜单/操作结果时才用 。
batch - 每次操作后截图验证,验证不通过则回到截图和读坐标。
- 收到 时必须更新当前计划或下一步动作。
SELF_CHECK_REQUIRED
- Prioritize the operation mode.
background - Only consider when
hijackis invalid or physical input is required.background - For scenarios requiring continuous physical input such as user-initiated requests, real-time game operations, and Chinese input method candidate panels, read before entering delegation mode.
references/api/delegated.md - Use single-step calls during the exploration phase; use only when the process is stable and instant observation of hover/menu/operation results is needed.
batch - Take a screenshot to verify after each operation; if verification fails, return to screenshot and coordinate reading.
- When receiving , you must update the current plan or next action.
SELF_CHECK_REQUIRED
坐标概念
Coordinate Concept
截图上的坐标格式为 ,例如 表示距左边界 50%、距上边界 35%。
XxY50x35xThe coordinate format on screenshots is , for example, means 50% from the left boundary and 35% from the top boundary.
XxY50x35xAPI 索引
API Index
执行API前先读对应文档references/api/{endpoint}.md
| API | method | 适用场景 | 参考文档 |
|---|---|---|---|
| health | GET | 任务开始前检查服务 | |
| get_window_list | POST | 找出需要被控制的目标窗口 | |
| screenshot | POST | 带网格可定位坐标。不带网格可分析界面、留存记录。带标记点可预览坐标位置 | |
| crop_zoom_screenshot | POST | 裁剪任意截图并放大,看清细节(如坐标数字) | |
| scroll_screenshot | POST | 滚动长截图,记录长页面、长内容,整体理解目标窗口 | |
| click | POST | 单击,触发按钮/进入页面 | |
| long_press | POST | 长按,触发某些功能 | |
| swipe | POST | 触摸式滑动,上下左右移动页面 | |
| drag | POST | 拖拽元素,按住鼠标并移动 | |
| scroll | POST | 鼠标滚轮滚动,上下移动页面 | |
| right_click | POST | 右键,打开上下文菜单 | |
| hover | POST | 触发悬停效果,配合截图获取hover效果 | |
| mouse_move | POST | 鼠标移动,游戏视角控制,仅hijack/托管 | |
| input_text | POST | 输入文本。带坐标会先单击再输入。不带坐标直接输入 | |
| press_key | POST | 按键/组合键。带坐标会先单击再按键。不带坐标直接按键 | |
| wait | POST | 等待UI动画/页面加载 | |
| batch | POST | 组合指令,执行连续步骤。多个单步的操作可组合执行,提高效率 | |
| delegated | POST | 用户主动要求进入/退出托管模式 | |
Read the corresponding documentbefore executing the APIreferences/api/{endpoint}.md
| API | method | Applicable Scenario | Reference Document |
|---|---|---|---|
| health | GET | Check service availability before starting tasks | |
| get_window_list | POST | Find the target window to be controlled | |
| screenshot | POST | With grid for coordinate positioning. Without grid for interface analysis and record keeping. With marker points to preview coordinate positions | |
| crop_zoom_screenshot | POST | Crop any screenshot and zoom in to see details (such as coordinate numbers) | |
| scroll_screenshot | POST | Scroll to take long screenshots, record long pages and content, and understand the target window as a whole | |
| click | POST | Single click to trigger buttons/enter pages | |
| long_press | POST | Long press to trigger certain functions | |
| swipe | POST | Touch-style swipe to move pages up, down, left, or right | |
| drag | POST | Drag elements by holding the mouse and moving | |
| scroll | POST | Mouse wheel scroll to move pages up or down | |
| right_click | POST | Right click to open context menus | |
| hover | POST | Trigger hover effects, cooperate with screenshots to obtain hover effects | |
| mouse_move | POST | Mouse movement, game perspective control, only for hijack/delegation | |
| input_text | POST | Input text. If coordinates are provided, it will click first then input. If no coordinates are provided, it will input directly | |
| press_key | POST | Press keys/combination keys. If coordinates are provided, it will click first then press keys. If no coordinates are provided, it will press keys directly | |
| wait | POST | Wait for UI animations/page loading | |
| batch | POST | Combine instructions to execute continuous steps. Multiple single-step operations can be combined to improve efficiency | |
| delegated | POST | User actively requests to enter/exit delegation mode | |
脚本降级
Script Downgrade
降级路径:
text
scripts/screenclaw.py -> scripts/screenclaw.ps1 -> scripts/screenclaw.sh -> curl降级前先判断原因:
| 错误类型 | 处理方式 |
|---|---|
| 参数错误 | 修正参数,重跑同一脚本,不降级 |
| API 业务错误 | 阅读对应 API 文档和服务端 message,不降级 |
| Python 不存在等环境错误 | 降级到 PowerShell 或 shell |
Downgrade path:
text
scripts/screenclaw.py -> scripts/screenclaw.ps1 -> scripts/screenclaw.sh -> curlJudge the reason before downgrading:
| Error Type | Handling Method |
|---|---|
| Parameter Error | Correct parameters and re-run the same script; do not downgrade |
| API Business Error | Read the corresponding API document and server message; do not downgrade |
| Environment Errors such as Python not existing | Downgrade to PowerShell or shell |
参考文档
Reference Documents
- - 连接配置、
references/config.md、ai_app_typesession_id - - 统一脚本入口和点号路径格式
scripts/README.md - - 长时程自检重载清单
references/self_check.md - - 各 API 参数和排错
references/api/*.md - - 场景模板和应用知识
references/scenarios/
- - Connection configuration,
references/config.md,ai_app_typesession_id - - Unified script entry and dot path format
scripts/README.md - - Long-term self-check reload checklist
references/self_check.md - - Parameters and troubleshooting for each API
references/api/*.md - - Scenario templates and application knowledge
references/scenarios/