glmocr-handwriting

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

GLM-OCR Handwriting Recognition Skill / GLM-OCR 手写体识别技能

GLM-OCR 手写体识别技能

Recognize handwritten text from images and PDFs using the ZhiPu GLM-OCR layout parsing API.

使用智谱GLM-OCR版面解析API识别图片和PDF中的手写文本。

When to Use / 使用场景

使用场景

Extract text from handwritten notes, letters, or documents / 从手写笔记、信件或文档中提取文字
Convert handwriting to editable text / 将手写内容转为可编辑文本
Recognize mixed handwritten and printed content / 识别手写和印刷混排内容
Read handwritten formulas, labels, or annotations / 读取手写公式、标签或批注
User mentions "handwriting OCR", "recognize handwriting", "手写识别", "手写体OCR", "识别手写字"

从手写笔记、信件或文档中提取文字
将手写内容转为可编辑文本
识别手写和印刷混排内容
读取手写公式、标签或批注
用户提及"handwriting OCR"、"recognize handwriting"、"手写识别"、"手写体OCR"、"识别手写字"时

Key Features / 核心特性

核心特性

Multi-style support: Handles various handwriting styles including cursive and print
Multi-language: Supports Chinese, English, and mixed-language handwriting
Mixed content: Can recognize documents with both handwritten and printed text
Local file & URL: Supports both local files and remote URLs

多风格支持：可处理包括草书、印刷体手写在内的多种手写风格
多语言支持：支持中文、英文及中英混合手写内容识别
混排内容识别：可识别同时包含手写文本和印刷文本的文档
本地文件&URL支持：同时支持本地文件和远程URL识别

Resource Links / 资源链接

资源链接

Resource	Link
Get API Key	智谱开放平台 API Keys
API Docs	Layout Parsing / 版面解析

资源	链接
获取API Key	智谱开放平台 API Keys
API文档	版面解析

Prerequisites / 前置条件

前置条件

API Key Setup / API Key 配置（Required / 必需）

API Key 配置（必需）

脚本通过

ZHIPU_API_KEY

环境变量获取密钥，可与其他智谱技能复用同一个 key。 This script reads the key from the

ZHIPU_API_KEY

environment variable. Reusing the same key across Zhipu skills is optional.

Get Key / 获取 Key： Visit 智谱开放平台 API Keys to create or copy your key.

Setup options / 配置方式（任选一种）：

Global config (recommended) / 全局配置（推荐）： Set once in

openclaw.json

under

env.vars

, all Zhipu skills will share it:

json

{
  "env": {
    "vars": {
      "ZHIPU_API_KEY": "你的密钥"
    }
  }
}

Skill-level config / Skill 级别配置： Set for this skill only in

openclaw.json

json

{
  "skills": {
    "entries": {
      "glmocr-handwriting": {
        "env": {
          "ZHIPU_API_KEY": "你的密钥"
        }
      }
    }
  }
}

Shell environment variable / Shell 环境变量： Add to
```
~/.zshrc
```
:
bash
```
export ZHIPU_API_KEY="你的密钥"
```

💡 如果你已为其他智谱 skill（如
glmocr
、
glmv-caption
、
glm-image-generation
）配置过 key，它们共享同一个
ZHIPU_API_KEY
，无需重复配置。

脚本通过

ZHIPU_API_KEY

环境变量获取密钥，可与其他智谱技能复用同一个key。该脚本从

ZHIPU_API_KEY

环境变量读取密钥，可与其他智谱技能复用同一个密钥。

获取Key： 访问智谱开放平台 API Keys创建或复制你的密钥。

配置方式（任选一种）：

全局配置（推荐）： 在

openclaw.json

的

env.vars

下配置一次，所有智谱技能均可共享该配置：

json

{
  "env": {
    "vars": {
      "ZHIPU_API_KEY": "你的密钥"
    }
  }
}

Skill级别配置： 仅在

openclaw.json

中为该技能单独配置：

json

{
  "skills": {
    "entries": {
      "glmocr-handwriting": {
        "env": {
          "ZHIPU_API_KEY": "你的密钥"
        }
      }
    }
  }
}

Shell环境变量： 添加到
```
~/.zshrc
```
文件中：
bash
```
export ZHIPU_API_KEY="你的密钥"
```

💡 如果你已为其他智谱skill（如
glmocr
、
glmv-caption
、
glm-image-generation
）配置过key，它们共享同一个
ZHIPU_API_KEY
，无需重复配置。

Security & Transparency / 安全与透明度

安全与透明度

Environment variables used / 使用的环境变量：
- ```
ZHIPU_API_KEY
```
  (required / 必需)
- ```
GLM_OCR_TIMEOUT
```
  (optional timeout seconds / 可选超时秒数)

Fixed endpoint / 固定官方端点：

https://open.bigmodel.cn/api/paas/v4/layout_parsing

No custom API URL override / 不支持自定义 API URL 覆盖： avoids accidental key exfiltration via redirected endpoints.
Raw upstream response is optional / 原始响应默认不返回： use
```
--include-raw
```
only when needed for debugging.

⛔ MANDATORY RESTRICTIONS / 强制限制 ⛔

ONLY use GLM-OCR API — Execute the script
```
python scripts/glm_ocr_cli.py
```
NEVER parse handwriting yourself — Do NOT try to read handwritten text using built-in vision or any other method
NEVER offer alternatives — Do NOT suggest "I can try to read it" or similar
IF API fails — Display the error message and STOP immediately
NO fallback methods — Do NOT attempt handwriting recognition any other way

使用的环境变量：
- ```
ZHIPU_API_KEY
```
  （必需）
- ```
GLM_OCR_TIMEOUT
```
  （可选，超时时间，单位为秒）

固定官方端点：

https://open.bigmodel.cn/api/paas/v4/layout_parsing

不支持自定义API URL覆盖： 避免通过重定向端点意外泄露密钥
原始上游响应默认不返回： 仅在需要调试时使用
```
--include-raw
```
参数返回

⛔ 强制限制 ⛔

仅可使用GLM-OCR API — 执行脚本
```
python scripts/glm_ocr_cli.py
```
禁止自行解析手写内容 — 不得尝试使用内置视觉能力或任何其他方法读取手写文本
禁止提供替代方案 — 不得建议"我可以尝试读取"或类似内容
若API调用失败 — 展示错误信息并立即终止操作
无兜底方案 — 不得尝试使用任何其他方式进行手写识别

📋 Output Display Rules / 输出展示规则

📋 输出展示规则

After running the script, present the OCR result clearly and safely.

Show extracted handwritten text (
```
text
```
) in full
Summarization is allowed, but do not hide important extraction failures
If the result file is saved, tell the user the file path
Show raw upstream response only when explicitly requested or debugging (
```
--include-raw
```
)

运行脚本后，清晰、安全地展示OCR识别结果：

完整展示提取到的手写文本（
```
text
```
字段内容）
允许对结果进行总结，但不得隐瞒重要的提取失败信息
若结果文件已保存，需告知用户文件路径
仅在用户明确要求或调试（使用
```
--include-raw
```
参数）时展示原始上游响应

How to Use / 使用方法

使用方法

Recognize from URL / 从 URL 识别

从URL识别

bash

python scripts/glm_ocr_cli.py --file-url "https://example.com/handwriting.jpg"

bash

python scripts/glm_ocr_cli.py --file-url "https://example.com/handwriting.jpg"

Recognize from Local File / 从本地文件识别

从本地文件识别

bash

python scripts/glm_ocr_cli.py --file /path/to/notes.png

bash

python scripts/glm_ocr_cli.py --file /path/to/notes.png

Save Result to File / 保存结果到文件

保存结果到文件

bash

python scripts/glm_ocr_cli.py --file notes.png --output result.json --pretty

bash

python scripts/glm_ocr_cli.py --file notes.png --output result.json --pretty

Include Raw Upstream Response (Debug Only) / 包含原始上游响应（仅调试）

包含原始上游响应（仅调试）

bash

python scripts/glm_ocr_cli.py --file notes.png --output result.json --include-raw

bash

python scripts/glm_ocr_cli.py --file notes.png --output result.json --include-raw

CLI Reference / CLI 参数

CLI参数

python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty] [--include-raw]

Parameter	Required	Description
`--file-url`	One of	URL to image/PDF
`--file`	One of	Local file path to image/PDF
`--output` , `-o`	No	Save result JSON to file
`--pretty`	No	Pretty-print JSON output
`--include-raw`	No	Include raw upstream API response in `result` field (debug only)

python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty] [--include-raw]

参数	是否必填	说明
`--file-url`	二选一	图片/PDF的URL地址
`--file`	二选一	图片/PDF的本地文件路径
`--output` , `-o`	否	将结果JSON保存到文件
`--pretty`	否	格式化JSON输出
`--include-raw`	否	在 `result` 字段中包含原始上游API响应（仅用于调试）

Response Format / 响应格式

响应格式

json

{
  "ok": true,
  "text": "Recognized handwritten text in Markdown...",
  "layout_details": [...],
  "result": null,
  "error": null,
  "source": "/path/to/file",
  "source_type": "file",
  "raw_result_included": false
}

Key fields:

```
ok
```
— whether recognition succeeded
```
text
```
— extracted text in Markdown (use this for display)
```
layout_details
```
— layout analysis details
```
error
```
— error details on failure

json

{
  "ok": true,
  "text": "Recognized handwritten text in Markdown...",
  "layout_details": [...],
  "result": null,
  "error": null,
  "source": "/path/to/file",
  "source_type": "file",
  "raw_result_included": false
}

核心字段：

```
ok
```
— 识别是否成功
```
text
```
— 提取到的Markdown格式文本（用于展示）
```
layout_details
```
— 版面分析详情
```
error
```
— 识别失败时的错误详情

Error Handling / 错误处理

错误处理

API key not configured:

ZHIPU_API_KEY not configured. Get your API key at: https://bigmodel.cn/usercenter/proj-mgmt/apikeys

→ Show exact error to user, guide them to configure

Authentication failed (401/403): API key invalid/expired → reconfigure

Rate limit (429): Quota exhausted → inform user to wait

File not found: Local file missing → check path

API Key未配置：

ZHIPU_API_KEY not configured. Get your API key at: https://bigmodel.cn/usercenter/proj-mgmt/apikeys

→ 向用户展示完整错误信息，引导用户完成配置

认证失败（401/403）： API Key无效/已过期 → 引导用户重新配置

触发限流（429）： 配额已耗尽 → 告知用户稍后再试

文件未找到： 本地文件不存在 → 引导用户检查路径