ocr

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

OCR Image Text Extraction Skill

OCR图像文本提取技能

Extract text from images using Tesseract OCR engine.
使用Tesseract OCR引擎从图片中提取文本。

Capabilities

功能

  • Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF)
  • Support for 100+ languages
  • Optional image preprocessing for better accuracy
  • Output in plain text or JSON format with confidence scores
  • 从图像文件中提取文本(PNG、JPG、JPEG、GIF、BMP、TIFF)
  • 支持100+种语言
  • 可选图像预处理以提升识别准确率
  • 输出纯文本或带置信度评分的JSON格式

Usage

使用方法

Basic OCR

基础OCR识别

bash
python3 scripts/ocr.py <image_file> <output_file>
bash
python3 scripts/ocr.py <image_file> <output_file>

With Options

带参数使用

bash
undefined
bash
undefined

Specify language (default: eng)

指定语言(默认:eng)

python3 scripts/ocr.py image.png text.txt --lang eng
python3 scripts/ocr.py image.png text.txt --lang eng

Chinese text

中文文本识别

python3 scripts/ocr.py image.png text.txt --lang chi_sim
python3 scripts/ocr.py image.png text.txt --lang chi_sim

Multiple languages

多语言识别

python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim
python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim

With image preprocessing (improves accuracy)

启用图像预处理(提升准确率)

python3 scripts/ocr.py image.png text.txt --preprocess
python3 scripts/ocr.py image.png text.txt --preprocess

JSON output with confidence scores

输出带置信度评分的JSON格式

python3 scripts/ocr.py image.png output.json --format json
undefined
python3 scripts/ocr.py image.png output.json --format json
undefined

Download and OCR from URL

从URL下载并识别

bash
undefined
bash
undefined

OCR from remote image

识别远程图片

python3 scripts/ocr_url.py <image_url> <output_file>
python3 scripts/ocr_url.py <image_url> <output_file>

With options

带参数使用

python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess
undefined
python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess
undefined

Parameters

参数说明

  • image_file
    /
    image_url
    (required): Path to local image or image URL
  • output_file
    (required): Path to output text/JSON file
  • --lang
    : Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng
  • --preprocess
    : Apply image preprocessing (grayscale, thresholding) for better accuracy
  • --format
    : Output format (text/json, default: text)
  • image_file
    /
    image_url
    (必填):本地图片路径或图片URL
  • output_file
    (必填):输出文本/JSON文件路径
  • --lang
    :语言代码(例如:eng、chi_sim、jpn、fra、deu)。默认值:eng
  • --preprocess
    :应用图像预处理(灰度化、阈值处理)以提升准确率
  • --format
    :输出格式(text/json,默认值:text)

Common Languages

常用语言

LanguageCode
Englisheng
Chinese (Simplified)chi_sim
Chinese (Traditional)chi_tra
Japanesejpn
Koreankor
Frenchfra
Germandeu
Spanishspa
Russianrus
Arabicara
语言代码
英语eng
简体中文chi_sim
繁体中文chi_tra
日语jpn
韩语kor
法语fra
德语deu
西班牙语spa
俄语rus
阿拉伯语ara

Supported Image Formats

支持的图像格式

PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP
PNG、JPG、JPEG、GIF、BMP、TIFF、WEBP

Dependencies

依赖项

  • Python 3.8+
  • pytesseract
  • Pillow (PIL)
  • tesseract-ocr (system package)
  • Python 3.8+
  • pytesseract
  • Pillow (PIL)
  • tesseract-ocr(系统包)

Installation

安装步骤

bash
undefined
bash
undefined

Python packages

Python包安装

pip install pytesseract Pillow
pip install pytesseract Pillow

Tesseract OCR engine

Tesseract OCR引擎安装

sudo apt-get install tesseract-ocr # Ubuntu/Debian sudo yum install tesseract # CentOS/RHEL brew install tesseract # macOS
undefined
sudo apt-get install tesseract-ocr # Ubuntu/Debian sudo yum install tesseract # CentOS/RHEL brew install tesseract # macOS
undefined