Loading...
Loading...
Extracts figures and sub-figures from academic PDF papers. Supports Fig/Figure, Scheme, Chart, Supplementary Figure, Extended Data Figure (Nature), and Chinese equivalents (图/方案/示意图/附图/补充图). Sub-figure label recognition supports (a)/(A)/a)/(i)/(1)/a. formats. High-quality PNG output at configurable DPI. Use when user asks to "extract figure", "截取文献图片", "提取子图", "get figure from paper", "Scheme", "方案图", "补充图", "Supplementary Figure", or "Extended Data".
npx skill4agent add shzhao27208/aut_sci_write sci-figurecd ${SKILL_DIR}
pip install -e .sh-sci-figwinget install UB-Mannheim.TesseractOCRapt install tesseract-ocrbrew install tesseract# Check project-level first
test -f .baoyu-skills/sci-figure/EXTEND.md && echo "project"
# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/sci-figure/EXTEND.md" && echo "user"sh-sci-fig <input.pdf> [options]| Option | Short | Description | Default |
|---|---|---|---|
| PDF file path | Required | |
| | Figure number (1, 2, 3...) | Required (except --list/--all) |
| | Sub-figure label (a, b, c...) | None (returns whole figure) |
| | Output directory | Current directory |
| | Output resolution | 600 |
| | List all available figure numbers | false |
| Extract all figures | false | |
| Output format (png/jpg) | png | |
| | Suppress info messages | false |
# Extract Figure 2, sub-figure c
sh-sci-fig paper.pdf -f 2 -s c
# Extract entire Figure 3
sh-sci-fig paper.pdf -f 3
# List all available figures in a PDF
sh-sci-fig paper.pdf --list
# Extract all figures
sh-sci-fig paper.pdf --all
# Custom output directory and DPI
sh-sci-fig paper.pdf -f 2 -s c -o ./output/ -d 300Extracted: figure_2c.png (1920x1080, 600 DPI)| Scenario | Behavior |
|---|---|
| Figure number not found | Error + list all available figure numbers |
| OCR recognition failed | Return entire figure region |
| Sub-figure split failed | Return entire figure region |
| No sub-figure labels found | Return entire figure region |
| Library | Role |
|---|---|
| pdfplumber | Text + coordinate extraction (locate "Figure X" labels) |
| PyMuPDF (fitz) | PDF → high-quality image rendering (600 DPI) |
| opencv-python | Boundary detection, contour analysis |
| Pillow | Final cropping, format conversion |
| pytesseract | OCR for sub-figure label recognition |
| Field | Type | Description |
|---|---|---|
| int | Figure number |
| int | Page index (0-based) |
| tuple | Crop region in pixels |
| tuple | Crop region in PDF points |
| str | Caption text (truncated to 200 chars) |
| str | Full caption text (no truncation) |
| tuple | Caption bounding box in PDF points |
| list[str] | Sub-figure labels, e.g. |
| list[dict] | Labels with detected format, e.g. |
| str | One of: |
| bool | True for |
| ndarray | Cropped figure image (numpy array) |