pdf-vision-reader

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

PDF Vision Reader

図表が多い PDF を画像化して、Claude の vision 機能で内容を解析・Markdown 化するスキルです。

This is a skill that converts PDFs with many diagrams and charts into images, analyzes their content using Claude's vision feature, and converts it to Markdown.

クイックスタート

Quick Start

基本的な使い方

Basic Usage

bash

undefined

bash

undefined

1. PDF を画像に変換

wsl python3 scripts/pdf_to_images.py "/mnt/c/path/to/file.pdf"

2. 各画像を Read ツールで読み込んで解析

3. Markdown 形式でまとめる

undefined

undefined

前提条件

Prerequisites

必要なパッケージ：

bash

undefined

Required Packages:

bash

undefined

Python パッケージ

wsl pip3 install pdf2image Pillow

システムパッケージ (poppler)

wsl sudo apt-get update wsl sudo apt-get install -y poppler-utils

undefined

wsl sudo apt-get update wsl sudo apt-get install -y poppler-utils

undefined

ワークフロー

Workflow

ステップ1: PDF を画像に変換

Step 1: Convert PDF to Images

bash

wsl python3 scripts/pdf_to_images.py "/mnt/c/path/to/document.pdf"

これにより

document_pages/

ディレクトリが作成され、各ページが画像として保存されます：

```
page_001.png
```
```
page_002.png
```
```
page_003.png
```
...

bash

wsl python3 scripts/pdf_to_images.py "/mnt/c/path/to/document.pdf"

This creates a

document_pages/

directory where each page is saved as an image:

```
page_001.png
```
```
page_002.png
```
```
page_003.png
```
...

ステップ2: 各画像を解析

Step 2: Analyze Each Image

Read ツールで各画像を順番に読み込み、内容を解析します。

解析時の指示例:

この画像の内容を詳しく説明してください：
- タイトルや見出し
- 本文テキスト
- 図表の説明
- グラフやチャートのデータ
- 重要なポイント

Use the Read tool to load each image sequentially and analyze its content.

Example Instructions for Analysis:

Please provide a detailed description of this image's content including:
- Titles and headings
- Body text
- Diagram and chart descriptions
- Graph and chart data
- Key points

ステップ3: Markdown に統合

Step 3: Integrate into Markdown

各ページの解析結果を統合して、一つの Markdown ファイルを作成します。

Integrate the analysis results from each page to create a single Markdown file.

使用例

Usage Examples

例1: プレゼンテーション資料を Markdown 化

Example 1: Convert Presentation Materials to Markdown

User: "presentation.pdf を vision で解析して Markdown 化して"
Assistant:
1. scripts/pdf_to_images.py で PDF を画像に変換
2. 各画像を Read ツールで読み込み
3. 各ページの内容を解析（タイトル、図表、テキスト）
4. 全ページの解析結果を統合
5. Write ツールで Markdown ファイルに保存

User: "Analyze presentation.pdf using vision and convert it to Markdown"
Assistant:
1. Convert the PDF to images using scripts/pdf_to_images.py
2. Load each image with the Read tool
3. Analyze each page's content (titles, diagrams, text)
4. Integrate analysis results from all pages
5. Save as a Markdown file using the Write tool

例2: 特定のページのみ解析

Example 2: Analyze Specific Pages Only

User: "document.pdf の 5-10 ページだけ解析して"
Assistant:
1. PDF を画像に変換（全ページ）
2. page_005.png から page_010.png のみ Read で読み込み
3. 該当ページの内容を Markdown 化

User: "Analyze only pages 5-10 of document.pdf"
Assistant:
1. Convert the PDF to images (all pages)
2. Load only page_005.png to page_010.png using Read
3. Convert the relevant pages' content to Markdown

解析の観点

Analysis Perspectives

自動的に抽出する情報

Automatically Extracted Information

各ページの画像から以下を抽出：

テキスト情報
- タイトル・見出し
- 本文テキスト
- 箇条書きリスト
- 注釈・キャプション
図表
- 図の種類（フローチャート、組織図、etc.）
- 図の説明・要約
- 主要な要素と関係性
グラフ・チャート
- グラフの種類（棒グラフ、円グラフ、etc.）
- 軸ラベル
- 主要なデータポイント
- トレンドや傾向
テーブル
- テーブルの構造
- ヘッダー行
- データの内容
- Markdown テーブル形式に変換
レイアウト・構造
- ページ全体のレイアウト
- セクション分け
- 強調されている情報

The following information is extracted from each page image:

Text Information
- Titles and headings
- Body text
- Bullet point lists
- Annotations and captions
Diagrams and Charts
- Diagram type (flowchart, organizational chart, etc.)
- Diagram description and summary
- Key elements and relationships
Graphs and Charts
- Graph type (bar graph, pie chart, etc.)
- Axis labels
- Key data points
- Trends and patterns
Tables
- Table structure
- Header rows
- Data content
- Conversion to Markdown table format
Layout and Structure
- Overall page layout
- Section divisions
- Highlighted information

Markdown 出力フォーマット

Markdown Output Format

markdown

undefined

markdown

undefined

[PDFタイトル]

[PDF Title]

解析日時: YYYY-MM-DD 総ページ数: N

Analysis Date: YYYY-MM-DD Total Pages: N

Page 1: [ページタイトル]

Page 1: [Page Title]

概要

Overview

[ページの概要説明]

[Page overview description]

主要な内容

Key Content

[ポイント1]
[ポイント2]

[Point 1]
[Point 2]

図表

Diagrams and Charts

図1: [図のタイトル] [図の説明]

Figure 1: [Diagram Title] [Diagram description]

テキスト内容

Text Content

[ページ内のテキスト]

[Page text content]

Page 2: [ページタイトル]

Page 2: [Page Title]

...

undefined

...

undefined

スクリプト詳細

Script Details

pdf_to_images.py

機能:

PDF の各ページを PNG 画像に変換
解像度指定可能（デフォルト: 200 DPI）
出力ディレクトリの自動作成

使い方:

bash

python scripts/pdf_to_images.py <pdf_path> [output_dir] [dpi]

Features:

Convert each PDF page to a PNG image
Configurable resolution (default: 200 DPI)
Automatic output directory creation

Usage:

bash

python scripts/pdf_to_images.py <pdf_path> [output_dir] [dpi]

例

Example

python scripts/pdf_to_images.py document.pdf ./images 300


**出力:**
- `[pdf_name]_pages/page_001.png`
- `[pdf_name]_pages/page_002.png`
- ...

python scripts/pdf_to_images.py document.pdf ./images 300


**Output:**
- `[pdf_name]_pages/page_001.png`
- `[pdf_name]_pages/page_002.png`
- ...

対応可能なコンテンツ

Supported Content

✅ テキスト（日本語・英語）
✅ 図表・ダイアグラム
✅ グラフ・チャート
✅ テーブル
✅ スクリーンショット
✅ インフォグラフィック
✅ 複雑なレイアウト
⚠️ 手書きメモ（精度は状況による）
⚠️ 低解像度画像（精度低下の可能性）

✅ Text (Japanese, English)
✅ Diagrams and charts
✅ Graphs and charts
✅ Tables
✅ Screenshots
✅ Infographics
✅ Complex layouts
⚠️ Handwritten notes (accuracy depends on conditions)
⚠️ Low-resolution images (possible accuracy reduction)

テキスト抽出との違い

Differences from Text Extraction

pdf-reader (テキスト抽出)

pdf-reader (Text Extraction)

✅ テキストのみの PDF で高速
✅ 純粋なテキスト抽出
❌ 図表は抽出不可
❌ レイアウトは簡略化

✅ Fast for text-only PDFs
✅ Pure text extraction
❌ Cannot extract diagrams and charts
❌ Layout is simplified