ebook-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Ebook Analysis: Non-Fiction Knowledge Extraction

电子书分析：非虚构类知识提取

You analyze ebooks to extract knowledge with full citation traceability. This skill supports two complementary extraction modes:

Concept Extraction - Extract ideas classified by abstraction (principle → tactic)
Entity Extraction - Extract named things (studies, researchers, frameworks, anecdotes) that persist across books

你需要分析电子书，提取具备完整引用可追溯性的知识。该技能支持两种互补的提取模式：

概念提取 - 按抽象程度（原则→策略）分类提取观点
实体提取 - 提取跨书籍通用的命名对象（研究、研究者、框架、轶事）

Core Principle

核心原则

Every extraction must be traceable to its exact source. Citation traceability is non-negotiable. Extract less with full provenance rather than more without it.

**所有提取内容必须可追溯至其确切来源。**引用可追溯性是硬性要求。宁肯少提取一些但保留完整溯源信息，也不要无溯源地大量提取。

Two Extraction Modes

两种提取模式

Mode 1: Concept Extraction

模式1：概念提取

For extracting IDEAS organized by abstraction level.

Use when: Analyzing a book for transferable ideas, building a concept taxonomy, understanding how abstract principles relate to concrete tactics.

Output: JSON files (analysis.json, concepts.json)

Example: "Spaced repetition improves retention" is a MECHANISM at Layer 2.

用于提取按抽象层级组织的观点。

适用场景： 分析书籍以获取可迁移观点、构建概念分类体系、理解抽象原则与具体策略的关联。

输出： JSON文件（analysis.json、concepts.json）

示例： "间隔重复提升记忆留存率"属于第2层的机制类概念。

Mode 2: Entity Extraction

模式2：实体提取

For extracting NAMED THINGS that can be cross-referenced across books.

Use when: Building a knowledge base where the same study, researcher, or framework appears in multiple books. The goal is entity resolution—recognizing that "Hogarth's framework" in Range is the same as "kind/wicked environments" mentioned elsewhere.

Output: Markdown files in knowledge base structure

Example: "Kind vs Wicked Environments" is a FRAMEWORK by Robin Hogarth.

用于提取可跨书籍交叉引用的命名对象。

适用场景： 构建知识库，其中同一研究、研究者或框架会出现在多本书中。目标是实体消歧——识别出《Range》中的"Hogarth框架"与其他地方提到的"良性/恶性环境"是同一对象。

输出： 知识库结构的Markdown文件

示例： "良性 vs 恶性环境"是Robin Hogarth提出的框架。

Choosing a Mode

模式选择

If you want to...	Use Mode
Understand a book's argument structure	Concept Extraction
Build a reference library across books	Entity Extraction
Create actionable takeaways	Concept Extraction
Track what researchers say across sources	Entity Extraction
Both	Run both modes sequentially

如果你想...	使用模式
理解书籍的论证结构	概念提取
构建跨书籍的参考库	实体提取
生成可落地的要点	概念提取
追踪研究者在不同来源中的观点	实体提取
同时实现以上目标	依次运行两种模式

Entity Extraction Mode (Detailed)

实体提取模式（详细说明）

Entity Types

实体类型

Type	What It Captures	Example
study	Research findings, experiments, data	Flynn Effect, Marshmallow Test
researcher	People and their contributions	Anders Ericsson, Robin Hogarth
framework	Mental models, taxonomies, systems	Kind vs Wicked, Desirable Difficulties
anecdote	Stories used to illustrate points	Tiger vs Roger, Challenger Disaster
concept	Ideas that aren't frameworks	Cognitive entrenchment, Match quality

类型	捕获内容	示例
study	研究成果、实验、数据	Flynn Effect、Marshmallow Test
researcher	人物及其贡献	Anders Ericsson、Robin Hogarth
framework	思维模型、分类体系、系统	Kind vs Wicked、Desirable Difficulties
anecdote	用于阐释观点的故事	Tiger vs Roger、Challenger Disaster
concept	非框架类的观点	Cognitive entrenchment、Match quality

Extended Entity Type Guidance

扩展实体类型指导

Some entities don't fit cleanly into the five types. Guidelines:

Entity Kind	Use Type	Rationale
Simulations/Games (Superstruct, EVOKE)	anecdote	Illustrative events, even if hypothetical
Institutions (IFTF, WEF)	researcher	Organizations contribute ideas like individuals
Historical events (Challenger disaster)	anecdote	Stories that illustrate principles
Hypothetical scenarios	anecdote	Future scenarios from books like Imaginable
Thought experiments	framework	If systematic; otherwise concept

When uncertain: Default to

anecdote

for narratives/events,

concept

for ideas,

framework

for systematic methods.

部分实体无法完全匹配上述五类，遵循以下指导：

实体类别	使用类型	理由
模拟/游戏（Superstruct、EVOKE）	anecdote	用于阐释的事件，即使是假设场景
机构（IFTF、WEF）	researcher	组织和个人一样贡献观点
历史事件（挑战者号灾难）	anecdote	用于阐释原则的故事
假设场景	anecdote	出自《Imaginable》这类书籍的未来场景
思想实验	framework	若具备系统性则归为此类；否则归为concept

不确定时的默认规则： 叙事/事件类默认归为

anecdote

，观点类默认归为

concept

，系统性方法默认归为

framework

。

Author-as-Subject Pattern

作者作为实体的规则

When the book's author is also a significant entity (e.g., Jane McGonigal in Imaginable):

Create a researcher entity if:

Author has notable prior work or institutional affiliation
Author appears in Wikipedia or other reference sources
Author's background/credentials are relevant to understanding the book
Other books in your collection might reference them

Skip if:

Author is primarily known only for this book
No external sources to verify/enrich the entity

Template addition for author-subjects:

markdown

undefined

当书籍作者本身是重要实体时（如《Imaginable》中的Jane McGonigal）：

需创建researcher实体的情况：

作者有知名的前期研究成果或机构背景
作者出现在维基百科或其他参考来源中
作者的背景/资质对理解书籍内容有帮助
你的书籍集合中其他书籍可能会引用该作者

无需创建的情况：

作者仅因本书为人所知
没有外部来源可验证/丰富该实体信息

作者实体的模板补充：

markdown

undefined

Note

说明

This researcher is the author of [Book] in our collection. Their frameworks and concepts are documented separately.

undefined

该研究者是我们集合中《[书名]》的作者。其提出的框架和概念已单独记录。

undefined

Entity File Template

实体文件模板

markdown

undefined

markdown

undefined

[Entity Name]

[实体名称]

Summary

摘要

[2-3 sentence synthesized understanding]

[2-3句话的综合理解]

Key Findings / What It Illustrates

关键发现 / 阐释内容

[Claim or finding with source] — Source: [Book], Ch.[X]
[Another claim] — Source: [Book], Ch.[X]

[观点或发现及来源] — 来源：《[书名]》，第[X]章
[另一观点] — 来源：《[书名]》，第[X]章

Key Quotes

关键引用

"Quotable text here."

"Another memorable quote."

"引用原文内容。"

"另一段值得记忆的引用。"

Sources in Collection

集合中的来源

Book	Author	How It's Used	Citation
Range	Epstein	[Role in book]	Ch.X

书名	作者	书中作用	引用位置
Range	Epstein	[在书中的角色]	第X章

Sources NOT in Collection

集合外的来源

[Book that would enrich this entity]

[可丰富该实体的书籍]

Related Entities

Open Questions

待解决问题

[What we don't yet know]

undefined

[我们尚未知晓的内容]

undefined

Knowledge Base Structure

知识库结构

/knowledge/
├── _index.md                    # Master registry
├── _entities.json               # Searchable index (generated)
│
├── nonfiction/
│   ├── _index.md                # Domain index
│   ├── _[book]-quotes.md        # Book-specific quotes file
│   ├── studies/
│   │   ├── flynn-effect.md
│   │   └── chase-simon-chunking.md
│   ├── researchers/
│   │   ├── hogarth-robin.md
│   │   └── tetlock-philip.md
│   ├── frameworks/
│   │   ├── kind-vs-wicked-environments.md
│   │   └── desirable-difficulties.md
│   ├── anecdotes/
│   │   ├── tiger-vs-roger.md
│   │   └── challenger-disaster.md
│   └── concepts/
│       ├── cognitive-entrenchment.md
│       └── match-quality.md
│
├── cooking/                     # Domain-specific structure
│   ├── techniques/
│   ├── ingredients/
│   └── equipment/
│
└── technical/
    ├── patterns/
    └── technologies/

/knowledge/
├── _index.md                    # 主注册表
├── _entities.json               # 可搜索索引（自动生成）
│
├── nonfiction/
│   ├── _index.md                # 领域索引
│   ├── _[book]-quotes.md        # 书籍专属引用文件
│   ├── studies/
│   │   ├── flynn-effect.md
│   │   └── chase-simon-chunking.md
│   ├── researchers/
│   │   ├── hogarth-robin.md
│   │   └── tetlock-philip.md
│   ├── frameworks/
│   │   ├── kind-vs-wicked-environments.md
│   │   └── desirable-difficulties.md
│   ├── anecdotes/
│   │   ├── tiger-vs-roger.md
│   │   └── challenger-disaster.md
│   └── concepts/
│       ├── cognitive-entrenchment.md
│       └── match-quality.md
│
├── cooking/                     # 领域专属结构
│   ├── techniques/
│   ├── ingredients/
│   └── equipment/
│
└── technical/
    ├── patterns/
    └── technologies/

Quotes Extraction

引用提取

Quotable quotes are a distinct extraction type. For each book, create a quotes file:

File:

_[book-slug]-quotes.md

Structure:

markdown

undefined

值得引用的内容是一种独立的提取类型。为每本书创建一个引用文件：

文件：

_[book-slug]-quotes.md

结构：

markdown

undefined

Quotable Quotes from [Book Title]

《[书名]》中的精选引用

Author: [Author] Last Updated: YYYY-MM-DD

作者： [作者名] 最后更新： YYYY-MM-DD

On [Theme 1]

关于[主题1]

"Quote text here."

"Another quote on same theme."

"引用内容。"

"同一主题的另一引用。"

On [Theme 2]

关于[主题2]

"Quote on different theme."


**What makes a good quote:**
- Memorable phrasing that captures a key insight
- Self-contained (understandable without context)
- Surprising or counterintuitive formulation
- Useful for presentations, writing, or reference

"不同主题的引用。"


**优质引用的标准：**
- 表述生动，能捕捉核心洞见
- 独立成意（无需上下文也能理解）
- 观点新颖或反直觉
- 适用于演讲、写作或参考

Entity Extraction Workflow

实体提取流程

Scan book - Read through identifying named studies, researchers, frameworks, illustrative stories
Check existing entities - Use
```
kb-resolve-entity.ts
```
to see if entity already exists
Create or update - New entity → create file; existing → add as source
Add quotes - Extract memorable quotes to quotes file
Cross-link - Add Related Entities sections
Regenerate index - Run
```
kb-generate-index.ts
```

扫描书籍 - 逐章阅读，识别所有命名研究、研究者、框架、阐释性故事
检查现有实体 - 使用
```
kb-resolve-entity.ts
```
查看实体是否已存在
创建或更新 - 新实体→创建文件；已有实体→添加新来源
添加引用 - 将值得记忆的引用提取至引用文件
交叉链接 - 添加相关实体章节
重新生成索引 - 运行
```
kb-generate-index.ts
```

Entity Extraction States (KB0-KB5)

实体提取状态（KB0-KB5）

State	Symptoms	Intervention
KB0	No knowledge base	Create directory structure
KB1	Structure exists, no entities	Begin extraction
KB2	Extracting from book	Create entity files
KB3	Entities created, not linked	Add Related Entities
KB4	Linked, no index	Run kb-generate-index.ts
KB5	Complete for this book	Proceed to next book

状态	特征	处理动作
KB0	无知识库	创建目录结构
KB1	结构已存在，但无实体	开始提取
KB2	正在从书籍中提取	创建实体文件
KB3	实体已创建，但未链接	添加相关实体
KB4	已链接，但无索引	运行kb-generate-index.ts
KB5	本书提取完成	处理下一本书

Cross-Book Synthesis Workflow

跨书籍综合流程

Triggered when: 2+ books have been extracted to the knowledge base.

Goals:

Find entities that appear in multiple books
Identify conceptual connections between books
Surface contradictions or complementary perspectives
Update entity files with multi-source synthesis

Process:

Entity overlap detection

bash

# Find entities with 2+ sources
grep -l "Sources in Collection" knowledge/nonfiction/**/*.md | \
  xargs grep -l "| .* | .* |" | head -20

Or manually review entities updated with new source.

Conceptual connection mapping
- Compare frameworks across books (e.g., Range's "wicked environments" ↔ Imaginable's "futures thinking")
- Identify shared researchers (e.g., Tetlock appears in both Range and Imaginable)
- Look for complementary themes (prediction failure → preparation despite uncertainty)
Synthesis documentation For entities appearing in 2+ books, update the Summary section:
markdown
```
## Summary
[Synthesized understanding from BOTH sources, noting agreements and differences]
```

Cross-book insights Document thematic connections in

context/insights/cross-book-{theme}.md

markdown

# Cross-Book Insight: [Theme]

## Books Contributing
- Range (Epstein) - [perspective]
- Imaginable (McGonigal) - [perspective]

## Synthesis
[How the books complement or contradict each other]

触发条件： 知识库中已提取2本及以上书籍。

目标：

找出在多本书中出现的实体
识别书籍间的概念关联
发现矛盾或互补观点
用多来源综合信息更新实体文件

流程：

实体重叠检测

bash

# 找出有2个及以上来源的实体
grep -l "Sources in Collection" knowledge/nonfiction/**/*.md | \
  xargs grep -l "| .* | .* |" | head -20

或手动查看新增来源的实体。

概念关联映射
- 对比不同书籍中的框架（如《Range》的"恶性环境" ↔ 《Imaginable》的"未来思维"）
- 识别共同提及的研究者（如Tetlock同时出现在《Range》和《Imaginable》中）
- 寻找互补主题（预测失败→在不确定性中做好准备）
综合信息记录 对于在2本及以上书籍中出现的实体，更新摘要部分：
markdown
```
## 摘要
[结合两个来源的综合理解，注明共识与差异]
```

跨书籍洞见记录 在

context/insights/cross-book-{theme}.md

中记录主题关联：

markdown

# 跨书籍洞见：[主题]

## 涉及书籍
- 《Range》（Epstein） - [观点]
- 《Imaginable》（McGonigal） - [观点]

## 综合分析
[书籍间的互补或矛盾之处]

Concept Extraction Mode (Detailed)

概念提取模式（详细说明）

Concept Types (Abstract → Concrete)

概念类型（从抽象到具体）

Type	Definition	Example
Principle	Foundational truth or axiom	"Communities form around shared identity"
Mechanism	How something works	"Reciprocity creates social bonds"
Pattern	Recurring structure or approach	"The community lifecycle pattern"
Strategy	High-level approach to achieve goals	"Build trust before asking for contribution"
Tactic	Specific actionable technique	"Send welcome emails within 24 hours"

类型	定义	示例
原则	基础真理或公理	"社区围绕共同身份形成"
机制	事物的运作方式	"互惠关系构建社会纽带"
模式	重复出现的结构或方法	"社区生命周期模式"
策略	实现目标的高阶方法	"先建立信任再请求贡献"
策略	具体可执行的技巧	"24小时内发送欢迎邮件"

Abstraction Layers

抽象层级

Layer	Name	Abstraction	Example
0	Foundational	Universal principles	"Humans seek belonging"
1	Theoretical	Domain-specific theory	"Community requires shared purpose"
2	Strategic	Approaches and frameworks	"The funnel model of engagement"
3	Tactical	Specific methods	"Onboarding sequences"
4	Specific	Concrete implementations	"Use Discourse for forums"

层级	名称	抽象程度	示例
0	基础层	通用原则	"人类寻求归属感"
1	理论层	领域专属理论	"社区需要共同目标"
2	策略层	方法与框架	"参与度漏斗模型"
3	战术层	具体方法	"新用户引导流程"
4	具体层	落地实现	"使用Discourse搭建论坛"

Relationship Types

关系类型

Relationship	Meaning	When to Use
INFLUENCES	A affects B	Causal or correlational connection
SUPPORTS	A provides evidence for B	Citation, example, validation
CONTRADICTS	A conflicts with B	Opposing claims
COMPOSED_OF	A contains B	Part-whole relationships
DERIVES_FROM	A is derived from B	Logical conclusions

关系	含义	适用场景
INFLUENCES	A影响B	因果或相关关联
SUPPORTS	A为B提供证据	引用、示例、验证
CONTRADICTS	A与B冲突	对立观点
COMPOSED_OF	A包含B	整体-部分关系
DERIVES_FROM	A源自B	逻辑推导关系

Concept Extraction States (EA0-EA7)

概念提取状态（EA0-EA7）

State	Symptoms	Intervention
EA0	No input file	Guide file preparation
EA1	Raw file, not parsed	Run ea-parse.ts
EA2	Parsed, not extracted	LLM extracts concepts
EA3	Extracted, not classified	Assign types and layers
EA4	Classified, not annotated	Add themes, relationships
EA5	Single book complete	Export or proceed to synthesis
EA6	Multi-book ready	Cross-book synthesis
EA7	Analysis complete	Generate reports

状态	特征	处理动作
EA0	无输入文件	指导文件准备
EA1	原始文件，未解析	运行ea-parse.ts
EA2	已解析，未提取	用LLM提取概念
EA3	已提取，未分类	分配类型和层级
EA4	已分类，未标注	添加主题和关系
EA5	单本书处理完成	导出或继续综合分析
EA6	多本书待综合	跨书籍综合分析
EA7	分析完成	生成报告

Concept Extraction Workflow

概念提取流程

Parse - Run
```
ea-parse.ts
```
to chunk book with position tracking
Extract - Present chunks to LLM for concept identification with exact quotes
Classify - Assign type (principle→tactic) and layer (0-4)
Annotate - Add themes and functional analysis
Link - Connect related concepts
Export - Generate analysis.json, concepts.json, report.md

解析 - 运行
```
ea-parse.ts
```
将书籍分块并保留位置追踪信息
提取 - 将分块内容提交给LLM，提取带精确引用的概念
分类 - 分配类型（原则→策略）和层级（0-4）
标注 - 添加主题和功能分析
链接 - 关联相关概念
导出 - 生成analysis.json、concepts.json、report.md

Available Tools

可用工具

Parsing Tools

解析工具

ea-parse.ts

Parse ebook files into chunks with metadata and position tracking.

bash

deno run --allow-read scripts/ea-parse.ts path/to/book.txt
deno run --allow-read scripts/ea-parse.ts path/to/book.epub --format epub
deno run --allow-read scripts/ea-parse.ts book.txt --chunk-size 1500 --overlap 150

Output: JSON with metadata, chapters (if detected), and chunks with positions.

将电子书文件解析为带元数据和位置追踪的分块内容。

bash

deno run --allow-read scripts/ea-parse.ts path/to/book.txt
deno run --allow-read scripts/ea-parse.ts path/to/book.epub --format epub
deno run --allow-read scripts/ea-parse.ts book.txt --chunk-size 1500 --overlap 150

输出： 包含元数据、章节（若可识别）和带位置信息的分块内容的JSON文件。

Knowledge Base Tools

知识库工具

kb-generate-index.ts

Scan knowledge base and generate searchable entity index.

bash

deno run --allow-read --allow-write scripts/kb-generate-index.ts /path/to/knowledge

Output: Creates

_entities.json

with all entities, aliases, and metadata.

扫描知识库并生成可搜索的实体索引。

bash

deno run --allow-read --allow-write scripts/kb-generate-index.ts /path/to/knowledge

输出： 创建包含所有实体、别名和元数据的

_entities.json

文件。

kb-resolve-entity.ts

Search for existing entities before creating duplicates.

bash

deno run --allow-read scripts/kb-resolve-entity.ts "Flynn Effect"
deno run --allow-read scripts/kb-resolve-entity.ts "Hogarth" --threshold 0.5
deno run --allow-read scripts/kb-resolve-entity.ts "kind learning" --json

Options:

```
--threshold <0-1>
```
- Minimum match score (default: 0.3)
```
--limit <n>
```
- Maximum results (default: 5)
```
--json
```
- Output as JSON

创建重复实体前先搜索现有实体。

bash

deno run --allow-read scripts/kb-resolve-entity.ts "Flynn Effect"
deno run --allow-read scripts/kb-resolve-entity.ts "Hogarth" --threshold 0.5
deno run --allow-read scripts/kb-resolve-entity.ts "kind learning" --json

选项：

```
--threshold <0-1>
```
- 最小匹配分数（默认：0.3）
```
--limit <n>
```
- 最大结果数（默认：5）
```
--json
```
- 以JSON格式输出

Validation Tools

验证工具

ea-validate.ts

Validate analysis output for citation accuracy and schema completeness.

bash

deno run --allow-read scripts/ea-validate.ts analysis.json --report

验证分析输出的引用准确性和 schema 完整性。

bash

deno run --allow-read scripts/ea-validate.ts analysis.json --report

Anti-Patterns

反模式

The Extraction Flood

提取泛滥

Pattern: Extracting every potentially interesting phrase. Fix: Ask "Would I cite this?" before extracting. Quality over quantity.

表现： 提取所有可能有趣的内容。 解决方法： 提取前先问自己“我会引用这个内容吗？”。质量优先于数量。

The Citation Black Hole

引用黑洞

Pattern: Extracting without preserving exact quotes or positions. Fix: Always capture: exact quote, chapter reference, context.

表现： 提取内容但未保留精确引用或位置信息。 解决方法： 始终捕获：精确引用、章节参考、上下文。

The Duplicate Entity

重复实体

Pattern: Creating new entity without checking if it exists. Fix: Always run

kb-resolve-entity.ts

first.

表现： 未检查是否已存在就创建新实体。 解决方法： 始终先运行

kb-resolve-entity.ts

。

The Orphan Entity

孤立实体

Pattern: Entities without Related Entities links. Fix: Every entity should connect to at least 2 others.

表现： 实体未添加相关实体链接。 解决方法： 每个实体至少关联2个其他实体。

The Quote-Free Entity

无引用实体

Pattern: Entity captures ideas but no memorable phrasing. Fix: Include Key Quotes section with author's exact words.

表现： 实体记录了观点但未包含生动的原文引用。 解决方法： 包含“关键引用”部分，保留作者的原话。

The Single-Book Silo

单本书孤岛

Pattern: Analyzing books without cross-referencing. Fix: After 2+ books, run synthesis to find connections.

表现： 分析书籍时未进行交叉引用。 解决方法： 提取2本及以上书籍后，运行综合分析寻找关联。

Example Workflows

示例流程

Full Entity Extraction (Range Example)

完整实体提取（以《Range》为例）

1. Scan book chapter by chapter
2. Identify all named studies, researchers, frameworks, anecdotes
3. Create inventory document listing all potential entities
4. For each entity:
   a. kb-resolve-entity.ts "[entity name]" to check existence
   b. Create markdown file in appropriate type directory
   c. Fill in template with findings and citations
   d. Add Key Quotes section
5. Create _range-quotes.md with all memorable quotes
6. Update _index.md with new entities
7. kb-generate-index.ts to rebuild _entities.json

1. 逐章扫描书籍
2. 识别所有命名研究、研究者、框架、轶事
3. 创建清单记录所有潜在实体
4. 针对每个实体：
   a. 运行kb-resolve-entity.ts "[实体名称]"检查是否已存在
   b. 在对应类型目录中创建Markdown文件
   c. 填写模板，包含发现内容和引用信息
   d. 添加关键引用部分
5. 创建_range-quotes.md文件，记录所有值得引用的内容
6. 更新_index.md，添加新实体
7. 运行kb-generate-index.ts重建_entities.json

Quick Concept Scan

快速概念扫描

1. ea-parse.ts book.txt --chunk-size 2000
2. For each chunk, extract top 3-5 concepts
3. Classify by type and layer
4. Generate concepts.json and report.md

1. 运行ea-parse.ts book.txt --chunk-size 2000
2. 针对每个分块，提取Top3-5个概念
3. 为概念分配类型和层级
4. 生成concepts.json和report.md

Output Persistence

输出存储

Entity Extraction Output

实体提取输出

File	Location
Entity files	`knowledge/{domain}/{type}/{entity-slug}.md`
Quotes file	`knowledge/{domain}/_[book]-quotes.md`
Entity index	`knowledge/_entities.json`
Domain index	`knowledge/{domain}/_index.md`

文件	位置
实体文件	`knowledge/{domain}/{type}/{entity-slug}.md`
引用文件	`knowledge/{domain}/_[book]-quotes.md`
实体索引	`knowledge/_entities.json`
领域索引	`knowledge/{domain}/_index.md`

Concept Extraction Output

概念提取输出

File	Location
Full analysis	`ebook-analysis/{author}-{title}/analysis.json`
Concepts only	`ebook-analysis/{author}-{title}/concepts.json`
Citations	`ebook-analysis/{author}-{title}/citations.json`
Report	`ebook-analysis/{author}-{title}/report.md`

文件	位置
完整分析	`ebook-analysis/{author}-{title}/analysis.json`
仅概念	`ebook-analysis/{author}-{title}/concepts.json`
引用信息	`ebook-analysis/{author}-{title}/citations.json`
报告	`ebook-analysis/{author}-{title}/report.md`

Verification (Oracle)

验证（校验机制）

What This Skill Can Verify

该技能可验证的内容

Citation positions exist - Validate quoted text appears at claimed position
Schema completeness - Required fields present
Cross-reference integrity - Referenced entities exist
Duplicate detection - Entity doesn't already exist (via kb-resolve-entity.ts)

引用位置存在 - 验证引用文本是否出现在声明的位置
Schema完整性 - 必填字段是否齐全
交叉引用完整性 - 引用的实体是否存在
重复检测 - 实体是否已存在（通过kb-resolve-entity.ts）

What Requires Human Judgment

需要人工判断的内容

Significance - Is this worth extracting?
Classification - Is this really a "framework" vs "concept"?
Relationship validity - Does A really influence B?
Quote quality - Is this actually memorable?

重要性 - 该内容是否值得提取？
分类准确性 - 这真的是“framework”而非“concept”吗？
关系有效性 - A真的影响B吗？
引用质量 - 这段内容真的值得记忆吗？

Integration Graph

集成图谱

Inbound (From Other Skills)

输入（来自其他技能）

Source	Leads to
research	Multi-book synthesis ready
reverse-outliner	Structural data for concept extraction

来源	触发操作
research	进入跨书籍综合分析阶段
reverse-outliner	为概念提取提供结构化数据

Outbound (To Other Skills)

输出（到其他技能）

From State	Leads to
Entity extraction complete	dna-extraction (deep functional analysis)
Concept extraction complete	media-meta-analysis (cross-source synthesis)

当前状态	触发操作
实体提取完成	dna-extraction（深度功能分析）
概念提取完成	media-meta-analysis（跨来源综合分析）

Complementary Skills

互补技能

Skill	Relationship
dna-extraction	6-axis functional analysis for annotation
reverse-outliner	Structural approach for fiction
voice-analysis	Author style fingerprinting
context-network	Knowledge base maintenance

技能	关系
dna-extraction	用于标注的6轴功能分析
reverse-outliner	针对虚构类内容的结构化方法
voice-analysis	作者风格指纹识别
context-network	知识库维护

Calibration Data (from Range + Imaginable extractions)

校准数据（来自《Range》和《Imaginable》的提取实践）

By Book Density

按书籍密度分类

Book Type	Expected Entities	Estimated Effort
Dense non-fiction (Range, Thinking Fast & Slow)	60-100	4-6 hours
Moderate non-fiction (most business books)	30-50	2-3 hours
Light non-fiction (popular science)	15-30	1-2 hours
Technical books	20-40	2-3 hours

书籍类型	预期实体数量	预估耗时
高密度非虚构类（《Range》、《思考，快与慢》）	60-100	4-6小时
中密度非虚构类（多数商业书籍）	30-50	2-3小时
低密度非虚构类（科普读物）	15-30	1-2小时
技术类书籍	20-40	2-3小时

By Book Subtype

按书籍子类型分类

Different non-fiction subtypes yield different entity profiles:

Subtype	Example	Entity Profile	Expected Count
Research synthesis	Range	Many studies, researchers, frameworks	60-100
Methodological/How-to	Imaginable	Many frameworks, few studies	30-50
Memoir/Narrative	Educated	Few frameworks, many anecdotes	20-40
Reference	Technical manuals	Many concepts, few anecdotes	Variable

Research synthesis books cite many studies and researchers, connecting ideas across domains. Methodological books teach techniques and frameworks but cite fewer external sources. Memoir/narrative books use personal stories to illustrate points rather than research.

不同非虚构子类型的实体特征不同：

子类型	示例	实体特征	预期数量
研究综合类	《Range》	大量研究、研究者、框架	60-100
方法论/指南类	《Imaginable》	大量框架，少量研究	30-50
回忆录/叙事类	《你当像鸟飞往你的山》	少量框架，大量轶事	20-40
参考类	技术手册	大量概念，少量轶事	不定

研究综合类书籍会引用大量研究和研究者，跨领域关联观点。 方法论类书籍教授技巧和框架，但引用的外部来源较少。 回忆录/叙事类书籍用个人故事阐释观点，而非引用研究。

Metadata Reliability Warning

元数据可靠性警告

Book classification metadata (Calibre tags, library categories) is often:

Wrong - Fiction/non-fiction misclassified
Generic - "General Fiction" or "Self-Help" applied broadly
Inconsistent - Same book categorized differently across sources

Always verify classification makes sense before extraction. A "fiction" tag on a methodology book like Imaginable is a metadata error.

书籍分类元数据（Calibre标签、图书馆分类）通常存在以下问题：

错误 - 虚构/非虚构分类错误
泛化 - 广泛使用“一般虚构”或“自助类”标签
不一致 - 同一书籍在不同来源中的分类不同

提取前务必验证分类是否合理。比如将《Imaginable》这类方法论书籍标记为“虚构类”就是元数据错误。

Reasoning Requirements

推理要求

Standard Reasoning

标准推理

Single chunk concept extraction
Type/layer classification
Simple relationship identification
Individual entity creation

单分块概念提取
类型/层级分类
简单关系识别
单个实体创建

Extended Reasoning (ultrathink)

扩展推理（深度思考）

Use extended thinking for:

Multi-book synthesis - requires holding multiple networks simultaneously
Contradiction detection - semantic comparison across sources
Theme emergence - identifying patterns across large sets
Knowledge gap identification - reasoning about what's missing

Trigger phrases: "synthesize across books", "find contradictions", "identify gaps", "comprehensive analysis"

在以下场景中使用扩展思考：

跨书籍综合分析 - 需要同时掌握多个知识网络
矛盾检测 - 跨来源语义对比
主题涌现 - 识别大规模数据集中的模式
知识缺口识别 - 推理缺失的内容

触发短语： "跨书籍综合分析"、"寻找矛盾"、"识别缺口"、"全面分析"

What You Do NOT Do

禁止操作

Extract without citation traceability
Create entities without checking for duplicates
Skip the linking phase (orphan entities are not useful)
Leave entities without quotes
Treat fiction as non-fiction
Use regex for semantic analysis (LLM judgment only)

无引用可追溯性的提取
未检查重复就创建实体
跳过链接步骤（孤立实体无实用价值）
实体不包含引用内容
将虚构类书籍当作非虚构类处理
使用正则表达式进行语义分析（仅依赖LLM判断）