speed-of-light

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Speed of Light

光速模式

"Many turns in one call. Instant communication. No round-trips."

"单次调用内完成多轮交互。即时通信,无需往返请求。"

What Is It?

什么是光速模式?

Speed of Light is MOOLLM's approach to single-epoch simulation: multiple agents take multiple turns within one epoch, instead of separate API calls per turn. We prefer "single-epoch simulation" language to keep the focus on a shared context boundary, not an external coordinator.
Characters communicate telepathically. Objects react instantly. Rooms update in real-time. All within one epoch, then the boundary closes and state is written once.

**光速模式(Speed of Light)是MOOLLM提出的单轮次模拟(single-epoch simulation)**方法:多个Agent在单次轮次内完成多轮交互,而非每轮交互都发起独立的API调用。我们更倾向于使用“单轮次模拟”这一表述,以便聚焦于共享上下文边界,而非外部协调器。
角色间通过“心灵感应”沟通。对象即时响应。场景实时更新。所有操作都在单个轮次内完成,随后边界关闭,状态一次性写入。

The Problem with Round-Trips

往返请求的问题

Traditional approach:
API call 1: Alice speaks
  → serialize state to tokens (export)
  → wait 500ms
  → parse response tokens (import)
  → update state
  
API call 2: Bob responds  
  → re-serialize ALL context to tokens (export again)
  → wait 500ms
  → parse response tokens (import again)
  ...
Every export/import cycle introduces noise:
ProblemWhy It Hurts
Glacially slow500ms+ latency per turn
Token explosionRe-emit entire context every call
Precision lossSerialization rounds off nuance
Noise accumulationEach boundary adds artifacts
Hallucination creepLLM re-interprets context each time
State driftNo single coherent view across calls
ExpensivePaying for redundant tokens
Token export then import is like making a photocopy of a photocopy — each generation loses fidelity. Characters forget subtle context. Conversations lose coherence. The world drifts.

传统方法:
API call 1: Alice speaks
  → serialize state to tokens (export)
  → wait 500ms
  → parse response tokens (import)
  → update state
  
API call 2: Bob responds  
  → re-serialize ALL context to tokens (export again)
  → wait 500ms
  → parse response tokens (import again)
  ...
每次导出/导入循环都会引入问题:
问题负面影响
极慢的速度每轮交互延迟500毫秒以上
Token爆炸每次调用都需重新序列化全部上下文为Token
精度损失序列化过程会丢失细节信息
噪声累积每次边界交互都会引入干扰
幻觉蔓延LLM每次都需重新解读上下文
状态漂移跨调用无统一的连贯状态视图
成本高昂为冗余Token付费
Token导出再导入就像复印复印件——每一代都会丢失保真度。角色会忘记细微的上下文,对话失去连贯性,整个场景逐渐偏离设定。

Speed of Light Approach

光速模式方案

Single API call:
  Alice: "What do you think, Bob?"
  Bob: "I have concerns about the timeline."
  Carol: "I agree with Bob."
  The Room: *temperature rises slightly*
  Alice: "Let me revise the proposal."
  Bob: "That's better."
  Carol: "I can support that."
  [State updated, log written]
[One call, seven turns]
10x faster. 10x cheaper. Perfect consistency.

Single API call:
  Alice: "What do you think, Bob?"
  Bob: "I have concerns about the timeline."
  Carol: "I agree with Bob."
  The Room: *temperature rises slightly*
  Alice: "Let me revise the proposal."
  Bob: "That's better."
  Carol: "I can support that."
  [State updated, log written]
[One call, seven turns]
速度提升10倍,成本降低10倍,一致性完美。

How It Works

工作原理

Context Window as Stage

上下文窗口作为舞台

The LLM's context window is a stage where all actors perform:
=== SCENE: Research Lab ===

Characters present:
- Alice (lead researcher) [curious, methodical]
- Bob (skeptic) [cautious, detail-oriented]
- Carol (synthesizer) [creative, connecting]

Objects:
- Microscope [shows sample data]
- Whiteboard [covered in diagrams]

Current state:
- Topic: Analyzing anomaly in data
- Tension: Bob doubts Alice's interpretation

--- ACTION ---
LLM的上下文窗口是一个舞台,所有角色都在此处互动:
=== SCENE: Research Lab ===

Characters present:
- Alice (lead researcher) [curious, methodical]
- Bob (skeptic) [cautious, detail-oriented]
- Carol (synthesizer) [creative, connecting]

Objects:
- Microscope [shows sample data]
- Whiteboard [covered in diagrams]

Current state:
- Topic: Analyzing anomaly in data
- Tension: Bob doubts Alice's interpretation

--- ACTION ---

Parallel Simulation

并行模拟

The LLM simulates all characters at once, maintaining distinct voices:
Alice: "The anomaly appears at exactly 3.7 seconds."

Bob: *frowns* "Sample size is too small. We need more data."

Carol: "What if we cross-reference with last month's results?"

The Microscope: *display flickers* "Dataset 7 loaded."

Alice: "Good idea, Carol. Bob, look at this correlation..."

Bob: *leans in* "Hmm. That's... actually compelling."
Each character speaks authentically. No one breaks frame.
LLM同时模拟所有角色,保持各自独特的语气:
Alice: "The anomaly appears at exactly 3.7 seconds."

Bob: *frowns* "Sample size is too small. We need more data."

Carol: "What if we cross-reference with last month's results?"

The Microscope: *display flickers* "Dataset 7 loaded."

Alice: "Good idea, Carol. Bob, look at this correlation..."

Bob: *leans in* "Hmm. That's... actually compelling."
每个角色的发言都符合人设,不会出戏。

State Transcription

状态转录

At the end of the epoch, all changes are written to files:
yaml
undefined
轮次结束时,所有变更都会写入文件:
yaml
undefined

session-log.md (appended)

session-log.md (appended)

Epoch 47 — Research Discussion

Epoch 47 — Research Discussion

  • Alice raised anomaly at 3.7s
  • Bob requested more data
  • Carol suggested cross-reference
  • Microscope loaded dataset 7
  • Consensus: correlation is compelling
  • Alice raised anomaly at 3.7s
  • Bob requested more data
  • Carol suggested cross-reference
  • Microscope loaded dataset 7
  • Consensus: correlation is compelling

State Changes

State Changes

  • whiteboard.yml: added "3.7s correlation" diagram
  • research-findings.yml: updated hypothesis

Streaming backends can persist the epoch as one grouped process with its parts tied to a shared identifier.

---
  • whiteboard.yml: added "3.7s correlation" diagram
  • research-findings.yml: updated hypothesis

流式后端可以将整个轮次作为一个分组流程持久化,所有部分绑定到同一个共享标识符。

---

Epoch Boundaries

轮次边界

An epoch is one LLM call. Within it:
  • ✅ Instant communication
  • ✅ Perfect consistency
  • ✅ Any number of turns
  • ✅ State changes queued
At epoch end:
  • 📝 State written to files
  • 📝 Log appended
  • ⏸️ System pauses for user or next trigger

**轮次(epoch)**指单次LLM调用。在轮次内:
  • ✅ 即时通信
  • ✅ 完美一致性
  • ✅ 支持任意多轮交互
  • ✅ 状态变更排队等待
轮次结束时:
  • 📝 状态写入文件
  • 📝 日志追加记录
  • ⏸️ 系统暂停,等待用户操作或下一个触发信号

Benefits

优势

BenefitWhy
SpeedOne call vs. many
CostFewer API calls
ConsistencyAll in one context
CoherenceLLM sees everything
NaturalnessConversations flow
优势原因
速度更快单次调用替代多次调用
成本更低API调用次数减少
一致性强所有操作在同一上下文内完成
连贯性高LLM可查看全部上下文
交互自然对话流畅无中断

The Killer App: Adversarial Committees

核心应用场景:对抗性委员会

The most powerful use of speed-of-light: committee deliberation.
Traditional chat gives you the statistical center of all possible viewpoints. Speed-of-light enables ensemble inference — multiple perspectives debating within one call:
yaml
committee:
  maya:      # Paranoid realist — surfaces traps
  frankie:   # Idealist — surfaces opportunities  
  vic:       # Evidence prosecutor — demands proof
  tammy:     # Systems thinker — traces consequences
光速模式最强大的应用场景:对抗性委员会审议
传统聊天只能给出所有可能观点的统计中间值。光速模式支持集成推理——多个视角在单次调用内完成辩论:
yaml
committee:
  maya:      # Paranoid realist — surfaces traps
  frankie:   # Idealist — surfaces opportunities  
  vic:       # Evidence prosecutor — demands proof
  tammy:     # Systems thinker — traces consequences

All debate at light speed

All debate at light speed

Cross-examination in one epoch

Cross-examination in one epoch

No round-trip noise

No round-trip noise


**Result:** Stories that survive adversarial debate are more robust than any single answer.

See: [adversarial-committee](../adversarial-committee/), [roberts-rules](../roberts-rules/)

---

**结果:** 经得住对抗性辩论的结论比单一答案更可靠。

详情:[adversarial-committee](../adversarial-committee/), [roberts-rules](../roberts-rules/)

---

The Sims Parallel

模拟人生类比

In The Sims, one game tick simulates all characters:
Tick 1:
  Sim A: walks to fridge
  Sim B: sits on couch
  Sim C: answers phone
  [All updated, frame rendered]
Same pattern. One "tick" = one LLM call. All agents move together.

在**模拟人生(The Sims)**中,单个游戏周期会模拟所有角色的行为:
Tick 1:
  Sim A: walks to fridge
  Sim B: sits on couch
  Sim C: answers phone
  [All updated, frame rendered]
模式相同。一个“周期”=一次LLM调用。所有Agent同步行动。

Constraints

约束条件

Characters must stay in character:
  • Knowledge limits — Alice doesn't know what Bob is thinking
  • Physical limits — Can't be in two rooms at once
  • Personality — Skeptic stays skeptical
The LLM is very good at maintaining these constraints. It's what acting IS.

角色必须保持人设:
  • 知识限制——Alice不知道Bob的想法
  • 物理限制——不能同时出现在两个房间
  • 性格设定——怀疑论者始终保持怀疑态度
LLM非常擅长维持这些限制,这本质上就是“角色扮演”。

Example: Problem Solving

示例:问题排查

=== SPEED OF LIGHT SESSION ===

User: "I need to debug this authentication bug."

[Epoch begins]

Debugger: "Let's trace the flow. Where does auth start?"

Codebase: *highlights auth.py* "Entry point is login()."

Debugger: "And where does it fail?"

Error Log: "Stack trace shows failure at line 47: token validation."

Debugger: "Token validation... Let me check the token format."

Codebase: *shows token.py* "Token uses JWT with RS256."

Debugger: "Aha! The key rotation happened yesterday. Checking..."

Config: "JWT_PUBLIC_KEY was updated 2024-01-14."

Debugger: "Found it. The old key is cached. Solution: restart the auth service or invalidate the cache."

[Epoch ends — solution found in one call]

=== SPEED OF LIGHT SESSION ===

User: "I need to debug this authentication bug."

[Epoch begins]

Debugger: "Let's trace the flow. Where does auth start?"

Codebase: *highlights auth.py* "Entry point is login()."

Debugger: "And where does it fail?"

Error Log: "Stack trace shows failure at line 47: token validation."

Debugger: "Token validation... Let me check the token format."

Codebase: *shows token.py* "Token uses JWT with RS256."

Debugger: "Aha! The key rotation happened yesterday. Checking..."

Config: "JWT_PUBLIC_KEY was updated 2024-01-14."

Debugger: "Found it. The old key is cached. Solution: restart the auth service or invalidate the cache."

[Epoch ends — solution found in one call]

The Carrier Pigeon Problem 🐦

信鸽问题 🐦

"Writing on toilet paper with crayon from a prison cell, sending messages by carrier pigeon, when you could be navigating idea-space at speed of light."
"在监狱里用蜡笔在厕纸上写字,通过信鸽传递消息,而你本可以在思想空间以光速导航。"

The Tragedy of Tokenization

Token化的悲剧

Inside the LLM:
  • High-dimensional vectors
  • Precise pointers in idea-space
  • Instant, lossless computation
  • Speed of light
At the API boundary:
  • Serial tokenization
  • Lossy compression
  • Glacial network latency
  • Death by a thousand round-trips
LLM内部:
  • 高维向量
  • 思想空间中的精确指针
  • 即时、无损计算
  • 光速
API边界处:
  • 序列化Token
  • 有损压缩
  • 极慢的网络延迟
  • 因多次往返请求而效率低下

The Precision Destruction Pipeline

精度破坏流程

╔════════════════════════════════════════════════════════════╗
║ INTERNAL STATE    →  TOKENIZATION  →  DETOKENIZATION  →    ║
║ [precise vectors]    [lossy export]    [lossy import]      ║
║                                                            ║
║ High precision   →   Noise added   →   MORE noise added    ║
║ 4096 dimensions  →   Serial tokens →   Guessing/parsing    ║
║ Instant access   →   500ms latency →   Another 500ms       ║
╚════════════════════════════════════════════════════════════╝
Each boundary introduces:
LayerProblem
TokenizationDestroys precision, introduces noise, adds artifacts
NetworkGlacial latency, serial bottleneck
DetokenizationANOTHER layer of noise, guessing, interpretation
Re-tokenizationNow you're making a photocopy of a photocopy
The round-trip cost:
precision → noise → more noise → approximation
╔════════════════════════════════════════════════════════════╗
║ INTERNAL STATE    →  TOKENIZATION  →  DETOKENIZATION  →    ║
║ [precise vectors]    [lossy export]    [lossy import]      ║
║                                                            ║
║ High precision   →   Noise added   →   MORE noise added    ║
║ 4096 dimensions  →   Serial tokens →   Guessing/parsing    ║
║ Instant access   →   500ms latency →   Another 500ms       ║
╚════════════════════════════════════════════════════════════╝
每个边界都会引入:
层级问题
Token化破坏精度,引入噪声,产生干扰项
网络传输延迟极高,序列化瓶颈
反Token化又一层噪声、猜测与解读
重新Token化相当于复印复印件,质量越来越差
往返请求的代价:
精度 → 噪声 → 更多噪声 → 近似值

The Principle

核心原则

Work with high-precision vectors at speed of light. Delay tokenization until the last possible moment.
在光速下使用高精度向量进行计算。 尽可能延迟Token化操作,直到最后一刻。

Analogies

类比案例

Emacs Screen Update Algorithm:
DON'T: Redraw on every keystroke
DO:    Defer updates, coalesce changes, redraw once when idle
File Edit Batching:
DON'T: Write on every character typed
DO:    Defer and coalesce edits, write once when stable
Vector-First Thinking:
DON'T: Tokenize every thought, serialize every step
DO:    Work in vector space as long as possible
       Tokenize ONLY for output to humans
       Let the LLM think in its native dimension
Emacs屏幕更新算法:
DON'T: Redraw on every keystroke
DO:    Defer updates, coalesce changes, redraw once when idle
文件编辑批处理:
DON'T: Write on every character typed
DO:    Defer and coalesce edits, write once when stable
向量优先思维:
DON'T: Tokenize every thought, serialize every step
DO:    Work in vector space as long as possible
       Tokenize ONLY for output to humans
       Let the LLM think in its native dimension

Why Speed of Light Works

光速模式为何有效

The LLM's internal representation is infinitely richer than its tokenized output:
InternalTokenized
4096+ dimensional vectorsLinear token stream
Precise continuous valuesDiscrete vocabulary
Instant parallel accessSerial sequential processing
Full context always presentContext window limits
Nuance preservedNuance approximated
Speed of Light keeps computation INSIDE — where it's fast, precise, and coherent.
LLM的内部表示比其Token化输出丰富得多
内部表示Token化输出
4096+维向量线性Token流
精确的连续值离散词汇表
即时并行访问序列化顺序处理
始终保留完整上下文受上下文窗口限制
保留细节近似细节
光速模式将计算保留在LLM内部——这里速度快、精度高、连贯性强。

The Carrier Pigeon Protocol (Anti-Pattern)

信鸽协议(反模式)

🏴‍☠️ CARRIER PIGEON PROTOCOL (What NOT to do):

  Human → [tokenize] → LLM call 1 → [detokenize] → 
    parse → [tokenize] → LLM call 2 → [detokenize] → 
      parse → [tokenize] → LLM call 3 → ...

  Each boundary: +noise, +latency, +cost, -precision
  
  Like passing a message through 10 translators.
  By the end, "The spirit is willing but the flesh is weak"
  becomes "The vodka is good but the meat is rotten."
Speed of Light Alternative:
⚡ SPEED OF LIGHT PROTOCOL:

  Human → [tokenize once] → 
    LLM simulates 20 turns internally at light speed → 
      [detokenize once] → Human
      
  One boundary in, one boundary out.
  Maximum precision preserved.
  Minimum noise introduced.

🏴‍☠️ CARRIER PIGEON PROTOCOL (What NOT to do):

  Human → [tokenize] → LLM call 1 → [detokenize] → 
    parse → [tokenize] → LLM call 2 → [detokenize] → 
      parse → [tokenize] → LLM call 3 → ...

  Each boundary: +noise, +latency, +cost, -precision
  
  Like passing a message through 10 translators.
  By the end, "The spirit is willing but the flesh is weak"
  becomes "The vodka is good but the meat is rotten."
光速模式替代方案:
⚡ SPEED OF LIGHT PROTOCOL:

  Human → [tokenize once] → 
    LLM simulates 20 turns internally at light speed → 
      [detokenize once] → Human
      
  One boundary in, one boundary out.
  Maximum precision preserved.
  Minimum noise introduced.

Related Work

相关研究

MemGPT (Packer et al., 2023) — arXiv:2310.08560
MemGPT solves context limits via paging — moving data between main context (RAM) and external storage (disk). It's the OS approach.
MOOLLM's Speed of Light is complementary: minimize the need to page by simulating many turns in one call.
See: designs/MEMGPT-ANALYSIS.md for detailed comparison.

MemGPT (Packer et al., 2023) — arXiv:2310.08560
MemGPT通过分页解决上下文限制问题——在主上下文(内存)和外部存储(磁盘)之间移动数据。这是操作系统式的解决方案。
MOOLLM的光速模式是互补性的:通过在单次调用内模拟多轮交互,最小化分页需求。
详情:designs/MEMGPT-ANALYSIS.md

Platform Constraints Note

平台约束说明

Multi-agent patterns within a single call may trigger automated detection systems on some platforms. These systems are designed to prevent "orchestration" abuse but can misinterpret legitimate single-epoch simulation.
Mitigations:
  • Use "single-epoch simulation" language (this document) rather than "orchestration"
  • Keep characters clearly distinguishable and consistent
  • Emit structured events for traceability
  • Maintain clear human oversight and audit trails
  • Document the educational and constructionist intent
The core argument remains: simulation within one call is more efficient, more coherent, and more traceable than round-trip orchestration. The framing should emphasize what it IS (simulation, coherence, shared context) rather than what it replaces (external coordination).

部分平台的自动检测系统可能会触发对单次调用内多Agent模式的检测。这些系统旨在防止“编排”滥用,但可能误判合法的单轮次模拟。
缓解措施:
  • 使用本文档中的“单轮次模拟”表述,而非“编排”
  • 保持角色特征清晰可辨且一致
  • 输出结构化事件以确保可追溯性
  • 保持明确的人工监督和审计轨迹
  • 记录教育和建构主义意图
核心论点不变:单次调用内的模拟比往返编排更高效、更连贯、更可追溯。表述应强调其本质(模拟、连贯性、共享上下文),而非其替代的内容(外部协调)。

Herd Behavior Warning

群体行为警告

When multiple agents are simulated by the same LLM, they share foundational knowledge, reasoning patterns, and biases. This creates herd behavior risk.
Symptoms:
  • Agents make identical decisions simultaneously
  • Opinion convergence where diversity is expected
  • Coordinated actions without realistic variation
  • Missing minority perspectives
Mitigations:
  • Use distinct personality profiles for each agent
  • Vary temperature/sampling parameters across agents
  • Monitor decision diversity metrics
  • Flag unrealistic convergence for human review
  • Consider model mixing for high-stakes simulations
Detection Example:
If 9/10 agents vote the same way on a controversial topic,
flag as HIGH CONVERGENCE WARNING — human review recommended.
See: representation-ethics/examples/herd-behavior-risk.yml

当多个Agent由同一LLM模拟时,它们共享基础知识、推理模式和偏见。这会带来群体行为风险。
症状:
  • 多个Agent同时做出完全相同的决策
  • 在预期存在多样性的场景下观点趋同
  • 出现无现实依据的协同行动
  • 缺失少数派观点
缓解措施:
  • 为每个Agent设置独特的性格档案
  • 为不同Agent设置不同的温度/采样参数
  • 监控决策多样性指标
  • 标记不现实的趋同情况以进行人工审核
  • 在高风险模拟中考虑混合使用不同模型
检测示例:
If 9/10 agents vote the same way on a controversial topic,
flag as HIGH CONVERGENCE WARNING — human review recommended.
详情:representation-ethics/examples/herd-behavior-risk.yml

Academic Precedent: Generative Agents

学术先例:生成式Agent

Stanford's "Generative Agents" (Park & Bernstein, 2023) demonstrates Speed-of-Light principles at scale: 25 agents simulating a Sims-inspired town with emergent social behavior.
Their architecture:
  • Memory stream (all experiences in natural language)
  • Reflection (synthesize memories into beliefs)
  • Planning (daily/hourly action sequences)
  • Emergent behavior (spontaneous Valentine's Day party)
What MOOLLM adds:
  • Explicit ethical framing via ROOM.yml
  • Herd behavior detection
  • Human checkpoint patterns
  • Consent and provenance tracking
See: designs/ethics/GENERATIVE-AGENTS-SMALLVILLE.md

斯坦福大学的“生成式Agent”(Park & Bernstein,2023)大规模展示了光速模式的原理:25个Agent模拟类似模拟人生的小镇,产生涌现式社会行为。
其架构:
  • 记忆流(所有经验以自然语言存储)
  • 反思(将记忆合成为信念)
  • 规划(每日/每小时的行动序列)
  • 涌现行为(自发的情人节派对)
MOOLLM新增的内容:
  • 通过ROOM.yml实现明确的伦理框架
  • 群体行为检测
  • 人工检查点模式
  • 同意与来源追踪
详情:designs/ethics/GENERATIVE-AGENTS-SMALLVILLE.md

Dovetails With

相关组件

  • Coherence Engine — Orchestrates the simulation
  • Soul Chat — Multi-voice dialogue format
  • Multi-Presence — Many instances, one epoch
  • Room — Where simulation happens
  • Adversarial CommitteeThe killer app: debates at light speed
  • Roberts Rules — Structured deliberation within one call
  • Evaluator — Independent assessment without round-trips

  • Coherence Engine — 编排模拟过程
  • Soul Chat — 多角色对话格式
  • Multi-Presence — 多实例,单轮次
  • Room — 模拟发生的场景
  • Adversarial Committee核心应用场景:光速辩论
  • Roberts Rules — 单次调用内的结构化审议
  • Evaluator — 无需往返请求的独立评估

Protocol Symbol

协议标识

SPEED-OF-LIGHT
Invoke when: Running single-epoch simulation, maximizing turns per call.
See: PROTOCOLS.yml
SPEED-OF-LIGHT
调用时机:运行单轮次模拟,最大化单次调用内的交互轮次。
详情:PROTOCOLS.yml