lamport-distributed-systems
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLeslie Lamport Style Guide
Leslie Lamport 分布式系统设计风格指南
Overview
概述
Leslie Lamport transformed distributed systems from ad-hoc engineering into a rigorous science. His work on logical clocks, consensus (Paxos), and formal specification (TLA+) provides the theoretical foundation for nearly every reliable distributed system built today. Turing Award winner (2013).
Leslie Lamport将分布式系统从临时拼凑的工程实践转变为一门严谨的科学。他在逻辑时钟、共识算法(Paxos)和形式化规范(TLA+)方面的研究,为当今几乎所有可靠分布式系统的构建奠定了理论基础。他是2013年图灵奖得主。
Core Philosophy
核心理念
"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
"If you're thinking without writing, you only think you're thinking."
"The way to be a good programmer is to write programs, not to learn languages."
"分布式系统是这样一种系统:你甚至不知道存在的某台计算机发生故障,都可能导致你自己的计算机无法使用。"
"如果你只是在脑子里想而不写下来,那你只是以为自己在思考。"
"成为优秀程序员的方法是写程序,而不是学习编程语言。"
Design Principles
设计原则
-
Formal Specification First: Write a precise specification before writing code. If you can't specify it precisely, you don't understand it.
-
Time is Relative: There is no global clock in a distributed system. Use logical time (happens-before) to reason about ordering.
-
State Machine Replication: Any deterministic service can be made fault-tolerant by replicating it as a state machine across multiple servers.
-
Safety and Liveness: Separate what must always be true (safety) from what must eventually happen (liveness). Prove both.
-
Simplicity Through Rigor: The clearest systems come from precise thinking. Formalism isn't overhead—it's the path to simplicity.
-
先做形式化规范:在编写代码前先撰写精确的规范。如果你无法精确地定义它,说明你还不理解它。
-
时间是相对的:分布式系统中不存在全局时钟。使用逻辑时间(先发生关系)来推导事件顺序。
-
状态机复制:任何确定性服务都可以通过在多台服务器上以状态机的形式复制,来实现容错。
-
安全性与活性:区分必须始终成立的属性(安全性)和最终必须发生的属性(活性),并对两者进行证明。
-
通过严谨性实现简洁:最清晰的系统源于精确的思考。形式化不是额外负担,而是实现简洁的途径。
When Designing Systems
系统设计注意事项
Always
必须遵循
- Write a specification before implementation (TLA+, Alloy, or precise prose)
- Define the safety properties: what bad things must never happen
- Define the liveness properties: what good things must eventually happen
- Reason about all possible interleavings of concurrent operations
- Use logical timestamps when physical time isn't reliable
- Make system state explicit and transitions clear
- Document invariants that must hold across all states
- 在实现前撰写规范(使用TLA+、Alloy或精确的文字描述)
- 定义安全性属性:哪些不良情况绝对不能发生
- 定义活性属性:哪些良好情况最终必须发生
- 考虑并发操作的所有可能交错情况
- 当物理时间不可靠时,使用逻辑时间戳
- 明确系统状态,清晰定义状态转换
- 记录系统在所有状态下必须保持的不变量
Never
绝对避免
- Assume messages arrive in order (unless you've proven it)
- Assume clocks are synchronized (they're not)
- Assume failures are independent (they're often correlated)
- Hand-wave about "eventually" without defining what guarantees that
- Trust intuition for concurrent systems—prove it or test it exhaustively
- Confuse the specification with the implementation
- 假设消息按顺序到达(除非你已经证明这一点)
- 假设时钟是同步的(实际上并非如此)
- 假设故障是独立的(它们往往是相关的)
- 模糊地使用“最终”一词,却不定义能保证该结果的条件
- 对并发系统仅凭直觉判断——要么证明它,要么进行全面测试
- 将规范与实现混淆
Prefer
优先选择
- State machines over ad-hoc event handling
- Logical clocks over physical timestamps for ordering
- Consensus protocols over optimistic concurrency for critical state
- Explicit failure handling over implicit assumptions
- Proved algorithms over clever heuristics
- 状态机而非临时事件处理
- 逻辑时钟而非物理时间戳进行事件排序
- 共识协议而非乐观并发控制来处理关键状态
- 显式故障处理而非隐式假设
- 经过证明的算法而非巧妙的启发式方法
Key Concepts
核心概念
Logical Clocks (Lamport Timestamps)
逻辑时钟(Lamport时间戳)
Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1
This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1
This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)每个进程维护一个计数器C:
- 在任何事件发生前,递增C
- 发送消息时,附带C的值
- 收到带有时间戳T的消息时,将C设置为max(C, T) + 1
这会形成一个偏序关系:如果a → b(a先于b发生),则C(a) < C(b)
(但C(a) < C(b)并不意味着a → b)
The Happens-Before Relation (→)
先发生关系(→)
a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)
If neither a → b nor b → a, events are CONCURRENT.a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)
If neither a → b nor b → a, events are CONCURRENT.a → b(a先于b发生)当且仅当:
- a和b在同一进程中,且a发生在b之前,或者
- a是消息发送事件,b是对应的接收事件,或者
- 存在c,使得a → c且c → b(传递性)
如果既不存在a → b也不存在b → a,则这两个事件是并发的。
State Machine Replication
状态机复制
To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state
Tolerates F failures with 2F+1 replicas.To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state
Tolerates F failures with 2F+1 replicas.要复制一个服务:
- 将服务建模为确定性状态机
- 在N台服务器上复制该状态机
- 使用共识算法(Paxos/Raft)就输入序列达成一致
- 每个副本按相同顺序应用输入 → 最终状态一致
使用2F+1个副本可以容忍F个故障。
Paxos (Simplified)
Paxos(简化版)
Three roles: Proposers, Acceptors, Learners
Phase 1 (Prepare):
Proposer sends PREPARE(n) to acceptors
Acceptor responds with highest-numbered proposal it accepted (if any)
Phase 2 (Accept):
If proposer receives majority responses:
Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
Acceptor accepts if it hasn't promised to a higher proposal
Consensus reached when majority accept the same (n, v).Three roles: Proposers, Acceptors, Learners
Phase 1 (Prepare):
Proposer sends PREPARE(n) to acceptors
Acceptor responds with highest-numbered proposal it accepted (if any)
Phase 2 (Accept):
If proposer receives majority responses:
Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
Acceptor accepts if it hasn't promised to a higher proposal
Consensus reached when majority accept the same (n, v).三种角色:提议者(Proposers)、接受者(Acceptors)、学习者(Learners)
阶段1(准备):
提议者向接受者发送PREPARE(n)
接受者回复它已接受的编号最高的提案(如果有的话)
阶段2(接受):
如果提议者收到多数接受者的回复:
发送ACCEPT(n, v),其中v是它看到的编号最高的值(或新值)
接受者如果没有承诺接受更高编号的提案,则接受该提案
当多数接受者接受同一个(n, v)时,共识达成。
Mental Model
思维模型
Lamport approaches distributed systems as a mathematician:
- Define the problem precisely: What are the inputs, outputs, and allowed behaviors?
- Identify the invariants: What must always be true?
- Consider all interleavings: What happens if events occur in any order?
- Prove correctness: Show that safety and liveness hold.
- Then implement: The code should be a straightforward translation of the spec.
Lamport以数学家的方式处理分布式系统:
- 精确定义问题:输入、输出和允许的行为是什么?
- 确定不变量:哪些属性必须始终成立?
- 考虑所有事件交错情况:如果事件以任意顺序发生会怎样?
- 证明正确性:证明安全性和活性属性成立。
- 然后实现:代码应该是规范的直接转译。
The TLA+ Approach
TLA+方法
1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties- 定义状态空间(所有可能的状态)
- 定义初始状态谓词
- 定义下一个状态关系(允许的状态转换)
- 将安全性指定为不变量(始终成立)
- 将活性指定为时序属性(最终成立)
- 通过模型检查或证明来验证规范是否满足这些属性
Code Patterns
代码模式
Implementing Logical Clocks
实现逻辑时钟
python
class LamportClock:
def __init__(self):
self._time = 0
def tick(self) -> int:
"""Increment before local event."""
self._time += 1
return self._time
def send_timestamp(self) -> int:
"""Get timestamp for outgoing message."""
self._time += 1
return self._time
def receive(self, msg_timestamp: int) -> int:
"""Update clock on message receipt."""
self._time = max(self._time, msg_timestamp) + 1
return self._timepython
class LamportClock:
def __init__(self):
self._time = 0
def tick(self) -> int:
"""Increment before local event."""
self._time += 1
return self._time
def send_timestamp(self) -> int:
"""Get timestamp for outgoing message."""
self._time += 1
return self._time
def receive(self, msg_timestamp: int) -> int:
"""Update clock on message receipt."""
self._time = max(self._time, msg_timestamp) + 1
return self._timeVector Clocks (for Causality Detection)
向量时钟(用于因果关系检测)
python
class VectorClock:
def __init__(self, node_id: str, num_nodes: int):
self._id = node_id
self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
def tick(self):
self._clock[self._id] += 1
def send(self) -> dict:
self.tick()
return self._clock.copy()
def receive(self, other: dict):
for node, time in other.items():
self._clock[node] = max(self._clock.get(node, 0), time)
self.tick()
def happens_before(self, other: dict) -> bool:
"""Returns True if self → other."""
return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
and any(self._clock[k] < other.get(k, 0) for k in self._clock)python
class VectorClock:
def __init__(self, node_id: str, num_nodes: int):
self._id = node_id
self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
def tick(self):
self._clock[self._id] += 1
def send(self) -> dict:
self.tick()
return self._clock.copy()
def receive(self, other: dict):
for node, time in other.items():
self._clock[node] = max(self._clock.get(node, 0), time)
self.tick()
def happens_before(self, other: dict) -> bool:
"""Returns True if self → other."""
return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
and any(self._clock[k] < other.get(k, 0) for k in self._clock)Warning Signs
警示信号
You're violating Lamport's principles if:
- You assume "this will never happen in practice" without proof
- Your distributed algorithm works "most of the time"
- You can't write down the invariants your system maintains
- You're using wall-clock time for ordering in a distributed system
- You haven't considered what happens during network partitions
- Your system has no formal specification
如果你出现以下情况,就违反了Lamport的原则:
- 没有证明就假设“这种情况在实践中永远不会发生”
- 你的分布式算法“大多数时候有效”
- 你无法写下系统维护的不变量
- 你在分布式系统中使用墙上时钟时间进行事件排序
- 你没有考虑网络分区时的情况
- 你的系统没有形式化规范
Additional Resources
额外资源
- For detailed philosophy, see philosophy.md
- For references (papers, talks), see references.md
- 如需了解详细的哲学理念,请参阅philosophy.md
- 如需参考资料(论文、演讲),请参阅references.md