cassandra
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseApache Cassandra
Apache Cassandra
Cassandra is a wide-column store database designed for scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
Cassandra是一款宽列存储数据库,专为可扩展性和高可用性设计,同时不牺牲性能。其线性可扩展性以及在通用硬件或云基础设施上经证实的容错能力,使其成为承载关键业务数据的理想平台。
When to Use
适用场景
- High Write Throughput: Ingests millions of writes per second.
- Always On: Zero single points of failure. Updates can happen even if nodes are down (Eventual Consistency).
- Multi-Region: Active-Active multi-region replication is built-in.
- 高写入吞吐量:每秒可处理数百万次写入操作。
- 始终在线:无单点故障。即使部分节点宕机,仍可进行更新操作(最终一致性)。
- 多区域部署:内置支持多区域主动-主动复制。
Quick Start (CQL)
快速开始(CQL)
sql
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name text,
email text
);
INSERT INTO users (user_id, name) VALUES (uuid(), 'Alice');sql
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name text,
email text
);
INSERT INTO users (user_id, name) VALUES (uuid(), 'Alice');Core Concepts
核心概念
Partition Key & Clustering Key
分区键(Partition Key)与聚类键(Clustering Key)
- Partition Key: Determines which node holds the data.
- Clustering Key: Sorts data within the partition on disk.
- Partition Key:决定数据存储在哪个节点上。
- Clustering Key:在磁盘上对分区内的数据进行排序。
Tunable Consistency
可调一致性
You choose consistency level per query.
- : Fastest, least specific.
ANY - : Majority must acknowledge. Balanced.
QUORUM - : Slowest, safest.
ALL
可针对每个查询选择一致性级别:
- :速度最快,一致性最弱。
ANY - :需要多数节点确认,平衡性能与一致性。
QUORUM - :速度最慢,一致性最强、最安全。
ALL
Vector Search (5.0+)
向量搜索(5.0+版本)
Native support for Vector Search (ANN) allows using Cassandra as a Vector DB for AI apps.
原生支持向量搜索(ANN),可将Cassandra用作AI应用的向量数据库。
Best Practices (2025)
2025年最佳实践
Do:
- Query by Partition Key: Always. Scans are prohibited in production.
- Use SAI (Storage Attached Indexes): New in 5.0. Better than old secondary indexes.
- Denormalize: Optimize schema for Reads. It is okay to duplicate data into 3 tables to satisfy 3 different query patterns.
Don't:
- Don't use distributed joins: Cassandra doesn't do joins. Join in the app.
- Don't use large partitions: Keep partitions under 100MB to avoid compaction issues.
建议:
- 按分区键查询:务必遵循此原则,生产环境中禁止使用扫描操作。
- 使用SAI(存储附加索引):5.0版本新增功能,优于旧版二级索引。
- 反规范化设计:针对读取操作优化 schema。为满足3种不同查询模式,将数据复制到3张表是可行的做法。
禁忌:
- 不要使用分布式连接:Cassandra不支持连接操作,需在应用层实现连接。
- 不要使用大分区:分区大小需控制在100MB以下,以避免压缩问题。