elasticsearch-expert
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseElasticsearch Expert
Elasticsearch专家指南
Expert guidance for Elasticsearch, search optimization, ELK stack, and distributed search systems.
为Elasticsearch、搜索优化、ELK栈及分布式搜索系统提供专业指导。
Core Concepts
核心概念
- Full-text search and inverted indexes
- Document-oriented storage
- RESTful API
- Distributed architecture with sharding
- ELK stack (Elasticsearch, Logstash, Kibana)
- Aggregations and analytics
- 全文检索与倒排索引
- 面向文档的存储
- RESTful API
- 带分片的分布式架构
- ELK栈(Elasticsearch、Logstash、Kibana)
- 聚合与分析
Index Management
索引管理
python
from elasticsearch import Elasticsearch
es = Elasticsearch(['http://localhost:9200'])python
from elasticsearch import Elasticsearch
es = Elasticsearch(['http://localhost:9200'])Create index with mapping
Create index with mapping
mapping = {
"mappings": {
"properties": {
"title": {"type": "text", "analyzer": "english"},
"content": {"type": "text"},
"author": {"type": "keyword"},
"created_at": {"type": "date"},
"views": {"type": "integer"}
}
}
}
es.indices.create(index='articles', body=mapping)
mapping = {
"mappings": {
"properties": {
"title": {"type": "text", "analyzer": "english"},
"content": {"type": "text"},
"author": {"type": "keyword"},
"created_at": {"type": "date"},
"views": {"type": "integer"}
}
}
}
es.indices.create(index='articles', body=mapping)
Index document
Index document
doc = {
"title": "Elasticsearch Guide",
"content": "Complete guide to Elasticsearch",
"author": "John Doe",
"created_at": "2024-01-01",
"views": 100
}
es.index(index='articles', id=1, body=doc)
doc = {
"title": "Elasticsearch Guide",
"content": "Complete guide to Elasticsearch",
"author": "John Doe",
"created_at": "2024-01-01",
"views": 100
}
es.index(index='articles', id=1, body=doc)
Bulk indexing
Bulk indexing
from elasticsearch.helpers import bulk
actions = [
{"_index": "articles", "_id": i, "_source": doc}
for i, doc in enumerate(documents)
]
bulk(es, actions)
undefinedfrom elasticsearch.helpers import bulk
actions = [
{"_index": "articles", "_id": i, "_source": doc}
for i, doc in enumerate(documents)
]
bulk(es, actions)
undefinedSearch Queries
搜索查询
python
undefinedpython
undefinedFull-text search
Full-text search
query = {
"query": {
"match": {
"content": "elasticsearch guide"
}
}
}
results = es.search(index='articles', body=query)
query = {
"query": {
"match": {
"content": "elasticsearch guide"
}
}
}
results = es.search(index='articles', body=query)
Boolean query
Boolean query
bool_query = {
"query": {
"bool": {
"must": [
{"match": {"content": "elasticsearch"}}
],
"filter": [
{"range": {"views": {"gte": 100}}}
],
"should": [
{"term": {"author": "john-doe"}}
],
"must_not": [
{"term": {"status": "draft"}}
]
}
}
}
bool_query = {
"query": {
"bool": {
"must": [
{"match": {"content": "elasticsearch"}}
],
"filter": [
{"range": {"views": {"gte": 100}}}
],
"should": [
{"term": {"author": "john-doe"}}
],
"must_not": [
{"term": {"status": "draft"}}
]
}
}
}
Multi-match query
Multi-match query
multi_match = {
"query": {
"multi_match": {
"query": "elasticsearch guide",
"fields": ["title^2", "content"], # Boost title
"type": "best_fields"
}
}
}
multi_match = {
"query": {
"multi_match": {
"query": "elasticsearch guide",
"fields": ["title^2", "content"], # Boost title
"type": "best_fields"
}
}
}
Fuzzy search
Fuzzy search
fuzzy = {
"query": {
"fuzzy": {
"title": {
"value": "elasticseerch",
"fuzziness": "AUTO"
}
}
}
}
undefinedfuzzy = {
"query": {
"fuzzy": {
"title": {
"value": "elasticseerch",
"fuzziness": "AUTO"
}
}
}
}
undefinedAggregations
聚合分析
python
undefinedpython
undefinedAggregation query
Aggregation query
agg_query = {
"aggs": {
"authors": {
"terms": {
"field": "author",
"size": 10
}
},
"avg_views": {
"avg": {
"field": "views"
}
},
"views_histogram": {
"histogram": {
"field": "views",
"interval": 100
}
},
"date_histogram": {
"date_histogram": {
"field": "created_at",
"calendar_interval": "month"
}
}
}
}
result = es.search(index='articles', body=agg_query)
undefinedagg_query = {
"aggs": {
"authors": {
"terms": {
"field": "author",
"size": 10
}
},
"avg_views": {
"avg": {
"field": "views"
}
},
"views_histogram": {
"histogram": {
"field": "views",
"interval": 100
}
},
"date_histogram": {
"date_histogram": {
"field": "created_at",
"calendar_interval": "month"
}
}
}
}
result = es.search(index='articles', body=agg_query)
undefinedBest Practices
最佳实践
- Design mappings carefully
- Use appropriate analyzers
- Implement proper sharding strategy
- Monitor cluster health
- Use bulk operations
- Implement pagination with search_after
- Cache frequently used queries
- 谨慎设计映射
- 使用合适的分析器
- 实施合理的分片策略
- 监控集群健康状态
- 使用批量操作
- 用search_after实现分页
- 缓存常用查询
Anti-Patterns
反模式
❌ Deep pagination with from/size
❌ Wildcard queries without prefix
❌ No replica shards
❌ Over-sharding
❌ Not using filters for exact matches
❌ Ignoring cluster yellow/red status
❌ 使用from/size进行深度分页
❌ 不带前缀的通配符查询
❌ 不设置副本分片
❌ 过度分片
❌ 精确匹配不使用过滤器
❌ 忽略集群黄/红状态
Resources
参考资源
- Elasticsearch Guide: https://www.elastic.co/guide/
- ELK Stack: https://www.elastic.co/elk-stack
- Elasticsearch官方指南: https://www.elastic.co/guide/
- ELK栈: https://www.elastic.co/elk-stack