Elasticsearch Expert

Elasticsearch专家指南

Expert guidance for Elasticsearch, search optimization, ELK stack, and distributed search systems.

为Elasticsearch、搜索优化、ELK栈及分布式搜索系统提供专业指导。

Core Concepts

核心概念

Full-text search and inverted indexes
Document-oriented storage
RESTful API
Distributed architecture with sharding
ELK stack (Elasticsearch, Logstash, Kibana)
Aggregations and analytics

全文检索与倒排索引
面向文档的存储
RESTful API
带分片的分布式架构
ELK栈（Elasticsearch、Logstash、Kibana）
聚合与分析

Index Management

索引管理

python

from elasticsearch import Elasticsearch

es = Elasticsearch(['http://localhost:9200'])

python

from elasticsearch import Elasticsearch

es = Elasticsearch(['http://localhost:9200'])

Create index with mapping

mapping = { "mappings": { "properties": { "title": {"type": "text", "analyzer": "english"}, "content": {"type": "text"}, "author": {"type": "keyword"}, "created_at": {"type": "date"}, "views": {"type": "integer"} } } }

es.indices.create(index='articles', body=mapping)

mapping = { "mappings": { "properties": { "title": {"type": "text", "analyzer": "english"}, "content": {"type": "text"}, "author": {"type": "keyword"}, "created_at": {"type": "date"}, "views": {"type": "integer"} } } }

es.indices.create(index='articles', body=mapping)

Index document

doc = { "title": "Elasticsearch Guide", "content": "Complete guide to Elasticsearch", "author": "John Doe", "created_at": "2024-01-01", "views": 100 }

es.index(index='articles', id=1, body=doc)

doc = { "title": "Elasticsearch Guide", "content": "Complete guide to Elasticsearch", "author": "John Doe", "created_at": "2024-01-01", "views": 100 }

es.index(index='articles', id=1, body=doc)

Bulk indexing

from elasticsearch.helpers import bulk

actions = [ {"_index": "articles", "_id": i, "_source": doc} for i, doc in enumerate(documents) ]

bulk(es, actions)

undefined

from elasticsearch.helpers import bulk

actions = [ {"_index": "articles", "_id": i, "_source": doc} for i, doc in enumerate(documents) ]

bulk(es, actions)

undefined

Search Queries

搜索查询

python

undefined

python

undefined

Full-text search

query = { "query": { "match": { "content": "elasticsearch guide" } } }

results = es.search(index='articles', body=query)

query = { "query": { "match": { "content": "elasticsearch guide" } } }

results = es.search(index='articles', body=query)

Boolean query

bool_query = { "query": { "bool": { "must": [ {"match": {"content": "elasticsearch"}} ], "filter": [ {"range": {"views": {"gte": 100}}} ], "should": [ {"term": {"author": "john-doe"}} ], "must_not": [ {"term": {"status": "draft"}} ] } } }

Multi-match query

multi_match = { "query": { "multi_match": { "query": "elasticsearch guide", "fields": ["title^2", "content"], # Boost title "type": "best_fields" } } }

Fuzzy search

fuzzy = { "query": { "fuzzy": { "title": { "value": "elasticseerch", "fuzziness": "AUTO" } } } }

undefined

fuzzy = { "query": { "fuzzy": { "title": { "value": "elasticseerch", "fuzziness": "AUTO" } } } }

undefined

Aggregations

聚合分析

python

undefined

python

undefined

Aggregation query

agg_query = { "aggs": { "authors": { "terms": { "field": "author", "size": 10 } }, "avg_views": { "avg": { "field": "views" } }, "views_histogram": { "histogram": { "field": "views", "interval": 100 } }, "date_histogram": { "date_histogram": { "field": "created_at", "calendar_interval": "month" } } } }

result = es.search(index='articles', body=agg_query)

undefined

agg_query = { "aggs": { "authors": { "terms": { "field": "author", "size": 10 } }, "avg_views": { "avg": { "field": "views" } }, "views_histogram": { "histogram": { "field": "views", "interval": 100 } }, "date_histogram": { "date_histogram": { "field": "created_at", "calendar_interval": "month" } } } }

result = es.search(index='articles', body=agg_query)

undefined

Best Practices

最佳实践

Design mappings carefully
Use appropriate analyzers
Implement proper sharding strategy
Monitor cluster health
Use bulk operations
Implement pagination with search_after
Cache frequently used queries

谨慎设计映射
使用合适的分析器
实施合理的分片策略
监控集群健康状态
使用批量操作
用search_after实现分页
缓存常用查询

Anti-Patterns

反模式

❌ Deep pagination with from/size ❌ Wildcard queries without prefix ❌ No replica shards ❌ Over-sharding ❌ Not using filters for exact matches ❌ Ignoring cluster yellow/red status

❌ 使用from/size进行深度分页 ❌ 不带前缀的通配符查询 ❌ 不设置副本分片 ❌ 过度分片 ❌ 精确匹配不使用过滤器 ❌ 忽略集群黄/红状态

Resources

参考资源

Elasticsearch Guide: https://www.elastic.co/guide/
ELK Stack: https://www.elastic.co/elk-stack

Elasticsearch官方指南: https://www.elastic.co/guide/
ELK栈: https://www.elastic.co/elk-stack

elasticsearch-expert

Original

Translation

Elasticsearch Expert

Elasticsearch专家指南

Core Concepts

核心概念

Index Management

索引管理

Create index with mapping

Create index with mapping

Index document

Index document

Bulk indexing

Bulk indexing

Search Queries

搜索查询

Full-text search

Full-text search

Boolean query

Boolean query

Multi-match query

Multi-match query

Fuzzy search

Fuzzy search

Aggregations

聚合分析

Aggregation query

Aggregation query

Best Practices

最佳实践

Anti-Patterns

反模式

Resources

参考资源