ElasticSearch是什么
📖 概述
ElasticSearch (ES) 是一个基于Apache Lucene构建的分布式、实时搜索和分析引擎。它将单机的Lucene搜索库扩展为分布式架构,提供了强大的全文搜索、结构化搜索和分析能力。ES在日志分析、应用搜索、商品推荐等场景中被广泛应用。
🔍 核心存储结构详解
1. 倒排索引 (Inverted Index)
倒排索引是ElasticSearch实现高效关键词搜索的核心数据结构。
工作原理
graph LR A[原始文档] --> B[分词器] B --> C[词项Term] C --> D[倒排索引] D --> E[文档ID列表] subgraph "倒排索引结构" F[Term Dictionary] --> G[Posting List] G --> H[Doc ID + 位置信息] end
数据结构示例
json
// 原始文档
{
"doc1": "Elasticsearch is a search engine",
"doc2": "Lucene is the core of Elasticsearch",
"doc3": "Search engines use inverted index"
}
// 分词后的倒排索引
{
"elasticsearch": [1, 2], // 出现在文档1和2中
"search": [1, 3], // 出现在文档1和3中
"engine": [1, 3], // 出现在文档1和3中
"lucene": [2], // 只出现在文档2中
"core": [2], // 只出现在文档2中
"inverted": [3], // 只出现在文档3中
"index": [3] // 只出现在文档3中
}
时间复杂度优化
通过倒排索引,搜索时间复杂度从暴力搜索的O(N*M)优化为O(logN):
java
// 传统全文搜索 - O(N*M)
public List<Document> bruteForceSearch(String keyword, List<Document> docs) {
List<Document> results = new ArrayList<>();
for (Document doc : docs) { // O(N)
if (doc.content.contains(keyword)) { // O(M)
results.add(doc);
}
}
return results;
}
// 倒排索引搜索 - O(logN)
public List<Document> invertedIndexSearch(String keyword, InvertedIndex index) {
// 通过跳表或B+树快速定位Term - O(logN)
PostingList postingList = index.getPostingList(keyword);
return postingList.getDocuments();
}
详细存储结构
text
倒排索引的物理存储结构:
Term Dictionary (词典):
├── Term: "elasticsearch" → Pointer to Posting List
├── Term: "lucene" → Pointer to Posting List
└── Term: "search" → Pointer to Posting List
Posting List (倒排链表):
elasticsearch → [DocID:1, Freq:1, Positions:[0]] → [DocID:2, Freq:1, Positions:[5]]
lucene → [DocID:2, Freq:1, Positions:[0]]
search → [DocID:1, Freq:1, Positions:[3]] → [DocID:3, Freq:1, Positions:[0]]
2. Term Index - 内存目录树
Term Index是基于前缀复用构建的内存数据结构,用于加速Term Dictionary的磁盘检索。
FST (Finite State Transducer) 结构
graph TB Root --> |e| Node1 Node1 --> |l| Node2 Node2 --> |a| Node3 Node3 --> |s| Node4 Node4 --> |t| Node5 Node5 --> |i| Node6 Node6 --> |c| Node7 Node7 --> |s| Node8[Final: elasticsearch] Node2 --> |u| Node9 Node9 --> |c| Node10 Node10 --> |e| Node11 Node11 --> |n| Node12 Node12 --> |e| Node13[Final: lucene]
前缀复用优势
java
// 传统Trie树 - 每个节点存储完整字符
class TrieNode {
char character;
Map<Character, TrieNode> children;
boolean isEndOfWord;
// 内存使用:每个字符一个节点
}
// FST - 前缀复用,压缩存储
class FSTNode {
String sharedPrefix; // 共享前缀
Map<String, FSTNode> transitions;
Object output; // 指向Term Dictionary的偏移量
// 内存使用:大幅减少,特别是有公共前缀的情况
}
检索流程
text
查找Term "elasticsearch"的流程:
1. 内存中的Term Index (FST):
Root → e → el → ela → elas → elast → elasti → elastic → elastics → elasticsearch
找到对应的磁盘偏移量:offset_123
2. 根据偏移量访问磁盘上的Term Dictionary:
seek(offset_123) → 读取"elasticsearch"的Posting List指针
3. 加载Posting List:
获取包含"elasticsearch"的所有文档ID和位置信息
3. Stored Fields - 行式存储
Stored Fields以行式结构存储完整的文档内容,用于返回搜索结果中的原始数据。
存储结构
text
Document Storage Layout (行式存储):
Doc 1: [field1: "value1", field2: "value2", field3: "value3"]
Doc 2: [field1: "value4", field2: "value5", field3: "value6"]
Doc 3: [field1: "value7", field2: "value8", field3: "value9"]
每行连续存储,便于根据DocID快速获取完整文档
访问模式
java
// 根据DocID获取完整文档
public Document getStoredDocument(int docId) {
// 1. 根据docId计算在文件中的偏移量
long offset = docId * averageDocSize + indexOffset;
// 2. 从磁盘读取完整文档数据
byte[] docData = readFromFile(offset, docLength);
// 3. 反序列化为Document对象
return deserialize(docData);
}
压缩优化
text
Stored Fields的压缩策略:
1. 文档级压缩:
- 使用LZ4/DEFLATE压缩算法
- 批量压缩16KB块,平衡压缩比和解压速度
2. 字段级优化:
- 数值字段使用变长编码
- 字符串字段使用字典压缩
- 日期字段使用差值编码
4. Doc Values - 列式存储
Doc Values采用列式存储结构,集中存放字段值,专门优化排序和聚合操作的性能。
存储对比
text
行式存储 (Stored Fields):
Doc1: [name:"Alice", age:25, city:"NYC"]
Doc2: [name:"Bob", age:30, city:"LA"]
Doc3: [name:"Carol", age:28, city:"NYC"]
列式存储 (Doc Values):
name列: ["Alice", "Bob", "Carol"]
age列: [25, 30, 28]
city列: ["NYC", "LA", "NYC"]
数据类型优化
java
// 数值类型的Doc Values优化
class NumericDocValues {
// 使用位打包技术,根据数值范围选择最小存储位数
private PackedInts.Reader values;
public long get(int docId) {
return values.get(docId); // O(1)访问
}
}
// 字符串类型的Doc Values优化
class SortedDocValues {
private PackedInts.Reader ordinals; // 文档到序号的映射
private BytesRef[] terms; // 去重后的字符串数组
public BytesRef get(int docId) {
int ord = (int) ordinals.get(docId);
return terms[ord];
}
}
聚合性能优化
java
// 基于Doc Values的高效聚合
public class AggregationOptimization {
// 数值聚合 - 利用列式存储的顺序访问优势
public double calculateAverage(NumericDocValues ageValues, int[] docIds) {
long sum = 0;
for (int docId : docIds) {
sum += ageValues.get(docId); // 顺序访问,缓存友好
}
return (double) sum / docIds.length;
}
// 分组聚合 - 利用排序的Doc Values
public Map<String, List<Integer>> groupByCity(SortedDocValues cityValues,
int[] docIds) {
Map<String, List<Integer>> groups = new HashMap<>();
for (int docId : docIds) {
String city = cityValues.get(docId).utf8ToString();
groups.computeIfAbsent(city, k -> new ArrayList<>()).add(docId);
}
return groups;
}
}
🗂️ Lucene核心概念
1. Segment - 最小搜索单元
Segment是Lucene的最小搜索单元,包含完整的搜索数据结构。
Segment结构
text
Segment文件组成:
├── .tim (Term Index) - 内存中的FST索引
├── .tip (Term Dictionary) - 磁盘上的词典
├── .doc (Frequency) - 词频信息
├── .pos (Positions) - 词位置信息
├── .pay (Payloads) - 自定义载荷数据
├── .fdt (Stored Fields Data) - 存储字段数据
├── .fdx (Stored Fields Index) - 存储字段索引
├── .dvd (Doc Values Data) - 列式数据
├── .dvm (Doc Values Metadata) - 列式元数据
└── .si (Segment Info) - 段信息
不可变性特征
java
public class ImmutableSegment {
private final String segmentName;
private final int maxDoc;
private final Map<String, InvertedIndex> fieldIndexes;
// Segment一旦生成就不可修改
public ImmutableSegment(String name, Document[] docs) {
this.segmentName = name;
this.maxDoc = docs.length;
this.fieldIndexes = buildIndexes(docs); // 构建时确定,之后只读
}
// 更新操作需要创建新的Segment
public ImmutableSegment addDocuments(Document[] newDocs) {
Document[] allDocs = ArrayUtils.addAll(this.getDocs(), newDocs);
return new ImmutableSegment(generateNewName(), allDocs);
}
}
搜索过程
java
public class SegmentSearcher {
public SearchResult search(Query query, Segment segment) {
// 1. 查询Term Index,定位Term Dictionary偏移
List<String> terms = query.extractTerms();
Map<String, PostingList> termPostings = new HashMap<>();
for (String term : terms) {
// 快速定位 - O(logN)
long offset = segment.getTermIndex().getOffset(term);
PostingList posting = segment.getTermDict().getPosting(offset);
termPostings.put(term, posting);
}
// 2. 合并Posting Lists,计算相关性
Set<Integer> candidateDocs = intersectPostings(termPostings);
// 3. 从Stored Fields获取文档内容
List<Document> results = new ArrayList<>();
for (Integer docId : candidateDocs) {
Document doc = segment.getStoredFields().getDocument(docId);
results.add(doc);
}
return new SearchResult(results);
}
}
2. 段合并 (Segment Merging)
段合并是Lucene优化性能的重要机制,定期合并小Segment为大Segment。
合并策略
java
public class TieredMergePolicy {
private static final double DEFAULT_SEGS_PER_TIER = 10.0;
private static final int DEFAULT_MAX_MERGE_AT_ONCE = 10;
public MergeSpecification findMerges(List<SegmentInfo> segments) {
// 1. 按Segment大小分组
List<List<SegmentInfo>> tiers = groupBySize(segments);
// 2. 识别需要合并的层级
MergeSpecification mergeSpec = new MergeSpecification();
for (List<SegmentInfo> tier : tiers) {
if (tier.size() > DEFAULT_SEGS_PER_TIER) {
// 选择最小的N个Segment进行合并
List<SegmentInfo> toMerge = selectSmallestSegments(tier,
DEFAULT_MAX_MERGE_AT_ONCE);
mergeSpec.add(new OneMerge(toMerge));
}
}
return mergeSpec;
}
// 合并执行
public SegmentInfo executeMerge(List<SegmentInfo> segments) {
SegmentWriter writer = new SegmentWriter();
// 逐个处理每个字段的数据
for (String fieldName : getAllFields(segments)) {
// 合并倒排索引
mergeInvertedIndex(writer, segments, fieldName);
// 合并Stored Fields
mergeStoredFields(writer, segments, fieldName);
// 合并Doc Values
mergeDocValues(writer, segments, fieldName);
}
return writer.commit();
}
}
合并收益
text
合并前:
Segment1 (1000 docs, 10MB)
Segment2 (1200 docs, 12MB)
Segment3 (800 docs, 8MB)
Segment4 (1500 docs, 15MB)
总计: 4个文件,45MB,4次磁盘seek
合并后:
Segment_merged (4500 docs, 42MB) // 压缩后略小
总计: 1个文件,42MB,1次磁盘seek
查询性能提升:
- 减少文件打开数量:4 → 1
- 减少磁盘seek次数:4 → 1
- 提高缓存命中率:连续存储
- 减少内存开销:合并索引结构
🌐 从Lucene到ElasticSearch的演进
1. 分布式架构设计
ElasticSearch将单机的Lucene扩展为分布式搜索引擎。
架构对比
graph TB subgraph "单机Lucene" A[Application] --> B[Lucene Library] B --> C[Local Index Files] end subgraph "分布式ElasticSearch" D[Client] --> E[Coordinate Node] E --> F[Data Node 1] E --> G[Data Node 2] E --> H[Data Node 3] F --> I[Shard 1] F --> J[Shard 2] G --> K[Shard 3] G --> L[Replica 1] H --> M[Replica 2] H --> N[Replica 3] end
核心改进
java
// Lucene单机搜索
public class LuceneSearcher {
private IndexSearcher searcher;
public TopDocs search(Query query, int numHits) {
return searcher.search(query, numHits); // 单机处理
}
}
// ElasticSearch分布式搜索
public class DistributedSearcher {
private List<ShardSearcher> shards;
public SearchResponse search(SearchRequest request) {
// 1. 分发查询到所有分片
List<Future<ShardSearchResult>> futures = new ArrayList<>();
for (ShardSearcher shard : shards) {
Future<ShardSearchResult> future = executor.submit(() ->
shard.search(request));
futures.add(future);
}
// 2. 收集各分片结果
List<ShardSearchResult> shardResults = new ArrayList<>();
for (Future<ShardSearchResult> future : futures) {
shardResults.add(future.get());
}
// 3. 合并排序,返回Top-K结果
return mergeAndSort(shardResults, request.getSize());
}
}
2. 分片与副本机制
Primary Shard vs Replica Shard
text
分片策略:
Primary Shard (主分片):
- 负责写操作的处理
- 数据的权威来源
- 创建索引时确定数量,后续不可更改
Replica Shard (副本分片):
- Primary Shard的完整副本
- 负责读操作的负载均衡
- 提供高可用保障
- 数量可以动态调整
数据分布示例
java
public class ShardAllocation {
// 文档路由算法
public int getShardId(String documentId, int numberOfShards) {
// 使用文档ID的哈希值确定分片
return Math.abs(documentId.hashCode()) % numberOfShards;
}
// 分片分布策略
public void allocateShards(Index index, List<Node> nodes) {
int primaryShards = index.getNumberOfShards();
int replicaCount = index.getNumberOfReplicas();
// 分配Primary Shards
for (int shardId = 0; shardId < primaryShards; shardId++) {
Node primaryNode = selectNodeForPrimary(nodes, shardId);
primaryNode.allocatePrimaryShard(index, shardId);
// 分配Replica Shards
for (int replica = 0; replica < replicaCount; replica++) {
Node replicaNode = selectNodeForReplica(nodes, primaryNode, shardId);
replicaNode.allocateReplicaShard(index, shardId, replica);
}
}
}
}
读写负载均衡
java
public class LoadBalancedOperations {
// 写操作:必须路由到Primary Shard
public IndexResponse index(IndexRequest request) {
String docId = request.getId();
int shardId = getShardId(docId, numberOfShards);
// 找到Primary Shard
Shard primaryShard = findPrimaryShard(shardId);
// 在Primary上执行写操作
IndexResponse response = primaryShard.index(request);
// 同步到所有Replica Shards
List<Shard> replicas = findReplicaShards(shardId);
for (Shard replica : replicas) {
replica.index(request); // 异步复制
}
return response;
}
// 读操作:可以路由到Primary或Replica
public GetResponse get(GetRequest request) {
String docId = request.getId();
int shardId = getShardId(docId, numberOfShards);
// 负载均衡选择分片(Primary + Replicas)
List<Shard> availableShards = findAvailableShards(shardId);
Shard selectedShard = selectShardForRead(availableShards);
return selectedShard.get(request);
}
}
3. 节点角色分工
ElasticSearch通过节点角色分化实现功能解耦和性能优化。
节点类型详解
java
// Master Node - 集群管理
public class MasterNode extends Node {
private ClusterState clusterState;
private AllocationService allocationService;
public void handleClusterStateChange() {
// 1. 处理索引创建/删除
processIndexOperations();
// 2. 管理分片分配
rebalanceShards();
// 3. 处理节点加入/离开
updateNodeMembership();
// 4. 广播集群状态到所有节点
broadcastClusterState();
}
@Override
public boolean canHandleSearchRequests() {
return false; // 专注于集群管理,不处理搜索请求
}
}
// Data Node - 数据存储
public class DataNode extends Node {
private Map<ShardId, Shard> localShards;
private LuceneService luceneService;
public SearchResponse executeSearch(SearchRequest request) {
List<ShardSearchResult> results = new ArrayList<>();
for (ShardId shardId : request.getShardIds()) {
if (localShards.containsKey(shardId)) {
Shard shard = localShards.get(shardId);
ShardSearchResult result = shard.search(request);
results.add(result);
}
}
return aggregateResults(results);
}
@Override
public boolean canStorePrimaryShards() {
return true; // 可以存储主分片
}
}
// Coordinate Node - 请求协调
public class CoordinateNode extends Node {
private LoadBalancer loadBalancer;
private ResultAggregator aggregator;
public SearchResponse coordinateSearch(SearchRequest request) {
// 1. 查询路由规划
Map<Node, List<ShardId>> shardRouting = planShardRouting(request);
// 2. 并发发送到各个Data Node
Map<Node, Future<SearchResponse>> futures = new HashMap<>();
for (Map.Entry<Node, List<ShardId>> entry : shardRouting.entrySet()) {
Node dataNode = entry.getKey();
SearchRequest shardRequest = buildShardRequest(request, entry.getValue());
Future<SearchResponse> future = sendSearchRequest(dataNode, shardRequest);
futures.put(dataNode, future);
}
// 3. 收集并聚合结果
List<SearchResponse> responses = collectResponses(futures);
return aggregator.aggregate(responses, request);
}
@Override
public boolean canStoreData() {
return false; // 不存储数据,专注于协调
}
}
角色组合配置
yaml
# elasticsearch.yml 配置示例
# 专用Master节点
node.master: true
node.data: false
node.ingest: false
node.ml: false
# 专用Data节点
node.master: false
node.data: true
node.ingest: false
node.ml: false
# 专用Coordinate节点
node.master: false
node.data: false
node.ingest: true
node.ml: false
# 混合节点(小集群)
node.master: true
node.data: true
node.ingest: true
node.ml: false
4. 去中心化设计
ElasticSearch采用去中心化架构,避免依赖外部组件。
Raft协议实现
java
public class RaftConsensus {
private NodeRole currentRole = NodeRole.FOLLOWER;
private int currentTerm = 0;
private String votedFor = null;
private List<LogEntry> log = new ArrayList<>();
// Leader选举
public void startElection() {
currentTerm++;
currentRole = NodeRole.CANDIDATE;
votedFor = this.nodeId;
// 向所有节点请求投票
int votes = 1; // 自己的票
for (Node node : clusterNodes) {
VoteRequest request = new VoteRequest(currentTerm, nodeId,
getLastLogIndex(), getLastLogTerm());
VoteResponse response = node.requestVote(request);
if (response.isVoteGranted()) {
votes++;
}
}
// 获得多数票,成为Leader
if (votes > clusterNodes.size() / 2) {
becomeLeader();
} else {
becomeFollower();
}
}
// 日志复制
public void replicateEntry(ClusterStateUpdate update) {
if (currentRole != NodeRole.LEADER) {
throw new IllegalStateException("Only leader can replicate entries");
}
LogEntry entry = new LogEntry(currentTerm, update);
log.add(entry);
// 并行复制到所有Follower
int replicationCount = 1; // Leader本身
for (Node follower : followers) {
AppendEntriesRequest request = new AppendEntriesRequest(
currentTerm, nodeId, getLastLogIndex() - 1,
getLastLogTerm(), Arrays.asList(entry), commitIndex);
AppendEntriesResponse response = follower.appendEntries(request);
if (response.isSuccess()) {
replicationCount++;
}
}
// 多数节点确认后提交
if (replicationCount > clusterNodes.size() / 2) {
commitIndex = log.size() - 1;
applyToStateMachine(update);
}
}
}
集群发现机制
java
public class ClusterDiscovery {
private List<String> seedNodes;
private Map<String, NodeInfo> discoveredNodes;
// 节点发现
public void discoverNodes() {
Set<String> allNodes = new HashSet<>(seedNodes);
// 递归发现:从种子节点开始,获取它们知道的其他节点
Queue<String> toDiscover = new LinkedList<>(seedNodes);
Set<String> discovered = new HashSet<>();
while (!toDiscover.isEmpty()) {
String nodeAddress = toDiscover.poll();
if (discovered.contains(nodeAddress)) {
continue;
}
try {
// 连接节点,获取其已知的集群成员
NodeInfo nodeInfo = connectAndGetInfo(nodeAddress);
discoveredNodes.put(nodeAddress, nodeInfo);
discovered.add(nodeAddress);
// 将新发现的节点加入待探索队列
for (String knownNode : nodeInfo.getKnownNodes()) {
if (!discovered.contains(knownNode)) {
toDiscover.offer(knownNode);
}
}
} catch (Exception e) {
log.warn("Failed to discover node: " + nodeAddress, e);
}
}
// 更新集群成员视图
updateClusterMembership(discoveredNodes.values());
}
// 故障检测
@Scheduled(fixedDelay = 1000)
public void detectFailures() {
for (Map.Entry<String, NodeInfo> entry : discoveredNodes.entrySet()) {
String nodeAddress = entry.getKey();
NodeInfo nodeInfo = entry.getValue();
try {
// 发送心跳检测
boolean isAlive = sendHeartbeat(nodeAddress);
if (!isAlive) {
handleNodeFailure(nodeAddress, nodeInfo);
}
} catch (Exception e) {
handleNodeFailure(nodeAddress, nodeInfo);
}
}
}
}
🔄 核心工作流程
1. 索引流程
java
public class IndexingWorkflow {
public IndexResponse index(IndexRequest request) {
// 1. 路由到正确的分片
int shardId = calculateShardId(request.getId());
Shard primaryShard = getPrimaryShard(shardId);
// 2. 在Primary Shard上执行索引
Document doc = parseDocument(request.getSource());
// 2.1 分词处理
Map<String, List<String>> analyzedFields = analyzeDocument(doc);
// 2.2 构建倒排索引
updateInvertedIndex(analyzedFields, doc.getId());
// 2.3 存储原始文档
storeDocument(doc);
// 2.4 更新Doc Values
updateDocValues(doc);
// 3. 复制到Replica Shards
List<Shard> replicas = getReplicaShards(shardId);
replicateToReplicas(replicas, request);
// 4. 返回响应
return new IndexResponse(request.getId(), shardId, "created");
}
}
2. 搜索流程
java
public class SearchWorkflow {
public SearchResponse search(SearchRequest request) {
// Phase 1: Query Phase (查询阶段)
Map<ShardId, ShardSearchResult> queryResults = queryPhase(request);
// Phase 2: Fetch Phase (获取阶段)
List<Document> documents = fetchPhase(queryResults, request);
return buildSearchResponse(documents, queryResults);
}
private Map<ShardId, ShardSearchResult> queryPhase(SearchRequest request) {
Map<ShardId, ShardSearchResult> results = new HashMap<>();
// 并行查询所有相关分片
List<Future<ShardSearchResult>> futures = new ArrayList<>();
for (ShardId shardId : getTargetShards(request)) {
Future<ShardSearchResult> future = executor.submit(() -> {
Shard shard = getShard(shardId);
// 1. 解析查询
Query luceneQuery = parseQuery(request.getQuery());
// 2. 执行搜索,只返回DocID和Score
TopDocs topDocs = shard.search(luceneQuery, request.getSize());
// 3. 包装结果
return new ShardSearchResult(shardId, topDocs);
});
futures.add(future);
}
// 收集查询结果
for (Future<ShardSearchResult> future : futures) {
ShardSearchResult result = future.get();
results.put(result.getShardId(), result);
}
return results;
}
private List<Document> fetchPhase(Map<ShardId, ShardSearchResult> queryResults,
SearchRequest request) {
// 1. 全局排序,选出Top-K
List<ScoreDoc> globalTopDocs = mergeAndSort(queryResults.values(),
request.getSize());
// 2. 根据DocID获取完整文档内容
List<Document> documents = new ArrayList<>();
for (ScoreDoc scoreDoc : globalTopDocs) {
ShardId shardId = getShardId(scoreDoc);
Shard shard = getShard(shardId);
// 从Stored Fields获取完整文档
Document doc = shard.getStoredDocument(scoreDoc.doc);
documents.add(doc);
}
return documents;
}
}
📊 性能特征分析
1. 查询性能
text
时间复杂度分析:
1. Term查找:O(logN)
- Term Index (FST): O(logT), T为不重复Term数量
- Term Dictionary: O(1), 直接偏移访问
2. Posting List遍历:O(K)
- K为包含Term的文档数量
- 使用跳表优化,可以跳过无关文档
3. 多Term查询合并:O(K1 + K2 + ... + Kn)
- 使用双指针法合并有序列表
- 布尔查询的AND/OR/NOT操作
4. 结果排序:O(R*logR)
- R为最终返回的结果数量
- 通常R << 总文档数,性能可控
总体查询复杂度:O(logN + K + R*logR)
2. 存储效率
text
存储空间分析:
1. 倒排索引:
- Term Dictionary: 约为原文本的10-30%
- Posting Lists: 约为原文本的20-50%
- Term Index (内存): 约为Term Dictionary的1-5%
2. Stored Fields:
- 压缩比: 50-80% (取决于数据类型和压缩算法)
- 随机访问: 需要解压缩开销
3. Doc Values:
- 数值类型: 原始数据的70-90% (位打包优化)
- 字符串类型: 原始数据的60-85% (序号化+字典)
4. 总体存储开销:
- 原始数据: 100%
- 索引开销: 50-100%
- 总存储: 150-200% of 原始数据
3. 内存使用
java
public class MemoryUsageAnalysis {
// 各组件内存使用估算
public MemoryUsage calculateMemoryUsage(IndexStats stats) {
long termIndexMemory = estimateTermIndexMemory(stats);
long filterCacheMemory = estimateFilterCacheMemory(stats);
long fieldDataMemory = estimateFieldDataMemory(stats);
long segmentMemory = estimateSegmentMemory(stats);
return new MemoryUsage(termIndexMemory, filterCacheMemory,
fieldDataMemory, segmentMemory);
}
private long estimateTermIndexMemory(IndexStats stats) {
// Term Index (FST) 大约占用:
// 每个唯一Term 8-32字节 (取决于前缀压缩效果)
return stats.getUniqueTermCount() * 20; // 平均20字节/Term
}
private long estimateFilterCacheMemory(IndexStats stats) {
// 过滤器缓存:缓存常用的过滤器BitSet
// 每个文档1bit,按字节对齐
return stats.getDocumentCount() / 8 * stats.getCachedFilterCount();
}
private long estimateFieldDataMemory(IndexStats stats) {
// Doc Values加载到内存的部分
// 数值字段:8字节/文档,字符串字段:变长
long numericMemory = stats.getNumericFieldCount() * stats.getDocumentCount() * 8;
long stringMemory = stats.getStringFieldSize(); // 实际字符串长度
return numericMemory + stringMemory;
}
}
🎯 总结
ElasticSearch通过精巧的数据结构设计和分布式架构,将Lucene从单机搜索库演进为分布式搜索引擎:
核心数据结构优势:
- 倒排索引:O(logN)查询复杂度,高效关键词搜索
- Term Index:FST前缀复用,减少内存开销和磁盘IO
- Stored Fields:行式存储,快速文档检索
- Doc Values:列式存储,优化聚合和排序性能
分布式架构特色:
- 分片机制:水平扩展,负载均衡
- 副本保障:高可用,读写分离
- 节点分工:Master、Data、Coordinate角色解耦
- 去中心化:Raft协议,无外部依赖
性能特征:
- 查询性能:亚秒级全文搜索响应
- 存储效率:50-100%索引开销,可接受的空间成本
- 扩展能力:线性扩展,支持PB级数据
ElasticSearch成功地将复杂的信息检索理论转化为实用的分布式搜索平台,在现代大数据生态中发挥着重要作用。其设计理念和技术实现为分布式系统架构提供了宝贵的参考价值。