[特殊字符]Elasticsearch从入门到精通:核心概念与实战解析(附完整代码)

🔥Elasticsearch从入门到精通:核心概念与实战解析(附完整代码)

大家好,我是star-yp,今天咱们来聊聊搜索界的"扛把子"------Elasticsearch(简称ES)。这篇文章我准备了很久,希望能用通俗易懂的方式,带大家彻底搞懂ES的核心概念,并配合实战代码,让你看完就能上手!


📌 目录

  1. ES到底是什么?
  2. 核心概念扫盲
  3. 环境搭建
  4. 索引操作实战
  5. 文档CRUD操作
  6. 查询DSL深度解析
  7. Java客户端实战
  8. 性能优化技巧
  9. 总结与展望

ES到底是什么?

说实话,刚接触ES的时候,我也一脸懵逼。这玩意儿到底是数据库还是搜索引擎?

其实吧,ES本质上是一个基于Lucene的分布式搜索和分析引擎。它把复杂的全文检索功能封装成了RESTful API,让我们这些普通开发者也能轻松实现"百度一下"的效果。

为什么要用ES?

举个🌰:

  • 你的电商网站需要商品搜索功能
  • 日志系统需要快速检索海量日志
  • 数据分析需要实时统计

传统关系型数据库(MySQL、Oracle)在全文检索方面确实力不从心,而ES就是为这个而生的!


核心概念扫盲

1. 集群(Cluster)

一个ES集群就是由一个或多个节点组成的集合,它们共同持有整个数据。说白了,就是一群服务器抱团取暖。

json 复制代码
// 查看集群健康状态
GET /_cluster/health

// 返回结果
{
  "cluster_name": "my-application",
  "status": "green",  // green(健康)、yellow(警告)、red(异常)
  "timed_out": false,
  "number_of_nodes": 3,
  "number_of_data_nodes": 3,
  "active_primary_shards": 10,
  "active_shards": 20
}

2. 节点(Node)

集群中的每个服务器就是一个节点。节点分好几种类型:

  • 主节点:负责集群管理
  • 数据节点:负责数据存储和检索
  • 协调节点:负责接收客户端请求

3. 索引(Index)

索引可以理解为关系型数据库里的"数据库"。它是一类相似文档的集合。

json 复制代码
// 创建索引
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,      // 主分片数
    "number_of_replicas": 1     // 副本数
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "price": {
        "type": "double"
      },
      "created_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}

注意 :在ES 7.x之后,一个索引只能有一个type(默认为_doc),这是个大坑,很多老教程还在讲多type,别被坑了!

4. 文档(Document)

文档就是索引中的一条数据,用JSON格式表示。

json 复制代码
// 添加文档
POST /my_index/_doc/1
{
  "title": "苹果手机iPhone 15",
  "price": 5999.00,
  "created_at": "2024-01-15 10:30:00"
}

5. 分片(Shard)和副本(Replica)

分片 :数据量大的时候,ES会把索引分成多个分片存储,实现水平扩展。
副本:分片的拷贝,提高可用性和查询性能。

类比一下:

  • 分片 = 把一本书分成几册
  • 副本 = 每册书的影印本

环境搭建

Docker一键安装(推荐)

bash 复制代码
# 拉取镜像
docker pull docker.elastic.co/elasticsearch/elasticsearch:7.17.0

# 运行容器
docker run -d \
  --name es \
  -p 9200:9200 \
  -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
  docker.elastic.co/elasticsearch/elasticsearch:7.17.0

小贴士

  • -Xms512m -Xmx512m 设置JVM内存,开发环境够用
  • discovery.type=single-node 单节点模式,省得配置集群

安装Kibana(可视化工具)

bash 复制代码
docker run -d \
  --name kibana \
  --link es:elasticsearch \
  -p 5601:5601 \
  docker.elastic.co/kibana/kibana:7.17.0

访问 http://localhost:5601 就能看到Kibana界面了,开发调试神器!


索引操作实战

创建索引

bash 复制代码
# 最简方式
PUT /products

# 带配置的方式
PUT /products_v2
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer": "ik_max_word",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "ik_max_word",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "description": {
        "type": "text",
        "analyzer": "ik_smart"
      },
      "price": {
        "type": "double"
      },
      "stock": {
        "type": "integer"
      },
      "category": {
        "type": "keyword"
      },
      "tags": {
        "type": "keyword"
      },
      "is_active": {
        "type": "boolean"
      },
      "created_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}

代码解析

  • number_of_shards: 主分片数,一旦设置不可修改!
  • number_of_replicas: 副本数,可以动态调整
  • ik_max_word: 中文分词器,需要额外安装
  • fields.multi-field: 一个字段多种类型,既能全文检索又能精确匹配

查看索引

bash 复制代码
# 查看所有索引
GET /_cat/indices?v

# 查看索引映射
GET /products_v2/_mapping

# 查看索引设置
GET /products_v2/_settings

修改索引

bash 复制代码
# 修改副本数(可以动态修改)
PUT /products_v2/_settings
{
  "number_of_replicas": 2
}

# 添加字段映射
PUT /products_v2/_mapping
{
  "properties": {
    "seller": {
      "type": "keyword"
    }
  }
}

删除索引

bash 复制代码
# 删除单个索引
DELETE /products

# 删除多个索引
DELETE /products_v2,logs_2024

# 删除所有索引(危险操作!)
DELETE /_all

⚠️警告:删除索引是不可逆操作,数据会全部丢失!生产环境一定要谨慎!


文档CRUD操作

创建文档(Create)

bash 复制代码
# 方式1:指定ID(如果ID存在则更新)
PUT /products_v2/_doc/1
{
  "name": "MacBook Pro 16英寸",
  "description": "苹果笔记本电脑,性能强劲,适合开发",
  "price": 19999.00,
  "stock": 10,
  "category": "电脑",
  "tags": ["笔记本", "苹果", "高性能"],
  "is_active": true,
  "created_at": "2024-01-15 14:30:00"
}

# 方式2:自动生成ID
POST /products_v2/_doc
{
  "name": "iPhone 15 Pro",
  "description": "最新款苹果手机",
  "price": 7999.00,
  "stock": 50,
  "category": "手机",
  "tags": ["手机", "苹果", "5G"],
  "is_active": true,
  "created_at": "2024-01-16 09:15:00"
}

返回结果解析

json 复制代码
{
  "_index": "products_v2",        // 索引名
  "_type": "_doc",                // 文档类型
  "_id": "1",                     // 文档ID
  "_version": 1,                  // 版本号,每次更新+1
  "result": "created",            // created/updated/deleted
  "_shards": {
    "total": 2,                   // 写入的分片数(主分片+副本)
    "successful": 1,              // 成功写入数
    "failed": 0
  },
  "_seq_no": 0,                   // 序列号,用于并发控制
  "_primary_term": 1
}

读取文档(Read)

bash 复制代码
# 根据ID查询
GET /products_v2/_doc/1

# 查询多个文档
GET /products_v2/_mget
{
  "ids": ["1", "2", "3"]
}

# 判断文档是否存在
HEAD /products_v2/_doc/1

更新文档(Update)

bash 复制代码
# 方式1:全量更新(覆盖整个文档)
PUT /products_v2/_doc/1
{
  "name": "MacBook Pro 16英寸 M3",
  "description": "苹果笔记本电脑,性能强劲,适合开发",
  "price": 21999.00,
  "stock": 8,
  "category": "电脑",
  "tags": ["笔记本", "苹果", "高性能"],
  "is_active": true,
  "created_at": "2024-01-15 14:30:00"
}

# 方式2:部分更新(推荐)
POST /products_v2/_update/1
{
  "doc": {
    "price": 18999.00,
    "stock": 15
  }
}

# 方式3:脚本更新
POST /products_v2/_update/1
{
  "script": {
    "source": "ctx._source.stock += params.count",
    "params": {
      "count": 5
    }
  }
}

删除文档(Delete)

bash 复制代码
# 删除单个文档
DELETE /products_v2/_doc/1

# 根据查询条件删除
POST /products_v2/_delete_by_query
{
  "query": {
    "match": {
      "category": "手机"
    }
  }
}

批量操作(Bulk)

bash 复制代码
POST /_bulk
{ "index": { "_index": "products_v2", "_id": "3" } }
{ "name": "华为Mate60", "price": 4999.00, "category": "手机" }
{ "update": { "_index": "products_v2", "_id": "1" } }
{ "doc": { "stock": 20 } }
{ "delete": { "_index": "products_v2", "_id": "2" } }

注意

  • 每行一条命令,不能换行
  • 最后要有换行符
  • 批量操作是原子性的,一个失败不会影响其他

查询DSL深度解析

ES的查询功能非常强大,这里我挑重点讲。

1. 基本查询

bash 复制代码
# 查询所有文档
GET /products_v2/_search
{
  "query": {
    "match_all": {}
  }
}

# 分页查询
GET /products_v2/_search
{
  "query": {
    "match_all": {}
  },
  "from": 0,      // 起始位置
  "size": 10,     // 每页条数
  "sort": [
    { "price": { "order": "desc" } },
    { "created_at": { "order": "desc" } }
  ]
}

# 返回指定字段
GET /products_v2/_search
{
  "query": {
    "match_all": {}
  },
  "_source": ["name", "price", "category"],
  "size": 5
}

2. 全文检索查询

bash 复制代码
# match查询(分词查询)
GET /products_v2/_search
{
  "query": {
    "match": {
      "name": "苹果电脑"
    }
  }
}

# match_phrase查询(短语查询,要求词语顺序一致)
GET /products_v2/_search
{
  "query": {
    "match_phrase": {
      "name": "MacBook Pro"
    }
  }
}

# multi_match查询(多字段查询)
GET /products_v2/_search
{
  "query": {
    "multi_match": {
      "query": "苹果",
      "fields": ["name", "description", "tags"]
    }
  }
}

3. 精确查询

bash 复制代码
# term查询(精确匹配,不分词)
GET /products_v2/_search
{
  "query": {
    "term": {
      "category": "电脑"
    }
  }
}

# terms查询(多值匹配)
GET /products_v2/_search
{
  "query": {
    "terms": {
      "category": ["电脑", "手机"]
    }
  }
}

# range查询(范围查询)
GET /products_v2/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1000,
        "lte": 10000
      }
    }
  }
}

4. 复合查询

bash 复制代码
# bool查询(组合查询)
GET /products_v2/_search
{
  "query": {
    "bool": {
      "must": [                     // 必须满足
        { "match": { "name": "苹果" } }
      ],
      "filter": [                   // 过滤条件(不计算评分)
        { "range": { "price": { "gte": 5000 } } },
        { "term": { "is_active": true } }
      ],
      "should": [                   // 应该满足(满足加分)
        { "match": { "tags": "5G" } }
      ],
      "must_not": [                 // 必须不满足
        { "term": { "category": "平板" } }
      ]
    }
  }
}

must vs filter 的区别

  • must:会计算评分(_score),影响排序
  • filter:不计算评分,可以缓存,性能更好

建议:精确匹配、范围查询用filter,全文检索用must。

5. 聚合查询

bash 复制代码
# 统计每个类别的商品数量
GET /products_v2/_search
{
  "size": 0,    // 不返回文档,只返回聚合结果
  "aggs": {
    "category_count": {
      "terms": {
        "field": "category",
        "size": 10
      }
    }
  }
}

# 统计价格范围
GET /products_v2/_search
{
  "size": 0,
  "aggs": {
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 1000 },
          { "from": 1000, "to": 5000 },
          { "from": 5000 }
        ]
      }
    }
  }
}

# 嵌套聚合:统计每个类别的平均价格
GET /products_v2/_search
{
  "size": 0,
  "aggs": {
    "category_stats": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

6. 高亮查询

bash 复制代码
GET /products_v2/_search
{
  "query": {
    "match": {
      "name": "苹果"
    }
  },
  "highlight": {
    "fields": {
      "name": {
        "pre_tags": ["<em class='highlight'>"],
        "post_tags": ["</em>"]
      }
    }
  }
}

Java客户端实战

讲真,实际开发中我们都是用代码操作ES,下面我用Java演示一下。

1. 添加依赖

xml 复制代码
<!-- Elasticsearch Java High Level Client -->
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.17.0</version>
</dependency>

<!-- 如果版本不匹配,还需要引入elasticsearch -->
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>7.17.0</version>
</dependency>

2. 创建客户端

java 复制代码
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.apache.http.HttpHost;

@Configuration
public class ElasticsearchConfig {
    
    @Bean
    public RestHighLevelClient client() {
        // 集群地址配置
        HttpHost[] httpHosts = new HttpHost[]{
            new HttpHost("localhost", 9200, "http"),
            // 如果是集群,可以配置多个节点
            // new HttpHost("localhost", 9201, "http")
        };
        
        // 构建客户端
        RestClientBuilder builder = RestClient.builder(httpHosts);
        
        // 设置连接超时时间
        builder.setRequestConfigCallback(requestConfigBuilder -> {
            requestConfigBuilder.setConnectTimeout(5000);  // 连接超时5秒
            requestConfigBuilder.setSocketTimeout(60000);   // 响应超时60秒
            return requestConfigBuilder;
        });
        
        // 设置连接数
        builder.setHttpClientConfigCallback(httpClientBuilder -> {
            httpClientBuilder.setMaxConnTotal(100);        // 最大连接数
            httpClientBuilder.setMaxConnPerRoute(30);      // 每个路由最大连接数
            return httpClientBuilder;
        });
        
        return new RestHighLevelClient(builder);
    }
}

3. 索引操作

java 复制代码
@Service
public class IndexService {
    
    @Autowired
    private RestHighLevelClient client;
    
    /**
     * 创建索引
     */
    public boolean createIndex(String indexName) {
        try {
            // 索引配置
            Settings settings = Settings.builder()
                .put("index.number_of_shards", 3)
                .put("index.number_of_replicas", 1)
                .build();
            
            // 映射配置
            XContentBuilder mapping = XContentFactory.jsonBuilder()
                .startObject()
                    .startObject("properties")
                        .startObject("name")
                            .field("type", "text")
                            .field("analyzer", "ik_max_word")
                            .startObject("fields")
                                .startObject("keyword")
                                    .field("type", "keyword")
                                .endObject()
                            .endObject()
                        .endObject()
                        .startObject("price")
                            .field("type", "double")
                        .endObject()
                        .startObject("category")
                            .field("type", "keyword")
                        .endObject()
                        .startObject("created_at")
                            .field("type", "date")
                            .field("format", "yyyy-MM-dd HH:mm:ss")
                        .endObject()
                    .endObject()
                .endObject();
            
            CreateIndexRequest request = new CreateIndexRequest(indexName)
                .settings(settings)
                .mapping(mapping);
            
            CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
            return response.isAcknowledged();
            
        } catch (IOException e) {
            log.error("创建索引失败: {}", e.getMessage(), e);
            return false;
        }
    }
    
    /**
     * 判断索引是否存在
     */
    public boolean indexExists(String indexName) {
        try {
            GetIndexRequest request = new GetIndexRequest(indexName);
            return client.indices().exists(request, RequestOptions.DEFAULT);
        } catch (IOException e) {
            log.error("检查索引失败: {}", e.getMessage(), e);
            return false;
        }
    }
    
    /**
     * 删除索引
     */
    public boolean deleteIndex(String indexName) {
        try {
            DeleteIndexRequest request = new DeleteIndexRequest(indexName);
            AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
            return response.isAcknowledged();
        } catch (IOException e) {
            log.error("删除索引失败: {}", e.getMessage(), e);
            return false;
        }
    }
}

4. 文档操作

java 复制代码
@Service
public class DocumentService {
    
    @Autowired
    private RestHighLevelClient client;
    
    /**
     * 添加文档
     */
    public String addDocument(String index, String id, Object document) {
        try {
            // 将对象转为JSON
            String json = JSON.toJSONStringWithDateFormat(document, "yyyy-MM-dd HH:mm:ss");
            
            IndexRequest request = new IndexRequest(index)
                .id(id)
                .source(json, XContentType.JSON);
            
            IndexResponse response = client.index(request, RequestOptions.DEFAULT);
            return response.getId();
            
        } catch (IOException e) {
            log.error("添加文档失败: {}", e.getMessage(), e);
            throw new RuntimeException("添加文档失败", e);
        }
    }
    
    /**
     * 批量添加文档
     */
    public boolean bulkAddDocuments(String index, List<?> documents) {
        try {
            BulkRequest bulkRequest = new BulkRequest();
            
            for (Object doc : documents) {
                String json = JSON.toJSONStringWithDateFormat(doc, "yyyy-MM-dd HH:mm:ss");
                IndexRequest request = new IndexRequest(index)
                    .source(json, XContentType.JSON);
                bulkRequest.add(request);
            }
            
            BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
            
            // 检查是否有失败
            if (response.hasFailures()) {
                log.error("批量添加文档失败: {}", response.buildFailureMessage());
                return false;
            }
            
            return true;
            
        } catch (IOException e) {
            log.error("批量添加文档失败: {}", e.getMessage(), e);
            return false;
        }
    }
    
    /**
     * 根据ID查询文档
     */
    public <T> T getDocumentById(String index, String id, Class<T> clazz) {
        try {
            GetRequest request = new GetRequest(index, id);
            GetResponse response = client.get(request, RequestOptions.DEFAULT);
            
            if (response.isExists()) {
                String json = response.getSourceAsString();
                return JSON.parseObject(json, clazz);
            }
            
            return null;
            
        } catch (IOException e) {
            log.error("查询文档失败: {}", e.getMessage(), e);
            throw new RuntimeException("查询文档失败", e);
        }
    }
    
    /**
     * 更新文档
     */
    public boolean updateDocument(String index, String id, Map<String, Object> updates) {
        try {
            UpdateRequest request = new UpdateRequest(index, id)
                .doc(updates);
            
            UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
            return response.getResult() == DocWriteResponse.Result.UPDATED;
            
        } catch (IOException e) {
            log.error("更新文档失败: {}", e.getMessage(), e);
            return false;
        }
    }
    
    /**
     * 删除文档
     */
    public boolean deleteDocument(String index, String id) {
        try {
            DeleteRequest request = new DeleteRequest(index, id);
            DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
            return response.getResult() == DocWriteResponse.Result.DELETED;
            
        } catch (IOException e) {
            log.error("删除文档失败: {}", e.getMessage(), e);
            return false;
        }
    }
}

5. 搜索操作

java 复制代码
@Service
public class SearchService {
    
    @Autowired
    private RestHighLevelClient client;
    
    /**
     * 全文检索
     */
    public <T> SearchResult<T> search(String index, String keyword, 
                                       int pageNum, int pageSize, 
                                       Class<T> clazz) {
        try {
            SearchRequest request = new SearchRequest(index);
            
            // 构建查询
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            
            // 多字段匹配
            MultiMatchQueryBuilder queryBuilder = QueryBuilders
                .multiMatchQuery(keyword, "name", "description", "tags")
                .type(MultiMatchQueryBuilder.Type.BEST_FIELDS);
            
            sourceBuilder.query(queryBuilder);
            
            // 分页
            sourceBuilder.from((pageNum - 1) * pageSize);
            sourceBuilder.size(pageSize);
            
            // 排序
            sourceBuilder.sort("price", SortOrder.DESC);
            
            // 高亮
            HighlightBuilder highlightBuilder = new HighlightBuilder();
            highlightBuilder.field("name");
            highlightBuilder.preTags("<em class='highlight'>");
            highlightBuilder.postTags("</em>");
            sourceBuilder.highlighter(highlightBuilder);
            
            request.source(sourceBuilder);
            
            // 执行查询
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            
            // 解析结果
            SearchHits hits = response.getHits();
            List<T> data = new ArrayList<>();
            
            for (SearchHit hit : hits.getHits()) {
                String json = hit.getSourceAsString();
                T obj = JSON.parseObject(json, clazz);
                
                // 处理高亮
                Map<String, HighlightField> highlightFields = hit.getHighlightFields();
                if (highlightFields.containsKey("name")) {
                    String highlightName = highlightFields.get("name").fragments()[0].string();
                    // 这里可以通过反射设置高亮字段
                }
                
                data.add(obj);
            }
            
            return new SearchResult<>(
                hits.getTotalHits().value,
                pageNum,
                pageSize,
                data
            );
            
        } catch (IOException e) {
            log.error("搜索失败: {}", e.getMessage(), e);
            throw new RuntimeException("搜索失败", e);
        }
    }
    
    /**
     * 聚合查询
     */
    public Map<String, Long> aggregateCategoryCount(String index) {
        try {
            SearchRequest request = new SearchRequest(index);
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            
            // 聚合
            TermsAggregationBuilder aggregation = AggregationBuilders
                .terms("category_count")
                .field("category")
                .size(10);
            
            sourceBuilder.aggregation(aggregation);
            sourceBuilder.size(0);
            request.source(sourceBuilder);
            
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            
            // 解析聚合结果
            Terms terms = response.getAggregations().get("category_count");
            Map<String, Long> result = new HashMap<>();
            
            for (Terms.Bucket bucket : terms.getBuckets()) {
                result.put(bucket.getKeyAsString(), bucket.getDocCount());
            }
            
            return result;
            
        } catch (IOException e) {
            log.error("聚合查询失败: {}", e.getMessage(), e);
            throw new RuntimeException("聚合查询失败", e);
        }
    }
    
    /**
     * 复杂查询示例
     */
    public <T> SearchResult<T> complexSearch(ProductSearchParam param, Class<T> clazz) {
        try {
            SearchRequest request = new SearchRequest(param.getIndex());
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            
            // Bool查询
            BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
            
            // 关键字搜索
            if (StringUtils.isNotBlank(param.getKeyword())) {
                boolQuery.must(QueryBuilders
                    .multiMatchQuery(param.getKeyword(), "name", "description")
                    .type(MultiMatchQueryBuilder.Type.BEST_FIELDS));
            }
            
            // 分类过滤
            if (StringUtils.isNotBlank(param.getCategory())) {
                boolQuery.filter(QueryBuilders.termQuery("category", param.getCategory()));
            }
            
            // 价格范围
            if (param.getMinPrice() != null) {
                boolQuery.filter(QueryBuilders.rangeQuery("price").gte(param.getMinPrice()));
            }
            if (param.getMaxPrice() != null) {
                boolQuery.filter(QueryBuilders.rangeQuery("price").lte(param.getMaxPrice()));
            }
            
            // 标签过滤
            if (CollectionUtils.isNotEmpty(param.getTags())) {
                boolQuery.filter(QueryBuilders.termsQuery("tags", param.getTags()));
            }
            
            sourceBuilder.query(boolQuery);
            
            // 分页
            sourceBuilder.from((param.getPageNum() - 1) * param.getPageSize());
            sourceBuilder.size(param.getPageSize());
            
            // 排序
            if (StringUtils.isNotBlank(param.getSortField())) {
                SortOrder sortOrder = "desc".equalsIgnoreCase(param.getSortOrder()) 
                    ? SortOrder.DESC : SortOrder.ASC;
                sourceBuilder.sort(param.getSortField(), sortOrder);
            }
            
            request.source(sourceBuilder);
            
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            
            // 解析结果
            SearchHits hits = response.getHits();
            List<T> data = new ArrayList<>();
            
            for (SearchHit hit : hits.getHits()) {
                data.add(JSON.parseObject(hit.getSourceAsString(), clazz));
            }
            
            return new SearchResult<>(
                hits.getTotalHits().value,
                param.getPageNum(),
                param.getPageSize(),
                data
            );
            
        } catch (IOException e) {
            log.error("复杂查询失败: {}", e.getMessage(), e);
            throw new RuntimeException("查询失败", e);
        }
    }
}

6. 工具类封装

java 复制代码
@Component
public class ElasticsearchUtil {
    
    @Autowired
    private RestHighLevelClient client;
    
    /**
     * 通用分页查询
     */
    public <T> PageResult<T> pageQuery(String index, 
                                        QueryBuilder queryBuilder,
                                        int pageNum, 
                                        int pageSize,
                                        List<SortBuilder<?>> sorts,
                                        Class<T> clazz) {
        try {
            SearchRequest request = new SearchRequest(index);
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            
            sourceBuilder.query(queryBuilder);
            sourceBuilder.from((pageNum - 1) * pageSize);
            sourceBuilder.size(pageSize);
            
            // 排序
            if (CollectionUtils.isNotEmpty(sorts)) {
                sorts.forEach(sourceBuilder::sort);
            }
            
            request.source(sourceBuilder);
            
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            
            List<T> data = new ArrayList<>();
            for (SearchHit hit : hits.getHits()) {
                data.add(JSON.parseObject(hit.getSourceAsString(), clazz));
            }
            
            return new PageResult<>(
                hits.getTotalHits().value,
                pageNum,
                pageSize,
                data
            );
            
        } catch (IOException e) {
            throw new RuntimeException("查询失败", e);
        }
    }
    
    /**
     * 批量操作
     */
    public boolean bulkOperation(String index, List<DocOperation> operations) {
        try {
            BulkRequest bulkRequest = new BulkRequest();
            
            for (DocOperation op : operations) {
                switch (op.getType()) {
                    case INDEX:
                        IndexRequest indexRequest = new IndexRequest(index)
                            .id(op.getId())
                            .source(op.getData(), XContentType.JSON);
                        bulkRequest.add(indexRequest);
                        break;
                    case UPDATE:
                        UpdateRequest updateRequest = new UpdateRequest(index, op.getId())
                            .doc(op.getData());
                        bulkRequest.add(updateRequest);
                        break;
                    case DELETE:
                        DeleteRequest deleteRequest = new DeleteRequest(index, op.getId());
                        bulkRequest.add(deleteRequest);
                        break;
                }
            }
            
            BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
            return !response.hasFailures();
            
        } catch (IOException e) {
            throw new RuntimeException("批量操作失败", e);
        }
    }
}

性能优化技巧

1. 索引设计优化

json 复制代码
// 合理设置分片数
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,    // 根据数据量和节点数决定
    "number_of_replicas": 1   // 至少1个副本保证高可用
  }
}

// 关闭不必要的功能
PUT /my_index
{
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "index": true,          // 需要索引
        "store": false          // 不需要单独存储
      },
      "user_id": {
        "type": "keyword",
        "index": false          // 不需要检索,只存储
      }
    }
  }
}

2. 查询优化

bash 复制代码
# 使用filter代替must(精确匹配场景)
GET /products/_search
{
  "query": {
    "bool": {
      "filter": [                   # 使用filter,不计算评分
        { "term": { "category": "电脑" } },
        { "range": { "price": { "gte": 5000 } } }
      ]
    }
  }
}

# 避免深度分页,使用scroll或search_after
GET /products/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    { "_id": "desc" }
  ],
  "size": 10,
  "search_after": ["100"],         # 基于上一页的最后一个值
  "track_total_hits": false        # 不统计总数,提升性能
}

3. 写入优化

bash 复制代码
# 批量写入
POST /_bulk
{ "index": { "_index": "products", "_id": "1" } }
{ "name": "商品1", "price": 100 }
{ "index": { "_index": "products", "_id": "2" } }
{ "name": "商品2", "price": 200 }

# 调整刷新间隔(批量导入时)
PUT /products/_settings
{
  "refresh_interval": "30s"        # 默认1s,调大提升写入性能
}

# 批量导入完成后调回来
PUT /products/_settings
{
  "refresh_interval": "1s"
}

4. 内存优化

bash 复制代码
# 配置文件 elasticsearch.yml
# 设置JVM堆内存(不超过物理内存的50%,最大32GB)
-Xms8g
-Xmx8g

# 开启G1GC垃圾收集器
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200

5. 常用监控命令

bash 复制代码
# 查看集群状态
GET /_cluster/health

# 查看节点状态
GET /_cat/nodes?v

# 查看索引状态
GET /_cat/indices?v

# 查看分片分布
GET /_cat/shards?v

# 查看线程池
GET /_cat/thread_pool?v

总结与展望

好了,今天的分享就到这里。总结一下重点

核心概念

  • Cluster:集群,多个节点组成
  • Index:索引,类似数据库
  • Document:文档,一条数据
  • Shard:分片,实现水平扩展
  • Replica:副本,提高可用性

最佳实践

  1. 索引设计:提前规划好分片数,合理使用mapping
  2. 查询优化:能用filter就不用must,避免深度分页
  3. 批量操作:批量写入/删除,提升性能
  4. 监控告警:集群健康状态、节点状态要监控

常见问题

Q1:分片数设置多少合适?

  • 单个分片建议不超过50GB
  • 分片数 = 数据总量 / 50GB
  • 分片数最好等于数据节点数

Q2:ES和MySQL怎么同步?

  • 双写:应用层同时写入(复杂但实时)
  • 异步:使用MQ或Canal(最终一致性)
  • 定时任务:定期同步(非实时)

Q3:数据丢了怎么办?

  • 确保副本数>=1
  • 定期快照备份
  • 监控集群状态,及时告警

📚 延伸阅读


💬 结语

ES真的是个很强大的工具,但用好它也不容易。建议大家

  1. 先跑通Demo,理解核心概念
  2. 结合实际业务场景练习
  3. 关注性能优化和监控
  4. 多看官方文档和社区经验

如果这篇文章对你有帮助,记得点赞👍、收藏⭐、评论💬!有问题随时留言,我会一一回复的!


相关推荐
workflower2 小时前
小强地狱(Bug Hell)
大数据·bug·团队开发·需求分析·个人开发·结对编程
Yng Forever3 小时前
解决Elasticsearch端口冲突:修改cpolar端口
大数据·elasticsearch·搜索引擎
IManiy3 小时前
总结之数据清洗框架DBT
大数据
老徐电商数据笔记3 小时前
技术复盘第四篇:Kimball维度建模在电商场景的实战应用
大数据·数据仓库·技术面试
科技小金龙3 小时前
小程序/APP接入分账系统:4大核心注意事项,避开合规与技术坑
大数据·人工智能·小程序
科学最TOP3 小时前
xLSTM-Mixer:基于记忆混合的多变量时间序列预测
大数据·人工智能·算法·机器学习·时间序列
LF3_3 小时前
Centos7,单机搭建Hadoop3.3.6伪分布式集群
大数据·hadoop·伪分布式
x新观点3 小时前
2025年IDC服务商市场观察:博大数据在第三方数据中心排名中表现稳健
大数据·人工智能·云计算
YangYang9YangYan3 小时前
2026年中专学历考会计的证书选择路径
大数据·人工智能·学习