🔥Elasticsearch从入门到精通:核心概念与实战解析(附完整代码)
大家好,我是star-yp,今天咱们来聊聊搜索界的"扛把子"------Elasticsearch(简称ES)。这篇文章我准备了很久,希望能用通俗易懂的方式,带大家彻底搞懂ES的核心概念,并配合实战代码,让你看完就能上手!
📌 目录
ES到底是什么?
说实话,刚接触ES的时候,我也一脸懵逼。这玩意儿到底是数据库还是搜索引擎?
其实吧,ES本质上是一个基于Lucene的分布式搜索和分析引擎。它把复杂的全文检索功能封装成了RESTful API,让我们这些普通开发者也能轻松实现"百度一下"的效果。
为什么要用ES?
举个🌰:
- 你的电商网站需要商品搜索功能
- 日志系统需要快速检索海量日志
- 数据分析需要实时统计
传统关系型数据库(MySQL、Oracle)在全文检索方面确实力不从心,而ES就是为这个而生的!
核心概念扫盲
1. 集群(Cluster)
一个ES集群就是由一个或多个节点组成的集合,它们共同持有整个数据。说白了,就是一群服务器抱团取暖。
json
// 查看集群健康状态
GET /_cluster/health
// 返回结果
{
"cluster_name": "my-application",
"status": "green", // green(健康)、yellow(警告)、red(异常)
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 3,
"active_primary_shards": 10,
"active_shards": 20
}
2. 节点(Node)
集群中的每个服务器就是一个节点。节点分好几种类型:
- 主节点:负责集群管理
- 数据节点:负责数据存储和检索
- 协调节点:负责接收客户端请求
3. 索引(Index)
索引可以理解为关系型数据库里的"数据库"。它是一类相似文档的集合。
json
// 创建索引
PUT /my_index
{
"settings": {
"number_of_shards": 3, // 主分片数
"number_of_replicas": 1 // 副本数
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"price": {
"type": "double"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}
注意 :在ES 7.x之后,一个索引只能有一个type(默认为_doc),这是个大坑,很多老教程还在讲多type,别被坑了!
4. 文档(Document)
文档就是索引中的一条数据,用JSON格式表示。
json
// 添加文档
POST /my_index/_doc/1
{
"title": "苹果手机iPhone 15",
"price": 5999.00,
"created_at": "2024-01-15 10:30:00"
}
5. 分片(Shard)和副本(Replica)
分片 :数据量大的时候,ES会把索引分成多个分片存储,实现水平扩展。
副本:分片的拷贝,提高可用性和查询性能。
类比一下:
- 分片 = 把一本书分成几册
- 副本 = 每册书的影印本
环境搭建
Docker一键安装(推荐)
bash
# 拉取镜像
docker pull docker.elastic.co/elasticsearch/elasticsearch:7.17.0
# 运行容器
docker run -d \
--name es \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
docker.elastic.co/elasticsearch/elasticsearch:7.17.0
小贴士:
-Xms512m -Xmx512m设置JVM内存,开发环境够用discovery.type=single-node单节点模式,省得配置集群
安装Kibana(可视化工具)
bash
docker run -d \
--name kibana \
--link es:elasticsearch \
-p 5601:5601 \
docker.elastic.co/kibana/kibana:7.17.0
访问 http://localhost:5601 就能看到Kibana界面了,开发调试神器!
索引操作实战
创建索引
bash
# 最简方式
PUT /products
# 带配置的方式
PUT /products_v2
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "ik_max_word",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"description": {
"type": "text",
"analyzer": "ik_smart"
},
"price": {
"type": "double"
},
"stock": {
"type": "integer"
},
"category": {
"type": "keyword"
},
"tags": {
"type": "keyword"
},
"is_active": {
"type": "boolean"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
}
代码解析:
number_of_shards: 主分片数,一旦设置不可修改!number_of_replicas: 副本数,可以动态调整ik_max_word: 中文分词器,需要额外安装fields.multi-field: 一个字段多种类型,既能全文检索又能精确匹配
查看索引
bash
# 查看所有索引
GET /_cat/indices?v
# 查看索引映射
GET /products_v2/_mapping
# 查看索引设置
GET /products_v2/_settings
修改索引
bash
# 修改副本数(可以动态修改)
PUT /products_v2/_settings
{
"number_of_replicas": 2
}
# 添加字段映射
PUT /products_v2/_mapping
{
"properties": {
"seller": {
"type": "keyword"
}
}
}
删除索引
bash
# 删除单个索引
DELETE /products
# 删除多个索引
DELETE /products_v2,logs_2024
# 删除所有索引(危险操作!)
DELETE /_all
⚠️警告:删除索引是不可逆操作,数据会全部丢失!生产环境一定要谨慎!
文档CRUD操作
创建文档(Create)
bash
# 方式1:指定ID(如果ID存在则更新)
PUT /products_v2/_doc/1
{
"name": "MacBook Pro 16英寸",
"description": "苹果笔记本电脑,性能强劲,适合开发",
"price": 19999.00,
"stock": 10,
"category": "电脑",
"tags": ["笔记本", "苹果", "高性能"],
"is_active": true,
"created_at": "2024-01-15 14:30:00"
}
# 方式2:自动生成ID
POST /products_v2/_doc
{
"name": "iPhone 15 Pro",
"description": "最新款苹果手机",
"price": 7999.00,
"stock": 50,
"category": "手机",
"tags": ["手机", "苹果", "5G"],
"is_active": true,
"created_at": "2024-01-16 09:15:00"
}
返回结果解析:
json
{
"_index": "products_v2", // 索引名
"_type": "_doc", // 文档类型
"_id": "1", // 文档ID
"_version": 1, // 版本号,每次更新+1
"result": "created", // created/updated/deleted
"_shards": {
"total": 2, // 写入的分片数(主分片+副本)
"successful": 1, // 成功写入数
"failed": 0
},
"_seq_no": 0, // 序列号,用于并发控制
"_primary_term": 1
}
读取文档(Read)
bash
# 根据ID查询
GET /products_v2/_doc/1
# 查询多个文档
GET /products_v2/_mget
{
"ids": ["1", "2", "3"]
}
# 判断文档是否存在
HEAD /products_v2/_doc/1
更新文档(Update)
bash
# 方式1:全量更新(覆盖整个文档)
PUT /products_v2/_doc/1
{
"name": "MacBook Pro 16英寸 M3",
"description": "苹果笔记本电脑,性能强劲,适合开发",
"price": 21999.00,
"stock": 8,
"category": "电脑",
"tags": ["笔记本", "苹果", "高性能"],
"is_active": true,
"created_at": "2024-01-15 14:30:00"
}
# 方式2:部分更新(推荐)
POST /products_v2/_update/1
{
"doc": {
"price": 18999.00,
"stock": 15
}
}
# 方式3:脚本更新
POST /products_v2/_update/1
{
"script": {
"source": "ctx._source.stock += params.count",
"params": {
"count": 5
}
}
}
删除文档(Delete)
bash
# 删除单个文档
DELETE /products_v2/_doc/1
# 根据查询条件删除
POST /products_v2/_delete_by_query
{
"query": {
"match": {
"category": "手机"
}
}
}
批量操作(Bulk)
bash
POST /_bulk
{ "index": { "_index": "products_v2", "_id": "3" } }
{ "name": "华为Mate60", "price": 4999.00, "category": "手机" }
{ "update": { "_index": "products_v2", "_id": "1" } }
{ "doc": { "stock": 20 } }
{ "delete": { "_index": "products_v2", "_id": "2" } }
注意:
- 每行一条命令,不能换行
- 最后要有换行符
- 批量操作是原子性的,一个失败不会影响其他
查询DSL深度解析
ES的查询功能非常强大,这里我挑重点讲。
1. 基本查询
bash
# 查询所有文档
GET /products_v2/_search
{
"query": {
"match_all": {}
}
}
# 分页查询
GET /products_v2/_search
{
"query": {
"match_all": {}
},
"from": 0, // 起始位置
"size": 10, // 每页条数
"sort": [
{ "price": { "order": "desc" } },
{ "created_at": { "order": "desc" } }
]
}
# 返回指定字段
GET /products_v2/_search
{
"query": {
"match_all": {}
},
"_source": ["name", "price", "category"],
"size": 5
}
2. 全文检索查询
bash
# match查询(分词查询)
GET /products_v2/_search
{
"query": {
"match": {
"name": "苹果电脑"
}
}
}
# match_phrase查询(短语查询,要求词语顺序一致)
GET /products_v2/_search
{
"query": {
"match_phrase": {
"name": "MacBook Pro"
}
}
}
# multi_match查询(多字段查询)
GET /products_v2/_search
{
"query": {
"multi_match": {
"query": "苹果",
"fields": ["name", "description", "tags"]
}
}
}
3. 精确查询
bash
# term查询(精确匹配,不分词)
GET /products_v2/_search
{
"query": {
"term": {
"category": "电脑"
}
}
}
# terms查询(多值匹配)
GET /products_v2/_search
{
"query": {
"terms": {
"category": ["电脑", "手机"]
}
}
}
# range查询(范围查询)
GET /products_v2/_search
{
"query": {
"range": {
"price": {
"gte": 1000,
"lte": 10000
}
}
}
}
4. 复合查询
bash
# bool查询(组合查询)
GET /products_v2/_search
{
"query": {
"bool": {
"must": [ // 必须满足
{ "match": { "name": "苹果" } }
],
"filter": [ // 过滤条件(不计算评分)
{ "range": { "price": { "gte": 5000 } } },
{ "term": { "is_active": true } }
],
"should": [ // 应该满足(满足加分)
{ "match": { "tags": "5G" } }
],
"must_not": [ // 必须不满足
{ "term": { "category": "平板" } }
]
}
}
}
must vs filter 的区别:
- must:会计算评分(_score),影响排序
- filter:不计算评分,可以缓存,性能更好
建议:精确匹配、范围查询用filter,全文检索用must。
5. 聚合查询
bash
# 统计每个类别的商品数量
GET /products_v2/_search
{
"size": 0, // 不返回文档,只返回聚合结果
"aggs": {
"category_count": {
"terms": {
"field": "category",
"size": 10
}
}
}
}
# 统计价格范围
GET /products_v2/_search
{
"size": 0,
"aggs": {
"price_ranges": {
"range": {
"field": "price",
"ranges": [
{ "to": 1000 },
{ "from": 1000, "to": 5000 },
{ "from": 5000 }
]
}
}
}
}
# 嵌套聚合:统计每个类别的平均价格
GET /products_v2/_search
{
"size": 0,
"aggs": {
"category_stats": {
"terms": {
"field": "category"
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
6. 高亮查询
bash
GET /products_v2/_search
{
"query": {
"match": {
"name": "苹果"
}
},
"highlight": {
"fields": {
"name": {
"pre_tags": ["<em class='highlight'>"],
"post_tags": ["</em>"]
}
}
}
}
Java客户端实战
讲真,实际开发中我们都是用代码操作ES,下面我用Java演示一下。
1. 添加依赖
xml
<!-- Elasticsearch Java High Level Client -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.17.0</version>
</dependency>
<!-- 如果版本不匹配,还需要引入elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.17.0</version>
</dependency>
2. 创建客户端
java
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.apache.http.HttpHost;
@Configuration
public class ElasticsearchConfig {
@Bean
public RestHighLevelClient client() {
// 集群地址配置
HttpHost[] httpHosts = new HttpHost[]{
new HttpHost("localhost", 9200, "http"),
// 如果是集群,可以配置多个节点
// new HttpHost("localhost", 9201, "http")
};
// 构建客户端
RestClientBuilder builder = RestClient.builder(httpHosts);
// 设置连接超时时间
builder.setRequestConfigCallback(requestConfigBuilder -> {
requestConfigBuilder.setConnectTimeout(5000); // 连接超时5秒
requestConfigBuilder.setSocketTimeout(60000); // 响应超时60秒
return requestConfigBuilder;
});
// 设置连接数
builder.setHttpClientConfigCallback(httpClientBuilder -> {
httpClientBuilder.setMaxConnTotal(100); // 最大连接数
httpClientBuilder.setMaxConnPerRoute(30); // 每个路由最大连接数
return httpClientBuilder;
});
return new RestHighLevelClient(builder);
}
}
3. 索引操作
java
@Service
public class IndexService {
@Autowired
private RestHighLevelClient client;
/**
* 创建索引
*/
public boolean createIndex(String indexName) {
try {
// 索引配置
Settings settings = Settings.builder()
.put("index.number_of_shards", 3)
.put("index.number_of_replicas", 1)
.build();
// 映射配置
XContentBuilder mapping = XContentFactory.jsonBuilder()
.startObject()
.startObject("properties")
.startObject("name")
.field("type", "text")
.field("analyzer", "ik_max_word")
.startObject("fields")
.startObject("keyword")
.field("type", "keyword")
.endObject()
.endObject()
.endObject()
.startObject("price")
.field("type", "double")
.endObject()
.startObject("category")
.field("type", "keyword")
.endObject()
.startObject("created_at")
.field("type", "date")
.field("format", "yyyy-MM-dd HH:mm:ss")
.endObject()
.endObject()
.endObject();
CreateIndexRequest request = new CreateIndexRequest(indexName)
.settings(settings)
.mapping(mapping);
CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
return response.isAcknowledged();
} catch (IOException e) {
log.error("创建索引失败: {}", e.getMessage(), e);
return false;
}
}
/**
* 判断索引是否存在
*/
public boolean indexExists(String indexName) {
try {
GetIndexRequest request = new GetIndexRequest(indexName);
return client.indices().exists(request, RequestOptions.DEFAULT);
} catch (IOException e) {
log.error("检查索引失败: {}", e.getMessage(), e);
return false;
}
}
/**
* 删除索引
*/
public boolean deleteIndex(String indexName) {
try {
DeleteIndexRequest request = new DeleteIndexRequest(indexName);
AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
return response.isAcknowledged();
} catch (IOException e) {
log.error("删除索引失败: {}", e.getMessage(), e);
return false;
}
}
}
4. 文档操作
java
@Service
public class DocumentService {
@Autowired
private RestHighLevelClient client;
/**
* 添加文档
*/
public String addDocument(String index, String id, Object document) {
try {
// 将对象转为JSON
String json = JSON.toJSONStringWithDateFormat(document, "yyyy-MM-dd HH:mm:ss");
IndexRequest request = new IndexRequest(index)
.id(id)
.source(json, XContentType.JSON);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
return response.getId();
} catch (IOException e) {
log.error("添加文档失败: {}", e.getMessage(), e);
throw new RuntimeException("添加文档失败", e);
}
}
/**
* 批量添加文档
*/
public boolean bulkAddDocuments(String index, List<?> documents) {
try {
BulkRequest bulkRequest = new BulkRequest();
for (Object doc : documents) {
String json = JSON.toJSONStringWithDateFormat(doc, "yyyy-MM-dd HH:mm:ss");
IndexRequest request = new IndexRequest(index)
.source(json, XContentType.JSON);
bulkRequest.add(request);
}
BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
// 检查是否有失败
if (response.hasFailures()) {
log.error("批量添加文档失败: {}", response.buildFailureMessage());
return false;
}
return true;
} catch (IOException e) {
log.error("批量添加文档失败: {}", e.getMessage(), e);
return false;
}
}
/**
* 根据ID查询文档
*/
public <T> T getDocumentById(String index, String id, Class<T> clazz) {
try {
GetRequest request = new GetRequest(index, id);
GetResponse response = client.get(request, RequestOptions.DEFAULT);
if (response.isExists()) {
String json = response.getSourceAsString();
return JSON.parseObject(json, clazz);
}
return null;
} catch (IOException e) {
log.error("查询文档失败: {}", e.getMessage(), e);
throw new RuntimeException("查询文档失败", e);
}
}
/**
* 更新文档
*/
public boolean updateDocument(String index, String id, Map<String, Object> updates) {
try {
UpdateRequest request = new UpdateRequest(index, id)
.doc(updates);
UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
return response.getResult() == DocWriteResponse.Result.UPDATED;
} catch (IOException e) {
log.error("更新文档失败: {}", e.getMessage(), e);
return false;
}
}
/**
* 删除文档
*/
public boolean deleteDocument(String index, String id) {
try {
DeleteRequest request = new DeleteRequest(index, id);
DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
return response.getResult() == DocWriteResponse.Result.DELETED;
} catch (IOException e) {
log.error("删除文档失败: {}", e.getMessage(), e);
return false;
}
}
}
5. 搜索操作
java
@Service
public class SearchService {
@Autowired
private RestHighLevelClient client;
/**
* 全文检索
*/
public <T> SearchResult<T> search(String index, String keyword,
int pageNum, int pageSize,
Class<T> clazz) {
try {
SearchRequest request = new SearchRequest(index);
// 构建查询
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 多字段匹配
MultiMatchQueryBuilder queryBuilder = QueryBuilders
.multiMatchQuery(keyword, "name", "description", "tags")
.type(MultiMatchQueryBuilder.Type.BEST_FIELDS);
sourceBuilder.query(queryBuilder);
// 分页
sourceBuilder.from((pageNum - 1) * pageSize);
sourceBuilder.size(pageSize);
// 排序
sourceBuilder.sort("price", SortOrder.DESC);
// 高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("name");
highlightBuilder.preTags("<em class='highlight'>");
highlightBuilder.postTags("</em>");
sourceBuilder.highlighter(highlightBuilder);
request.source(sourceBuilder);
// 执行查询
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 解析结果
SearchHits hits = response.getHits();
List<T> data = new ArrayList<>();
for (SearchHit hit : hits.getHits()) {
String json = hit.getSourceAsString();
T obj = JSON.parseObject(json, clazz);
// 处理高亮
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
if (highlightFields.containsKey("name")) {
String highlightName = highlightFields.get("name").fragments()[0].string();
// 这里可以通过反射设置高亮字段
}
data.add(obj);
}
return new SearchResult<>(
hits.getTotalHits().value,
pageNum,
pageSize,
data
);
} catch (IOException e) {
log.error("搜索失败: {}", e.getMessage(), e);
throw new RuntimeException("搜索失败", e);
}
}
/**
* 聚合查询
*/
public Map<String, Long> aggregateCategoryCount(String index) {
try {
SearchRequest request = new SearchRequest(index);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 聚合
TermsAggregationBuilder aggregation = AggregationBuilders
.terms("category_count")
.field("category")
.size(10);
sourceBuilder.aggregation(aggregation);
sourceBuilder.size(0);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 解析聚合结果
Terms terms = response.getAggregations().get("category_count");
Map<String, Long> result = new HashMap<>();
for (Terms.Bucket bucket : terms.getBuckets()) {
result.put(bucket.getKeyAsString(), bucket.getDocCount());
}
return result;
} catch (IOException e) {
log.error("聚合查询失败: {}", e.getMessage(), e);
throw new RuntimeException("聚合查询失败", e);
}
}
/**
* 复杂查询示例
*/
public <T> SearchResult<T> complexSearch(ProductSearchParam param, Class<T> clazz) {
try {
SearchRequest request = new SearchRequest(param.getIndex());
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// Bool查询
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
// 关键字搜索
if (StringUtils.isNotBlank(param.getKeyword())) {
boolQuery.must(QueryBuilders
.multiMatchQuery(param.getKeyword(), "name", "description")
.type(MultiMatchQueryBuilder.Type.BEST_FIELDS));
}
// 分类过滤
if (StringUtils.isNotBlank(param.getCategory())) {
boolQuery.filter(QueryBuilders.termQuery("category", param.getCategory()));
}
// 价格范围
if (param.getMinPrice() != null) {
boolQuery.filter(QueryBuilders.rangeQuery("price").gte(param.getMinPrice()));
}
if (param.getMaxPrice() != null) {
boolQuery.filter(QueryBuilders.rangeQuery("price").lte(param.getMaxPrice()));
}
// 标签过滤
if (CollectionUtils.isNotEmpty(param.getTags())) {
boolQuery.filter(QueryBuilders.termsQuery("tags", param.getTags()));
}
sourceBuilder.query(boolQuery);
// 分页
sourceBuilder.from((param.getPageNum() - 1) * param.getPageSize());
sourceBuilder.size(param.getPageSize());
// 排序
if (StringUtils.isNotBlank(param.getSortField())) {
SortOrder sortOrder = "desc".equalsIgnoreCase(param.getSortOrder())
? SortOrder.DESC : SortOrder.ASC;
sourceBuilder.sort(param.getSortField(), sortOrder);
}
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 解析结果
SearchHits hits = response.getHits();
List<T> data = new ArrayList<>();
for (SearchHit hit : hits.getHits()) {
data.add(JSON.parseObject(hit.getSourceAsString(), clazz));
}
return new SearchResult<>(
hits.getTotalHits().value,
param.getPageNum(),
param.getPageSize(),
data
);
} catch (IOException e) {
log.error("复杂查询失败: {}", e.getMessage(), e);
throw new RuntimeException("查询失败", e);
}
}
}
6. 工具类封装
java
@Component
public class ElasticsearchUtil {
@Autowired
private RestHighLevelClient client;
/**
* 通用分页查询
*/
public <T> PageResult<T> pageQuery(String index,
QueryBuilder queryBuilder,
int pageNum,
int pageSize,
List<SortBuilder<?>> sorts,
Class<T> clazz) {
try {
SearchRequest request = new SearchRequest(index);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(queryBuilder);
sourceBuilder.from((pageNum - 1) * pageSize);
sourceBuilder.size(pageSize);
// 排序
if (CollectionUtils.isNotEmpty(sorts)) {
sorts.forEach(sourceBuilder::sort);
}
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
SearchHits hits = response.getHits();
List<T> data = new ArrayList<>();
for (SearchHit hit : hits.getHits()) {
data.add(JSON.parseObject(hit.getSourceAsString(), clazz));
}
return new PageResult<>(
hits.getTotalHits().value,
pageNum,
pageSize,
data
);
} catch (IOException e) {
throw new RuntimeException("查询失败", e);
}
}
/**
* 批量操作
*/
public boolean bulkOperation(String index, List<DocOperation> operations) {
try {
BulkRequest bulkRequest = new BulkRequest();
for (DocOperation op : operations) {
switch (op.getType()) {
case INDEX:
IndexRequest indexRequest = new IndexRequest(index)
.id(op.getId())
.source(op.getData(), XContentType.JSON);
bulkRequest.add(indexRequest);
break;
case UPDATE:
UpdateRequest updateRequest = new UpdateRequest(index, op.getId())
.doc(op.getData());
bulkRequest.add(updateRequest);
break;
case DELETE:
DeleteRequest deleteRequest = new DeleteRequest(index, op.getId());
bulkRequest.add(deleteRequest);
break;
}
}
BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
return !response.hasFailures();
} catch (IOException e) {
throw new RuntimeException("批量操作失败", e);
}
}
}
性能优化技巧
1. 索引设计优化
json
// 合理设置分片数
PUT /my_index
{
"settings": {
"number_of_shards": 3, // 根据数据量和节点数决定
"number_of_replicas": 1 // 至少1个副本保证高可用
}
}
// 关闭不必要的功能
PUT /my_index
{
"mappings": {
"properties": {
"content": {
"type": "text",
"index": true, // 需要索引
"store": false // 不需要单独存储
},
"user_id": {
"type": "keyword",
"index": false // 不需要检索,只存储
}
}
}
}
2. 查询优化
bash
# 使用filter代替must(精确匹配场景)
GET /products/_search
{
"query": {
"bool": {
"filter": [ # 使用filter,不计算评分
{ "term": { "category": "电脑" } },
{ "range": { "price": { "gte": 5000 } } }
]
}
}
}
# 避免深度分页,使用scroll或search_after
GET /products/_search
{
"query": {
"match_all": {}
},
"sort": [
{ "_id": "desc" }
],
"size": 10,
"search_after": ["100"], # 基于上一页的最后一个值
"track_total_hits": false # 不统计总数,提升性能
}
3. 写入优化
bash
# 批量写入
POST /_bulk
{ "index": { "_index": "products", "_id": "1" } }
{ "name": "商品1", "price": 100 }
{ "index": { "_index": "products", "_id": "2" } }
{ "name": "商品2", "price": 200 }
# 调整刷新间隔(批量导入时)
PUT /products/_settings
{
"refresh_interval": "30s" # 默认1s,调大提升写入性能
}
# 批量导入完成后调回来
PUT /products/_settings
{
"refresh_interval": "1s"
}
4. 内存优化
bash
# 配置文件 elasticsearch.yml
# 设置JVM堆内存(不超过物理内存的50%,最大32GB)
-Xms8g
-Xmx8g
# 开启G1GC垃圾收集器
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
5. 常用监控命令
bash
# 查看集群状态
GET /_cluster/health
# 查看节点状态
GET /_cat/nodes?v
# 查看索引状态
GET /_cat/indices?v
# 查看分片分布
GET /_cat/shards?v
# 查看线程池
GET /_cat/thread_pool?v
总结与展望
好了,今天的分享就到这里。总结一下重点:
核心概念
- Cluster:集群,多个节点组成
- Index:索引,类似数据库
- Document:文档,一条数据
- Shard:分片,实现水平扩展
- Replica:副本,提高可用性
最佳实践
- 索引设计:提前规划好分片数,合理使用mapping
- 查询优化:能用filter就不用must,避免深度分页
- 批量操作:批量写入/删除,提升性能
- 监控告警:集群健康状态、节点状态要监控
常见问题
Q1:分片数设置多少合适?
- 单个分片建议不超过50GB
- 分片数 = 数据总量 / 50GB
- 分片数最好等于数据节点数
Q2:ES和MySQL怎么同步?
- 双写:应用层同时写入(复杂但实时)
- 异步:使用MQ或Canal(最终一致性)
- 定时任务:定期同步(非实时)
Q3:数据丢了怎么办?
- 确保副本数>=1
- 定期快照备份
- 监控集群状态,及时告警
📚 延伸阅读
💬 结语
ES真的是个很强大的工具,但用好它也不容易。建议大家:
- 先跑通Demo,理解核心概念
- 结合实际业务场景练习
- 关注性能优化和监控
- 多看官方文档和社区经验
如果这篇文章对你有帮助,记得点赞👍、收藏⭐、评论💬!有问题随时留言,我会一一回复的!