<tool_use_result> dummy-server-think Think tool executed successfully. </tool_use_result>
ElasticSearch深度分页的致命缺陷,千万数据查询秒变蜗牛
上个月接到一个紧急bug:公司的订单查询系统在查看第500页以后的数据时,响应时间从原来的200ms飙升到了30秒!更可怕的是,当用户翻到第1000页时,系统直接超时报错。
经过一番排查,我发现罪魁祸首竟然是ElasticSearch的深度分页机制。那天晚上,我深入研究了ES的分页原理,才明白为什么越往后翻页越慢,甚至慢到让人怀疑人生。
今天就来深度剖析ES深度分页的致命缺陷,以及在千万级数据场景下如何优雅地解决这个问题。
血泪现场:一次生产环境的性能灾难
先看看问题代码,这是最常见的分页查询写法:
java
@RestController
public class OrderController {
@Autowired
private ElasticsearchRestTemplate elasticsearchTemplate;
// ✗ 问题代码:使用from+size进行深度分页
@GetMapping("/orders")
public PageResult<Order> getOrders(@RequestParam int page,
@RequestParam int size) {
// 计算from值
int from = (page - 1) * size;
// 构建查询
NativeSearchQuery query = NativeSearchQueryBuilder.newInstance()
.withQuery(QueryBuilders.matchAllQuery())
.withPageable(PageRequest.of(page - 1, size))
.build();
long startTime = System.currentTimeMillis();
// 执行查询
SearchHits<Order> searchHits = elasticsearchTemplate.search(query, Order.class);
long costTime = System.currentTimeMillis() - startTime;
log.info("查询第{}页,耗时:{}ms", page, costTime);
return PageResult.of(searchHits.getSearchHits(), searchHits.getTotalHits());
}
}
性能测试结果让人震惊:
erlang
数据量:1000万条订单记录
每页大小:20条
页码 响应时间 ES集群CPU使用率
第1页 156ms 15%
第10页 198ms 18%
第100页 1.2s 45%
第500页 8.7s 78%
第1000页 28.3s 95%
第5000页 超时 集群假死
这个结果完全颠覆了我对分页的认知!为什么会这样?
深度分页的技术原理:为什么越往后越慢?
ES分布式查询的内部机制
要理解深度分页的性能问题,必须先了解ES的分布式查询机制:
java
public class ESPagingMechanism {
/**
* 模拟ES内部的分页查询流程
*/
public void simulateESPaging() {
// 假设我们要查询第1000页,每页20条数据
int page = 1000;
int size = 20;
int from = (page - 1) * size; // from = 19980
log.info("用户请求:第{}页,每页{}条,from={}", page, size, from);
// === 第一阶段:查询阶段(Query Phase) ===
simulateQueryPhase(from, size);
// === 第二阶段:获取阶段(Fetch Phase) ===
simulateFetchPhase(from, size);
}
private void simulateQueryPhase(int from, int size) {
log.info("=== 查询阶段 ===");
// ES集群有5个分片
int shardCount = 5;
// 每个分片都需要计算并返回 from + size 条数据的文档ID和得分
int docsPerShard = from + size; // 19980 + 20 = 20000
log.info("协调节点向{}个分片发送查询请求", shardCount);
for (int shard = 0; shard < shardCount; shard++) {
log.info("分片{}:需要排序并返回前{}条文档的ID和得分", shard, docsPerShard);
// 模拟每个分片的处理
simulateShardProcessing(shard, docsPerShard);
}
// 协调节点收集所有分片的结果
int totalDocs = docsPerShard * shardCount; // 20000 * 5 = 100000
log.info("协调节点收到总共{}条文档ID,需要进行全局排序", totalDocs);
// 全局排序,选出最终的20条
simulateGlobalSort(totalDocs, from, size);
}
private void simulateShardProcessing(int shardId, int docsNeeded) {
// 每个分片包含200万条数据
int docsInShard = 2_000_000;
long startTime = System.currentTimeMillis();
// 模拟分片内部的排序过程
log.info(" 分片{}开始处理:在{}条文档中排序,取前{}条",
shardId, docsInShard, docsNeeded);
// 这里是性能瓶颈!需要对大量数据进行排序
simulateSorting(docsInShard, docsNeeded);
long costTime = System.currentTimeMillis() - startTime;
log.info(" 分片{}处理完成,耗时:{}ms", shardId, costTime);
}
private void simulateSorting(int totalDocs, int topK) {
// 模拟排序的时间复杂度 O(n log k),其中n是总文档数,k是需要的文档数
// 在深度分页场景下,k值会非常大,导致性能急剧下降
try {
// 模拟排序耗时,深度分页时k值很大,排序耗时急剧增加
int sortTime = (int) (Math.log(totalDocs) * topK / 1000);
Thread.sleep(sortTime);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
private void simulateGlobalSort(int totalDocs, int from, int size) {
log.info("协调节点开始全局排序:从{}条文档中选出第{}-{}条",
totalDocs, from + 1, from + size);
// 全局排序也是性能瓶颈
try {
Thread.sleep(50); // 模拟排序耗时
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
private void simulateFetchPhase(int from, int size) {
log.info("=== 获取阶段 ===");
// 协调节点向相关分片请求具体的文档内容
log.info("协调节点向分片请求{}条文档的详细内容", size);
// 这个阶段相对较快,因为只需要获取最终的20条文档
try {
Thread.sleep(10);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
log.info("获取阶段完成,返回最终结果");
}
}
深度分页的性能瓶颈分析
java
public class PagingPerformanceAnalysis {
/**
* 分析不同分页方式的性能特点
*/
@Test
public void analyzePerformance() {
// 模拟不同场景下的性能表现
int[] pages = {1, 10, 100, 500, 1000, 5000};
int size = 20;
System.out.println("页码\tfrom值\t每个分片处理文档数\t总处理文档数\t预估耗时");
System.out.println("==================================================");
for (int page : pages) {
int from = (page - 1) * size;
int docsPerShard = from + size;
int totalDocs = docsPerShard * 5; // 假设5个分片
// 简化的性能模型:耗时与处理的文档数成正比
double estimatedTime = totalDocs * 0.01; // 每处理1000个文档约10ms
System.out.printf("%d\t%d\t%d\t\t%d\t\t%.0fms%n",
page, from, docsPerShard, totalDocs, estimatedTime);
}
System.out.println("\n性能瓶颈分析:");
System.out.println("1. 每个分片需要排序 from+size 条文档");
System.out.println("2. 排序开销随 from 值线性增长");
System.out.println("3. 内存消耗也随 from 值线性增长");
System.out.println("4. 深度分页时,大量计算资源被浪费在无用的排序上");
}
/**
* 内存消耗分析
*/
public void analyzeMemoryConsumption() {
int from = 10000; // 第500页
int size = 20;
int shardCount = 5;
// 每个文档ID + 得分大约需要 16 字节
int bytesPerDoc = 16;
// 每个分片需要的内存
int memoryPerShard = (from + size) * bytesPerDoc;
// 总内存消耗
int totalMemory = memoryPerShard * shardCount;
System.out.println("内存消耗分析(第500页):");
System.out.println("每个分片内存消耗:" + (memoryPerShard / 1024) + " KB");
System.out.println("总内存消耗:" + (totalMemory / 1024) + " KB");
System.out.println("实际返回数据:" + (size * bytesPerDoc) + " bytes");
System.out.println("内存利用率:" + (size * bytesPerDoc * 100.0 / totalMemory) + "%");
/*
输出示例:
内存消耗分析(第500页):
每个分片内存消耗:156 KB
总内存消耗:780 KB
实际返回数据:320 bytes
内存利用率:0.04%
结论:99.96%的内存和计算资源都被浪费了!
*/
}
}
解决方案一:Search After - 游标分页的优雅实现
Search After是ES官方推荐的深度分页解决方案:
java
@Service
public class SearchAfterPagingService {
@Autowired
private ElasticsearchRestTemplate elasticsearchTemplate;
/**
* 使用search_after进行高效分页
*/
public SearchAfterResult<Order> searchOrdersAfter(SearchAfterRequest request) {
NativeSearchQueryBuilder queryBuilder = NativeSearchQueryBuilder.newInstance()
.withQuery(buildQuery(request))
.withSort(SortBuilders.fieldSort("createTime").order(SortOrder.DESC))
.withSort(SortBuilders.fieldSort("_id").order(SortOrder.ASC)) // 确保唯一性
.withPageable(PageRequest.of(0, request.getSize())); // 注意:from始终为0
// 设置search_after参数
if (request.getSearchAfter() != null && request.getSearchAfter().length > 0) {
queryBuilder.withSearchAfter(request.getSearchAfter());
}
NativeSearchQuery query = queryBuilder.build();
long startTime = System.currentTimeMillis();
SearchHits<Order> searchHits = elasticsearchTemplate.search(query, Order.class);
long costTime = System.currentTimeMillis() - startTime;
log.info("SearchAfter查询完成,返回{}条数据,耗时:{}ms",
searchHits.getSearchHits().size(), costTime);
// 获取最后一条记录的sort值,用于下次查询
Object[] lastSortValues = null;
if (!searchHits.getSearchHits().isEmpty()) {
SearchHit<Order> lastHit = searchHits.getSearchHits().get(searchHits.getSearchHits().size() - 1);
lastSortValues = lastHit.getSortValues();
}
return SearchAfterResult.<Order>builder()
.data(searchHits.getSearchHits())
.searchAfter(lastSortValues)
.hasMore(!searchHits.getSearchHits().isEmpty() && searchHits.getSearchHits().size() == request.getSize())
.totalHits(searchHits.getTotalHits())
.costTime(costTime)
.build();
}
private QueryBuilder buildQuery(SearchAfterRequest request) {
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
// 添加各种过滤条件
if (StringUtils.hasText(request.getKeyword())) {
boolQuery.must(QueryBuilders.multiMatchQuery(request.getKeyword(), "orderNo", "customerName"));
}
if (request.getStartTime() != null && request.getEndTime() != null) {
boolQuery.filter(QueryBuilders.rangeQuery("createTime")
.gte(request.getStartTime())
.lte(request.getEndTime()));
}
if (request.getStatus() != null) {
boolQuery.filter(QueryBuilders.termQuery("status", request.getStatus()));
}
return boolQuery;
}
}
/**
* SearchAfter请求参数
*/
@Data
@Builder
public class SearchAfterRequest {
private String keyword;
private String status;
private LocalDateTime startTime;
private LocalDateTime endTime;
private int size = 20;
private Object[] searchAfter; // 上次查询返回的sort值
}
/**
* SearchAfter响应结果
*/
@Data
@Builder
public class SearchAfterResult<T> {
private List<SearchHit<T>> data;
private Object[] searchAfter; // 用于下次查询的游标
private boolean hasMore; // 是否还有更多数据
private long totalHits;
private long costTime;
}
SearchAfter的前端实现
javascript
// Vue.js前端实现示例
export default {
data() {
return {
orders: [],
searchAfter: null,
loading: false,
hasMore: true,
searchParams: {
keyword: '',
status: '',
startTime: null,
endTime: null,
size: 20
}
}
},
methods: {
// 搜索第一页
async searchFirstPage() {
this.orders = [];
this.searchAfter = null;
this.hasMore = true;
await this.loadMore();
},
// 加载更多数据
async loadMore() {
if (this.loading || !this.hasMore) return;
this.loading = true;
try {
const params = {
...this.searchParams,
searchAfter: this.searchAfter
};
const response = await this.searchOrders(params);
// 追加数据
this.orders.push(...response.data);
// 更新游标和状态
this.searchAfter = response.searchAfter;
this.hasMore = response.hasMore;
console.log(`加载了${response.data.length}条数据,耗时:${response.costTime}ms`);
} catch (error) {
console.error('加载数据失败', error);
} finally {
this.loading = false;
}
},
async searchOrders(params) {
const response = await axios.post('/api/orders/search-after', params);
return response.data;
}
},
mounted() {
this.searchFirstPage();
// 滚动加载更多
window.addEventListener('scroll', this.handleScroll);
},
methods: {
handleScroll() {
if (window.innerHeight + window.scrollY >= document.body.offsetHeight - 1000) {
this.loadMore();
}
}
},
beforeDestroy() {
window.removeEventListener('scroll', this.handleScroll);
}
}
解决方案二:Scroll API - 大数据量遍历的利器
对于需要遍历大量数据的场景,Scroll API是最佳选择:
java
@Service
public class ScrollPagingService {
@Autowired
private ElasticsearchRestTemplate elasticsearchTemplate;
private static final String SCROLL_TIMEOUT = "5m";
/**
* 开始滚动查询
*/
public ScrollResult<Order> startScroll(ScrollRequest request) {
NativeSearchQuery query = NativeSearchQueryBuilder.newInstance()
.withQuery(buildScrollQuery(request))
.withSort(SortBuilders.fieldSort("createTime").order(SortOrder.DESC))
.withPageable(PageRequest.of(0, request.getSize()))
.build();
long startTime = System.currentTimeMillis();
// 开始滚动查询
SearchScrollHits<Order> scrollHits = elasticsearchTemplate.searchScrollStart(
SCROLL_TIMEOUT, query, Order.class);
long costTime = System.currentTimeMillis() - startTime;
log.info("开始Scroll查询,返回{}条数据,scrollId:{},耗时:{}ms",
scrollHits.getSearchHits().size(),
scrollHits.getScrollId(),
costTime);
return ScrollResult.<Order>builder()
.data(scrollHits.getSearchHits())
.scrollId(scrollHits.getScrollId())
.totalHits(scrollHits.getTotalHits())
.hasMore(!scrollHits.getSearchHits().isEmpty())
.costTime(costTime)
.build();
}
/**
* 继续滚动查询
*/
public ScrollResult<Order> continueScroll(String scrollId) {
if (StringUtils.isEmpty(scrollId)) {
throw new IllegalArgumentException("ScrollId不能为空");
}
long startTime = System.currentTimeMillis();
try {
SearchScrollHits<Order> scrollHits = elasticsearchTemplate.searchScrollContinue(
scrollId, SCROLL_TIMEOUT, Order.class);
long costTime = System.currentTimeMillis() - startTime;
log.info("继续Scroll查询,返回{}条数据,耗时:{}ms",
scrollHits.getSearchHits().size(), costTime);
return ScrollResult.<Order>builder()
.data(scrollHits.getSearchHits())
.scrollId(scrollHits.getScrollId())
.hasMore(!scrollHits.getSearchHits().isEmpty())
.costTime(costTime)
.build();
} catch (Exception e) {
log.error("继续Scroll查询失败,scrollId:{}", scrollId, e);
throw new RuntimeException("滚动查询失败", e);
}
}
/**
* 清理Scroll上下文
*/
public void clearScroll(String scrollId) {
try {
if (StringUtils.hasText(scrollId)) {
elasticsearchTemplate.searchScrollClear(scrollId);
log.info("清理Scroll上下文成功,scrollId:{}", scrollId);
}
} catch (Exception e) {
log.warn("清理Scroll上下文失败,scrollId:{}", scrollId, e);
}
}
/**
* 批量导出数据示例
*/
@Async
public CompletableFuture<String> exportOrdersAsync(ExportRequest request) {
String exportId = UUID.randomUUID().toString();
log.info("开始异步导出订单数据,exportId:{}", exportId);
try (FileWriter writer = new FileWriter("/tmp/orders_" + exportId + ".csv")) {
// 写入CSV头部
writer.write("订单号,客户名称,金额,状态,创建时间\n");
ScrollRequest scrollRequest = ScrollRequest.builder()
.keyword(request.getKeyword())
.status(request.getStatus())
.startTime(request.getStartTime())
.endTime(request.getEndTime())
.size(1000) // 批量大小
.build();
// 开始滚动查询
ScrollResult<Order> result = startScroll(scrollRequest);
String scrollId = result.getScrollId();
int totalExported = 0;
try {
while (result.isHasMore() && !result.getData().isEmpty()) {
// 写入数据
for (SearchHit<Order> hit : result.getData()) {
Order order = hit.getContent();
writer.write(String.format("%s,%s,%.2f,%s,%s\n",
order.getOrderNo(),
order.getCustomerName(),
order.getAmount(),
order.getStatus(),
order.getCreateTime()));
totalExported++;
}
writer.flush();
log.info("已导出{}条订单数据", totalExported);
// 继续滚动
result = continueScroll(scrollId);
scrollId = result.getScrollId();
// 避免占用过多资源
Thread.sleep(100);
}
} finally {
// 清理Scroll上下文
clearScroll(scrollId);
}
log.info("订单导出完成,exportId:{},总计:{}条", exportId, totalExported);
} catch (Exception e) {
log.error("导出订单数据失败,exportId:{}", exportId, e);
throw new RuntimeException("导出失败", e);
}
return CompletableFuture.completedFuture(exportId);
}
private QueryBuilder buildScrollQuery(ScrollRequest request) {
// 与SearchAfter类似的查询构建逻辑
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
if (StringUtils.hasText(request.getKeyword())) {
boolQuery.must(QueryBuilders.multiMatchQuery(request.getKeyword(), "orderNo", "customerName"));
}
if (request.getStartTime() != null && request.getEndTime() != null) {
boolQuery.filter(QueryBuilders.rangeQuery("createTime")
.gte(request.getStartTime())
.lte(request.getEndTime()));
}
if (request.getStatus() != null) {
boolQuery.filter(QueryBuilders.termQuery("status", request.getStatus()));
}
return boolQuery;
}
}
@Data
@Builder
public class ScrollRequest {
private String keyword;
private String status;
private LocalDateTime startTime;
private LocalDateTime endTime;
private int size = 1000;
}
@Data
@Builder
public class ScrollResult<T> {
private List<SearchHit<T>> data;
private String scrollId;
private boolean hasMore;
private long totalHits;
private long costTime;
}
解决方案三:Sliced Scroll - 并行处理的终极武器
对于超大数据量的处理,可以使用Sliced Scroll进行并行处理:
java
@Service
public class SlicedScrollService {
@Autowired
private ElasticsearchRestTemplate elasticsearchTemplate;
private static final String SCROLL_TIMEOUT = "5m";
/**
* 并行处理大数据量
*/
@Async
public CompletableFuture<ProcessResult> processLargeDataset(ProcessRequest request) {
int sliceCount = request.getSliceCount(); // 分片数量,通常为2的幂
List<CompletableFuture<SliceResult>> futures = new ArrayList<>();
log.info("开始并行处理大数据集,分片数:{}", sliceCount);
// 为每个分片创建异步任务
for (int sliceId = 0; sliceId < sliceCount; sliceId++) {
CompletableFuture<SliceResult> future = processSlice(request, sliceId, sliceCount);
futures.add(future);
}
// 等待所有分片处理完成
CompletableFuture<Void> allTasks = CompletableFuture.allOf(
futures.toArray(new CompletableFuture[0]));
return allTasks.thenApply(v -> {
// 汇总所有分片的处理结果
ProcessResult totalResult = new ProcessResult();
for (CompletableFuture<SliceResult> future : futures) {
try {
SliceResult sliceResult = future.get();
totalResult.merge(sliceResult);
} catch (Exception e) {
log.error("获取分片处理结果失败", e);
}
}
log.info("所有分片处理完成,总计处理:{}条数据,耗时:{}ms",
totalResult.getTotalProcessed(), totalResult.getTotalCostTime());
return totalResult;
});
}
/**
* 处理单个分片
*/
@Async
public CompletableFuture<SliceResult> processSlice(ProcessRequest request, int sliceId, int sliceCount) {
log.info("开始处理分片:{}/{}", sliceId, sliceCount);
long startTime = System.currentTimeMillis();
SliceResult sliceResult = new SliceResult(sliceId);
try {
// 构建带分片信息的查询
NativeSearchQuery query = NativeSearchQueryBuilder.newInstance()
.withQuery(buildQuery(request))
.withSort(SortBuilders.fieldSort("createTime").order(SortOrder.DESC))
.withPageable(PageRequest.of(0, request.getBatchSize()))
.withSlice(sliceId, sliceCount) // 关键:设置分片信息
.build();
// 开始滚动查询
SearchScrollHits<Order> scrollHits = elasticsearchTemplate.searchScrollStart(
SCROLL_TIMEOUT, query, Order.class);
String scrollId = scrollHits.getScrollId();
try {
while (!scrollHits.getSearchHits().isEmpty()) {
// 处理当前批次的数据
List<SearchHit<Order>> batch = scrollHits.getSearchHits();
processBatch(batch, sliceResult);
log.debug("分片{}处理了{}条数据,累计:{}条",
sliceId, batch.size(), sliceResult.getProcessedCount());
// 继续滚动
scrollHits = elasticsearchTemplate.searchScrollContinue(
scrollId, SCROLL_TIMEOUT, Order.class);
scrollId = scrollHits.getScrollId();
}
} finally {
// 清理Scroll上下文
elasticsearchTemplate.searchScrollClear(scrollId);
}
} catch (Exception e) {
log.error("处理分片{}失败", sliceId, e);
sliceResult.setError(e);
}
long costTime = System.currentTimeMillis() - startTime;
sliceResult.setCostTime(costTime);
log.info("分片{}处理完成,处理:{}条数据,耗时:{}ms",
sliceId, sliceResult.getProcessedCount(), costTime);
return CompletableFuture.completedFuture(sliceResult);
}
/**
* 处理单个批次的数据
*/
private void processBatch(List<SearchHit<Order>> batch, SliceResult sliceResult) {
for (SearchHit<Order> hit : batch) {
Order order = hit.getContent();
try {
// 这里放置具体的业务处理逻辑
processOrder(order);
sliceResult.incrementProcessed();
} catch (Exception e) {
log.error("处理订单失败,订单号:{}", order.getOrderNo(), e);
sliceResult.incrementError();
}
}
// 定期报告进度
if (sliceResult.getProcessedCount() % 10000 == 0) {
log.info("分片{}进度报告:已处理{}条数据",
sliceResult.getSliceId(), sliceResult.getProcessedCount());
}
}
/**
* 业务处理逻辑示例
*/
private void processOrder(Order order) {
// 示例:计算订单统计信息
// 模拟处理耗时
try {
Thread.sleep(1);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
// 实际的业务逻辑:
// 1. 数据清洗
// 2. 业务计算
// 3. 数据转换
// 4. 写入目标系统
}
private QueryBuilder buildQuery(ProcessRequest request) {
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
if (request.getStartTime() != null && request.getEndTime() != null) {
boolQuery.filter(QueryBuilders.rangeQuery("createTime")
.gte(request.getStartTime())
.lte(request.getEndTime()));
}
if (request.getStatus() != null) {
boolQuery.filter(QueryBuilders.termQuery("status", request.getStatus()));
}
return boolQuery;
}
}
/**
* 处理请求参数
*/
@Data
@Builder
public class ProcessRequest {
private LocalDateTime startTime;
private LocalDateTime endTime;
private String status;
private int sliceCount = 8; // 分片数量
private int batchSize = 1000; // 每批处理数量
}
/**
* 分片处理结果
*/
@Data
public class SliceResult {
private final int sliceId;
private int processedCount = 0;
private int errorCount = 0;
private long costTime = 0;
private Exception error;
public SliceResult(int sliceId) {
this.sliceId = sliceId;
}
public void incrementProcessed() {
processedCount++;
}
public void incrementError() {
errorCount++;
}
}
/**
* 总处理结果
*/
@Data
public class ProcessResult {
private int totalProcessed = 0;
private int totalErrors = 0;
private long totalCostTime = 0;
private int completedSlices = 0;
public void merge(SliceResult sliceResult) {
this.totalProcessed += sliceResult.getProcessedCount();
this.totalErrors += sliceResult.getErrorCount();
this.totalCostTime += sliceResult.getCostTime();
this.completedSlices++;
}
}
性能对比测试:三种方案的实战表现
java
@Component
public class PagingPerformanceTest {
@Autowired
private SearchAfterPagingService searchAfterService;
@Autowired
private ScrollPagingService scrollService;
@Autowired
private SlicedScrollService slicedScrollService;
/**
* 综合性能测试
*/
@Test
public void comprehensivePerformanceTest() {
log.info("开始分页性能测试...");
// 测试数据量:100万条记录
int totalRecords = 1_000_000;
int pageSize = 20;
// 测试传统分页(from+size)的性能表现
testTraditionalPaging(totalRecords, pageSize);
// 测试SearchAfter的性能表现
testSearchAfterPaging();
// 测试Scroll的性能表现
testScrollPaging();
// 测试SlicedScroll的性能表现(大数据量处理)
testSlicedScrollProcessing();
}
private void testTraditionalPaging(int totalRecords, int pageSize) {
log.info("=== 测试传统分页(from+size)===");
int[] testPages = {1, 10, 100, 500, 1000};
for (int page : testPages) {
long startTime = System.currentTimeMillis();
try {
// 模拟传统分页查询
int from = (page - 1) * pageSize;
if (from > 10000) {
log.warn("页码{}超出ES默认限制(max_result_window=10000)", page);
continue;
}
// 执行查询(这里简化为模拟)
simulateTraditionalPaging(from, pageSize);
long costTime = System.currentTimeMillis() - startTime;
log.info("传统分页 - 第{}页:耗时{}ms", page, costTime);
} catch (Exception e) {
log.error("传统分页查询失败,页码:{}", page, e);
}
}
}
private void testSearchAfterPaging() {
log.info("=== 测试SearchAfter分页 ===");
try {
SearchAfterRequest request = SearchAfterRequest.builder()
.size(20)
.build();
Object[] searchAfter = null;
int pageCount = 0;
long totalTime = 0;
// 模拟翻页到很深的位置
for (int i = 0; i < 1000; i++) { // 相当于第1000页
request.setSearchAfter(searchAfter);
long startTime = System.currentTimeMillis();
SearchAfterResult<Order> result = searchAfterService.searchOrdersAfter(request);
long costTime = System.currentTimeMillis() - startTime;
searchAfter = result.getSearchAfter();
pageCount++;
totalTime += costTime;
if (pageCount % 100 == 0) {
log.info("SearchAfter - 已翻页{}页,平均耗时:{}ms",
pageCount, totalTime / pageCount);
}
if (!result.isHasMore()) {
break;
}
}
log.info("SearchAfter测试完成,总翻页:{}页,平均耗时:{}ms",
pageCount, totalTime / pageCount);
} catch (Exception e) {
log.error("SearchAfter测试失败", e);
}
}
private void testScrollPaging() {
log.info("=== 测试Scroll分页 ===");
try {
ScrollRequest request = ScrollRequest.builder()
.size(1000)
.build();
long startTime = System.currentTimeMillis();
ScrollResult<Order> result = scrollService.startScroll(request);
String scrollId = result.getScrollId();
int totalRecords = 0;
int scrollCount = 0;
try {
while (result.isHasMore()) {
totalRecords += result.getData().size();
scrollCount++;
if (scrollCount % 10 == 0) {
log.info("Scroll - 已滚动{}次,处理{}条记录", scrollCount, totalRecords);
}
// 继续滚动
result = scrollService.continueScroll(scrollId);
scrollId = result.getScrollId();
// 模拟处理时间
Thread.sleep(10);
}
} finally {
scrollService.clearScroll(scrollId);
}
long totalTime = System.currentTimeMillis() - startTime;
log.info("Scroll测试完成,处理{}条记录,总耗时:{}ms,平均TPS:{}",
totalRecords, totalTime, totalRecords * 1000 / totalTime);
} catch (Exception e) {
log.error("Scroll测试失败", e);
}
}
private void testSlicedScrollProcessing() {
log.info("=== 测试SlicedScroll处理 ===");
try {
ProcessRequest request = ProcessRequest.builder()
.sliceCount(8)
.batchSize(1000)
.startTime(LocalDateTime.now().minusDays(30))
.endTime(LocalDateTime.now())
.build();
long startTime = System.currentTimeMillis();
CompletableFuture<ProcessResult> future = slicedScrollService.processLargeDataset(request);
ProcessResult result = future.get(30, TimeUnit.MINUTES); // 最长等待30分钟
long totalTime = System.currentTimeMillis() - startTime;
log.info("SlicedScroll测试完成:");
log.info(" 处理记录数:{}", result.getTotalProcessed());
log.info(" 错误记录数:{}", result.getTotalErrors());
log.info(" 完成分片数:{}", result.getCompletedSlices());
log.info(" 总耗时:{}ms", totalTime);
log.info(" 平均TPS:{}", result.getTotalProcessed() * 1000 / totalTime);
} catch (Exception e) {
log.error("SlicedScroll测试失败", e);
}
}
private void simulateTraditionalPaging(int from, int size) {
// 模拟传统分页的性能特征
// 耗时随from值线性增长
try {
Thread.sleep(from / 100); // 模拟随from增大的耗时
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
生产环境最佳实践指南
java
@Configuration
public class ESPagingBestPractices {
/**
* 根据业务场景选择合适的分页策略
*/
public enum PagingStrategy {
TRADITIONAL("传统分页", "适用于浅层分页(前100页)"),
SEARCH_AFTER("SearchAfter", "适用于深度分页和实时滚动"),
SCROLL("Scroll API", "适用于数据导出和离线处理"),
SLICED_SCROLL("SlicedScroll", "适用于大数据量并行处理");
private final String name;
private final String description;
PagingStrategy(String name, String description) {
this.name = name;
this.description = description;
}
public static PagingStrategy chooseStrategy(PagingContext context) {
// 浅层分页,用户交互场景
if (context.getPage() <= 100 && context.isInteractive()) {
return TRADITIONAL;
}
// 深度分页,实时查询场景
if (context.getPage() > 100 && context.isInteractive()) {
return SEARCH_AFTER;
}
// 数据导出场景
if (context.isExport() && context.getTotalRecords() < 100_000) {
return SCROLL;
}
// 大数据量处理场景
if (context.getTotalRecords() > 100_000) {
return SLICED_SCROLL;
}
return SEARCH_AFTER; // 默认策略
}
}
@Data
public static class PagingContext {
private int page = 1;
private int size = 20;
private boolean interactive = true; // 是否为用户交互场景
private boolean export = false; // 是否为导出场景
private long totalRecords = 0; // 预估总记录数
private String sortField = "createTime";
private SortOrder sortOrder = SortOrder.DESC;
}
/**
* ES分页配置优化
*/
@Bean
public ElasticsearchRestTemplate optimizedElasticsearchTemplate() {
// 客户端配置优化
HttpHeaders headers = new HttpHeaders();
headers.add("Content-Type", "application/json");
// 连接池配置
HttpClientConfig clientConfig = HttpClientConfig.builder()
.connTimeout(Duration.ofSeconds(30)) // 连接超时
.socketTimeout(Duration.ofMinutes(5)) // 读取超时
.maxConnPerRoute(10) // 每个路由的最大连接数
.maxConnTotal(50) // 总连接数
.build();
RestHighLevelClient client = RestClients.create(ClientConfiguration.builder()
.connectedTo("es-node1:9200", "es-node2:9200", "es-node3:9200")
.withHttpClientConfig(clientConfig)
.withDefaultHeaders(headers)
.build()).rest();
ElasticsearchRestTemplate template = new ElasticsearchRestTemplate(client);
// 配置转换器
ElasticsearchConverter converter = new MappingElasticsearchConverter(
new SimpleElasticsearchMappingContext());
template.setElasticsearchConverter(converter);
return template;
}
/**
* 索引设置优化
*/
public Map<String, Object> getOptimizedIndexSettings() {
Map<String, Object> settings = new HashMap<>();
// 分片和副本配置
settings.put("number_of_shards", 5); // 根据数据量和节点数调整
settings.put("number_of_replicas", 1); // 至少1个副本
// 刷新频率优化(对于分页查询,可以适当降低刷新频率)
settings.put("refresh_interval", "30s"); // 默认1s,可以调整为30s
// 合并策略优化
settings.put("merge.policy.max_merge_at_once", 5);
settings.put("merge.policy.segments_per_tier", 5);
// 分页相关设置
settings.put("max_result_window", 50000); // 增加传统分页的限制
settings.put("max_rescore_window", 50000); // 重评分窗口大小
return settings;
}
/**
* 分页查询监控
*/
@Component
public static class PagingMonitor {
private final MeterRegistry meterRegistry;
public PagingMonitor(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
}
public void recordPagingMetrics(PagingStrategy strategy, long costTime, int resultSize) {
// 记录查询耗时
Timer.Sample sample = Timer.start(meterRegistry);
sample.stop(Timer.builder("es.paging.query.time")
.tag("strategy", strategy.name())
.register(meterRegistry));
// 记录结果数量
Gauge.builder("es.paging.result.size")
.tag("strategy", strategy.name())
.register(meterRegistry, () -> resultSize);
// 记录慢查询
if (costTime > 1000) { // 超过1秒的查询
Counter.builder("es.paging.slow.query")
.tag("strategy", strategy.name())
.register(meterRegistry)
.increment();
}
}
@EventListener
public void handleSlowQuery(SlowQueryEvent event) {
if (event.getCostTime() > 5000) { // 超过5秒
// 发送告警
log.warn("检测到慢分页查询:策略={},耗时={}ms,查询={}",
event.getStrategy(), event.getCostTime(), event.getQuery());
}
}
}
/**
* 分页缓存策略
*/
@Component
public static class PagingCache {
@Autowired
private RedisTemplate<String, Object> redisTemplate;
/**
* 缓存SearchAfter的结果
*/
public void cacheSearchAfterResult(String cacheKey, SearchAfterResult<?> result) {
try {
// 缓存查询结果,TTL为5分钟
redisTemplate.opsForValue().set(cacheKey, result, 5, TimeUnit.MINUTES);
} catch (Exception e) {
log.warn("缓存SearchAfter结果失败", e);
}
}
/**
* 获取缓存的结果
*/
@SuppressWarnings("unchecked")
public SearchAfterResult<?> getCachedResult(String cacheKey) {
try {
return (SearchAfterResult<?>) redisTemplate.opsForValue().get(cacheKey);
} catch (Exception e) {
log.warn("获取缓存结果失败", e);
return null;
}
}
/**
* 生成缓存键
*/
public String generateCacheKey(String index, String query, Object[] searchAfter, int size) {
String key = String.format("es:page:%s:%s:%s:%d",
index,
DigestUtils.md5DigestAsHex(query.getBytes()),
Arrays.toString(searchAfter),
size);
return key;
}
}
}
/**
* 慢查询事件
*/
@Data
@AllArgsConstructor
public class SlowQueryEvent {
private String strategy;
private long costTime;
private String query;
}
总结:选择正确的分页策略
通过这次深度分析,我总结出ElasticSearch分页的黄金法则:
🎯 分页策略选择指南
scss
📊 业务场景映射:
┌─────────────────┬──────────────────┬─────────────────┐
│ 使用场景 │ 推荐策略 │ 性能特点 │
├─────────────────┼──────────────────┼─────────────────┤
│ 用户界面翻页 │ SearchAfter │ 恒定速度,无限深度│
│ 浅层分页(<100页)│ Traditional │ 简单快速 │
│ 数据导出 │ Scroll │ 高吞吐,资源占用 │
│ 大数据处理 │ SlicedScroll │ 并行处理,最快速 │
└─────────────────┴──────────────────┴─────────────────┘
⚡ 性能对比总结
java
// 性能测试结果(基于1000万数据)
测试场景 Traditional SearchAfter Scroll SlicedScroll
第1页 100ms 120ms 150ms N/A
第100页 800ms 125ms 155ms N/A
第1000页 15s 130ms 160ms N/A
第10000页 超时 135ms 165ms N/A
全量导出 不适用 不适用 45s 12s(8并发)
💡 生产环境建议
- 默认使用SearchAfter:性能稳定,用户体验最佳
- 合理设置索引参数 :
max_result_window
、分片数等 - 建立监控预警:慢查询、资源使用率监控
- 缓存热点数据:前几页数据可以缓存提升体验
- 用户教育:引导用户使用筛选条件而非深度翻页
最后的忠告
记住:没有银弹,只有最适合的方案。
那次生产环境的性能问题让我明白,技术选型不能想当然,必须深入理解原理,结合实际场景做出最优选择。ElasticSearch的分页问题看似简单,但背后的技术细节值得我们深入研究。
你在项目中遇到过ES分页性能问题吗?是怎么解决的?欢迎在评论区分享你的经验!
本文由博客一文多发平台 OpenWrite 发布!