导出数据缺失问题分析
共享的请求对象被并发修改
baseRequest 是共享对象,多个线程同时修改其 page 属性会导致数据错乱:
源代码如下
java
/**
* 并发处理数据
*/
private void processBatches(StudentPageRequest baseRequest,
int totalPages, ExcelWriter excelWriter,
WriteSheet writeSheet) {
if (totalPages <= 1) {
return;
}
List<CompletableFuture<Void>> futures = new ArrayList<>();
// 每页单独一个任务执行
for (int pageNum = 1; pageNum <= totalPages; pageNum++) {
final int currentPage = pageNum;
// 多任务执行
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
try {
// 设置当前页
baseRequest.setPage(currentPage);
PageResponse<List<StudentResponse>> pageData = this.queryPage(baseRequest);
// 并行处理并组装数据
List<List<String>> rowData = parallelConvertToRowData(pageData.getData());
synchronized (excelWriter) {
excelWriter.write(rowData, writeSheet);
}
log.info("已完成 {}/{} 页,数据量:{}", currentPage, totalPages, pageData.getData().size());
} catch (Exception e) {
throw new AppRuntimeException(ResponseCode.EXPORT_FILE_FAILED, e);
}
}, importTaskExecutor);
futures.add(future);
}
// 等待所有任务完成,30分钟超时
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.orTimeout(30, TimeUnit.MINUTES)
.join();
}
修改方案
方案1:为每个任务创建独立的请求对象(推荐)
java
/**
* 并发处理数据
*/
private void processBatches(StudentPageRequest baseRequest,
int totalPages, ExcelWriter excelWriter,
WriteSheet writeSheet) {
if (totalPages <= 1) {
return;
}
List<CompletableFuture<Void>> futures = new ArrayList<>();
// 每页单独一个任务执行
for (int pageNum = 1; pageNum <= totalPages; pageNum++) {
final int currentPage = pageNum;
// 为每个任务创建独立的请求对象,拷贝属性
StudentPageRequest pageRequest = createStudentPageRequest(baseRequest);
pageRequest.setPage(currentPage);
// 多任务执行
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
try {
PageResponse<List<StudentResponse>> pageData = this.queryPage(pageRequest);
// 并行处理并组装数据
List<List<String>> rowData = parallelConvertToRowData(pageData.getData());
synchronized (excelWriter) {
excelWriter.write(rowData, writeSheet);
}
log.info("已完成 {}/{} 页,数据量:{}", currentPage, totalPages, pageData.getData().size());
} catch (Exception e) {
throw new AppRuntimeException(ResponseCode.EXPORT_FILE_FAILED, e);
}
}, importTaskExecutor);
futures.add(future);
}
// 等待所有任务完成,30分钟超时
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.orTimeout(30, TimeUnit.MINUTES)
.join();
}
方案2:使用线程局部变量
java
/**
* 并发处理数据
*/
private void processBatches(StudentPageRequest baseRequest,
int totalPages, ExcelWriter excelWriter,
WriteSheet writeSheet) {
if (totalPages <= 1) {
return;
}
List<CompletableFuture<Void>> futures = new ArrayList<>();
// 每页单独一个任务执行
for (int pageNum = 1; pageNum <= totalPages; pageNum++) {
final int currentPage = pageNum;
// 多任务执行
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
try {
// 使用线程局部变量,拷贝属性
StudentPageRequest pageRequest = createStudentPageRequest(baseRequest);
pageRequest.setPage(currentPage);
PageResponse<List<StudentResponse>> pageData = this.queryPage(pageRequest);
// 并行处理并组装数据
List<List<String>> rowData = parallelConvertToRowData(pageData.getData());
synchronized (excelWriter) {
excelWriter.write(rowData, writeSheet);
}
log.info("已完成 {}/{} 页,数据量:{}", currentPage, totalPages, pageData.getData().size());
} catch (Exception e) {
throw new AppRuntimeException(ResponseCode.EXPORT_FILE_FAILED, e);
}
}, importTaskExecutor);
futures.add(future);
}
// 等待所有任务完成,30分钟超时
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.orTimeout(30, TimeUnit.MINUTES)
.join();
}
总结:
在多线程并发场景下,避免使用共享的对象,建议为每个人任务创建独立的对象。