引言
在日常的企业运维和开发场景中,我们经常会遇到这样的情况:某个应用生成的日志文件需要分发到多台服务器做分析,或者一个新的模型文件需要快速同步到多台机器的共享目录。如果只是偶尔几份文件,手动复制尚可应付;但一旦文件数量庞大、体积巨大,或者需要频繁分发,手工操作就会变得低效且容易出错。更糟糕的是,网络抖动、目标文件已存在、甚至单次复制失败等问题,都会让整个流程变得复杂。

因此,一个可配置、支持多线程、具备失败重试和覆盖写入能力的自动化文件复制服务 ,就显得尤为重要。它不仅能节省大量人力,还能保证分发的稳定性和一致性。本文将带你从需求出发,逐步实现一个基于 Spring Boot 支持大文件流式/NIO复制、失败重试与覆盖写入的案例 。
Powered by Moshow 🚀🔥 | Show more 👉🌟 https://zhengkai.blog.csdn.net/
1. 背景与需求
在分布式系统或多机部署场景中,经常需要将本机文件复制到多台共享目录。例如日志归档、模型文件分发、配置同步等。常见需求包括:
-
多线程并发复制:提升大批量文件复制效率
-
配置化源/目标目录 :通过配置文件预设不同任务(如
gle、lis)的源目录和多个目标目录 -
任务队列与调度:API 调用后入队,每分钟定时检测并执行
-
批次与日志记录:记录批次 ID、开始/结束时间、总耗时,以及每个文件的复制耗时与结果
-
队列合并:相同任务名的请求合并为一次执行,避免重复
-
大文件支持:支持流式分块复制或 NIO 零拷贝复制
-
失败重试:每个文件失败时自动重试一次
-
覆盖写入:目标文件存在时强制覆盖
2. 配置文件(application.yml)
XML
#Powered by Moshow 🚀🔥 | Show more 👉🌟 https://zhengkai.blog.csdn.net/
copy:
executor:
corePoolSize: 8
maxPoolSize: 16
queueCapacity: 200
scheduler:
cron: "0 * * * * *" # 每分钟执行一次
behavior:
mode: "NIO" # STREAM 或 NIO
bufferBytes: 8388608 # 8MB 缓冲区
retryTimes: 1 # 每文件失败重试次数
tasks:
gle:
source: "/data/gle/source"
destinations:
- "//hostA/share/gle"
- "//hostB/share/gle"
lis:
source: "/data/lis/source"
destinations:
- "//hostA/share/lis"
- "//hostC/share/lis"
3. 核心设计
3.1 队列与合并
-
使用
ConcurrentHashMap.newKeySet()存储待执行任务,按任务名去重。 -
API 调用时入队,如果已存在则合并。
3.2 调度执行
-
使用
@Scheduled每分钟扫描队列,逐个执行任务。 -
每个任务生成一个批次 ID,记录执行时间与结果。
3.3 多线程复制
-
使用
ThreadPoolTaskExecutor提交(文件 × 目标目录)的复制任务。 -
每个文件复制任务独立执行,互不阻塞。
3.4 大文件复制实现
提供两种模式:
流式分块复制(STREAM)
java
private void streamCopy(Path src, Path dst, int bufferBytes) throws IOException {
//Powered by Moshow 🚀🔥 | Show more 👉🌟 https://zhengkai.blog.csdn.net/
try (InputStream in = Files.newInputStream(src);
OutputStream out = Files.newOutputStream(dst,
StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE)) {
byte[] buf = new byte[bufferBytes];
int n;
while ((n = in.read(buf)) != -1) {
out.write(buf, 0, n);
}
}
}
NIO 零拷贝复制
java
private void nioCopy(Path src, Path dst) throws IOException {
//Powered by Moshow 🚀🔥 | Show more 👉🌟 https://zhengkai.blog.csdn.net/
try (FileChannel in = FileChannel.open(src, StandardOpenOption.READ);
FileChannel out = FileChannel.open(dst,
StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE)) {
long size = in.size();
long pos = 0;
while (pos < size) {
long transferred = in.transferTo(pos, size - pos, out);
if (transferred <= 0) throw new IOException("Zero progress in transfer");
pos += transferred;
}
}
}
4. 失败重试与覆盖写入
每个文件复制时,若失败则自动重试一次:
java
private void copyWithRetry(Path file, Path sourceRoot, Path destDir, CopyBatchResult batch) {
int maxRetries = props.getBehavior().getRetryTimes();
int attempt = 0;
while (true) {
try {
copyOne(file, sourceRoot, destDir, batch);
return; // 成功
} catch (IOException e) {
if (attempt >= maxRetries) {
recordFailure(file, sourceRoot, destDir, batch, e);
return;
}
attempt++;
log.warn("Retry {}/{} copying {} -> {} due to: {}",
attempt, maxRetries, file, destDir, e.getMessage());
}
}
}
-
覆盖写入 :始终使用
StandardOpenOption.TRUNCATE_EXISTING或REPLACE_EXISTING。 -
失败记录:失败时写入日志与批次结果。
5. REST API
-
入队任务
POST /api/copy/enqueue/{taskName}入队并返回状态(ENQUEUED 或 MERGED)。 -
立即执行
POST /api/copy/run/{taskName}立即执行任务,返回批次结果。 -
查询历史
GET /api/copy/history返回批次执行历史。
6. 日志与监控
-
批次日志:批次 ID、任务名、开始/结束时间、总耗时、状态。
-
文件日志:源文件、目标目录、字节数、耗时、结果、错误信息。
-
监控建议:可将批次结果持久化到数据库,并暴露 Prometheus 指标(失败率、吞吐量、平均耗时)。
CopyExecutorService完整代码
java
@Service
@RequiredArgsConstructor
@Slf4j
class CopyExecutorService {
private final ThreadPoolTaskExecutor executor;
private final CopyProperties props;
private final List<CopyBatchResult> history = Collections.synchronizedList(new ArrayList<>());
public CopyBatchResult executeTask(String taskName, String batchLabel) {
CopyProperties.TaskDef def = props.getTasks().get(taskName);
if (def == null) throw new IllegalArgumentException("Unknown task: " + taskName);
CopyBatchResult result = new CopyBatchResult();
result.setBatchId(UUID.randomUUID().toString());
result.setTaskName(taskName);
result.setBatchLabel(batchLabel);
result.setStartTime(Instant.now());
Path sourceDir = Paths.get(def.getSource());
List<Path> destinations = def.getDestinations().stream().map(Paths::get).toList();
List<Path> files;
try (Stream<Path> s = Files.walk(sourceDir).filter(Files::isRegularFile)) {
files = s.toList();
} catch (IOException e) {
log.error("List source files failed: {}", e.getMessage(), e);
result.setStatus("FAILED");
result.setEndTime(Instant.now());
result.setTotalMillis(Duration.between(result.getStartTime(), result.getEndTime()).toMillis());
history.add(result);
return result;
}
List<CompletableFuture<Void>> futures = new ArrayList<>();
for (Path file : files) {
for (Path destDir : destinations) {
futures.add(CompletableFuture.runAsync(
() -> copyWithRetry(file, sourceDir, destDir, result), executor));
}
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
long failures = result.getDetails().stream().filter(d -> "FAILED".equals(d.getResult())).count();
long total = result.getDetails().size();
result.setEndTime(Instant.now());
result.setTotalMillis(Duration.between(result.getStartTime(), result.getEndTime()).toMillis());
result.setStatus(failures == 0 ? "SUCCESS" : (failures < total ? "PARTIAL" : "FAILED"));
history.add(result);
log.info("Batch {} task={} status={} files={} total={}ms",
result.getBatchId(), taskName, result.getStatus(), total, result.getTotalMillis());
return result;
}
private void copyWithRetry(Path file, Path sourceRoot, Path destDir, CopyBatchResult batch) {
int maxRetries = Math.max(0, props.getBehavior().getRetryTimes());
int attempt = 0;
while (true) {
try {
copyOne(file, sourceRoot, destDir, batch);
return; // 成功则返回
} catch (IOException e) {
if (attempt >= maxRetries) {
recordFailure(file, sourceRoot, destDir, batch, e);
return;
}
attempt++;
log.warn("Retry {}/{} copying {} -> {} due to: {}",
attempt, maxRetries, file, destDir, e.getMessage());
// 简单退避,可根据需要加随机抖动
try { Thread.sleep(500L * attempt); } catch (InterruptedException ignored) {}
}
}
}
private void copyOne(Path file, Path sourceRoot, Path destDir, CopyBatchResult batch) throws IOException {
long startNs = System.nanoTime();
FileCopyDetail detail = new FileCopyDetail();
detail.setSourceFile(file);
detail.setDestinationDir(destDir);
Path rel = sourceRoot.relativize(file);
Path target = destDir.resolve(rel);
Files.createDirectories(target.getParent());
// 覆盖写入:始终 REPLACE_EXISTING
CopyMode mode = CopyMode.valueOf(props.getBehavior().getMode().toUpperCase());
switch (mode) {
case STREAM -> streamCopy(file, target, props.getBehavior().getBufferBytes());
case NIO -> nioCopy(file, target);
}
detail.setResult("COPIED");
detail.setMessage("overwritten");
detail.setBytes(Files.size(file));
long millis = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
detail.setMillis(millis);
synchronized (batch) { batch.getDetails().add(detail); }
log.debug("File copied: {} -> {} in {} ms (mode={})", file, target, millis, mode);
}
private void recordFailure(Path file, Path sourceRoot, Path destDir,
CopyBatchResult batch, Exception e) {
long startNs = System.nanoTime(); // 失败也记录耗时(近似)
FileCopyDetail detail = new FileCopyDetail();
detail.setSourceFile(file);
detail.setDestinationDir(destDir);
detail.setResult("FAILED");
detail.setMessage(e.getMessage());
try { detail.setBytes(Files.size(file)); } catch (IOException ignored) {}
long millis = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
detail.setMillis(millis);
synchronized (batch) { batch.getDetails().add(detail); }
log.warn("Copy failed: {} -> {} ({})", file, destDir, e.getMessage());
}
// 流式分块复制(覆盖)
private void streamCopy(Path src, Path dst, int bufferBytes) throws IOException {
try (InputStream in = Files.newInputStream(src, StandardOpenOption.READ);
OutputStream out = Files.newOutputStream(dst,
StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE)) {
byte[] buf = new byte[Math.max(1024 * 1024, bufferBytes)]; // 至少 1MB
int n;
while ((n = in.read(buf)) != -1) {
out.write(buf, 0, n);
}
out.flush();
}
}
// NIO Channel 复制(覆盖)
private void nioCopy(Path src, Path dst) throws IOException {
try (FileChannel in = FileChannel.open(src, StandardOpenOption.READ);
FileChannel out = FileChannel.open(dst,
StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE)) {
long size = in.size();
long pos = 0;
while (pos < size) {
long transferred = in.transferTo(pos, size - pos, out);
if (transferred <= 0) {
// 某些文件系统可能返回 0,进行微小步进以避免死循环
transferred = out.transferFrom(in, pos, size - pos);
if (transferred <= 0) throw new IOException("Zero progress in transfer");
}
pos += transferred;
}
out.force(true);
}
}
public List<CopyBatchResult> listHistory() { return new ArrayList<>(history); }
}