一、前言
最近在项目中需要实现对大文件的下载,但是文件过大会容易导致服务器下载失败,因为下载时间长容易导致请求超时,那么使用分片下载就解决了这样一个问题。
这里分析下分片下载有哪些好处:
1.大文件单次下载时,网络波动或中断会导致整个文件需重新下载,分片下载允许单独下载失败的分片,避免全量重新下载。
2.单次下载大文件请求都要占用服务器大量内存来缓冲数据,服务器很容易崩溃。分片下载让数据流式传输,内存占用更小,对服务器更友好。
3.传统大文件下载效率更低,而分片支持并行下载,总速度比单线程快得多。
4.单次下载大文件请求容易超时,而分片下载可以避免服务器超时中断。
二、如何分片下载文件?
1.添加maven依赖。
<dependency>
<groupId>io.minio</groupId>
<artifactId>minio</artifactId>
<version>8.4.3</version>
<exclusions>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.10.0</version>
</dependency>
2.minio连接配置
2.1、配置minio属性
import com.smartcitysz.dp.upload.constants.PlatformEnum;
import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
@Data
@Component
@ConfigurationProperties(prefix = "minio")
public class MinioProperties {
private String accessKey;
private String secretKey;
private String endpoint;
private String bucket;
/**
* 访问域名
*/
private String domain = "";
/**
* 启用存储
*/
private Boolean enableStorage = true;
/**
* 存储平台
*/
private PlatformEnum platform = PlatformEnum.MINIO;
/**
* 基础路径
*/
private String basePath = "";
}
2.2、配置minio连接,代码如下:
import com.smartcitysz.dp.minio.MinioService;
import com.smartcitysz.dp.minio.utils.MinioUtils;
import io.minio.MinioClient;
import org.springframework.boot.autoconfigure.condition.ConditionalOnBean;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
@EnableConfigurationProperties(MinioProperties.class )
public class MinioConfiguration {
@Bean
@ConditionalOnBean(MinioProperties.class)
public MinioClient minioClient(MinioProperties minioProperties) {
return MinioClient.builder()
.endpoint(minioProperties.getEndpoint())
.credentials(minioProperties.getAccessKey(), minioProperties.getSecretKey())
.build();
}
}
2.3、在YAML添加minio配置
minio:
endpoint: http://xxxxxx:1222
accessKey: xxxxxxxxx
secretKey: xxxxxxxxxxx
bucket: my-bucketName
3.编写实现文件分片下载的Service。
import cn.hutool.core.util.StrUtil;
import com.smartcitysz.corpus.upload.dto.DatasetSourceTargetDTO;
import com.smartcitysz.corpus.upload.exceptions.FileStorageRuntimeException;
import com.smartcitysz.corpus.upload.util.FileUtils;
import io.minio.GetObjectArgs;
import io.minio.MinioClient;
import io.minio.StatObjectArgs;
import io.minio.StatObjectResponse;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.stereotype.Service;
import org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBody;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.InputStream;
import java.net.URLEncoder;
@Service
@Slf4j
public class MinioFileProcessorService {
@Autowired
MinioProperties minioProperties;
@Autowired
MinioClient minioClient;
public long getContentLength(String bucketName, String objectName) throws Exception {
StatObjectResponse resp = minioClient.statObject(StatObjectArgs.builder().bucket(bucketName).object(objectName).build());
return resp.size();
}
public InputStream getInputStream(String bucket, String originFilePath, String fileName, Long offset, Long length) throws Exception {
GetObjectArgs.Builder builder = GetObjectArgs.builder()
.bucket(bucket)
.object((originFilePath == null ? "" : originFilePath) + fileName);
if (offset != null) {
builder.offset(offset);
}
if (length != null) {
builder.length(length);
}
return minioClient.getObject(builder.build());
}
public ResponseEntity<StreamingResponseBody> download(DatasetSourceTargetDTO targetDTO, HttpServletResponse response, HttpServletRequest request) {
log.info("targetDTO...:{}", targetDTO);
String encodedFileName = targetDTO.getSourceFileName();
// 使用UTF-8编码文件名
try {
encodedFileName = URLEncoder.encode(encodedFileName, "UTF-8").replace("+", "%20"); // '+' 被替换为 '%20' 以保持空格
} catch (Exception e) {
log.error("文件名编码失败", e);
}
response.setCharacterEncoding("UTF-8");
response.setHeader("Cache-Control", "no-cache");
// 设置Content-Type
String contentType = FileUtils.getContentType(encodedFileName);
return ResponseEntity.ok()
.contentType(MediaType.parseMediaType(contentType))
.header("Content-Disposition", "attachment; filename=" + encodedFileName)
.body(out -> {
try (ServletOutputStream sos = response.getOutputStream();
) {
//单个文件遍历
String targetObjectId = targetDTO.getTargetObjectId();
String sourceFileName = targetDTO.getSourceFileName();
if (StrUtil.isNotBlank(targetObjectId) && targetObjectId.startsWith("/")) {
targetObjectId = targetObjectId.substring(1);
}
log.info("targetObjectId...:{}", targetObjectId);
log.info("sourceFileName...:{}", sourceFileName);
long totalByte = getContentLength(minioProperties.getBucket(), targetObjectId);
long startByte = 0;
long endByte = totalByte - 1;
String range = request.getHeader("Range");
log.info("======range======={}", range);
if (StringUtils.isNotBlank(range) && range.contains("bytes=") && range.contains("-")) {
range = range.substring(range.lastIndexOf("=") + 1).trim();
String[] ranges = range.split("-");
if (ranges.length == 1) {
//类型一:bytes=-2343
if (range.startsWith("-")) {
endByte = Long.parseLong(ranges[0]);
}
//类型二:bytes=2343-
else if (range.endsWith("-")) {
startByte = Long.parseLong(ranges[0]);
if (FileUtils.isPdf(sourceFileName)) {
endByte = startByte + 1048575;
}
}
}
//类型三:bytes=22-2343
else if (ranges.length == 2) {
startByte = Long.parseLong(ranges[0]);
endByte = Long.parseLong(ranges[1]);
}
response.setStatus(206);
} else {
response.setStatus(200);
}
response.setHeader("ETag", "ETag");
//表明服务器支持分片加载
response.setHeader("Accept-Ranges", "bytes");
//Content-Range: bytes 0-65535/408244,表明此次返回的文件范围
response.setHeader("Content-Range", filterContentRangeByte("bytes " + startByte + "-" + endByte + "/" + totalByte));
//需要设置此属性,否则浏览器默认不会读取到响应头中的Accept-Ranges属性,因此会认为服务器端不支持分片,所以会直接全文下载
response.setHeader("Access-Control-Expose-Headers", "Accept-Ranges,Content-Range");
long contentLength = endByte - startByte + 1;
response.setContentLengthLong(contentLength);
try (InputStream fileInputStream = getInputStream(minioProperties.getBucket(), null,targetObjectId, startByte, contentLength)) {
org.apache.commons.io.IOUtils.copy(fileInputStream, out);
} catch (Exception e) {
log.error("下载文件失败. objectId-->" + targetObjectId, e);
throw new RuntimeException("下载文件失败");
}
} catch (Exception e) {
// 异常处理,关闭流等
log.error("下载文件失败....", e);
throw new RuntimeException("下载文件失败");
}
});
}
private String filterContentRangeByte(String input) {
return input.replaceAll("[\r\n]", "");
}
}
接着前端调用分片下载接口,就可以实现分片下载了。
总结:
1.分片下载将一个大文件拆分成多个小块(分片),并发或串行地从服务器下载,最后在本地合并还原成完整文件,提升下载效率和可靠性。
2.分片下载解决了传统大文件下载效率低下,同时让数据流式传输,内存占用更小,对服务器更友好,提升效率、稳定性和用户体验,所以大文件推荐用分片下载。