大文件分片上传

大文件分片上传

很久没有输出文章了,最近在工作中需要实现大文件分片上传的功能,所以就写了一个简单的 demo 出来,分享一下。

首先说一下为什么要用大文件分片上传

  1. 超时问题:我们一般使用 axios 请求一般都有超时时间,大文件一旦很大上传必然会超时(当然你可以设置很长的超时时间)
  2. 网络中断重传:一般网络中断 或者遇到超时上传失败,我们就得重新再传了
  3. 内存占用过高:不管是浏览器读取文件的时候,还是服务器在接收的时候都需要占用很大的内存

图解思路:

前端部分

  • 先说单文件分片上传:需要定义的

    • 文件的唯一标识 文件的MD5
    • 文件分片的数组 chunks
    • 每个分片的大小
    • 每个分片的上传状态
    • 总得分片数量(判断是否需要合并)
    vue 复制代码
    <!-- template -->
    <el-upload ref="uploader" class="flex-center gap-20" action="" multiple :http-request="httpRequsetSubmit"
    				:auto-upload="false" :show-file-list="false" :disabled="isUploading" :on-remove="uploaderOnRemove"
    				:accept="acceptFileTypeString" :on-change="onChange" :limit="limit" :on-success="uploadFileOnSuccess"
    				:on-error="uploadFileOnError" :on-exceed="uploadFileExceed">
    				<template #trigger>
    					<el-button class="select-files" :disabled="isUploading || enContinue">
    						选择文件
    					</el-button>
    				</template>
    </el-upload>
    typescript 复制代码
    <!-- script -->
    const fileSparkMD5 = ref([]); // 文件MD5 唯一标识
    const fileChuncks = ref([]); // 文件分片list
    const chunckSize = ref(1*1024*1024); // 分片大小
    const promiseArr = []; // 分片上传promise集合
    const isUploadChuncks = ref([]); // 返回 [1,1,1,0,0,1] 格式数组(这里需要服务端返回的值是按照索引正序排列),标识对应下标上传状态 已上传:1 ,未上传:0
    const uploadProgress = ref(0); // 上传进度
    const uploadQuantity = ref(0); // 总上传数量
    <!-- httpRequsetSubmit -->
    function httpRequsetSubmit({
    	file,
    	onProgress,
    	onSuccess,
    	onError
    }: {
    	file: File
    	onProgress: Function
    	onSuccess: Function
    	onError: Function
    	onException?: Function
    }) {
       const data = await getFileMD5(file); // 获取文件 md5 使用 spark-md5
       fileSparkMD5.value.push({md5Value:data,fileKey:file.name});
       sliceFile(file);
       const isUploaded = await checkFile(md5Value); //是否上传过
      if(isUploaded) {
            const hasEmptyChunk = isUploadChuncks.value.findIndex(item => item === 0); //找没有上传过的分片继续上传
            if(hasEmptyChunk === -1) {
                ElMessage({message:'上传成功',type:'success'});
                return;
            }else {
                //上传缺失的分片文件,注意这里的索引,就是文件上传的序号
                for(let k = 0; k < isUploadChuncks.value.length; k++) {
                    if(isUploadChuncks.value[k] !== 1) {
                        const {md5Value,fileKey} = fileSparkMD5.value[0]; //单文件处理,多文件需要遍历匹配对应的文件
                        let data = new FormData();
                        data.append('totalNumber',fileChuncks.value.length); // 分片总数
                        data.append("chunkSize",chunckSize.value); // 分片大小
                        data.append("chunckNumber",k); // 分片序号
                        data.append('md5',md5Value); // 文件唯一标识
                        data.append('name',fileKey); // 文件名称
                        data.append('file',new File([fileChuncks.value[k].fileChuncks],fileKey)) //分片文件
                        httpRequest(data,k,fileChuncks.value.length);
                    }
                }
            }
        }else {
            //未上传,执行完整上传逻辑
            fileChuncks.value.forEach((e, i)=>{
                const {md5Value,fileKey} = fileSparkMD5.value.find(item => item.fileKey === e.fileName);
                let data = new FormData();
                data.append('totalNumber',fileChuncks.value.length);
                data.append("chunkSize",chunckSize.value);
                data.append("chunckNumber",i);
                data.append('md5',md5Value); //文件唯一标识
                data.append('name',fileKey);
                data.append('file',new File([e.fileChuncks],fileKey))
                httpRequest(data,i,fileChuncks.value.length); // 是一个 promise 
            })
        }
        let uploadResult = uploadResultRef.value;
        Promise.all(promiseArr).then((e)=>{
            uploadResult.innerHTML = '上传成功';
            // pormise all 机制,所有上传完毕,执行正常回调,开启合并文件操作
            mergeFile(fileSparkMD5.value,fileChuncks.value.length);
        }).catch(e=>{
            ElMessage({message:'文件未上传完整,请继续上传',type:'error'});
            uploadResult.innerHTML = '上传失败';
        })
    }
    
    //获取文件MD5,有的浏览器有文件大小限制
    function getFileMD5 (file) {
        return new Promise((resolve, reject) => {
            const fileReader = new FileReader();
            fileReader.onload = (e) =>{
                const fileMd5 = SparkMD5.ArrayBuffer.hash(e.target.result)
                resolve(fileMd5)
            }
            fileReader.onerror = (e) =>{
                reject('文件读取失败',e)
            }
            fileReader.readAsArrayBuffer(file);
        })
    }
    
    //对文件进行切片
    const sliceFile = (file) => {
        //文件分片之后的集合
        const chuncks = [];
        let start = 0 ;
        let end;
        while(start < file.size) {
            end = Math.min(start + chunckSize.value,file.size);
            //slice 截取文件字节
            chuncks.push({fileChuncks:file.slice(start,end),fileName:file.name}); 
            start = end;
        }
        fileChuncks.value = [...chuncks];
    }
    // 检查文件是否被上传过
    const checkFile = async (md5) => {
        const data = await api.checkChuncks({ md5: md5 });
        if (data.length === 0) {
            return false;
        }
        const {file_hash:fileHash,chunck_total_number:chunckTotal} = data[0]; // 文件的信息,hash值,分片总数,每条分片都是一致的内容
        if(fileHash === md5) {
            const allChunckStatusList = new Array(Number(chunckTotal)).fill(0); 
            const chunckNumberArr = data.map(item => item.chunck_number);  // 数据库中记录已上传的分片
            chunckNumberArr.forEach((item,index) => {
                allChunckStatusList[item] = 1
            });
            isUploadChuncks.value = [...allChunckStatusList];
            return true; // 返回是否上传过,为下面的秒传,断点续传做铺垫
        }else {
            return false;
        }
    }
  • 多文件分片上传 :步入真实需求 ,我们在上传的时候基本上都是多文件上传

    • 总逻辑和单文件上传相同
    • 每个文件都应该有自己的状态、分片数组、分片总数、文件名
    vue 复制代码
    <!-- template -->
    <div class="file-upload-view">
    		<div class="upload-status-container">
    			<el-table :data="mergedItem" style="width: 100%" max-height="700px">
    				<el-table-column prop="name" label="文件名称" width="300" />
    				<el-table-column prop="size" label="文件大小" :formatter="sizeformatter" />
    				<el-table-column prop="status" label="上传状态" :formatter="statusFormatter" />
    				<el-table-column prop="percentage" label="上传进度">
    					<template #default="scope">
    						<div v-if="fileUploadContexts[scope.row.uid]">
    							<el-progress :percentage="fileUploadContexts[scope.row.uid].progress"
    								:status="fileUploadContexts[scope.row.uid].isCompleted ? 'success' : ''" />
    						</div>
    						<span v-else class="text-primary text-15">准备中...</span>
    					</template>
    				</el-table-column>
    				<el-table-column label="操作" align="center" width="100px">
    					<template #default="scope">
    						<div v-if="scope.row.status != 'success'" class="flex-center cursor-pointer"
    							@click="onRemove(scope.row)">
    							<i class="i-material-symbols:scan-delete text-20"></i>
    						</div>
    					</template>
    				</el-table-column>
    			</el-table>
    		</div>
    		<div class="mt-20">
    			<!-- template -->
    			<el-upload ref="uploader" class="flex-center gap-20" action="" multiple :http-request="httpRequsetSubmit"
    				:auto-upload="false" :show-file-list="false" :disabled="isUploading" :on-remove="uploaderOnRemove"
    				:accept="acceptFileTypeString" :on-change="onChange" :limit="limit" :on-success="uploadFileOnSuccess"
    				:on-error="uploadFileOnError" :on-exceed="uploadFileExceed">
    				<template #trigger>
    					<el-button class="select-files" :disabled="isUploading || enContinue">
    						继续添加
    					</el-button>
    				</template>
    				<el-button v-if="isShowUploadButton" class="bg-primary text-white" :disabled="isUploading"
    					@click="startUpload">
    					开始上传
    				</el-button>
    				<el-button v-else class="bg-primary text-white" :disabled="isUploading" @click="onComplete">
    					完成
    				</el-button>
    			</el-upload>
    		</div>
    </div>
    typescript 复制代码
    <!-- script -->
    // 定义文件上传上下文
    interface FileChuncks {
    	fileChuncks: File
    	fileName: string
    	index:number
    }
    interface FileUploadContext {
      file: File;
      uid: string;
      md5: string;
      chunks: FileChuncks[];
      chunkStatus: number[]; // 分片状态数组
      uploadedChunks: number; // 已上传分片数
      totalChunks: number;
      chunkSize: number;
      isCompleted: boolean;
      progress: number;
    }
    const fileUploadContexts = ref<Record<string, FileUploadContext>>({});
    async function httpRequsetSubmit({
    	file,
    	onProgress,
    	onSuccess,
    	onError
    }: {
    	file: File
    	onProgress: Function
    	onSuccess: Function
    	onError: Function
    	onException?: Function
    }) {
    	// 获取文件的 uid
    	const uid = file.uid;
    	// 初始化文件上下文
    	const context: FileUploadContext = {
    		file,
    		uid,
    		md5: '',
    		chunks: [],
    		chunkStatus: [],
    		uploadedChunks: 0,
    		totalChunks: 0,
    		chunkSize: 1 * 1024 * 1024, // 1MB
    		isCompleted: false,
    		progress: 0
    	};
    	// 计算文件 MD5
    	try {
    		context.md5 = await getFileMD5(file);
    	} catch (err) {
    		onError(err);
    		return;
    	}
    	// 分片文件
    	sliceFile(file, context);
    	// 检查文件状态
    	try {
    		const hasUploaded = await checkFile(context);
    		if (hasUploaded) {
    			// 计算已上传分片数
    			context.uploadedChunks = context.chunkStatus.filter(status => status === 1).length;
    
    			// 如果所有分片都已上传,直接完成
    			if (context.uploadedChunks === context.totalChunks) {
    				fileUploadContexts.value[uid] = context;
    				context.progress = 100;
    				return await mergeFile(context)
    			}
    		}
    		// 添加上下文
    		fileUploadContexts.value[uid] = context;
    		// 开始上传缺失的分片
    		await uploadMissingChunks(context, onProgress);
    		// 所有分片上传完成后合并文件
    		return await mergeFile(context);
    	} catch (error) {
    		onError(error);
    	}
    }
    //=========================== 文件切片
    function sliceFile (file: File, context: FileUploadContext) {
    	//文件分片之后的集合
    	const chunks: FileChuncks[] = []
    	let start = 0
    	let end
    	while (start < file.size) {
    		end = Math.min(start + context.chunkSize, file.size)
    		//slice 截取文件字节
    		chunks.push({ fileChuncks: file.slice(start, end), fileName: file.name, index: chunks.length })
    		start = end
    	}
    	context.chunks = chunks;
    	context.totalChunks = chunks.length;
    	context.chunkStatus = new Array(chunks.length).fill(0); // 保证初始状态 未上传的是 0  比如切分成 32个 那就是 [0,0,0,......32个]
    }
    // ========================== 检测文件是否上传过,
    async function checkFile (context: FileUploadContext): Promise<boolean>  {
    	const response = await api.checkChuncks({ md5: context.md5 });
    	if (response.data.length === 0) {
    		return false;
    	}
    	const { file_hash: fileHash, chunck_total_number: chunckTotal } = response.data[0]; // 文件的信息,hash值,分片总数,每条分片都是一致的内容
    	if (fileHash === context.md5) {
    		const allChunckStatusList = new Array(context.totalChunks).fill(0); // 文件所有分片状态list,默认都填充为0(0: 未上传,1:已上传)
    		const chunckNumberArr = response.data.map(item => item.chunck_number); // 遍历已上传的分片,获取已上传分片对应的索引 (chunck_number为每个文件分片的索引)
    		console.log(allChunckStatusList, 'allChunckStatusList')
    		console.log(chunckNumberArr, 'chunckNumberArr')
    		chunckNumberArr.forEach((item, index) => {  // 遍历已上传分片的索引,将对应索引赋值为1,代表已上传的分片 (注意这里,服务端返回的值是按照索引正序排列)
    			if (item < context.totalChunks) {
    				allChunckStatusList[item] = 1;
    			}
    		});
    		context.chunkStatus = allChunckStatusList;
    		return true; // 返回是否上传过,为下面的秒传,断点续传做铺垫
    	}
    	return false;
    }
    // ========================== 上传缺失分片
    async function uploadMissingChunks(context: FileUploadContext, onProgress: Function) {
    	const uploadPromises: Promise<void>[] = [];
    	for (let i = 0; i < context.totalChunks; i++) {
    		if (context.chunkStatus[i] === 0) {
    			uploadPromises.push(uploadChunk(context, i, onProgress));
    		}
    	}
    	// 并行上传所有缺失分片
    	await Promise.all(uploadPromises);
    }
    // =========================== 上传方法 promise
    async function uploadChunk(context: FileUploadContext, index: number, onProgress: Function) {
    	const chunk = context.chunks[index];
    	const data = {
    		totalNumber: context.totalChunks,
    		chunkSize: context.chunkSize,
    		chunckNumber: index,
    		md5: context.md5,
    		name: context.file.name,
    	};
    	try {
    		await apiData.request(
    			new File([chunk.fileChuncks], context.file.name),
    			{
    				...apiData.data,
    				...data,
    				onUploadProgress: ({ percent }) => {
    					// 修复进度计算逻辑
    					const completedChunks = context.uploadedChunks;
    					const currentChunkProgress = percent;
    					console.log(currentChunkProgress,'currentChunkProgress')
    					const totalProgress = (completedChunks + currentChunkProgress) / context.totalChunks * 100;
    					console.log(totalProgress,'totalProgress')
    					context.progress = Math.min(100, Math.round(totalProgress));
    					// // 调用 Element Plus 的进度回调
    					onProgress({ percent: context.progress });
    				}
    			}
    		);
    
    		// 标记分片为已上传
    		context.chunkStatus[index] = 1;
    		context.uploadedChunks++;
    		// 更新进度到100% (确保分片完成后进度准确)
    		context.progress = Math.round(context.uploadedChunks / context.totalChunks * 100);
    		onProgress({ percent: context.progress });
    	} catch (err) {
    		throw new Error(`分片 ${index + 1}/${context.totalChunks} 上传失败: ${err.message}`);
    	}
    }
    // ====================== 合并文件
    async function mergeFile (context: FileUploadContext) {
    	const params = {
    		totalNumber: context.totalChunks,
    		md5: context.md5,
    		name: context.file.name
    	};
    	try {
    		const response = await api.merge(params);
    		context.isCompleted = true;
    		return response;
    	} catch (err) {
    		throw new Error(`文件合并失败: ${err.message}`);
    	}
    }

后端 nodeJs 部分

  • Koa 框架

  • 使用 koa-body 、mySql、koa-router

  • 设计数据表

    • chunks 表,应该包含 id、file_hash、file_name、chunk_total_number、chunk_size、chunk_number
    • files 表,id、file_name、file_hash、file_path、file_size
  • checkChunks 接口

    js 复制代码
    async function checkChuncks(ctx: Context) {
    	try {
    		const { md5 } = ctx.request.body;
    		 const queryStr = `SELECT 
                    (SELECT count(*)  FROM chunck_list WHERE file_hash = ?) as all_count, 
                    id as chunck_id,
                    file_hash,
                    chunck_number,
                    chunck_total_number 
                FROM chunck_list  
                WHERE file_hash = ? 
                GROUP BY id 
                ORDER BY chunck_number
                `
            const res = await files.checkChuncks(queryStr, [md5, md5])
    		ctx.body = {
    			result: 200,
    			msg: "获取成功",
    			data: res ?? []
    		}
    	}
    	catch (err: any) {
    		ctx.body = {
    			result: 500,
    			msg: "获取失败",
    			data: err.message
    		}
    	}
    }
  • upload 上传接口

    js 复制代码
    function handleFileUpload(ctx: Context) {
    	try {
    		const { totalNumber, chunckNumber, chunkSize, md5, name } = ctx.request.body;
    		// 指定 hash 文件路径
    		const chunckPath = path.join(uploadPath, 'chunks', md5, '/');
    		console.log(chunckPath)
    		if (!fs.existsSync(chunckPath)) {
    			fs.mkdirSync(chunckPath, { recursive: true })
    		}
    		console.log(totalNumber, 'totalNumber')
    		// 移动文件到指定目录
    		// 重点修改处:从 files 中获取上传的文件
    		const fileField = ctx.request.files?.file;
    		if (!fileField) {
    			throw new Error('未接收到文件');
    		}
    		const file = Array.isArray(fileField) ? fileField[0] : fileField;
    		// 5. 直接移动文件到目标位置
    		const targetPath = path.join(chunckPath, `${md5}-${chunckNumber}`);
    		fs.renameSync(file.filepath, targetPath);
    
    		// 插入数据到数据库
    		const sql = `
                INSERT INTO file_split.chunck_list 
                (file_hash,file_name, chunck_total_number, chunck_size ,chunck_number ) 
                VALUES (?, ?, ?, ?, ?)
            `;
    		const result = await files.insertFileChunks(sql, [md5, name, totalNumber, chunkSize, chunckNumber]);
    		console.log(result, '数据插入成功')
    
    		ctx.body = {
    			result: 200,
    			msg: "上传成功",
    			data: null
    		}
    	}
    	catch (err: any) {
    		console.error('文件上传出错:', err);
    		ctx.body = {
    			result: 500,
    			msg: "上传失败",
    			data: err.message
    		}
    	}
    }
  • mergeFile 合并接口(需要检查文件是否相同 md5 但是不同名,如果这样则使用同一个存储路径)

    javascript 复制代码
    async function mergeFile(ctx: Context) {
    	const { totalNumber, md5, name } = ctx.request.body;
    	const ext = path.extname(name);  // 获取文件扩展名
    	try {
    		// 1. 构建基于MD5的物理路径
    		const filePath = `uploads/${md5}${ext}`;
    		const fullPath = path.join(process.cwd(), filePath);
    		// 2. 查询数据库中所有相同MD5的记录
    		const md5Records = await checkMergeStatusInDB(md5) as Array<{
    			file_name: string
    			stored_name: string
    			file_hash: string
    			file_path: string
    			file_size: number
    		}>;
    		console.log(md5Records,'md5Records',name)
    		// 3. 检查是否已存在同名记录
    		const sameNameRecord = md5Records.find(record => record.file_name === name);
    		console.log(sameNameRecord,'sameNameRecord')
    		if (sameNameRecord) {
    			console.log(sameNameRecord,'sameNameRecord')
    			ctx.body = {
    				result: 200,
    				msg: "文件已存在",
    				data: {
    					fileName: sameNameRecord.file_name,
    					filePath: sameNameRecord.file_path,
    					fileSize: sameNameRecord.file_size,
    					md5: sameNameRecord.file_hash
    				}
    			};
    			return;
    		}
    		// 4. 检查物理文件是否存在
    		const physicalFileExists = fs.existsSync(fullPath);
    		if (physicalFileExists && md5Records.length > 0) {
    			// 情况2:物理文件存在,但不同名
    			// 使用第一条记录的文件大小(所有记录大小相同)
    			const fileSize = md5Records[0].file_size;
    
    			// 创建新记录
    			await createFileRecord(name, md5, filePath, fileSize)
    			ctx.body = {
    				result: 200,
    				msg: "文件关联成功",
    				data: {
    					fileName: name,
    					filePath: filePath,
    					fileSize: fileSize,
    					md5
    				}
    			};
    			return;
    		}
    		// 5. 文件不存在,执行合并
    		const chunckPath = path.join(uploadPath, 'chunks', md5, '/');
    		//读取对应hash文件夹下的所有分片文件名称
    		const chunckList = fs.existsSync(chunckPath) ? fs.readdirSync(chunckPath) : [];
    		//判断切片是否完整
    		console.log(chunckList.length, totalNumber, '我是总地址,和分片地址')
    		if (chunckList.length !== totalNumber) { // 没有传完
    			ctx.body = {
    				result: 500,
    				msg: "Merge failed, missing file slices",
    				data: null
    			}
    			process.exit();
    		}
    		// 创建可写流
    		const writeStream = fs.createWriteStream(filePath);
    		for (let index = 0; index < totalNumber; index++) {
    			const chunkFilePath = path.join(chunckPath, `${md5}-${index}`);
    			const readStream = fs.createReadStream(chunkFilePath);
    			// 使用promise来等待该分片完成
    			await new Promise<void>((resolve, reject) => {
    				readStream.pipe(writeStream, { end: false }); // 不自动结束可写流
    				readStream.on('end', () => {
    					// 删除分片文件
    					fs.unlink(chunkFilePath, (err) => {
    						if (err) {
    							reject(err);
    						} else {
    							resolve();
    						}
    					});
    				});
    				readStream.on('error', reject);
    			});
    		}
    		// 关闭可写流
    		writeStream.end();
    		// 等待流结束
    		await new Promise<void>((resolve) => writeStream.on('finish', resolve));
    		// 删除空文件夹
    		await fs.promises.rmdir(chunckPath);
    		// 7. 获取文件大小
            const fileSize = fs.statSync(fullPath).size;
    		await createFileRecord(name, md5, filePath, fileSize)
    		ctx.body = {
    			result: 200,
    			msg: "文件合并成功",
    			data: {
    				fileName: name,
    				filePath: filePath,
    				fileSize: fileSize,
    				md5
    			}
    		}
    	} catch (error) {
    		ctx.body = {
    			result: 500,
    			msg: "合并失败",
    			data: null
    		}
    	}
    }
    // 添加文件合并状态检查
    async function checkMergeStatusInDB(md5: string): Promise<[]> {
    	try {
    		const sql = `SELECT * FROM files WHERE file_hash = ?`;
    		const { results } = await files.checkMergeStatus(sql, [md5]);
    		return results;
    	} catch (error) {
    		console.error(`检查合并状态失败: ${error}`);
    		return [];
    	}
    }
    // 辅助方法:创建文件记录
    async function createFileRecord(fileName: string, fileHash: string, filePath: string, fileSize: number) {
        const sql = `INSERT INTO files (file_name, file_hash, file_path, file_size) VALUES (?, ?, ?, ?)`;
    	await files.insertFile(sql, [fileName, fileHash, filePath, fileSize]);
    }

总结:

总体优化的部分还有前端在切片的时候可以使用 web Worker 处理 ,使用多线程运行,刚写一点 nodeJs 不足之处请大佬们指出

源代码:

github: github.com/Yuhior/file...

参考链接

参考

相关推荐
gnip5 分钟前
做个交通信号灯特效
前端·javascript
尝尝你的优乐美8 分钟前
前端查缺补漏系列(二)JS数组及其扩展
前端·javascript·面试
Lsx_1 小时前
MultiRepo 和 Monorepo:代码管理的演进与选择
前端·javascript·架构
destinying2 小时前
当部分请求失败时,前端如何保证用户体验不崩溃?
前端·javascript·程序员
叁金Coder2 小时前
业务系统跳转Nacos免登录方案实践
前端·javascript·nginx·nacos
CodeTransfer2 小时前
今天给大家带来的是一个简单的小球抛物线动画效果
前端·javascript
宁静_致远2 小时前
使用 React 实现高效的接口轮询与高实时性通信:性能优化与最佳实践
前端·javascript·面试
肥板炒鸡蛋2 小时前
浏览器视频合成转码@ffmpeg/ffmpeg使用笔记
前端·javascript
Moment2 小时前
推了这么久的 Turbopack,现在终于能用了 🤔🤔🤔
前端·javascript·react.js