从简单到深入大文件上传和minio、权限认证

一、基础的文件上传，服务器存储。

javascript 复制代码

// 前端代码
import { Upload, } from 'antd';
import type { UploadProps } from 'antd';
import axios from 'axios';

export default function HomePage() {
    const props: UploadProps = {
        customRequest: async (options) => {
            const { file, } = options;
            const formData = new FormData();
            formData.append('file', file);

            const response = await axios.post('http://localhost:3000/uploads/single', formData);
            console.log(response);
        },
    };
    return (
        <div>
            <h2>基本文件上传</h2>
            <Upload {...props}>
                上传
            </Upload>
        </div>
    );
}

typescript 复制代码

/* node.js 后端代码 直接的文件存储*/
import {
    Controller,
    Post,
    UploadedFile,
    UseInterceptors,
} from '@nestjs/common';
import { FileInterceptor } from '@nestjs/platform-express';
import { Express } from 'express';
import * as fs from 'fs';
import * as path from 'path';

@Controller('uploads')
export class UploadController {
    @Post('single')
    @UseInterceptors(FileInterceptor('file'))
    uploadFile(@UploadedFile() file: Express.Multer.File) {
        const filePath = path.join(__dirname, '../static/') + file.originalname;//创建文件的存储路径，@todo 并对文件进行重命名
        const upStream = fs.createWriteStream(filePath);// 创建可写流，传入路径
        upStream.write(file.buffer);// 写入缓冲区
        return 'ok';
    }
}

优化1、前端切片，并发上传

目前现代浏览器并发http请求数一般为6个，极大提速了上传效率。

步骤：

对文件进行切片
将切片传输给服务端, 进度条显示
发送合并请求

php 复制代码

const SLICE_SUFFIX = '-part' // 切片后缀约定
const CHUNK_SIZE = 1024 * 1024 * 6
/** 切片 */
/**
 * Splits a file into chunks for uploading.
 * 
 * @param file - The file to be split.
 * @param fileName - The name of the file.
 * @param skips - An array of part names to skip.
 * @returns An array of tasks, each containing a chunk of the file, the filename, and the upload URL.
 */
function splitFile(file: File, fileName: string, skips: string[]) {
    const fileSize = file.size
    const tasks = [];
    const chunks = Math.ceil(fileSize / CHUNK_SIZE);
    for (let i = 0; i < chunks; i++) {
        const partName = fileName + SLICE_SUFFIX + (i + 1) // 序号从1开始
        if (skips.includes(partName)) {
            continue
        }
        const start = i * CHUNK_SIZE;
        const end = ((start + CHUNK_SIZE) >= fileSize) ? file.size : start + CHUNK_SIZE;
        const chunk = file.slice(start, end);
        tasks.push({
            chunk,
            filename: partName,
            url: '', // 上传地址
        });
    }
    return tasks
}

优化2、断点续传

服务端查询已经上传的文件切片列表，返回数组给前端。

前端在切片的时候，通过skips，跳过已经上传的部分。

注意点：这里需要约定前后端使用md5 hash值作为文件名，存储在服务端静态文件系统。这样最简单。

ini 复制代码

function splitFile(file: File, fileName: string, skips: string[]) {
    const fileSize = file.size
    const tasks = [];
    const chunks = Math.ceil(fileSize / CHUNK_SIZE);
    for (let i = 0; i < chunks; i++) {
        const partName = fileName + SLICE_SUFFIX + (i + 1) // 序号从1开始
        if (skips.includes(partName)) {
            continue
        }
        const start = i * CHUNK_SIZE;
        const end = ((start + CHUNK_SIZE) >= fileSize) ? file.size : start + CHUNK_SIZE;
        const chunk = file.slice(start, end);
        tasks.push({
            chunk,
            filename: partName,
            url: '', // 上传地址
        });
    }
    return tasks
}

二、使用minio代替服务器静态文件存储

涉及minio-js sdk的这几个函数：

presignedPutObject：上传切片
listObjects：查询已经存在的切片
composeObject：合并切片
statObject：检查文件是否存在

详细api 参数参考官网：min.io/docs/minio/...

三、鉴权API

文件名是使用文件内容的md5 hash。

场景：同样一份文件，由不同用户上传，后面上传的用户是秒传的。

关于大文件上传的一个界面如下：

（1）前端业务组件提供upload 组件

（2）提供nest-minio docker服务api支持（无登录用户鉴权), 这里有两层服务，一层是node.js 服务，一层是minio文件存储服务。

但是各个平台的鉴权是不一样的，有的使用cookie，有的使用token。。

所以需要将nest-minio docker服务提供给后端，让后端集成登录鉴权和nest-minio，比如使用docker-compose容器编排，将这些服务和服务依赖管理起来。

后端标准化auth微服务后，可以将这一套服务容器编排，无成本地部署起来。

This content is only supported in a Feishu Docs

其中，大文件md5 hash计算比较耗时，使用web worker线程去计算。

四、场景局限性

以上的两个方案在什么场景下不能使用呢？

数据持久化更丰富的场景：比如需要记录用户与他上传的文件之间的关系，增加文件元信息之外的记录时候，则不能直接使用。

可考虑方案：需要基于代码仓库，fork仓库，增加MySQL 持久化数据库，这样的架构才能支持。