基于 Vue.js 的 PDF 文档渲染性能优化实践

引言

PDF 文档作为一种常见的文件格式,在Web应用中的展示仍面临性能挑战,特别是对于大型文档。在浏览器中渲染大型 PDF 时,常见问题包括初次加载慢、内存占用高、滚动卡顿等。本文将分析一个基于 Vue 3 的 PDF 渲染优化方案,展示如何通过虚拟滚动、分批渲染和缓存策略来提升用户体验。

面临的挑战

在 Web 中渲染 PDF 文档主要面临以下挑战:

  1. 内存占用:完整渲染大型 PDF 会消耗大量内存
  2. 首次加载时间:等待所有页面渲染会导致长时间白屏
  3. 滚动性能:大量 DOM 元素会导致滚动卡顿
  4. 网络传输:大文件下载耗时长,用户等待体验差

核心优化策略

1. 虚拟滚动与按需渲染

只渲染用户当前可见区域及其周围的缓冲区页面,而不是整个文档:

vue 复制代码
<div ref="pdfContainer" class="flex-1 overflow-auto" @scroll="onScroll">
  <!-- 通过内层 div 设定整个 PDF 的高度 -->
  <div
    :style="{
      position: 'relative',
      height: cumulativeHeights[cumulativeHeights.length - 1] + 'px',
    }"
  >
    <!-- 只渲染当前可见及缓冲区域的页面 -->
    <div
      v-for="pageNumber in visiblePageIndexes"
      :key="pageNumber"
      :style="{
        position: 'absolute',
        top: cumulativeHeights[pageNumber - 1] + 'px',
        left: '0',
        width: '100%',
        height: pageHeights[pageNumber - 1] + 'px',
      }"
    >
      <RenderPage :render-page="renderPage" :pageNumber="pageNumber" />
    </div>
  </div>
</div>

2. 可见区域计算

使用二分查找算法快速确定当前可见页面范围:

typescript 复制代码
const updateVisiblePages = async () => {
  const scrollTop = pdfContainer.value.scrollTop
  const containerHeight = pdfContainer.value.clientHeight
  const totalPages = pageHeights.value.length
  
  let startIdx = lowerBound(cumulativeHeights.value, scrollTop)
  startIdx = Math.max(startIdx - buffer, 0)
  let endIdx = lowerBound(cumulativeHeights.value, scrollTop + containerHeight)
  endIdx = Math.min(endIdx - 1 + buffer, totalPages - 1)
  
  // 更新可见页码
  const pages = []
  for (let i = startIdx; i <= endIdx; i++) {
    pages.push(i + 1) // 页码从1开始
  }
  visiblePageIndexes.value = pages
}

const lowerBound = (arr: number[], target: number): number => {
  let low = 0, high = arr.length
  while (low < high) {
    const mid = Math.floor((low + high) / 2)
    if (arr[mid] < target) {
      low = mid + 1
    } else {
      high = mid
    }
  }
  return low
}

3. 页面缓存机制

已渲染的页面保存在缓存中,避免重复渲染:

typescript 复制代码
// 渲染单个页面,并缓存 canvas
const renderPage = async (pageNumber: number) => {
  // 检查缓存中是否已有当前页面
  if (renderedPages.value.has(pageNumber)) return renderedPages.value.get(pageNumber)
  
  // 渲染页面并存入缓存
  const page = await toRaw(pdfDoc.value).getPage(pageNumber)
  const canvas = document.createElement('canvas')
  // 渲染逻辑...
  renderedPages.value.set(pageNumber, canvas)
  
  return canvas
}

4. 分批异步计算布局

为避免长时间阻塞主线程,分批计算页面高度:

typescript 复制代码
const calculatePageHeightsChunked = async () => {
  if (!pdfDoc.value) return
  const totalPages = pdfDoc.value.numPages
  const cumHeights: number[] = [0]
  const batchSize = 10
  
  for (let i = 1; i <= totalPages; i++) {
    // 计算当前页高度...
    
    // 批次计算,释放控制权,保证 UI 响应
    if (i % batchSize === 0) {
      cumulativeHeights.value = [...cumHeights]
      updateVisiblePages()
      await new Promise((resolve) => setTimeout(resolve, 0))
    }
  }
  // 最终更新...
}

5. 渐进式加载策略

先使用估计高度快速展示文档结构,随后异步更新实际高度:

typescript 复制代码
// 先显示一个大致估算的页面高度(例如使用第一页高度估算其余页面高度)以保证初始滚动条正常
const firstPage = await toRaw(pdfDoc.value).getPage(1)
const firstViewport = firstPage.getViewport({ scale: renderScale })
const firstWidth = firstViewport.width / renderScale
const firstHeight = firstViewport.height / renderScale
const pageScale = pdfContainer.value!.clientWidth / firstWidth
// 初始化估算数据
pageHeights.value = new Array(pdfDoc.value.numPages).fill(firstHeight * pageScale)
// ...后续异步计算真实高度...

6. 流式数据传输

服务端支持范围请求,客户端按需加载数据:

typescript 复制代码
const loadingTask = PDFJS.getDocument({
  url: url,
  rangeChunkSize: 1048575,
  disableAutoFetch: true,
  disableStream: true,
})

服务端示例代码(Node.js):

javascript 复制代码
const express = require('express');
const fs = require('fs');
const path = require('path');

const app = express();
const port = 5000;

app.all("*", (req, res, next) => {
  const { origin, Origin, referer, Referer } = req.headers;
  const allowOrigin = origin || Origin || referer || Referer || "*" || "null";
  res.header("Access-Control-Allow-Origin", allowOrigin);
  res.header(
    "Access-Control-Allow-Headers",
    "traceparent, Content-Type, Authorization, X-Requested-With"
  );
  res.header("Access-Control-Allow-Methods", "PUT,POST,GET,DELETE,OPTIONS");
  res.header("Access-Control-Allow-Credentials", true);
  res.setHeader(
    "Access-Control-Expose-Headers",
    "Accept-Ranges,Content-Range"
  );
  res.header("X-Powered-By", "Express");
  res.header("Accept-Ranges", 65536 * 4);
  if (req.method == "OPTIONS") {
    res.sendStatus(200);
  } else {
    next();
  }
});

app.get('/pdf', (req, res) => {
  const filePath = path.resolve(__dirname, 'CSS世界-张鑫旭.pdf');
  const stat = fs.statSync(filePath);
  const fileSize = stat.size;
  const range = req.headers.range;

  if (range) {
    const parts = range.replace(/bytes=/, '').split('-');
    const start = parseInt(parts[0], 10);
    const end = parts[1] ? parseInt(parts[1], 10) : fileSize - 1;

    if (start >= fileSize) {
      res.status(416).send('Requested range not satisfiable\n' + start + ' >= ' + fileSize);
      return;
    }

    const chunksize = (end - start) + 1;
    const file = fs.createReadStream(filePath, { start, end });
    const head = {
      'Content-Range': `bytes ${start}-${end}/${fileSize}`,
      'Accept-Ranges': 'bytes',
      'Content-Length': chunksize,
      'Content-Type': 'application/pdf',
    };

    res.writeHead(206, head);
    file.pipe(res);
  } else {
    const head = {
      'Content-Length': fileSize,
      'Accept-Ranges': 'bytes',
      'Content-Type': 'application/pdf',
    };
    res.writeHead(200, head);
    fs.createReadStream(filePath).pipe(res);
  }
});

app.listen(port, () => {
  console.log(`Server is running on http://localhost:${port}`);
});

性能优化效果

通过上述优化策略,该 PDF 渲染方案实现了以下效果:

  1. 大幅降低内存占用:从渲染整个文档几百 MB 内存降至只有几十 MB
  2. 快速首屏加载:无需等待全文档渲染,初始可见内容立即显示
  3. 平滑滚动体验:DOM 元素数量始终保持在较低水平,确保滚动流畅
  4. 响应式交互:分批异步处理保证 UI 线程不阻塞

扩展与改进方向

目前实现已经解决了基本的 PDF 渲染性能问题,但仍有优化空间:

  1. 预渲染策略:智能预测用户可能浏览的下一页并提前渲染
  2. 缓存管理:实现更智能的缓存淘汰策略,避免内存持续增长
  3. 渲染质量动态调整:滚动时降低渲染质量,停止滚动后提高质量
  4. 离屏渲染:利用 Web Worker 在后台线程渲染页面

结语

通过以上策略------虚拟滚动与按需渲染、页面缓存、分批计算页面高度和流式数据传输,在处理大量 PDF 页面时取得了非常不错的性能表现。尽管在实际应用中可能还需要考虑诸如缓存失效、视图重绘等问题,但这些优化思路为复杂 PDF 渲染提供了一种可行的参考。希望本篇文章能为大家在优化 PDF 渲染时提供一些思路和参考。

完整代码

PdfVIew.vue

vue 复制代码
<script setup lang="ts">
import { computed, onMounted, ref, toRaw } from 'vue'
import * as PDFJS from 'pdfjs-dist'
import RenderPage from './RenderPage.vue'
import { throttle } from 'lodash-es'
import { ArrowBigUp, ArrowBigDown } from 'lucide-vue-next'

PDFJS.GlobalWorkerOptions.workerSrc = new URL(
  'pdfjs-dist/build/pdf.worker.min.js',
  import.meta.url,
).href

const url = 'http://localhost:5000/pdf'
const pdfContainer = ref<HTMLDivElement | null>(null)
const pdfDoc = ref<PDFJS.PDFDocumentProxy | null>(null)
const pageHeights = ref<number[]>([]) // 每页高度
const cumulativeHeights = ref<number[]>([]) // 累加高度,第 i 页顶部的位置(0-indexed, cumulativeHeights[0]=0)
const renderedPages = ref<Map<number, HTMLCanvasElement | HTMLImageElement>>(new Map()) // 缓存
const visiblePageIndexes = ref<number[]>([]) // 当前可见页码(1-indexed)
const buffer = 3 // 上下缓冲页数
const renderScale = 4 // 渲染缩放比例
const renderType = ref<'canvas' | 'image'>('image') // 渲染类型

// 二分查找:寻找第一个大于或等于 target 的索引
const lowerBound = (arr: number[], target: number): number => {
  let low = 0,
    high = arr.length
  while (low < high) {
    const mid = Math.floor((low + high) / 2)
    if (arr[mid] < target) {
      low = mid + 1
    } else {
      high = mid
    }
  }
  return low
}

// 分批计算页面高度和累计高度,每计算一部分就更新进度和可见页,从而避免长时间白屏
const calculatePageHeightsChunked = async () => {
  if (!pdfDoc.value) return
  const totalPages = pdfDoc.value.numPages
  const cumHeights: number[] = [0]
  // 每次计算的批次大小,可根据性能调节
  const batchSize = 10
  for (let i = 1; i <= totalPages; i++) {
    const page = await toRaw(pdfDoc.value).getPage(i)
    const viewport = page.getViewport({ scale: renderScale })
    const w = viewport.width / renderScale
    const h = viewport.height / renderScale
    const pageScale = pdfContainer.value!.clientWidth / w
    const pageHeight = h * pageScale
    pageHeights.value[i - 1] = pageHeight
    cumHeights.push(cumHeights[cumHeights.length - 1] + pageHeight)
    // 批次计算,释放控制权,保证 UI 响应
    if (i % batchSize === 0) {
      cumulativeHeights.value = [...cumHeights]
      updateVisiblePages()
      await new Promise((resolve) => setTimeout(resolve, 0))
    }
  }
  cumulativeHeights.value = [...cumHeights]
  updateVisiblePages()
}

// 渲染单个页面,并缓存 canvas
const renderPage = async (pageNumber: number) => {
  if (!pdfDoc.value || pageNumber > pdfDoc.value.numPages) return null
  if (renderedPages.value.has(pageNumber)) return renderedPages.value.get(pageNumber)

  const page = await toRaw(pdfDoc.value).getPage(pageNumber)

  const viewport = page.getViewport({ scale: renderScale })
  const width = viewport.width / renderScale
  const height = viewport.height / renderScale
  const pageScale = pdfContainer.value!.clientWidth / width

  const canvas = document.createElement('canvas')
  const context = canvas.getContext('2d')!
  canvas.width = viewport.width
  canvas.height = viewport.height
  canvas.dataset.pageNumber = pageNumber.toString()

  const renderContext = {
    canvasContext: context,
    viewport: viewport,
  }
  await page.render(renderContext).promise

  if (renderType.value === 'canvas') {
    renderedPages.value.set(pageNumber, canvas)
    return canvas
  } else {
    // 将 canvas 转换为 image 元素
    const img = document.createElement('img')
    img.width = width * pageScale
    img.height = height * pageScale
    img.dataset.pageNumber = pageNumber.toString()
    img.src = canvas.toDataURL()

    renderedPages.value.set(pageNumber, img)

    return img
  }
}

// 根据滚动位置更新当前可见的页码
const updateVisiblePages = async () => {
  if (!pdfContainer.value || cumulativeHeights.value.length === 0) return

  const scrollTop = pdfContainer.value.scrollTop
  const containerHeight = pdfContainer.value.clientHeight
  const totalPages = pageHeights.value.length

  let startIdx = lowerBound(cumulativeHeights.value, scrollTop)
  startIdx = Math.max(startIdx - buffer, 0)
  let endIdx = lowerBound(cumulativeHeights.value, scrollTop + containerHeight)
  endIdx = Math.min(endIdx - 1 + buffer, totalPages - 1)

  const pages: number[] = []
  for (let i = startIdx; i <= endIdx; i++) {
    pages.push(i + 1) // 页码从 1 开始
  }
  visiblePageIndexes.value = pages
}

const onScroll = throttle(() => {
  updateVisiblePages()
}, 100)

onMounted(async () => {
  try {
    const loadingTask = PDFJS.getDocument({
      url: url,
      rangeChunkSize: 1048575,
      disableAutoFetch: true,
      disableStream: true,
    })
    pdfDoc.value = await loadingTask.promise
    console.log(`PDF 加载成功,总页数: ${pdfDoc.value.numPages}`)
    // 先显示一个大致估算的页面高度(例如使用第一页高度估算其余页面高度)以保证初始滚动条正常
    const firstPage = await toRaw(pdfDoc.value).getPage(1)
    const firstViewport = firstPage.getViewport({ scale: renderScale })
    const firstWidth = firstViewport.width / renderScale
    const firstHeight = firstViewport.height / renderScale
    const pageScale = pdfContainer.value!.clientWidth / firstWidth
    // 初始化估算数据
    pageHeights.value = new Array(pdfDoc.value.numPages).fill(firstHeight * pageScale)
    const estCumHeights = [0]
    for (let i = 1; i <= pdfDoc.value.numPages; i++) {
      estCumHeights.push(estCumHeights[i - 1] + firstHeight * pageScale)
    }
    cumulativeHeights.value = estCumHeights
    updateVisiblePages()
    // 异步分批计算真实的页面高度
    calculatePageHeightsChunked()
  } catch (error) {
    console.error('PDF 加载失败:', error)
  }
})

const currentPage = computed(() => {
  if (!pdfContainer.value || visiblePageIndexes.value.length === 0) return 1

  const scrollTop = pdfContainer.value.scrollTop
  const containerHeight = pdfContainer.value.clientHeight
  const viewportMiddle = scrollTop + containerHeight / 2

  // 找到包含视口中点的页面
  const pageIndex = lowerBound(cumulativeHeights.value, viewportMiddle) - 1
  return Math.max(1, Math.min(pageIndex + 1, totalPages.value))
})

const totalPages = computed(() => {
  return pdfDoc.value?.numPages || 0
})

const scrollToTop = () => {
  if (pdfContainer.value) {
    pdfContainer.value.scrollTop = 0
  }
}

const scrollToBottom = () => {
  if (pdfContainer.value) {
    pdfContainer.value.scrollTop = pdfContainer.value.scrollHeight - pdfContainer.value.clientHeight
  }
}
</script>

<template>
  <div class="h-full flex flex-col">
    <div class="fixed top-4 right-6 bg-zinc-200 px-3 py-1 rounded-md shadow z-10 text-zinc-800">
      {{ currentPage }} / {{ totalPages }}
    </div>
    <div class="fixed bottom-4 right-6 flex flex-col gap-4 z-10">
      <button @click="scrollToTop" class="p-2 bg-zinc-200 rounded" title="滚动到顶部">
        <ArrowBigUp />
      </button>
      <button @click="scrollToBottom" class="p-2 bg-zinc-200 rounded" title="滚动到底部">
        <ArrowBigDown />
      </button>
    </div>
    <div ref="pdfContainer" class="flex-1 overflow-auto" @scroll="onScroll">
      <!-- 通过内层 div 设定整个 PDF 的高度 -->
      <div
        :style="{
          position: 'relative',
          height: cumulativeHeights[cumulativeHeights.length - 1] + 'px',
        }"
      >
        <!-- 只渲染当前可见及缓冲区域的页面 -->
        <div
          v-for="pageNumber in visiblePageIndexes"
          :key="pageNumber"
          :style="{
            position: 'absolute',
            top: cumulativeHeights[pageNumber - 1] + 'px',
            left: '0',
            width: '100%',
            height: pageHeights[pageNumber - 1] + 'px',
          }"
        >
          <RenderPage :render-page="renderPage" :pageNumber="pageNumber" />
        </div>
      </div>
    </div>
  </div>
</template>

RenderPage.vue

vue 复制代码
<script setup lang="ts">
import { onMounted, ref } from 'vue'
import { Loader } from 'lucide-vue-next'

const props = defineProps<{
  renderPage: (pageNumber: number) => Promise<HTMLCanvasElement | HTMLImageElement | null | undefined>
  pageNumber: number
}>()

const container = ref<HTMLDivElement | null>(null)
const isLoading = ref(true)

onMounted(async () => {
  const page = await props.renderPage(props.pageNumber)
  if (container.value && page) {
    container.value.innerHTML = ''
    container.value.appendChild(page)
    isLoading.value = false
  }
})
</script>

<template>
  <div ref="container" class="relative h-full flex justify-center">
    <div v-show="isLoading" class="absolute inset-0 flex items-center justify-center">
      <Loader class="animate-spin text-zinc-800" />
    </div>
  </div>
</template>

package.json

json 复制代码
{
  "name": "client",
  "version": "0.0.0",
  "private": true,
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "run-p type-check \"build-only {@}\" --",
    "preview": "vite preview",
    "build-only": "vite build",
    "type-check": "vue-tsc --build",
    "lint": "eslint . --fix",
    "format": "prettier --write src/"
  },
  "dependencies": {
    "lodash-es": "^4.17.21",
    "lucide-vue-next": "^0.475.0",
    "pdfjs-dist": "^2.16.105",
    "pinia": "^2.3.1",
    "vue": "^3.5.13",
    "vue-router": "^4.5.0"
  },
  "devDependencies": {
    "@tsconfig/node22": "^22.0.0",
    "@types/lodash-es": "^4.17.12",
    "@types/node": "^22.10.7",
    "@vitejs/plugin-vue": "^5.2.1",
    "@vue/eslint-config-prettier": "^10.1.0",
    "@vue/eslint-config-typescript": "^14.3.0",
    "@vue/tsconfig": "^0.7.0",
    "autoprefixer": "^10.4.20",
    "eslint": "^9.18.0",
    "eslint-plugin-vue": "^9.32.0",
    "jiti": "^2.4.2",
    "npm-run-all2": "^7.0.2",
    "postcss": "^8.5.1",
    "prettier": "^3.4.2",
    "tailwindcss": "^3.4.17",
    "typescript": "~5.7.3",
    "vite": "^6.0.11",
    "vite-plugin-vue-devtools": "^7.7.0",
    "vue-tsc": "^2.2.0"
  }
}
相关推荐
桂月二二1 小时前
Vue3服务端渲染深度解析:从Nuxt3架构到性能优化实战
性能优化·架构
_未知_开摆1 小时前
uniapp APP端在线升级(简版)
开发语言·前端·javascript·vue.js·uni-app
海峰教授1 小时前
扫描仪+文档pdf编辑器+pdf格式转换器
pdf
喝拿铁写前端1 小时前
不同命名风格在 Vue 中后台项目中的使用分析
javascript·vue.js
Li_na_na011 小时前
解决安卓手机WebView无法直接预览PDF的问题(使用PDF.js方案)
android·pdf·uni-app·html5
我有医保我先冲2 小时前
SQL复杂查询与性能优化全攻略
数据库·sql·性能优化
背太阳的牧羊人2 小时前
使用 PyMuPDF(fitz)库打开 PDF 文件,并且是从内存中的字节流(BytesIO)读取 PDF 内容
数据库·pdf·文件处理·pymupdf·fitz
灯火不休ᝰ2 小时前
前端处理pdf文件流,展示pdf
前端·pdf
lvbb662 小时前
框架修改思路
前端·javascript·vue.js
qq_456001652 小时前
43、接口请求需要时间,导致页面初始加载时会出现空白,影响用户体验
javascript·vue.js·ux