引言
PDF 文档作为一种常见的文件格式,在Web应用中的展示仍面临性能挑战,特别是对于大型文档。在浏览器中渲染大型 PDF 时,常见问题包括初次加载慢、内存占用高、滚动卡顿等。本文将分析一个基于 Vue 3 的 PDF 渲染优化方案,展示如何通过虚拟滚动、分批渲染和缓存策略来提升用户体验。
面临的挑战
在 Web 中渲染 PDF 文档主要面临以下挑战:
- 内存占用:完整渲染大型 PDF 会消耗大量内存
- 首次加载时间:等待所有页面渲染会导致长时间白屏
- 滚动性能:大量 DOM 元素会导致滚动卡顿
- 网络传输:大文件下载耗时长,用户等待体验差
核心优化策略
1. 虚拟滚动与按需渲染
只渲染用户当前可见区域及其周围的缓冲区页面,而不是整个文档:
vue
<div ref="pdfContainer" class="flex-1 overflow-auto" @scroll="onScroll">
<!-- 通过内层 div 设定整个 PDF 的高度 -->
<div
:style="{
position: 'relative',
height: cumulativeHeights[cumulativeHeights.length - 1] + 'px',
}"
>
<!-- 只渲染当前可见及缓冲区域的页面 -->
<div
v-for="pageNumber in visiblePageIndexes"
:key="pageNumber"
:style="{
position: 'absolute',
top: cumulativeHeights[pageNumber - 1] + 'px',
left: '0',
width: '100%',
height: pageHeights[pageNumber - 1] + 'px',
}"
>
<RenderPage :render-page="renderPage" :pageNumber="pageNumber" />
</div>
</div>
</div>
2. 可见区域计算
使用二分查找算法快速确定当前可见页面范围:
typescript
const updateVisiblePages = async () => {
const scrollTop = pdfContainer.value.scrollTop
const containerHeight = pdfContainer.value.clientHeight
const totalPages = pageHeights.value.length
let startIdx = lowerBound(cumulativeHeights.value, scrollTop)
startIdx = Math.max(startIdx - buffer, 0)
let endIdx = lowerBound(cumulativeHeights.value, scrollTop + containerHeight)
endIdx = Math.min(endIdx - 1 + buffer, totalPages - 1)
// 更新可见页码
const pages = []
for (let i = startIdx; i <= endIdx; i++) {
pages.push(i + 1) // 页码从1开始
}
visiblePageIndexes.value = pages
}
const lowerBound = (arr: number[], target: number): number => {
let low = 0, high = arr.length
while (low < high) {
const mid = Math.floor((low + high) / 2)
if (arr[mid] < target) {
low = mid + 1
} else {
high = mid
}
}
return low
}
3. 页面缓存机制
已渲染的页面保存在缓存中,避免重复渲染:
typescript
// 渲染单个页面,并缓存 canvas
const renderPage = async (pageNumber: number) => {
// 检查缓存中是否已有当前页面
if (renderedPages.value.has(pageNumber)) return renderedPages.value.get(pageNumber)
// 渲染页面并存入缓存
const page = await toRaw(pdfDoc.value).getPage(pageNumber)
const canvas = document.createElement('canvas')
// 渲染逻辑...
renderedPages.value.set(pageNumber, canvas)
return canvas
}
4. 分批异步计算布局
为避免长时间阻塞主线程,分批计算页面高度:
typescript
const calculatePageHeightsChunked = async () => {
if (!pdfDoc.value) return
const totalPages = pdfDoc.value.numPages
const cumHeights: number[] = [0]
const batchSize = 10
for (let i = 1; i <= totalPages; i++) {
// 计算当前页高度...
// 批次计算,释放控制权,保证 UI 响应
if (i % batchSize === 0) {
cumulativeHeights.value = [...cumHeights]
updateVisiblePages()
await new Promise((resolve) => setTimeout(resolve, 0))
}
}
// 最终更新...
}
5. 渐进式加载策略
先使用估计高度快速展示文档结构,随后异步更新实际高度:
typescript
// 先显示一个大致估算的页面高度(例如使用第一页高度估算其余页面高度)以保证初始滚动条正常
const firstPage = await toRaw(pdfDoc.value).getPage(1)
const firstViewport = firstPage.getViewport({ scale: renderScale })
const firstWidth = firstViewport.width / renderScale
const firstHeight = firstViewport.height / renderScale
const pageScale = pdfContainer.value!.clientWidth / firstWidth
// 初始化估算数据
pageHeights.value = new Array(pdfDoc.value.numPages).fill(firstHeight * pageScale)
// ...后续异步计算真实高度...
6. 流式数据传输
服务端支持范围请求,客户端按需加载数据:
typescript
const loadingTask = PDFJS.getDocument({
url: url,
rangeChunkSize: 1048575,
disableAutoFetch: true,
disableStream: true,
})
服务端示例代码(Node.js):
javascript
const express = require('express');
const fs = require('fs');
const path = require('path');
const app = express();
const port = 5000;
app.all("*", (req, res, next) => {
const { origin, Origin, referer, Referer } = req.headers;
const allowOrigin = origin || Origin || referer || Referer || "*" || "null";
res.header("Access-Control-Allow-Origin", allowOrigin);
res.header(
"Access-Control-Allow-Headers",
"traceparent, Content-Type, Authorization, X-Requested-With"
);
res.header("Access-Control-Allow-Methods", "PUT,POST,GET,DELETE,OPTIONS");
res.header("Access-Control-Allow-Credentials", true);
res.setHeader(
"Access-Control-Expose-Headers",
"Accept-Ranges,Content-Range"
);
res.header("X-Powered-By", "Express");
res.header("Accept-Ranges", 65536 * 4);
if (req.method == "OPTIONS") {
res.sendStatus(200);
} else {
next();
}
});
app.get('/pdf', (req, res) => {
const filePath = path.resolve(__dirname, 'CSS世界-张鑫旭.pdf');
const stat = fs.statSync(filePath);
const fileSize = stat.size;
const range = req.headers.range;
if (range) {
const parts = range.replace(/bytes=/, '').split('-');
const start = parseInt(parts[0], 10);
const end = parts[1] ? parseInt(parts[1], 10) : fileSize - 1;
if (start >= fileSize) {
res.status(416).send('Requested range not satisfiable\n' + start + ' >= ' + fileSize);
return;
}
const chunksize = (end - start) + 1;
const file = fs.createReadStream(filePath, { start, end });
const head = {
'Content-Range': `bytes ${start}-${end}/${fileSize}`,
'Accept-Ranges': 'bytes',
'Content-Length': chunksize,
'Content-Type': 'application/pdf',
};
res.writeHead(206, head);
file.pipe(res);
} else {
const head = {
'Content-Length': fileSize,
'Accept-Ranges': 'bytes',
'Content-Type': 'application/pdf',
};
res.writeHead(200, head);
fs.createReadStream(filePath).pipe(res);
}
});
app.listen(port, () => {
console.log(`Server is running on http://localhost:${port}`);
});
性能优化效果
通过上述优化策略,该 PDF 渲染方案实现了以下效果:
- 大幅降低内存占用:从渲染整个文档几百 MB 内存降至只有几十 MB
- 快速首屏加载:无需等待全文档渲染,初始可见内容立即显示
- 平滑滚动体验:DOM 元素数量始终保持在较低水平,确保滚动流畅
- 响应式交互:分批异步处理保证 UI 线程不阻塞
扩展与改进方向
目前实现已经解决了基本的 PDF 渲染性能问题,但仍有优化空间:
- 预渲染策略:智能预测用户可能浏览的下一页并提前渲染
- 缓存管理:实现更智能的缓存淘汰策略,避免内存持续增长
- 渲染质量动态调整:滚动时降低渲染质量,停止滚动后提高质量
- 离屏渲染:利用 Web Worker 在后台线程渲染页面
结语
通过以上策略------虚拟滚动与按需渲染、页面缓存、分批计算页面高度和流式数据传输,在处理大量 PDF 页面时取得了非常不错的性能表现。尽管在实际应用中可能还需要考虑诸如缓存失效、视图重绘等问题,但这些优化思路为复杂 PDF 渲染提供了一种可行的参考。希望本篇文章能为大家在优化 PDF 渲染时提供一些思路和参考。
完整代码
PdfVIew.vue
vue
<script setup lang="ts">
import { computed, onMounted, ref, toRaw } from 'vue'
import * as PDFJS from 'pdfjs-dist'
import RenderPage from './RenderPage.vue'
import { throttle } from 'lodash-es'
import { ArrowBigUp, ArrowBigDown } from 'lucide-vue-next'
PDFJS.GlobalWorkerOptions.workerSrc = new URL(
'pdfjs-dist/build/pdf.worker.min.js',
import.meta.url,
).href
const url = 'http://localhost:5000/pdf'
const pdfContainer = ref<HTMLDivElement | null>(null)
const pdfDoc = ref<PDFJS.PDFDocumentProxy | null>(null)
const pageHeights = ref<number[]>([]) // 每页高度
const cumulativeHeights = ref<number[]>([]) // 累加高度,第 i 页顶部的位置(0-indexed, cumulativeHeights[0]=0)
const renderedPages = ref<Map<number, HTMLCanvasElement | HTMLImageElement>>(new Map()) // 缓存
const visiblePageIndexes = ref<number[]>([]) // 当前可见页码(1-indexed)
const buffer = 3 // 上下缓冲页数
const renderScale = 4 // 渲染缩放比例
const renderType = ref<'canvas' | 'image'>('image') // 渲染类型
// 二分查找:寻找第一个大于或等于 target 的索引
const lowerBound = (arr: number[], target: number): number => {
let low = 0,
high = arr.length
while (low < high) {
const mid = Math.floor((low + high) / 2)
if (arr[mid] < target) {
low = mid + 1
} else {
high = mid
}
}
return low
}
// 分批计算页面高度和累计高度,每计算一部分就更新进度和可见页,从而避免长时间白屏
const calculatePageHeightsChunked = async () => {
if (!pdfDoc.value) return
const totalPages = pdfDoc.value.numPages
const cumHeights: number[] = [0]
// 每次计算的批次大小,可根据性能调节
const batchSize = 10
for (let i = 1; i <= totalPages; i++) {
const page = await toRaw(pdfDoc.value).getPage(i)
const viewport = page.getViewport({ scale: renderScale })
const w = viewport.width / renderScale
const h = viewport.height / renderScale
const pageScale = pdfContainer.value!.clientWidth / w
const pageHeight = h * pageScale
pageHeights.value[i - 1] = pageHeight
cumHeights.push(cumHeights[cumHeights.length - 1] + pageHeight)
// 批次计算,释放控制权,保证 UI 响应
if (i % batchSize === 0) {
cumulativeHeights.value = [...cumHeights]
updateVisiblePages()
await new Promise((resolve) => setTimeout(resolve, 0))
}
}
cumulativeHeights.value = [...cumHeights]
updateVisiblePages()
}
// 渲染单个页面,并缓存 canvas
const renderPage = async (pageNumber: number) => {
if (!pdfDoc.value || pageNumber > pdfDoc.value.numPages) return null
if (renderedPages.value.has(pageNumber)) return renderedPages.value.get(pageNumber)
const page = await toRaw(pdfDoc.value).getPage(pageNumber)
const viewport = page.getViewport({ scale: renderScale })
const width = viewport.width / renderScale
const height = viewport.height / renderScale
const pageScale = pdfContainer.value!.clientWidth / width
const canvas = document.createElement('canvas')
const context = canvas.getContext('2d')!
canvas.width = viewport.width
canvas.height = viewport.height
canvas.dataset.pageNumber = pageNumber.toString()
const renderContext = {
canvasContext: context,
viewport: viewport,
}
await page.render(renderContext).promise
if (renderType.value === 'canvas') {
renderedPages.value.set(pageNumber, canvas)
return canvas
} else {
// 将 canvas 转换为 image 元素
const img = document.createElement('img')
img.width = width * pageScale
img.height = height * pageScale
img.dataset.pageNumber = pageNumber.toString()
img.src = canvas.toDataURL()
renderedPages.value.set(pageNumber, img)
return img
}
}
// 根据滚动位置更新当前可见的页码
const updateVisiblePages = async () => {
if (!pdfContainer.value || cumulativeHeights.value.length === 0) return
const scrollTop = pdfContainer.value.scrollTop
const containerHeight = pdfContainer.value.clientHeight
const totalPages = pageHeights.value.length
let startIdx = lowerBound(cumulativeHeights.value, scrollTop)
startIdx = Math.max(startIdx - buffer, 0)
let endIdx = lowerBound(cumulativeHeights.value, scrollTop + containerHeight)
endIdx = Math.min(endIdx - 1 + buffer, totalPages - 1)
const pages: number[] = []
for (let i = startIdx; i <= endIdx; i++) {
pages.push(i + 1) // 页码从 1 开始
}
visiblePageIndexes.value = pages
}
const onScroll = throttle(() => {
updateVisiblePages()
}, 100)
onMounted(async () => {
try {
const loadingTask = PDFJS.getDocument({
url: url,
rangeChunkSize: 1048575,
disableAutoFetch: true,
disableStream: true,
})
pdfDoc.value = await loadingTask.promise
console.log(`PDF 加载成功,总页数: ${pdfDoc.value.numPages}`)
// 先显示一个大致估算的页面高度(例如使用第一页高度估算其余页面高度)以保证初始滚动条正常
const firstPage = await toRaw(pdfDoc.value).getPage(1)
const firstViewport = firstPage.getViewport({ scale: renderScale })
const firstWidth = firstViewport.width / renderScale
const firstHeight = firstViewport.height / renderScale
const pageScale = pdfContainer.value!.clientWidth / firstWidth
// 初始化估算数据
pageHeights.value = new Array(pdfDoc.value.numPages).fill(firstHeight * pageScale)
const estCumHeights = [0]
for (let i = 1; i <= pdfDoc.value.numPages; i++) {
estCumHeights.push(estCumHeights[i - 1] + firstHeight * pageScale)
}
cumulativeHeights.value = estCumHeights
updateVisiblePages()
// 异步分批计算真实的页面高度
calculatePageHeightsChunked()
} catch (error) {
console.error('PDF 加载失败:', error)
}
})
const currentPage = computed(() => {
if (!pdfContainer.value || visiblePageIndexes.value.length === 0) return 1
const scrollTop = pdfContainer.value.scrollTop
const containerHeight = pdfContainer.value.clientHeight
const viewportMiddle = scrollTop + containerHeight / 2
// 找到包含视口中点的页面
const pageIndex = lowerBound(cumulativeHeights.value, viewportMiddle) - 1
return Math.max(1, Math.min(pageIndex + 1, totalPages.value))
})
const totalPages = computed(() => {
return pdfDoc.value?.numPages || 0
})
const scrollToTop = () => {
if (pdfContainer.value) {
pdfContainer.value.scrollTop = 0
}
}
const scrollToBottom = () => {
if (pdfContainer.value) {
pdfContainer.value.scrollTop = pdfContainer.value.scrollHeight - pdfContainer.value.clientHeight
}
}
</script>
<template>
<div class="h-full flex flex-col">
<div class="fixed top-4 right-6 bg-zinc-200 px-3 py-1 rounded-md shadow z-10 text-zinc-800">
{{ currentPage }} / {{ totalPages }}
</div>
<div class="fixed bottom-4 right-6 flex flex-col gap-4 z-10">
<button @click="scrollToTop" class="p-2 bg-zinc-200 rounded" title="滚动到顶部">
<ArrowBigUp />
</button>
<button @click="scrollToBottom" class="p-2 bg-zinc-200 rounded" title="滚动到底部">
<ArrowBigDown />
</button>
</div>
<div ref="pdfContainer" class="flex-1 overflow-auto" @scroll="onScroll">
<!-- 通过内层 div 设定整个 PDF 的高度 -->
<div
:style="{
position: 'relative',
height: cumulativeHeights[cumulativeHeights.length - 1] + 'px',
}"
>
<!-- 只渲染当前可见及缓冲区域的页面 -->
<div
v-for="pageNumber in visiblePageIndexes"
:key="pageNumber"
:style="{
position: 'absolute',
top: cumulativeHeights[pageNumber - 1] + 'px',
left: '0',
width: '100%',
height: pageHeights[pageNumber - 1] + 'px',
}"
>
<RenderPage :render-page="renderPage" :pageNumber="pageNumber" />
</div>
</div>
</div>
</div>
</template>
RenderPage.vue
vue
<script setup lang="ts">
import { onMounted, ref } from 'vue'
import { Loader } from 'lucide-vue-next'
const props = defineProps<{
renderPage: (pageNumber: number) => Promise<HTMLCanvasElement | HTMLImageElement | null | undefined>
pageNumber: number
}>()
const container = ref<HTMLDivElement | null>(null)
const isLoading = ref(true)
onMounted(async () => {
const page = await props.renderPage(props.pageNumber)
if (container.value && page) {
container.value.innerHTML = ''
container.value.appendChild(page)
isLoading.value = false
}
})
</script>
<template>
<div ref="container" class="relative h-full flex justify-center">
<div v-show="isLoading" class="absolute inset-0 flex items-center justify-center">
<Loader class="animate-spin text-zinc-800" />
</div>
</div>
</template>
package.json
json
{
"name": "client",
"version": "0.0.0",
"private": true,
"type": "module",
"scripts": {
"dev": "vite",
"build": "run-p type-check \"build-only {@}\" --",
"preview": "vite preview",
"build-only": "vite build",
"type-check": "vue-tsc --build",
"lint": "eslint . --fix",
"format": "prettier --write src/"
},
"dependencies": {
"lodash-es": "^4.17.21",
"lucide-vue-next": "^0.475.0",
"pdfjs-dist": "^2.16.105",
"pinia": "^2.3.1",
"vue": "^3.5.13",
"vue-router": "^4.5.0"
},
"devDependencies": {
"@tsconfig/node22": "^22.0.0",
"@types/lodash-es": "^4.17.12",
"@types/node": "^22.10.7",
"@vitejs/plugin-vue": "^5.2.1",
"@vue/eslint-config-prettier": "^10.1.0",
"@vue/eslint-config-typescript": "^14.3.0",
"@vue/tsconfig": "^0.7.0",
"autoprefixer": "^10.4.20",
"eslint": "^9.18.0",
"eslint-plugin-vue": "^9.32.0",
"jiti": "^2.4.2",
"npm-run-all2": "^7.0.2",
"postcss": "^8.5.1",
"prettier": "^3.4.2",
"tailwindcss": "^3.4.17",
"typescript": "~5.7.3",
"vite": "^6.0.11",
"vite-plugin-vue-devtools": "^7.7.0",
"vue-tsc": "^2.2.0"
}
}