大家好,我是小悟。
Word到PDF的奇幻之旅
Word文档就像个穿着睡衣在家办公的程序员------舒服但有点随意 。而PDF呢?就是穿上西装打上领带,准备去参加董事会的同一人------专业且纹丝不动!
这转变过程好比:
- Word文档:"哈!我的字体可以随便换,边距可以随意调,图片还能拖来拖去~"
- PDF:"闭嘴!现在开始我说了算,每个像素都给我站好岗!"
SpringBoot实现这个转换,就像是请了个文档变形金刚,把自由散漫的Word驯化成纪律严明的PDF士兵。下面就让我带你见证这场"格式驯化仪式"!
准备阶段:装备你的"变形工具箱"
第一步:Maven依赖大采购
xml
<!-- pom.xml 里加入这些法宝 -->
<dependencies>
<!-- SpringBoot标准装备 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Apache POI - Word文档的"读心术" -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>5.2.3</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.3</version>
</dependency>
<!-- OpenPDF - PDF的"打印机" -->
<dependency>
<groupId>com.github.librepdf</groupId>
<artifactId>openpdf</artifactId>
<version>1.3.30</version>
</dependency>
<!-- 文件类型检测 - 避免把图片当Word处理 -->
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>2.7.0</version>
</dependency>
</dependencies>
第二步:配置属性文件
yaml
# application.yml
word-to-pdf:
upload-dir: "uploads/" # Word文档临时停靠站
output-dir: "pdf-output/" # PDF成品仓库
max-file-size: 10MB # 别想用《战争与和平》来考验我
spring:
servlet:
multipart:
max-file-size: 10MB
max-request-size: 10MB
核心代码:变身吧,Word君!
1. 文件上传控制器(接待员)
java
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import javax.servlet.http.HttpServletResponse;
import java.io.*;
@RestController
@RequestMapping("/api/doc-transform")
public class WordToPdfController {
@PostMapping("/word-to-pdf")
public void convertWordToPdf(
@RequestParam("file") MultipartFile wordFile,
HttpServletResponse response) throws IOException {
// 1. 检查文件:别想用猫咪图片冒充Word文档!
if (!isWordDocument(wordFile)) {
response.getWriter().write("喂!这不是Word文档,别骗我!");
response.setStatus(HttpServletResponse.SC_BAD_REQUEST);
return;
}
// 2. 临时存放Word文件(像安检前的暂存)
File tempWordFile = new File("temp_" + System.currentTimeMillis() + ".docx");
wordFile.transferTo(tempWordFile);
// 3. 开始变形!
byte[] pdfBytes = WordToPdfConverter.convert(tempWordFile);
// 4. 清理现场(像用完的变形金刚恢复原状)
tempWordFile.delete();
// 5. 把PDF交给用户
response.setContentType("application/pdf");
response.setHeader("Content-Disposition",
"attachment; filename=\"" +
wordFile.getOriginalFilename().replace(".docx", ".pdf") + "\"");
response.getOutputStream().write(pdfBytes);
System.out.println("转换成功!又一个Word被成功驯化成PDF!");
}
private boolean isWordDocument(MultipartFile file) {
String fileName = file.getOriginalFilename().toLowerCase();
return fileName.endsWith(".docx") || fileName.endsWith(".doc");
}
}
2. 转换器核心(真正的变形引擎)
java
import org.apache.poi.xwpf.usermodel.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.PdfWriter;
import java.io.*;
@Component
public class WordToPdfConverter {
public static byte[] convert(File wordFile) throws IOException {
ByteArrayOutputStream pdfOutputStream = new ByteArrayOutputStream();
try (FileInputStream fis = new FileInputStream(wordFile)) {
// 1. 打开Word文档(像打开潘多拉魔盒)
XWPFDocument document = new XWPFDocument(fis);
// 2. 创建PDF文档(准备新家)
Document pdfDocument = new Document();
PdfWriter.getInstance(pdfDocument, pdfOutputStream);
pdfDocument.open();
// 3. 逐段搬运内容(像蚂蚁搬家)
System.out.println("开始搬运段落,共" + document.getParagraphs().size() + "段...");
for (XWPFParagraph para : document.getParagraphs()) {
if (para.getText().trim().isEmpty()) continue;
// 处理文本样式
Font font = new Font();
if (para.getStyle() != null) {
switch (para.getStyle()) {
case "Heading1":
font = new Font(Font.HELVETICA, 18, Font.BOLD);
break;
case "Heading2":
font = new Font(Font.HELVETICA, 16, Font.BOLD);
break;
default:
font = new Font(Font.HELVETICA, 12, Font.NORMAL);
}
}
Paragraph pdfPara = new Paragraph(para.getText(), font);
pdfDocument.add(pdfPara);
pdfDocument.add(Chunk.NEWLINE); // 加个换行,喘口气
}
// 4. 处理图片(最难搬家的部分)
System.out.println("开始处理图片,共" + document.getAllPictures().size() + "张...");
for (XWPFPictureData picture : document.getAllPictures()) {
try {
byte[] pictureData = picture.getData();
Image image = Image.getInstance(pictureData);
image.scaleToFit(500, 500); // 给图片上个紧箍咒,别太大
image.setAlignment(Element.ALIGN_CENTER);
pdfDocument.add(image);
pdfDocument.add(Chunk.NEWLINE);
} catch (Exception e) {
System.err.println("图片" + picture.getFileName() + "太调皮,转换失败: " + e.getMessage());
}
}
// 5. 处理表格(Excel表示:我也想来凑热闹)
for (XWPFTable table : document.getTables()) {
com.lowagie.text.Table pdfTable =
new com.lowagie.text.Table(table.getNumberOfRows());
for (XWPFTableRow row : table.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
pdfTable.addCell(cell.getText());
}
}
pdfDocument.add(pdfTable);
}
pdfDocument.close();
document.close();
System.out.println("转换完成!生成PDF大小: " +
(pdfOutputStream.size() / 1024) + " KB");
} catch (Exception e) {
System.err.println("转换过程出现意外: " + e.getMessage());
throw new IOException("转换失败,Word文档可能被施了魔法", e);
}
return pdfOutputStream.toByteArray();
}
}
3. 异常处理(变形失败的救护车)
swift
@ControllerAdvice
public class DocumentConversionExceptionHandler {
@ExceptionHandler(IOException.class)
public ResponseEntity<String> handleIOException(IOException e) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("文档转换失败,可能原因:\n" +
"1. Word文档被外星人加密了\n" +
"2. 文件太大,服务器举不动了\n" +
"3. 网络连接在打瞌睡\n" +
"错误详情: " + e.getMessage());
}
@ExceptionHandler(InvalidFormatException.class)
public ResponseEntity<String> handleInvalidFormat(Exception e) {
return ResponseEntity.badRequest()
.body("喂!你上传的是Word文档吗?\n" +
"我猜你上传的是:\n" +
"□ 猫咪图片 \n" +
"□ Excel表格 \n" +
"□ 心灵鸡汤文本 \n" +
"请上传正经的.docx或.doc文件!");
}
}
4. 进度监控(变形过程直播)
typescript
@Component
public class ConversionProgressService {
private Map<String, Integer> progressMap = new ConcurrentHashMap<>();
public void startConversion(String fileId) {
progressMap.put(fileId, 0);
System.out.println("开始转换文件: " + fileId);
}
public void updateProgress(String fileId, int percent) {
progressMap.put(fileId, percent);
// 打印进度条(假装很高级)
StringBuilder progressBar = new StringBuilder("[");
for (int i = 0; i < 20; i++) {
progressBar.append(i * 5 < percent ? "█" : "░");
}
progressBar.append("] ").append(percent).append("%");
System.out.println(fileId + " 转换进度: " + progressBar.toString());
// 说点骚话鼓励一下
if (percent == 50) {
System.out.println("转换过半,坚持住!");
} else if (percent == 90) {
System.out.println("马上完成,准备发射PDF!");
}
}
public void completeConversion(String fileId) {
progressMap.remove(fileId);
System.out.println(fileId + " 转换完成,深藏功与名~");
}
}
前端调用示例(用户操作界面)
xml
<!DOCTYPE html>
<html>
<head>
<title>Word转PDF变形工坊</title>
<style>
body { font-family: 'Comic Sans MS', cursive; padding: 20px; }
.container { max-width: 600px; margin: 0 auto; }
.drop-zone {
border: 3px dashed #4CAF50;
border-radius: 10px;
padding: 40px;
text-align: center;
background: #f9f9f9;
cursor: pointer;
}
.drop-zone:hover { background: #e8f5e9; }
.convert-btn {
background: linear-gradient(45deg, #FF6B6B, #4ECDC4);
color: white;
border: none;
padding: 15px 30px;
border-radius: 25px;
font-size: 18px;
cursor: pointer;
margin-top: 20px;
}
.progress-bar {
width: 100%;
height: 20px;
background: #ddd;
border-radius: 10px;
margin-top: 20px;
overflow: hidden;
display: none;
}
.progress-fill {
height: 100%;
background: linear-gradient(90deg, #4CAF50, #8BC34A);
width: 0%;
transition: width 0.3s;
}
</style>
</head>
<body>
<div class="container">
<h1>Word转PDF变形工坊</h1>
<p>把你的Word文档扔进来,还你一个乖巧的PDF!</p>
<div class="drop-zone" id="dropZone">
<h2>拖拽Word文件到这里</h2>
<p>或者 <label style="color: #2196F3; cursor: pointer;">点击选择文件
<input type="file" id="fileInput" accept=".docx,.doc" hidden>
</label></p>
</div>
<button class="convert-btn" onclick="convertToPdf()">
开始变形!
</button>
<div class="progress-bar" id="progressBar">
<div class="progress-fill" id="progressFill"></div>
</div>
<div id="status" style="margin-top: 20px;"></div>
</div>
<script>
const dropZone = document.getElementById('dropZone');
const fileInput = document.getElementById('fileInput');
let selectedFile = null;
// 拖拽功能
dropZone.addEventListener('dragover', (e) => {
e.preventDefault();
dropZone.style.background = '#e8f5e9';
});
dropZone.addEventListener('drop', (e) => {
e.preventDefault();
dropZone.style.background = '#f9f9f9';
selectedFile = e.dataTransfer.files[0];
document.getElementById('status').innerHTML =
`已选择: <strong>${selectedFile.name}</strong>`;
});
fileInput.addEventListener('change', (e) => {
selectedFile = e.target.files[0];
document.getElementById('status').innerHTML =
`已选择: <strong>${selectedFile.name}</strong>`;
});
// 转换函数
async function convertToPdf() {
if (!selectedFile) {
alert('请先选择一个Word文件!');
return;
}
const formData = new FormData();
formData.append('file', selectedFile);
// 显示进度条
const progressBar = document.getElementById('progressBar');
const progressFill = document.getElementById('progressFill');
progressBar.style.display = 'block';
// 模拟进度(实际项目可以用WebSocket)
let progress = 0;
const interval = setInterval(() => {
progress += 10;
progressFill.style.width = `${progress}%`;
if (progress >= 90) clearInterval(interval);
}, 300);
try {
const response = await fetch('/api/doc-transform/word-to-pdf', {
method: 'POST',
body: formData
});
clearInterval(interval);
progressFill.style.width = '100%';
if (response.ok) {
// 下载PDF
const blob = await response.blob();
const url = window.URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = selectedFile.name.replace(/\.docx?$/i, '.pdf');
document.body.appendChild(a);
a.click();
a.remove();
document.getElementById('status').innerHTML =
'转换成功!PDF已开始下载~';
// 3秒后重置
setTimeout(() => {
progressBar.style.display = 'none';
progressFill.style.width = '0%';
document.getElementById('status').innerHTML = '';
}, 3000);
} else {
const errorText = await response.text();
document.getElementById('status').innerHTML =
`转换失败: ${errorText}`;
}
} catch (error) {
document.getElementById('status').innerHTML =
`网络错误: ${error.message}`;
}
}
</script>
</body>
</html>
高级功能扩展
批量转换(群变模式)
arduino
@Service
public class BatchConversionService {
@Async // 异步处理,不卡界面
public CompletableFuture<List<File>> convertMultiple(List<MultipartFile> files) {
System.out.println("开始批量转换,共" + files.size() + "个文件,冲鸭!");
List<File> pdfFiles = new ArrayList<>();
List<CompletableFuture<File>> futures = new ArrayList<>();
for (int i = 0; i < files.size(); i++) {
final int index = i;
CompletableFuture<File> future = CompletableFuture.supplyAsync(() -> {
try {
System.out.println("正在转换第" + (index + 1) + "个文件...");
byte[] pdfBytes = WordToPdfConverter.convert(convertToFile(files.get(index)));
File pdfFile = new File("converted_" + index + ".pdf");
Files.write(pdfFile.toPath(), pdfBytes);
return pdfFile;
} catch (Exception e) {
System.err.println("第" + (index + 1) + "个文件转换失败: " + e.getMessage());
return null;
}
});
futures.add(future);
}
// 等待所有转换完成
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
for (CompletableFuture<File> future : futures) {
try {
File pdf = future.get();
if (pdf != null) pdfFiles.add(pdf);
} catch (Exception e) {
// 忽略失败的文件
}
}
System.out.println("批量转换完成!成功: " + pdfFiles.size() +
"/" + files.size() + " 个文件");
return CompletableFuture.completedFuture(pdfFiles);
}
}
转换记录(变形档案室)
less
@Entity
@Table(name = "conversion_records")
@Data
@NoArgsConstructor
@AllArgsConstructor
public class ConversionRecord {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String originalFileName;
private String pdfFileName;
private Long originalSize;
private Long pdfSize;
private LocalDateTime conversionTime;
private String status; // SUCCESS, FAILED, PROCESSING
private String errorMessage;
@PrePersist
protected void onCreate() {
conversionTime = LocalDateTime.now();
}
}
@Repository
public interface ConversionRecordRepository extends JpaRepository<ConversionRecord, Long> {
List<ConversionRecord> findByStatusOrderByConversionTimeDesc(String status);
}
部署与优化建议
1. 性能优化
yaml
# application.yml 添加
server:
tomcat:
max-threads: 200 # 增加线程数处理并发转换
min-spare-threads: 20
spring:
task:
execution:
pool:
core-size: 10 # 异步任务线程池
max-size: 50
2. 内存管理
java
@Component
public class MemoryWatcher {
@Scheduled(fixedRate = 60000) // 每分钟检查一次
public void monitorMemory() {
long usedMemory = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
long maxMemory = Runtime.getRuntime().maxMemory();
double usagePercentage = (double) usedMemory / maxMemory * 100;
if (usagePercentage > 80) {
System.out.println("内存警告:使用率 " +
String.format("%.1f", usagePercentage) + "%");
// 触发垃圾回收
System.gc();
}
}
}
总结:Word转PDF的奇幻旅程终点站
经过这一番折腾,我们成功打造了一个SpringBoot牌文档变形金刚!总结一下这场冒险:
我们实现了什么:
- 格式驯化:把自由的Word变成规矩的PDF
- 异步处理:大文件转换不卡界面
- 进度监控:实时查看转换进度
- 错误处理:优雅处理各种意外情况
- 批量操作:一次性驯化整个Word文档家族
注意事项:
- 字体问题:有些特殊字体PDF可能不认识,需要额外处理
- 复杂格式:Word里的高级排版(如文本框、艺术字)可能变形
- 内存消耗:大文档转换时注意内存溢出
- 并发限制:同时转换太多文档可能导致服务器喘不过气
Word转PDF就像给文档穿上"防改铠甲",SpringBoot就是我们打造这副铠甲的智能工厂。虽然过程中会遇到各种奇葩格式的"刺头文档",但只要有耐心调试,最终都能把它们治理得服服帖帖!

谢谢你看我的文章,既然看到这里了,如果觉得不错,随手点个赞、转发、在看三连吧,感谢感谢。那我们,下次再见。
您的一键三连,是我更新的最大动力,谢谢
山水有相逢,来日皆可期,谢谢阅读,我们再会
我手中的金箍棒,上能通天,下能探海