使用docx4j转换word为pdf处理中文乱码问题

word转pdf

实现方法

java 复制代码
import org.docx4j.Docx4J;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import cn.hutool.core.io.FileUtil;
import java.io.*;

@Slf4j
@Service
public class FileService {
    /**
     * 获取pdf文件,通过word文件转换
     *
     * @param file word文件
     * @date 2023/08/26 23:13
     */
    public File getPdfByWordFile(File file) {
        Mapper fontMapper = new IdentityPlusMapper();
        fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
        fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
        fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft Yahei"));
        fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
        fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
        fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
        fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
        fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
        fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
        fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
        fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
        fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
        fontMapper.put("等线", PhysicalFonts.get("SimSun"));
        fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
        fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
        fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
        fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
        fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
        fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
        fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
        fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
        fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
        fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
        fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
        //解决宋体(正文)和宋体(标题)的乱码问题
        PhysicalFonts.put("PMingLiU", PhysicalFonts.get("SimSun"));
        PhysicalFonts.put("新細明體", PhysicalFonts.get("SimSun"));

        //输出空文件
        File outFile = FileUtil.createTempFile(".pdf", true);
        WordprocessingMLPackage pkg = null;
        try (InputStream resources = FileUtil.getInputStream(file)) {
            pkg = Docx4J.load(resources);
            pkg.setFontMapper(fontMapper);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }

        try (FileOutputStream outputStream = new FileOutputStream(outFile)) {
            Docx4J.toPDF(pkg, outputStream);
        } catch (Exception e) {
            log.error("生成pdf文件异常");
        }
        return outFile;
    }
}

maven

docx4j版本自己酌情升级

可能存在漏洞

Dependency maven:org.apache.xmlgraphics:xmlgraphics-commons:2.3 is vulnerable

Upgrade to 2.9

CVE-2020-11988, Score: 8.2

Apache XmlGraphics Commons 2.4 and earlier is vulnerable to server-side request forgery, caused by improper input validation by the XMPParser. By using a specially-crafted argument, an attacker could exploit this vulnerability to cause the underlying server to make arbitrary GET requests. Users should upgrade to 2.6 or later.

xml 复制代码
        <!-- word转pdf -->
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-JAXB-Internal</artifactId>
            <version>8.2.4</version>
            <exclusions>
                <exclusion>
                    <groupId>xerces</groupId>
                    <artifactId>xercesImpl</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-export-fo</artifactId>
            <version>8.2.4</version>
        </dependency>
相关推荐
兔兔爱学习兔兔爱学习1 小时前
Spring Al学习7:ImageModel
java·学习·spring
lang201509282 小时前
Spring远程调用与Web服务全解析
java·前端·spring
m0_564264183 小时前
IDEA DEBUG调试时如何获取 MyBatis-Plus 动态拼接的 SQL?
java·数据库·spring boot·sql·mybatis·debug·mybatis-plus
崎岖Qiu3 小时前
【设计模式笔记06】:单一职责原则
java·笔记·设计模式·单一职责原则
Hello.Reader3 小时前
Flink ExecutionConfig 实战并行度、序列化、对象重用与全局参数
java·大数据·flink
熊小猿4 小时前
在 Spring Boot 项目中使用分页插件的两种常见方式
java·spring boot·后端
paopaokaka_luck4 小时前
基于SpringBoot+Vue的助农扶贫平台(AI问答、WebSocket实时聊天、快递物流API、协同过滤算法、Echarts图形化分析、分享链接到微博)
java·vue.js·spring boot·后端·websocket·spring
老华带你飞5 小时前
机器人信息|基于Springboot的机器人门户展示系统设计与实现(源码+数据库+文档)
java·数据库·spring boot·机器人·论文·毕设·机器人门户展示系统
notion20255 小时前
Adobe Lightroom Classic下载与安装教程(附安装包) 2025最新版详细图文安装教程
java·数据库·其他·adobe
rengang665 小时前
351-Spring AI Alibaba Dashscope 多模型示例
java·人工智能·spring·多模态·spring ai·ai应用编程