使用docx4j转换word为pdf处理中文乱码问题

word转pdf

实现方法

java 复制代码
import org.docx4j.Docx4J;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import cn.hutool.core.io.FileUtil;
import java.io.*;

@Slf4j
@Service
public class FileService {
    /**
     * 获取pdf文件,通过word文件转换
     *
     * @param file word文件
     * @date 2023/08/26 23:13
     */
    public File getPdfByWordFile(File file) {
        Mapper fontMapper = new IdentityPlusMapper();
        fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
        fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
        fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft Yahei"));
        fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
        fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
        fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
        fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
        fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
        fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
        fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
        fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
        fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
        fontMapper.put("等线", PhysicalFonts.get("SimSun"));
        fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
        fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
        fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
        fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
        fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
        fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
        fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
        fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
        fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
        fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
        fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
        //解决宋体(正文)和宋体(标题)的乱码问题
        PhysicalFonts.put("PMingLiU", PhysicalFonts.get("SimSun"));
        PhysicalFonts.put("新細明體", PhysicalFonts.get("SimSun"));

        //输出空文件
        File outFile = FileUtil.createTempFile(".pdf", true);
        WordprocessingMLPackage pkg = null;
        try (InputStream resources = FileUtil.getInputStream(file)) {
            pkg = Docx4J.load(resources);
            pkg.setFontMapper(fontMapper);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }

        try (FileOutputStream outputStream = new FileOutputStream(outFile)) {
            Docx4J.toPDF(pkg, outputStream);
        } catch (Exception e) {
            log.error("生成pdf文件异常");
        }
        return outFile;
    }
}

maven

docx4j版本自己酌情升级

可能存在漏洞

Dependency maven:org.apache.xmlgraphics:xmlgraphics-commons:2.3 is vulnerable

Upgrade to 2.9

CVE-2020-11988, Score: 8.2

Apache XmlGraphics Commons 2.4 and earlier is vulnerable to server-side request forgery, caused by improper input validation by the XMPParser. By using a specially-crafted argument, an attacker could exploit this vulnerability to cause the underlying server to make arbitrary GET requests. Users should upgrade to 2.6 or later.

xml 复制代码
        <!-- word转pdf -->
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-JAXB-Internal</artifactId>
            <version>8.2.4</version>
            <exclusions>
                <exclusion>
                    <groupId>xerces</groupId>
                    <artifactId>xercesImpl</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.docx4j</groupId>
            <artifactId>docx4j-export-fo</artifactId>
            <version>8.2.4</version>
        </dependency>
相关推荐
考虑考虑21 小时前
Jpa使用union all
java·spring boot·后端
用户37215742613521 小时前
Java 实现 Excel 与 TXT 文本高效互转
java
浮游本尊1 天前
Java学习第22天 - 云原生与容器化
java
渣哥1 天前
原来 Java 里线程安全集合有这么多种
java
间彧1 天前
Spring Boot集成Spring Security完整指南
java
间彧1 天前
Spring Secutiy基本原理及工作流程
java
Java水解1 天前
JAVA经典面试题附答案(持续更新版)
java·后端·面试
洛小豆1 天前
在Java中,Integer.parseInt和Integer.valueOf有什么区别
java·后端·面试
前端小张同学1 天前
服务器上如何搭建jenkins 服务CI/CD😎😎
java·后端
ytadpole1 天前
Spring Cloud Gateway:一次不规范 URL 引发的路由转发404问题排查
java·后端