PDFBox - PDDocument 与 byte 数组、PDF 加密

一、PDDocument 与 byte 数组

1、由 byte 数组创建 PDDocument
(1)基本介绍
  • 调用 PDDocument 的 load 静态方法,由 byte 数组创建 PDDocument
java 复制代码
public static PDDocument load(byte[] input) throws IOException {
    return load(input, "");
}
(2)演示
java 复制代码
File pdfFile = new File("pdf/image_example.pdf");

InputStream pdfInputStream;
try {
    pdfInputStream = new FileInputStream(pdfFile);
} catch (FileNotFoundException e) {
    e.printStackTrace();
    System.out.println("PDF 文件不存在");
    return;
}

ByteArrayOutputStream pdfBuffer = new ByteArrayOutputStream();
int nRead;
byte[] data = new byte[1024];
try {
    while ((nRead = pdfInputStream.read(data, 0, data.length)) != -1) {
        pdfBuffer.write(data, 0, nRead);
    }
} catch (IOException e) {
    e.printStackTrace();
    System.out.println("PDF 读取文件失败");
    return;
}

byte[] pdfBytes = pdfBuffer.toByteArray();

try (PDDocument document = PDDocument.load(pdfBytes)) {
    PDPage page = new PDPage();
    document.addPage(page);

    File imgFile = new File("pdf/dzs.jpeg");

    InputStream imgInputStream;
    try {
        imgInputStream = new FileInputStream(imgFile);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        System.out.println("图片不存在");
        return;
    }

    ByteArrayOutputStream imgBuffer = new ByteArrayOutputStream();
    int nReadImg;
    byte[] dataImg = new byte[1024];
    try {
        while ((nReadImg = imgInputStream.read(dataImg, 0, dataImg.length)) != -1) {
            imgBuffer.write(dataImg, 0, nReadImg);
        }
    } catch (IOException e) {
        e.printStackTrace();
        System.out.println("图片读取失败");
        return;
    }

    byte[] imgBytes = imgBuffer.toByteArray();

    PDImageXObject pdImage = PDImageXObject.createFromByteArray(document, imgBytes, "dzs.jpeg");

    try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {
        contentStream.drawImage(pdImage, 100, 600, pdImage.getWidth() * 0.25f, pdImage.getHeight() * 0.25f);
    } catch (IOException e) {
        e.printStackTrace();
    }

    document.save("pdf/image_example.pdf");
} catch (IOException e) {
    e.printStackTrace();
}
2、由 PDDocument 得到 byte 数组
(1)基本介绍
  • 调用 PDDocument 的 save 方法,使用 ByteArrayOutputStream 来捕获输出,由 PDDocument 得到 byte 数组
java 复制代码
public void save(OutputStream output) throws IOException {
    if (this.document.isClosed()) {
        throw new IOException("Cannot save a document which has been closed");
    } else {
        Iterator var2 = this.fontsToSubset.iterator();

        while(var2.hasNext()) {
            PDFont font = (PDFont)var2.next();
            font.subset();
        }

        this.fontsToSubset.clear();
        COSWriter writer = new COSWriter(output);

        try {
            writer.write(this);
        } finally {
            writer.close();
        }

    }
}
(2)演示
java 复制代码
File pdfFile = new File("pdf/image_example.pdf");

InputStream pdfInputStream;
try {
    pdfInputStream = new FileInputStream(pdfFile);
} catch (FileNotFoundException e) {
    e.printStackTrace();
    System.out.println("PDF 文件不存在");
    return;
}

ByteArrayOutputStream pdfBuffer = new ByteArrayOutputStream();
int nRead;
byte[] data = new byte[1024];
try {
    while ((nRead = pdfInputStream.read(data, 0, data.length)) != -1) {
        pdfBuffer.write(data, 0, nRead);
    }
} catch (IOException e) {
    e.printStackTrace();
    System.out.println("PDF 读取文件失败");
    return;
}

byte[] pdfBytes = pdfBuffer.toByteArray();

try (PDDocument document = PDDocument.load(pdfBytes)) {
    PDPage page = new PDPage();
    document.addPage(page);

    File imgFile = new File("pdf/dzs.jpeg");

    InputStream imgInputStream;
    try {
        imgInputStream = new FileInputStream(imgFile);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        System.out.println("图片不存在");
        return;
    }

    ByteArrayOutputStream imgBuffer = new ByteArrayOutputStream();
    int nReadImg;
    byte[] dataImg = new byte[1024];
    try {
        while ((nReadImg = imgInputStream.read(dataImg, 0, dataImg.length)) != -1) {
            imgBuffer.write(dataImg, 0, nReadImg);
        }
    } catch (IOException e) {
        e.printStackTrace();
        System.out.println("图片读取失败");
        return;
    }

    byte[] imgBytes = imgBuffer.toByteArray();

    PDImageXObject pdImage = PDImageXObject.createFromByteArray(document, imgBytes, "dzs.jpeg");

    try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {
        contentStream.drawImage(pdImage, 100, 600, pdImage.getWidth() * 0.25f, pdImage.getHeight() * 0.25f);
    } catch (IOException e) {
        e.printStackTrace();
    }

    ByteArrayOutputStream output = new ByteArrayOutputStream();
    document.save(output);
    byte[] bytes = output.toByteArray();
    System.out.println("PDF 字节数:" + bytes.length);
} catch (IOException e) {
    e.printStackTrace();
}
复制代码
# 输出结果

PDF 字节数:86736

二、PDF 加密

1、基本介绍
(1)用户密码
  1. 用户密码(User Password / Open Password)用于打开 PDF 文件,也叫打开密码或文档密码

  2. 如果设置了用户密码,任何人在打开该 PDF 时都必须输入这个密码,否则无法查看内容

(2)所有者密码
  1. 所有者密码(Owner Password / Permissions Password)用于解除权限限制

  2. 所有者密码用于禁止某些操作,例如,打印、复制、编辑

  3. 只有拥有所有者密码的用户,才能移除这些限制,普通用户即使能打开 PDF,也无法绕过这些限制

2、创建加密的 PDF
java 复制代码
try (PDDocument document = new PDDocument()) {
    PDPage page = new PDPage();
    document.addPage(page);

    try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {
        contentStream.beginText();
        contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
        contentStream.newLineAtOffset(100, 700);
        contentStream.showText("Hello PDFBox");
        contentStream.endText();
    } catch (IOException e) {
        e.printStackTrace();
    }

    // 设置权限
    AccessPermission accessPermission = new AccessPermission();
    accessPermission.setReadOnly();

    // 设置保护策略
    StandardProtectionPolicy policy = new StandardProtectionPolicy("12345", "12345", accessPermission);
    document.protect(policy);

    document.save("pdf/test.pdf");
} catch (IOException e) {
    e.printStackTrace();
}
3、读取加密的 PDF
  1. 读取时未设置密码或设置了错误的密码
java 复制代码
try (PDDocument document = PDDocument.load(new File("pdf/test.pdf"), "abc")) {
    PDFTextStripper stripper = new PDFTextStripper();
    String text = stripper.getText(document);
    System.out.println(text);
} catch (Exception e) {
    e.printStackTrace();
}
复制代码
# 输出结果

org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException: Cannot decrypt PDF, the password is incorrect
  1. 读取时设置了正确的密码
java 复制代码
try (PDDocument document = PDDocument.load(new File("pdf/test.pdf"), "12345")) {
    PDFTextStripper stripper = new PDFTextStripper();
    String text = stripper.getText(document);
    System.out.println(text);
} catch (Exception e) {
    e.printStackTrace();
}
复制代码
# 输出结果

Hello PDFBox
相关推荐
@PHARAOH4 小时前
HOW - prefetch 二级页面实践
前端·javascript·react.js
EF@蛐蛐堂4 小时前
WUJIE VS QIANKUN 微前端框架选型(一)
前端·vue.js·微服务·架构
花哥码天下4 小时前
Oracle下载JDK无需登录
java·开发语言
南飞测绘视界4 小时前
【编号220】中国国内生产总值历史数据汇编1952-2021合订本(PDF扫描版)
汇编·pdf·年鉴
咚咚咚小柒4 小时前
【前端】用el-popover做通用悬停气泡(可设置弹框宽度)
前端·javascript·vue.js·elementui·html·scss
Ares-Wang4 小时前
CSS3》》 transform、transition、translate、animation 区别
前端·css·css3
考虑考虑4 小时前
go格式化时间
后端·go
摇滚侠4 小时前
Spring Boot 3零基础教程,yml语法细节,笔记16
java·spring boot·笔记
楼田莉子4 小时前
C++学习:异常及其处理
开发语言·c++·学习·visual studio