HarmonyOS 鸿蒙应用开发基础：转换整个PDF文档为图片功能

在许多应用场景中，将PDF文档的每一页转换为单独的图片文件是非常有帮助的。这可以用于文档的分享、扫描文档的电子化存档、或者进行进一步的文字识别处理等。本文将介绍如何使用华为HarmonyOS提供的PDF处理服务将整个PDF文档转换为图片，并将这些图片存放在指定的文件夹中。以往想实现这个功能都需要一些收费的插件，现在鸿蒙直接支持。

场景介绍

假设我们有一个PDF文档，想要将其所有的页面转换为图片格式，并且希望每一页都生成一张单独的图片文件。所有生成的图片文件需要存储在一个指定的文件夹中，以便后续的处理和使用。HarmonyOS的PDF服务提供了将PDF文档转换为图片的功能，支持多种图片格式，具体可以参考ImageFormat。

接口说明

接口名	描述
convertToImage(path: string, format: ImageFormat, onProgress?: Callback<number>): boolean	转换PDF文档为图片。

接口名 : convertToImage
描述: 将PDF文档的每一页转换为图片，并存储在指定的目录中。
参数 :
- path: string：指定输出图片的文件夹路径。
- format: ImageFormat：指定图片的输出格式。
- onProgress?: Callback<number>：可选参数，用于监听转换进度的回调函数。

示例代码

下面是一个完整的示例代码，演示如何将PDF文档的所有页面转换为PNG格式的图片，并存储在应用的沙箱目录下的output文件夹中。

typescript 复制代码

import { fileIo as fs } from '@kit.CoreFileKit';
import { common } from '@kit.AbilityKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import { pdfService } from '@kit.PDFKit';

@Entry
@Component
struct PdfPage {
  private pdfDocument: pdfService.PdfDocument = new pdfService.PdfDocument();
  private context = getContext() as common.UIAbilityContext;
  private loadResult: pdfService.ParseResult = pdfService.ParseResult.PARSE_ERROR_FORMAT;

  aboutToAppear(): void {
    // 确保应用沙箱目录下有input.pdf文档
    let filePath = this.context.filesDir + '/input.pdf';
    this.loadResult = this.pdfDocument.loadDocument(filePath);
  }

  build() {
    Column() {
      // 转换PDF文档的按钮
      Button('convertToImage').onClick(async () => {
        if (this.loadResult === pdfService.ParseResult.PARSE_SUCCESS) {
          // 设置输出路径
          let outputPath = getContext().filesDir + '/output/';
          // 创建输出目录
          fs.mkdir(outputPath);
          
          // 将所有的页面转化为png图片，并存储在output文件夹里
          let res = this.pdfDocument.convertToImage(outputPath, pdfService.ImageFormat.PNG);
          
          // 记录转换结果日志
          hilog.info(0x0000, 'PdfPage', 'convertToImage %{public}s!', res ? 'success' : 'fail');
        }
      })
    }
  }
}

代码解析

导入必要的模块 ：首先，我们需要导入一些必要的模块，包括文件IO操作模块fileIo、上下文模块common、日志记录模块hilog，以及PDF处理模块pdfService。
加载PDF文档 ：在aboutToAppear生命周期方法中，我们通过loadDocument方法加载PDF文档。这里假设PDF文档的路径为应用沙箱目录下的input.pdf。
创建输出目录 ：在点击按钮触发的事件处理函数中，我们首先检查PDF文档是否成功加载。如果成功，我们将创建一个用于存放输出图片的文件夹，路径为应用沙箱目录下的output文件夹。
转换为图片 ：然后，调用convertToImage方法，将PDF文档的所有页面转换为PNG格式的图片，并存放在刚刚创建的output文件夹中。
日志记录 ：最后，我们使用hilog.info方法记录转换的结果，以便于调试和日志查看。

转换指定页面为图片

DF文档页面转换为图片，或将页面的指定区域转换为图片时使用。

接口说明

接口名	描述
getPagePixelMap(): image.PixelMap	获取当前页的图片。
getCustomPagePixelMap(matrix: PdfMatrix, isGray: boolean, drawAnnotations: boolean): image.PixelMap	获取指定PdfPage区域的图片内容。
getAreaPixelMap(matrix: PdfMatrix, bitmapwidth: number, bitmapHeight: number, isGray: boolean, drawAnnotations: boolean): image.PixelMap	获取指定PdfPage区域的图片内容，并指定图片的宽和高。

示例代码

调用loadDocument方法加载PDF文档。
调用getPage方法获取某个页面。
调用getPagePixelMap或getCustomPagePixelMap方法获取当前页面或者页面区域，这时获取的是image.PixelMap图像类型。
将image.PixelMap图像类型转化为二进制图片文件并保存，参考以下方法pixelMap2Buffer。

javascript 复制代码

import { pdfService } from '@kit.PDFKit';
    import { image } from '@kit.ImageKit';
    import { fileIo as fs } from '@kit.CoreFileKit';
    import { common } from '@kit.AbilityKit';
    import { BusinessError } from '@kit.BasicServicesKit';
    
    @Entry
    @Component
    struct PdfPage {
      private pdfDocument: pdfService.PdfDocument = new pdfService.PdfDocument();
      private context = getContext() as common.UIAbilityContext;
      private loadResult: pdfService.ParseResult = pdfService.ParseResult.PARSE_ERROR_FORMAT;
    
      aboutToAppear(): void {
        // 确保沙箱目录有input.pdf文档
        let filePath = this.context.filesDir + '/input.pdf';
        this.loadResult = this.pdfDocument.loadDocument(filePath);
      }
    
      // 将 pixelMap 转成图片格式
      pixelMap2Buffer(pixelMap: image.PixelMap): Promise<ArrayBuffer> {
        return new Promise((resolve, reject) => {
          /**
           设置打包参数
           format：图片打包格式，只支持 jpg 和 webp
           quality：JPEG 编码输出图片质量
           bufferSize：图片大小，默认 10M
           */
          let packOpts: image.PackingOption = { format: 'image/jpeg', quality: 98 }
          // 创建ImagePacker实例
          const imagePackerApi = image.createImagePacker()
          imagePackerApi.packToData(pixelMap, packOpts).then((buffer: ArrayBuffer) => {
            resolve(buffer)
          }).catch((err: BusinessError) => {
            reject()
          })
        })
      }
    
      build() {
        Column() {
          // 获取为图片并保存到应用沙箱
          Button('getPagePixelMap').onClick(async () => {
            if (this.loadResult === pdfService.ParseResult.PARSE_SUCCESS) {
              let page = this.pdfDocument.getPage(0)
              let pixmap: image.PixelMap = page.getPagePixelMap();
              if (!pixmap) {
                return
              }
              const imgBuffer = await this.pixelMap2Buffer(pixmap)
              const file =
                fs.openSync(this.context.filesDir + `/${Date.now()}.png`, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);
              await fs.write(file.fd, imgBuffer)
              // 关闭文档
              await fs.close(file.fd)
            }
          })
          // 获取指定PdfPage区域的图片内容。
          Button('getCustomPagePixelMap').onClick(async () => {
            if (this.loadResult === pdfService.ParseResult.PARSE_SUCCESS) {
              let page = this.pdfDocument.getPage(0);
              let matrix = new pdfService.PdfMatrix();
              matrix.x = 100;
              matrix.y = 100;
              matrix.width = 500;
              matrix.height = 500;
              matrix.rotate = 0;
              let pixmap: image.PixelMap = page.getCustomPagePixelMap(matrix, false, false);
              if (!pixmap) {
                return;
              }
              const imgBuffer = await this.pixelMap2Buffer(pixmap);
              const file =
                fs.openSync(this.context.filesDir + `/${Date.now()}.jpeg`, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);
              await fs.write(file.fd, imgBuffer);
              // 关闭文件
              await fs.close(file.fd);
            }
          })
          // 获取指定PdfPage区域的图片内容
          Button('getAreaPixelMap').onClick(async () => {
            if (this.loadResult === pdfService.ParseResult.PARSE_SUCCESS) {
              let page = this.pdfDocument.getPage(0);
              let matrix = new pdfService.PdfMatrix();
              matrix.x = 100;
              matrix.y = 100;
              matrix.width = 500;
              matrix.height = 500;
              matrix.rotate = 0;
              let pixmap: image.PixelMap = page.getAreaPixelMap(matrix, 400, 400, true, false);
              if (!pixmap) {
                return
              }
              const imgBuffer = await this.pixelMap2Buffer(pixmap)
              const file =
                fs.openSync(this.context.filesDir + `/${Date.now()}.bmp`, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);
              await fs.write(file.fd, imgBuffer)
              // 关闭文件
              await fs.close(file.fd);
            }
          })
        }
      }
    }

总结

通过上述步骤和代码，我们可以轻松地实现将一个PDF文档的所有页面转换为单独的图片文件，并存放在指定的文件夹中。这种方法对于需要对PDF文档进行处理或分享的场景非常有用。请注意，实际开发中需要处理各种异常情况，确保程序的健壮性和用户体验。