HarmonyOS AI开发提效工具：DevEco Code & DevEco CLI - 实战：端侧AI文字识别应用

一个问题

HarmonyOS 端侧 AI 开发，最容易踩的坑不在模型本身，而在任务调度和跨设备部署。

很多人在集成 ML Kit 的 OCR 能力时，直接用 @State 驱动 UI 刷新，忽略了 Taskpool 处理图像时的生命周期问题。结果是页面进入后台再返回，识别任务直接无声无息地断了。而配置 DevEco CLI 做多 Target 打包时，AOT 编译和 HAP 分发的版本对齐问题，也会让人反复重装。

这篇文章把这两块拆开讲。先解决 OCR 识别的异步处理，再解决跨设备部署的配置问题。

问题的本质

文字识别（OCR）在 HarmonyOS 上走 ML Kit 确实方便，但实际项目里真正麻烦的是三点：

图像来源不稳定：相机数据流可能被生命周期打断
AI 推理阻塞 UI 线程：不走 Taskpool 基本没法用
模型加载偶现失败：Token 校验和权限申请时机不对

而且，不同设备上模型的行为不完全一致。用 DevEco CLI 配置多个 Product，可以针对不同设备单独优化，但这也意味着构建配置需要更精细的管理。

ML Kit 和 MindSpore Lite 的对比：

方案	集成成本	推理速度	设备兼容性	后续维护
ML Kit	低，API 封装完整	快，硬件加速	好，官方维护	省心
MindSpore Lite	高，需要模型转换	可控，可调优	一般，需自测	成本高

对于大多数应用，ML Kit 已经够用。MindSpore Lite 适合有定制模型需求的场景。

环境说明

text 复制代码

DevEco Studio 版本：DevEco Studio 6.1.0 及以上
HarmonyOS SDK 版本：HarmonyOS 6.1.0(23) 及以上
目标设备：[手机、平板]

核心实现

1. 初始化 OCR 模型引擎

这段代码负责加载 ML Kit 的文字识别能力。很多人直接写成全局单例，但忽略了模型文件在被加载过程中可能被系统回收。

typescript 复制代码

// TextRecognitionManager.ets
import { textRecognition, textRecognitionResult } from '@kit.MindSporeLiteKit';
import { BusinessError } from '@kit.BasicServicesKit';

export class TextRecognitionManager {
    private static instance: TextRecognitionManager;
    private recognizer: textRecognition.TextRecognition | null = null;

    private constructor() {
        // 初始化在 load 时做，避免构造时占用资源
    }

    static getInstance(): TextRecognitionManager {
        if (!TextRecognitionManager.instance) {
            TextRecognitionManager.instance = new TextRecognitionManager();
        }
        return TextRecognitionManager.instance;
    }

    async loadRecognizer(): Promise<void> {
        if (this.recognizer !== null) {
            return;
        }
        try {
            // 加载模型，注意 modelPath 必须为沙箱路径
            this.recognizer = await textRecognition.createTextRecognition(
                'business/your_model.mindir',
                {
                    // 可选配置：设备类型、精度模式
                    deviceType: 0 // 0: CPU, 1: GPU
                }
            );
        } catch (error) {
            console.error('模型加载失败: ' + (error as BusinessError).message);
            throw error;
        }
    }

    getRecognizer(): textRecognition.TextRecognition | null {
        return this.recognizer;
    }
}

这里有个关键细节： modelPath 必须是 HAP 包内的沙箱路径，不能直接传 rawfile 路径。如果写成 getContext().resourceDir + '/model.mindir'，在部分设备上会找不到文件。建议用 rawfile 打包，通过 getContext().resourceManager.getRawFileContent() 读取后存入沙箱。

2. Taskpool 图像处理任务

OCR 推理是 CPU 密集型操作。如果直接在主线程调，UI 会直接卡死 Taskpool 是官方推荐方案，但很多人不知道 Taskpool 里不能直接传递 PixelMap 对象。

typescript 复制代码

// TextRecognitionTask.ets
import { taskpool } from '@kit.ArkTS';
import { image } from '@kit.ImageKit';
import { textRecognition, textRecognitionResult } from '@kit.MindSporeLiteKit';
import { TextRecognitionManager } from './TextRecognitionManager';

@Concurrent
async function doOcrRecognition(buffer: ArrayBuffer): Promise<string> {
    // 注意：Taskpool 里不能直接传 PixelMap，必须传 ArrayBuffer
    // 因为 PixelMap 持有 Native 资源，跨线程传递会触发资源释放
    const manager = TextRecognitionManager.getInstance();
    // 但这里又不能重复加载模型，需要提前在主线程完成
    // 这个设计在 HarmonyOS 上有点别扭，所以建议在 Taskpool 外初始化好
    const recognizer = manager.getRecognizer();
    if (!recognizer) {
        return '模型未加载';
    }
    // 将 ArrayBuffer 转为 imageSource
    const imageSource = image.createImageSource(buffer);
    const pixelMap = await imageSource.createPixelMap();
    // 执行识别
    const result: textRecognitionResult = await recognizer.detect(pixelMap);
    // 回收资源
    pixelMap.release();
    imageSource.release();
    return result.text;
}

export class OcrTask {
    static execute(buffer: ArrayBuffer): Promise<string> {
        return taskpool.execute(doOcrRecognition, buffer) as Promise<string>;
    }
}

为什么不能传 PixelMap？ PixelMap 在 ArkTS 里是 Native 对象，其生命周期由 C++ 侧管理。跨 Taskpool 传递时，实际只传递了句柄，主线程如果提前释放，子线程会读到野指针。所以必须先转成 ArrayBuffer，在 Taskpool 里重建。

3. UI 页面：相机拍照 + 识别展示

主页面处理了个典型的场景：相机拍照、图像预处理、识别结果展示。完整代码如下：

typescript 复制代码

// Index.ets
import { camera } from '@kit.CameraKit';
import { image } from '@kit.ImageKit';
import { DateTimeUtil } from '@kit.ArkTS';
import { OcrTask } from './TextRecognitionTask';
import { TextRecognitionManager } from './TextRecognitionManager';

@Entry
@Component
struct OCRDemoPage {
    @State recognizedText: string = '等待识别...';
    @State isProcessing: boolean = false;
    private cameraInput: camera.CameraInput | null = null;
    private photoOutput: camera.PhotoOutput | null = null;

    aboutToAppear() {
        // 提前加载模型，避免拍照后等待
        TextRecognitionManager.getInstance().loadRecognizer().catch(() => {
            this.recognizedText = '模型加载失败';
        });
    }

    build() {
        Column() {
            // 相机预览区域
            XComponent({
                id: 'cameraXComponent',
                type: 'surface',
                controller: new XComponentController()
            })
            .width('100%')
            .aspectRatio(4 / 3)
            .onLoad(() => {
                this.initCamera();
            })

            // 识别结果
            Text(this.recognizedText)
                .width('90%')
                .height(80)
                .backgroundColor('#F0F0F0')
                .borderRadius(8)
                .padding(12)
                .margin(16)

            // 拍照按钮
            Button('拍照识别')
                .width(200)
                .height(48)
                .onClick(async () => {
                    await this.takePhotoAndRecognize();
                })
                .enabled(!this.isProcessing)
        }
        .width('100%')
        .height('100%')
    }

    private async initCamera() {
        // 相机初始化代码（省略标准流程）
        // 重点是获取 CameraOutput 到 this.photoOutput
    }

    private async takePhotoAndRecognize() {
        if (this.isProcessing) {
            return;
        }
        this.isProcessing = true;
        this.recognizedText = '正在识别...';
        try {
            // 拍照获取 PixelMap
            const photo = await this.photoOutput?.capture();
            if (!photo) {
                throw new Error('拍照失败');
            }
            // 将照片转为 ArrayBuffer
            const rawImage: image.Picture = photo as image.Picture;
            const imgReceiver = image.createImageReceiver(1, 1);
            const buffer = await imgReceiver.readNextImage();
            // 转为 ArrayBuffer 用于 Taskpool
            const arrayBuffer = buffer.getComponent(image.ComponentType.JPEG).byteBuffer;
            // 在 Taskpool 中执行 OCR
            const result = await OcrTask.execute(arrayBuffer.slice(0));
            this.recognizedText = result || '未识别到文字';
        } catch (error) {
            this.recognizedText = '识别出错: ' + (error as Error).message;
        } finally {
            this.isProcessing = false;
        }
    }
}

生命周期问题： aboutToAppear 只会在页面首次加载时调一次。如果页面被系统销毁重建，模型需要重新加载。推荐在模块入口用单例管理加载状态，配合 @State 记录是否加载完成。

4. AOT 编译与多 Target 配置

DevEco CLI 的 AOT 编译需要明确指定哪些模块需要预编译。在 hvigor-config.json5 中配置：

json5 复制代码

{
  "app": {
    "products": [
      {
        "name": "phone",
        "versionCode": 1,
        "versionName": "1.0.0",
        "buildConfig": {
          "compileMode": "aot",
          "aotMode": "partial", // 或 full，full 更耗空间但更快
          "excludeFiles": ["**/*.d.ts"],
          "aotProfiles": ["business/profile.ap"] // 预置的热点数据
        },
        "supportDevices": [
          {
            "deviceType": "phone"
          }
        ]
      },
      {
        "name": "tablet",
        "versionCode": 1,
        "versionName": "1.0.0",
        "buildConfig": {
          "compileMode": "aot",
          "aotMode": "partial"
        },
        "supportDevices": [
          {
            "deviceType": "tablet"
          }
        ]
      }
    ]
  }
}

参数说明：

compileMode: "aot"：启用 AOT
aotMode: "partial"：只编译热点代码，适合有启动性能要求但不希望包体过大的场景
aotProfiles：可以传入通过 DevEco Profiler 采集的热点函数文件，让编译器优先编译

常见问题

问题 1：Taskpool 任务始终不执行

现象：调了 taskpool.execute() 但任务永远卡住，不进入回调。

原因：Taskpool 的任务队列有容量限制。如果之前提交的任务一直没释放（比如死循环、Promise 不 resolve），新任务会被阻塞。在 OCR 场景，如果模型加载失败导致 recognizer 为 null，任务会抛异常但不被 taskpool 捕获，导致任务永远 pending。

解决方案 ：在 Taskpool 函数内加 try-catch，保证所有路径都返回或 reject。另外，定期检查任务队列状态可以用 taskpool.getTaskPoolInfo()。

问题 2：AOT 编译后应用启动反而变慢

现象：启用 AOT 后，首次启动耗时比关闭 AOT 还长。

原因：AOT 编译会把字节码转为机器码存到设备的 mmap 区域，但这个过程本身是耗时的。如果开启了 aotMode: "full" 但没有提供 profile，编译器会把所有代码全量编译，导致首次安装后编译任务过重。

解决方案 ：改用 partial 模式，并通过 DevEco Profiler 采集实际高频调用的函数，生成 profile 文件配置到 aotProfiles。对 OCR 应用来说，模型加载和推理函数是热点，应该优先编译。

最佳实践

模型实例建议单例化 ：ML Kit 的 TextRecognition 创建成本不低，应该在应用启动时就初始化好，避免每次识别都重新创建。单例配合 aboutToAppear 是最稳定的组合。
Taskpool 任务需要控制粒度：单次 OCR 推理最好不要超过 200ms。如果图片分辨率高（比如 4000*3000），先压缩再传入 Taskpool。在 Taskpool 内部做压缩也行，但要控制内存占用。实测 1080p 图片用 70% 质量压缩，识别精度几乎不变。
DevEco CLI 的 AOT 配置要结合真机测试 ：模拟器的编译行为和真机不一致。partial 模式在模拟器上可能表现正常，但真机上因为 CPU 指令集差异，可能出现编译异常。建议至少在每个 target 的真机上跑一次。

FAQ

Q：为什么 AOT 编译后安装包大小变化很大？

A：AOT 会把部分字节码转为汇编指令存储在包内，所以包体增大是正常的。full 模式会比 partial 多 30%~50%。如果包体过大，可以拆分成 HAP + HSP，把不常用的插件延迟加载。

Q：为什么真机正常，模拟器上 OCR 识别结果全为空？

A：模拟器通常没有真实的相机硬件，拍照返回的可能是模拟的空白图片。建议用 image.createPixelMap 手动构造测试数据验证。另外，部分模拟器不支持 ML Kit 的硬件加速，会回退到 CPU 模式，但有时回退逻辑有问题，需要指定 deviceType: 0。

Q：为什么 Taskpool 里用 PixelMap 有时崩溃？

A：前面解释过，PixelMap 是 Native 对象。即使你在 Taskpool 里只读不写，主线程的释放操作也会导致子线程访问非法内存。一定要转成 ArrayBuffer 再传。如果 ArrayBuffer 过大（超过 10MB），建议分片处理或用 @Sendable 装饰器（但要小心共享状态的可变性）。

总结

端侧 AI 的核心不在于模型有多强，而在于如何稳定地调度任务、管理资源。HarmonyOS 的 Taskpool 和 AOT 编译解决了大部分性能问题，但开发者需要理解它们的限制：Native 对象的跨线程传递、AOT 编译的 profile 匹配、多 Target 的版本一致性。DevEcoCode 和 DevEcoCLI 把工具链完整了，但真正的工程经验在于知道什么时候该用什么配置，什么时候要绕开限制。