需要在APP里使用tensorflow lite来运行PC端训练的model.tlite,又想apk的体积最小,尝试了如下方法:
- 在gradle里配置
implementation("org.tensorflow:tensorflow-lite:2.16.1")
这样会引入tensorflow.jar,最终apk的size增加大约2.2M
- 根据tensorflow官方的优化编译教程
https://www.tensorflow.org/lite/android/lite_build?spm=5176.28103460.0.0.73711db8niy7UE&hl=zh-cn
针对我们的模型,构建出针对性的TensorFlow Lite AAR,最后集成到apk里,体积增加约1.5M
分析TensorFlow Lite AAR的实现,发现其本质还是通过JNI调用了libtensorflowlite.so,
而这个libtensorflowlite.so,包含了tensorflow lite几乎所有核心framework代码,因此肯定很大。
- 因为我们仅需要用到tensorflow lite里model 初始化,interpreter推理等基础功能,并不需要tensorflow lite里的其他功能,因此,想要最小,直接在我们的JNI文件里,集成tensorflow lite相关类的源码进行编译,应该就能使得体积增加最小化了。
把我们JNI文件依赖的类,比如
#include "tensorflow/lite/interpreter.h"
#include "tensorflow/lite/model.h"
等.h和.cc引入我们的JNI里,一起编译就行了。
一开始是在android studio里,导入tensorflow lite的源码, 修改CmakeLists.txt,尝试编译出可以独立运行的JNI so, 但是总是失败。
最后,把JNI文件,放到tensorflow lite的源码目录里,利用tensorflow的编译工具bazel,编译成功。然后把生成的milc_jni.so放到app的jniLibs里,成功:
a. 在tensorflow/lite/下创建milc_jni/这个目录,目录下创建BUILD,milc_jni.cc, custom_op_resolver.h和custom_op_resolver.cc
b. 根据我们的模型文件model.tflite里用到的算子,比如,我只用了FULLY_CONNECTED,RELU, LOGISTIC这3个算子,定制精简算子的Resolver类
custom_op_resolver.h
cpp
#ifndef TENSORFLOW_LITE_CUSTOM_OP_RESOLVER_H_
#define TENSORFLOW_LITE_CUSTOM_OP_RESOLVER_H_
#include "tensorflow/lite/mutable_op_resolver.h"
namespace tflite {
class MinimalOpResolver : public MutableOpResolver {
public:
MinimalOpResolver();
};
} // namespace tflite
#endif // TENSORFLOW_LITE_CUSTOM_OP_RESOLVER_H_
custom_op_resolver.cc
cpp
#include "tensorflow/lite/milc_jni/custom_op_resolver.h"
#include "tensorflow/lite/kernels/builtin_op_kernels.h"
namespace tflite {
MinimalOpResolver::MinimalOpResolver() {
// 使用 kernels::builtin:: 命名空间下的注册函数
AddBuiltin(BuiltinOperator_FULLY_CONNECTED,
tflite::ops::builtin::Register_FULLY_CONNECTED());
AddBuiltin(BuiltinOperator_RELU,
tflite::ops::builtin::Register_RELU());
AddBuiltin(BuiltinOperator_LOGISTIC,
tflite::ops::builtin::Register_LOGISTIC());
}
} // namespace tflite
c. 创建JNI文件milc_jni.cc
cpp
#include <jni.h>
#include "tensorflow/lite/interpreter.h"
#include "tensorflow/lite/model.h"
#include "tensorflow/lite/milc_jni/custom_op_resolver.h"
#include <android/log.h>
#define LOG_TAG "TensorFlowLiteJNI"
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
// 移除所有日志输出
//#define LOGI(...)
//#define LOGE(...)
extern "C" JNIEXPORT jfloat JNICALL
Java_com_xm_j_milc_predictJNI(JNIEnv* env, jobject /* this */, jstring modelPath, jfloatArray inputArray) {
const char* modelPathStr = env->GetStringUTFChars(modelPath, nullptr);
// 获取输入数组
jfloat* inputElements = env->GetFloatArrayElements(inputArray, nullptr);
jsize inputLength = env->GetArrayLength(inputArray);
if (inputLength != 31) {
LOGE("Input array length must be 31");
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return -1.0;
}
// 加载 TensorFlow Lite 模型
std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile(modelPathStr);
if (!model) {
LOGE("Failed to load model from %s", modelPathStr);
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return -1.0;
}
// 创建解释器
//tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::MinimalOpResolver resolver;
std::unique_ptr<tflite::Interpreter> interpreter;
tflite::InterpreterBuilder(*model, resolver)(&interpreter);
if (!interpreter) {
LOGE("Failed to create interpreter");
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return -1.0;
}
// 分配张量
if (interpreter->AllocateTensors() != kTfLiteOk) {
LOGE("Failed to allocate tensors");
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return -1.0;
}
// 设置输入
float* input = interpreter->typed_input_tensor<float>(0);
for (int i = 0; i < inputLength; ++i) {
input[i] = inputElements[i];
}
// 运行推理
if (interpreter->Invoke() != kTfLiteOk) {
LOGE("Failed to invoke interpreter");
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return -1.0;
}
// 获取输出
// 5. 获取输出结果
float* outputTensor = interpreter->typed_output_tensor<float>(0);
// 释放资源
env->ReleaseStringUTFChars(modelPath, modelPathStr);
env->ReleaseFloatArrayElements(inputArray, inputElements, JNI_ABORT);
return outputTensor[0]; // 直接返回标量值
}
d. 创建BUILD文件
bash
# 自定义操作解析器(仅包含必要算子)
cc_library(
name = "custom_op_resolver",
srcs = ["custom_op_resolver.cc"],
hdrs = ["custom_op_resolver.h"],
deps = [
"//tensorflow/lite/kernels:builtin_ops",
],
)
cc_binary(
name = "milc_jni.so",
srcs = ["milc_jni.cc"],
linkshared = True,
linkstatic = True, # 静态链接所有依赖
deps = [
":custom_op_resolver",
"//tensorflow/lite:framework",
"//tensorflow/lite/kernels:builtin_ops",
"@flatbuffers//:flatbuffers",
],
copts = [
"-Oz",
"-flto=thin",
"-ffunction-sections",
"-fdata-sections",
"-fvisibility=hidden",
"-fvisibility-inlines-hidden",
"-DFLATBUFFERS_RELEASE",
"-DTF_LITE_STRIP_ERROR_STRINGS=1",
"-DNDEBUG",
"-DFORCE_MINIMAL_LOGGING",
"-fno-exceptions",
"-fno-rtti",
"-fno-unwind-tables",
"-fno-asynchronous-unwind-tables",
"-ffreestanding",
],
linkopts = [
"-flto=thin",
"-Wl,--gc-sections",
"-Wl,--exclude-libs,ALL",
"-s",
"-Wl,--as-needed",
"-Wl,-z,norelro",
"-Wl,--build-id=none", # 移除构建ID
"-Wl,--strip-all", # 彻底去除符号
"-nostdlib",
"-lc",
"-Wl,--hash-style=gnu", # 更小的哈希表
"-Wl,--compress-debug-sections=zlib", # 压缩调试节
],
features = [
"-layering_check",
],
)
e. 在tensorflow的源码目录里,初始化好环境,AndroidNDK之类的,然后执行编译
bash
bazel build -c opt --config=android_arm64 --copt="-DFORCE_DISABLE_ALL_OPS" --linkopt="-Wl,--gc-sections" --linkopt="-Wl,--exclude-libs,ALL" --linkopt="-s" --define=tflite_with_xnnpack=false --copt="-Os" --copt="-fomit-frame-pointer" --copt="-ffunction-sections" --copt="-fdata-sections" --copt="-fvisibility=hidden" --copt="-g0" --copt="-DFLATBUFFERS_RELEASE" //tensorflow/lite/milc_jni:milc_jni.so
然后,就会生成一个milc_jni.so,大约500K,它是可以独立运行的,不用依赖libtensorflowlite.so,因此,APK的size,也就只会增加约500K。
f.针对生成的milc_jni.so,进一步压缩优化
bash
sudo apt-get install upx
upx --android-shlib --best --lzma milc_jni.so -o milc_jni_upx.so
最终的milc_jni_upx.so大约200K,因此,APK的size,也就只会增加约200K。