OPI4A，目标检测，口罩检测，mnn，YoloX

记得之前，使用了bubbling导师复现的python版yolox，训练了自建的口罩数据集，得到了h5文件，又转换成pb文件，再使用阿里巴巴的MNN，使用它的MNNConvert，转换成mnn文件

最终实现了，在树莓派4B上，使用mnn推理加速，实现了口罩检测，判断一个人通过时戴不戴口罩，由于需要渲染结果，FPS始终维持在20到25左右，关闭以后可以在30FPS。

在香橙派OPI4A上，烧录官网的Ubuntu镜像，使用conda创建虚拟环境，安装阿里巴巴的MNN，当时用了mnn1.1.3，是需要手动编译的，只需要修改一些配置，得到mnn1.1.3的安装包，然后使用pip 安装，用yolox的代码运行。最终实现了，在相同的输入输出情况下，得到了高达40FPS的速度。（现在的话，直接pip install MNN应该也是可以的）

（由于目前用户手册没有提及camera1和camera2适配什么摄像头，而我的imx219也不能使用（电压问题？），使用了一个垃圾usb摄像头作为输入）

8核OPI4A完胜4核树莓派4B

在MNN文档，有官方示例，重点就是创建interpreter，创建session，输入输出，推理......而这些都前提是，你得先转换得到一个mnn模型文件，官网也有给出方法，一行命令即可，工具叫做------MNNConvert

python 复制代码

import MNN
import MNN.cv as cv
import MNN.numpy as np
import MNN.expr as expr

# 创建interpreter
interpreter = MNN.Interpreter("mobilenet_v1.mnn")
# 创建session
config = {}
config['precision'] = 'low'
config['backend'] = 'CPU'
config['thread'] = 4
session = interpreter.createSession(config)
# 获取会话的输入输出
input_tensor = interpreter.getSessionInput(session)
output_tensor = interpreter.getSessionOutput(session)

# 读取图片
image = cv.imread('cat.jpg')

dst_height = dst_width = 224
# 使用ImageProcess处理第一张图片，将图片转换为转换为size=(224, 224), dtype=float32，并赋值给input_data1
image_processer = MNN.CVImageProcess({'sourceFormat': MNN.CV_ImageFormat_BGR,
                                      'destFormat': MNN.CV_ImageFormat_BGR,
                                      'mean': (103.94, 116.78, 123.68, 0.0),
                                      'filterType': MNN.CV_Filter_BILINEAL,
                                      'normal': (0.017, 0.017, 0.017, 0.0)})
image_data = image.ptr
src_height, src_width, channel = image.shape
input_data1 = MNN.Tensor((1, dst_height, dst_width, channel), MNN.Halide_Type_Float, MNN.Tensor_DimensionType_Tensorflow)
#设置图像变换矩阵
matrix = MNN.CVMatrix()
x_scale = src_width / dst_width
y_scale = src_height / dst_height
matrix.setScale(x_scale, y_scale)
image_processer.setMatrix(matrix)
image_processer.convert(image_data, src_width, src_height, 0, input_data1)

# 使用cv模块处理第二张图片，将图片转换为转换为size=(224, 224), dtype=float32，并赋值给input_data2
image = cv.imread('TestMe.jpg')
image = cv.resize(image, (224, 224), mean=[103.94, 116.78, 123.68], norm=[0.017, 0.017, 0.017])
input_data2 = np.expand_dims(image, 0) # [224, 224, 3] -> [1, 224, 224, 3]

# 合并2张图片到，并赋值给input_data
input_data1 = expr.const(input_data1.getHost(), input_data1.getShape(), expr.NHWC) # Tensor -> Var
input_data = np.concatenate([input_data1, input_data2])  # [2, 224, 224, 3]
input_data = MNN.Tensor(input_data) # Var -> Tensor

# 演示多张图片输入，所以将输入resize到[2, 3, 224, 224]
interpreter.resizeTensor(input_tensor, (2, 3, 224, 224))
# 重新计算形状分配内存
interpreter.resizeSession(session)

# 拷贝数据到输入Tensor
input_tensor.copyFrom(input_data)

# 执行会话推理
interpreter.runSession(session)

# 从输出Tensor拷贝出数据 
output_data = MNN.Tensor(output_tensor.getShape(), MNN.Halide_Type_Float, MNN.Tensor_DimensionType_Caffe)
output_tensor.copyToHostTensor(output_data)

# 打印出分类结果: 282为猫，385为象
output_var = expr.const(output_data.getHost(), [2, 1001])
print("output belong to class: {}".format(np.argmax(output_var, 1)))
# output belong to class: array([282, 385], dtype=int32)