基于华为atlas的重车(满载)空车(空载)识别

该教程主要是想摸索出华为atlas的基于ACL的推理模式。最终实现通过煤矿磅道上方的摄像头,识别出车辆的重车(满载)、空车(空载)情况。本质上是一个简单的检测问题。

但是整体探索过程比较坎坷,Tianxiaomo的代码可以基于原始yolov4模型进行推理,可以转化onnx,但是训练过程我感觉代码有问题,loss很大,也没检测框输出。同时输出的结果的维度和atlas教程的维度也不一样。对于这2个问题,第1个问题训练效果不对选择使用原始darknet网络解决,第2个问题输出维度和atlas不一样通过重新实现atlas后处理代码实现。

风雨兼程,终见彩虹;艰辛耕耘,方得硕果。

darknet数据集制作及配置文件修改:

(1)数据集采用labelimg工具标注为VOC格式,一共标注了1087张图片。

(2)数据集格式如下,

其中,VOC2025为我自己的数据集起的名字,你也可以起别的名字,Annotations存放XML文件,Main中存放,train.txt,val.txt,txt中只写图片的名字,一行一个。JPEGImages中存放图片。labels中存放由XML生成的txt文件。

(3)修改scripts下面的voc_label.py,将数据集的目录修改为自己的目录,

复制代码
#开始几行
sets=[('2025', 'train'), ('2025', 'val')]
classes = ["full", "empty"]
#最后2行
os.system("cat 2025_train.txt 2025_val.txt > train.txt")
os.system("cat 2025_train.txt 2025_val.txt > train.all.txt")

然后执行

Python3 scripts/voc_label.py

就会生成labels文件夹,以及文件夹下面的txt标记,以及train.txt 和train.all.txt

其中,train.txt中存储路径+图片名,一行一个

复制代码
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01467.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01468.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01469.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01559.jpg

Labels文件夹下每个图片对应一个txt文件,里面存储类别 框坐标的归一化值

复制代码
0 0.6794407894736842 0.5394736842105263 0.5516447368421052 0.9195906432748537

(4)修改,cfg/fullempty.data

复制代码
classes= 2
train  = ./VOCdevkit/VOC2025/ImageSets/Main/train.txt
valid  = ./VOCdevkit/VOC2025/ImageSets/Main/val.txt
names = ./data/fullempty.names
backup = ./pjreddie/backup/

class为训练的类别数

train为训练集train.txt

valid为验证集val.txt

names为fullempty.names,里面为自己训练的目标名称

backup为weights的存储位置

(5)修改cfg/yolov4-fullempty.cfg

修改每个classes=2

修改最后一个卷基层,filters和最后一个region的classes,num参数是因为yolov4有3个分支,每个分支3个anchor。

其中,filters=num×(classes + coords + 1)=3*(2+4+1)=21,这里我有2个类别。

(6)修改data/fullempty.names

复制代码
full
empty

darknet模型训练:

复制代码
./darknet detector train ./cfg/fullempty.data ./cfg/yolov4-fullempty.cfg  ./yolov4.weights -clear

darknet的.weights模型测试:

复制代码
./darknet detect  ./cfg/yolov4-fullempty.cfg  ./pjreddie/backup/yolov4-fullempty_last.weights  ./VOCdevkit/VOC2025/JPEGImages/2793_00847.jpg

Pytorch代码配置文件修改:

#cfg.py

复制代码
Cfg.use_darknet_cfg = True
Cfg.cfgfile = os.path.join(_BASE_DIR, 'cfg', 'yolov4-custom.cfg')

#cfg/yolov4-custom.cfg,

Pytorch代码bug修改:

#train.py211行,

复制代码
pred_ious = bboxes_iou(pred[b].view(-1, 4), truth_box, xyxy=False)修改为,
pred_ious = bboxes_iou(pred[b].contiguous().view(-1, 4), truth_box, xyxy=False)

dataset.py, get_image_id函数,因为我的图片命名规则是Id_id.jpg,所以将2个id拼接起来作为最终的id。

复制代码
parts = filename.split('.')[0].split('_')
id = int(parts[0]+ parts[1])
return id

基于pytorch代码的.weights模型测试:

复制代码
python3 demo.py -cfgfile ./cfg/yolov4-custom.cfg -weightfile ./yolov4-fullempty_last.weights -imgfile ./full_empty_dataset/images/2793_00847.jpg -torch False

.weights模型转onnx模型:

复制代码
python3 demo_darknet2onnx.py ./cfg/yolov4-custom.cfg ./data/full_empty.names ./yolov4-fullempty_last.weights ./full_empty_dataset/images/2793_00847.jpg 1

onnx模型转om模型:

复制代码
atc --model=./yolov4_1_3_608_608_static.onnx --framework=5 --output=yolov4_bs1 --input_shape="input:1,3,608, 608"  --soc_version=Ascend310P3 --input_format=NCHW

atlas推理代码编写:

#yolov4.py

复制代码
import sys
sys.path.append("./common/acllite")
import os
import numpy as np
import acl
import cv2
import time
from acllite_model import AclLiteModel
from acllite_resource import AclLiteResource

from utils import post_processing, plot_boxes_cv2

MODEL_PATH = "./model/yolov4_bs1.om"

#ACL resource initialization
acl_resource = AclLiteResource()
acl_resource.init()
#load model
model = AclLiteModel(MODEL_PATH)
 

class YOLOV4(object):
    def __init__(self):
        self.MODEL_PATH = MODEL_PATH
        self.MODEL_WIDTH = 608
        self.MODEL_HEIGHT = 608

        self.class_names= ['full', 'empty']

        self.model = model
 
    def preprocess(self, bgr_img):
        sized = cv2.resize(bgr_img.copy(), (self.MODEL_WIDTH, self.MODEL_HEIGHT))
        sized = cv2.cvtColor(sized, cv2.COLOR_BGR2RGB)
    
        new_image = sized.astype(np.float32)
        new_image = new_image / 255.0
        new_image = new_image.transpose(2, 0, 1).copy()
    
        return new_image


    def process(self, bgr_img):
        height, width = bgr_img.shape[:2]
        #preprocess
        data = self.preprocess(bgr_img)#(3, 608, 608)

        #Send into model inference
        result_list = self.model.execute([data,])    
        #Process inference results

        conf_thresh, nms_thresh = 0.4, 0.6
        boxes = post_processing(conf_thresh, nms_thresh, result_list, height, width)
        return boxes

    def draw(self, bgr_img, boxes):
        drawed_img = plot_boxes_cv2(bgr_img, boxes[0], class_names=self.class_names)
        return drawed_img

def test_image():
    yolov4 = YOLOV4()
    img_name = "./data/3553_00173.jpg"

    #read image
    bgr_img = cv2.imread(img_name)

    t1 = time.time()
    boxes = yolov4.process(bgr_img)
    t2 = time.time()
    drawed_img = yolov4.draw(bgr_img, boxes)
    t3 = time.time()
    print("result = ", len(boxes[0]), boxes, t2-t1, t3-t2)

    cv2.imwrite("out.jpg", drawed_img)

if __name__ == '__main__':
    test_image()

#utils.py

复制代码
import sys
import os
import time
import math
import numpy as np

import itertools
import struct  # get_image_size
import imghdr  # get_image_size


def sigmoid(x):
    return 1.0 / (np.exp(-x) + 1.)


def softmax(x):
    x = np.exp(x - np.expand_dims(np.max(x, axis=1), axis=1))
    x = x / np.expand_dims(x.sum(axis=1), axis=1)
    return x


def bbox_iou(box1, box2, x1y1x2y2=True):
    if x1y1x2y2:
        mx = min(box1[0], box2[0])
        Mx = max(box1[2], box2[2])
        my = min(box1[1], box2[1])
        My = max(box1[3], box2[3])
        w1 = box1[2] - box1[0]
        h1 = box1[3] - box1[1]
        w2 = box2[2] - box2[0]
        h2 = box2[3] - box2[1]
    else:
        w1 = box1[2]
        h1 = box1[3]
        w2 = box2[2]
        h2 = box2[3]

        mx = min(box1[0], box2[0])
        Mx = max(box1[0] + w1, box2[0] + w2)
        my = min(box1[1], box2[1])
        My = max(box1[1] + h1, box2[1] + h2)
    uw = Mx - mx
    uh = My - my
    cw = w1 + w2 - uw
    ch = h1 + h2 - uh
    carea = 0
    if cw <= 0 or ch <= 0:
        return 0.0

    area1 = w1 * h1
    area2 = w2 * h2
    carea = cw * ch
    uarea = area1 + area2 - carea
    return carea / uarea


def nms_cpu(boxes, confs, nms_thresh=0.5, min_mode=False):
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]

    areas = (x2 - x1) * (y2 - y1)
    order = confs.argsort()[::-1]

    keep = []
    while order.size > 0:
        idx_self = order[0]
        idx_other = order[1:]

        keep.append(idx_self)

        xx1 = np.maximum(x1[idx_self], x1[idx_other])
        yy1 = np.maximum(y1[idx_self], y1[idx_other])
        xx2 = np.minimum(x2[idx_self], x2[idx_other])
        yy2 = np.minimum(y2[idx_self], y2[idx_other])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        inter = w * h

        if min_mode:
            over = inter / np.minimum(areas[order[0]], areas[order[1:]])
        else:
            over = inter / (areas[order[0]] + areas[order[1:]] - inter)

        inds = np.where(over <= nms_thresh)[0]
        order = order[inds + 1]
    
    return np.array(keep)



def plot_boxes_cv2(img, boxes, class_names=None, color=None):
    import cv2
    img = np.copy(img)
    colors = np.array([[1, 0, 1], [0, 0, 1], [0, 1, 1], [0, 1, 0], [1, 1, 0], [1, 0, 0]], dtype=np.float32)

    def get_color(c, x, max_val):
        ratio = float(x) / max_val * 5
        i = int(math.floor(ratio))
        j = int(math.ceil(ratio))
        ratio = ratio - i
        r = (1 - ratio) * colors[i][c] + ratio * colors[j][c]
        return int(r * 255)

    width = img.shape[1]
    height = img.shape[0]
    for i in range(len(boxes)):
        box = boxes[i]
        x1 = int(box[0])
        y1 = int(box[1])
        x2 = int(box[2])
        y2 = int(box[3])
 
        bbox_thick = int(0.6 * (height + width) / 600)
        if color:
            rgb = color
        else:
            rgb = (255, 0, 0)
        if len(box) >= 7 and class_names:
            cls_conf = box[5]
            cls_id = box[6]
            print('%s: %f' % (class_names[cls_id], cls_conf))
            classes = len(class_names)
            offset = cls_id * 123457 % classes
            red = get_color(2, offset, classes)
            green = get_color(1, offset, classes)
            blue = get_color(0, offset, classes)
            if color is None:
                rgb = (red, green, blue)
            msg = str(class_names[cls_id])+" "+str(round(cls_conf,3))
            t_size = cv2.getTextSize(msg, 0, 0.7, thickness=bbox_thick // 2)[0]
            c1, c2 = (x1,y1), (x2, y2)
            c3 = (c1[0] + t_size[0], c1[1] - t_size[1] - 3)

            cv2.rectangle(img, (x1,y1), (np.int32(c3[0]), np.int32(c3[1])), rgb, -1)
            img = cv2.putText(img, msg, (c1[0], np.int32(c1[1] - 2)), cv2.FONT_HERSHEY_SIMPLEX,0.7, (0,0,0), bbox_thick//2,lineType=cv2.LINE_AA)
            #cv2.rectangle(img, (x1,y1), (np.float32(c3[0]), np.float32(c3[1])), rgb, -1)
            #img = cv2.putText(img, msg, (c1[0], np.float32(c1[1] - 2)), cv2.FONT_HERSHEY_SIMPLEX,0.7, (0,0,0), bbox_thick//2,lineType=cv2.LINE_AA)
        
        img = cv2.rectangle(img, (x1, y1), (x2, y2), rgb, bbox_thick)
    return img


def read_truths(lab_path):
    if not os.path.exists(lab_path):
        return np.array([])
    if os.path.getsize(lab_path):
        truths = np.loadtxt(lab_path)
        truths = truths.reshape(truths.size / 5, 5)  # to avoid single truth problem
        return truths
    else:
        return np.array([])


def load_class_names(namesfile):
    class_names = []
    with open(namesfile, 'r') as fp:
        lines = fp.readlines()
    for line in lines:
        line = line.rstrip()
        class_names.append(line)
    return class_names



def post_processing(conf_thresh, nms_thresh, output, height, width):

    # anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
    # num_anchors = 9
    # anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
    # strides = [8, 16, 32]
    # anchor_step = len(anchors) // num_anchors

    # [batch, num, 1, 4]
    box_array = output[0]
    # [batch, num, num_classes]
    confs = output[1]

    t1 = time.time()

    if type(box_array).__name__ != 'ndarray':
        box_array = box_array.cpu().detach().numpy()
        confs = confs.cpu().detach().numpy()

    num_classes = confs.shape[2]

    # [batch, num, 4]
    box_array = box_array[:, :, 0]

    # [batch, num, num_classes] --> [batch, num]
    max_conf = np.max(confs, axis=2)
    max_id = np.argmax(confs, axis=2)

    t2 = time.time()

    bboxes_batch = []
    for i in range(box_array.shape[0]):
       
        argwhere = max_conf[i] > conf_thresh
        l_box_array = box_array[i, argwhere, :]
        l_max_conf = max_conf[i, argwhere]
        l_max_id = max_id[i, argwhere]

        bboxes = []
        # nms for each class
        for j in range(num_classes):

            cls_argwhere = l_max_id == j
            ll_box_array = l_box_array[cls_argwhere, :]
            ll_max_conf = l_max_conf[cls_argwhere]
            ll_max_id = l_max_id[cls_argwhere]

            keep = nms_cpu(ll_box_array, ll_max_conf, nms_thresh)
            
            if (keep.size > 0):
                ll_box_array = ll_box_array[keep, :]
                ll_max_conf = ll_max_conf[keep]
                ll_max_id = ll_max_id[keep]

                for k in range(ll_box_array.shape[0]):
                    bboxes.append([ll_box_array[k, 0]*width, ll_box_array[k, 1]*height, ll_box_array[k, 2]*width, ll_box_array[k, 3]*height, ll_max_conf[k], ll_max_conf[k], ll_max_id[k]])
        
        bboxes_batch.append(bboxes)

    t3 = time.time()
    return bboxes_batch

atlas推理代码测试:

复制代码
python3 yolov4.py

视频测试:

参考链接:

https://github.com/AlexeyAB/darknet

https://github.com/Tianxiaomo/pytorch-YOLOv4

samples: CANN Samples - Gitee.com

相关推荐
程序猿追7 小时前
【鸿蒙PC桌面端实战】从零构建 ArkTS 高性能图像展示器:DevEco Studio 调试与 HDC 命令行验证全流程
华为·harmonyos
前端世界9 小时前
设备找不到、Ability 启不动?一次讲清 DevEco Studio 调试鸿蒙分布式应用
华为·harmonyos
小雨下雨的雨12 小时前
Flutter 框架跨平台鸿蒙开发 —— Row & Column 布局之轴线控制艺术
flutter·华为·交互·harmonyos·鸿蒙系统
小雨下雨的雨13 小时前
Flutter 框架跨平台鸿蒙开发 —— Center 控件之完美居中之道
flutter·ui·华为·harmonyos·鸿蒙
小雨下雨的雨14 小时前
Flutter 框架跨平台鸿蒙开发 —— Icon 控件之图标交互美学
flutter·华为·交互·harmonyos·鸿蒙系统
小雨下雨的雨14 小时前
Flutter 框架跨平台鸿蒙开发 —— Placeholder 控件之布局雏形美学
flutter·ui·华为·harmonyos·鸿蒙系统
小雨下雨的雨15 小时前
Flutter 框架跨平台鸿蒙开发 —— Padding 控件之空间呼吸艺术
flutter·ui·华为·harmonyos·鸿蒙系统
小雨下雨的雨16 小时前
Flutter 框架跨平台鸿蒙开发 —— Align 控件之精准定位美学
flutter·ui·华为·harmonyos·鸿蒙
C雨后彩虹1 天前
任务最优调度
java·数据结构·算法·华为·面试
盐焗西兰花1 天前
鸿蒙学习实战之路-蓝牙设置完全指南
学习·华为·harmonyos