基于华为atlas的重车(满载)空车(空载)识别

该教程主要是想摸索出华为atlas的基于ACL的推理模式。最终实现通过煤矿磅道上方的摄像头,识别出车辆的重车(满载)、空车(空载)情况。本质上是一个简单的检测问题。

但是整体探索过程比较坎坷,Tianxiaomo的代码可以基于原始yolov4模型进行推理,可以转化onnx,但是训练过程我感觉代码有问题,loss很大,也没检测框输出。同时输出的结果的维度和atlas教程的维度也不一样。对于这2个问题,第1个问题训练效果不对选择使用原始darknet网络解决,第2个问题输出维度和atlas不一样通过重新实现atlas后处理代码实现。

风雨兼程,终见彩虹;艰辛耕耘,方得硕果。

darknet数据集制作及配置文件修改:

(1)数据集采用labelimg工具标注为VOC格式,一共标注了1087张图片。

(2)数据集格式如下,

其中,VOC2025为我自己的数据集起的名字,你也可以起别的名字,Annotations存放XML文件,Main中存放,train.txt,val.txt,txt中只写图片的名字,一行一个。JPEGImages中存放图片。labels中存放由XML生成的txt文件。

(3)修改scripts下面的voc_label.py,将数据集的目录修改为自己的目录,

复制代码
#开始几行
sets=[('2025', 'train'), ('2025', 'val')]
classes = ["full", "empty"]
#最后2行
os.system("cat 2025_train.txt 2025_val.txt > train.txt")
os.system("cat 2025_train.txt 2025_val.txt > train.all.txt")

然后执行

Python3 scripts/voc_label.py

就会生成labels文件夹,以及文件夹下面的txt标记,以及train.txt 和train.all.txt

其中,train.txt中存储路径+图片名,一行一个

复制代码
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01467.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01468.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01469.jpg
/data/jxl/darknet/VOCdevkit/VOC2025/JPEGImages/3743_01559.jpg

Labels文件夹下每个图片对应一个txt文件,里面存储类别 框坐标的归一化值

复制代码
0 0.6794407894736842 0.5394736842105263 0.5516447368421052 0.9195906432748537

(4)修改,cfg/fullempty.data

复制代码
classes= 2
train  = ./VOCdevkit/VOC2025/ImageSets/Main/train.txt
valid  = ./VOCdevkit/VOC2025/ImageSets/Main/val.txt
names = ./data/fullempty.names
backup = ./pjreddie/backup/

class为训练的类别数

train为训练集train.txt

valid为验证集val.txt

names为fullempty.names,里面为自己训练的目标名称

backup为weights的存储位置

(5)修改cfg/yolov4-fullempty.cfg

修改每个classes=2

修改最后一个卷基层,filters和最后一个region的classes,num参数是因为yolov4有3个分支,每个分支3个anchor。

其中,filters=num×(classes + coords + 1)=3*(2+4+1)=21,这里我有2个类别。

(6)修改data/fullempty.names

复制代码
full
empty

darknet模型训练:

复制代码
./darknet detector train ./cfg/fullempty.data ./cfg/yolov4-fullempty.cfg  ./yolov4.weights -clear

darknet的.weights模型测试:

复制代码
./darknet detect  ./cfg/yolov4-fullempty.cfg  ./pjreddie/backup/yolov4-fullempty_last.weights  ./VOCdevkit/VOC2025/JPEGImages/2793_00847.jpg

Pytorch代码配置文件修改:

#cfg.py

复制代码
Cfg.use_darknet_cfg = True
Cfg.cfgfile = os.path.join(_BASE_DIR, 'cfg', 'yolov4-custom.cfg')

#cfg/yolov4-custom.cfg,

Pytorch代码bug修改:

#train.py211行,

复制代码
pred_ious = bboxes_iou(pred[b].view(-1, 4), truth_box, xyxy=False)修改为,
pred_ious = bboxes_iou(pred[b].contiguous().view(-1, 4), truth_box, xyxy=False)

dataset.py, get_image_id函数,因为我的图片命名规则是Id_id.jpg,所以将2个id拼接起来作为最终的id。

复制代码
parts = filename.split('.')[0].split('_')
id = int(parts[0]+ parts[1])
return id

基于pytorch代码的.weights模型测试:

复制代码
python3 demo.py -cfgfile ./cfg/yolov4-custom.cfg -weightfile ./yolov4-fullempty_last.weights -imgfile ./full_empty_dataset/images/2793_00847.jpg -torch False

.weights模型转onnx模型:

复制代码
python3 demo_darknet2onnx.py ./cfg/yolov4-custom.cfg ./data/full_empty.names ./yolov4-fullempty_last.weights ./full_empty_dataset/images/2793_00847.jpg 1

onnx模型转om模型:

复制代码
atc --model=./yolov4_1_3_608_608_static.onnx --framework=5 --output=yolov4_bs1 --input_shape="input:1,3,608, 608"  --soc_version=Ascend310P3 --input_format=NCHW

atlas推理代码编写:

#yolov4.py

复制代码
import sys
sys.path.append("./common/acllite")
import os
import numpy as np
import acl
import cv2
import time
from acllite_model import AclLiteModel
from acllite_resource import AclLiteResource

from utils import post_processing, plot_boxes_cv2

MODEL_PATH = "./model/yolov4_bs1.om"

#ACL resource initialization
acl_resource = AclLiteResource()
acl_resource.init()
#load model
model = AclLiteModel(MODEL_PATH)
 

class YOLOV4(object):
    def __init__(self):
        self.MODEL_PATH = MODEL_PATH
        self.MODEL_WIDTH = 608
        self.MODEL_HEIGHT = 608

        self.class_names= ['full', 'empty']

        self.model = model
 
    def preprocess(self, bgr_img):
        sized = cv2.resize(bgr_img.copy(), (self.MODEL_WIDTH, self.MODEL_HEIGHT))
        sized = cv2.cvtColor(sized, cv2.COLOR_BGR2RGB)
    
        new_image = sized.astype(np.float32)
        new_image = new_image / 255.0
        new_image = new_image.transpose(2, 0, 1).copy()
    
        return new_image


    def process(self, bgr_img):
        height, width = bgr_img.shape[:2]
        #preprocess
        data = self.preprocess(bgr_img)#(3, 608, 608)

        #Send into model inference
        result_list = self.model.execute([data,])    
        #Process inference results

        conf_thresh, nms_thresh = 0.4, 0.6
        boxes = post_processing(conf_thresh, nms_thresh, result_list, height, width)
        return boxes

    def draw(self, bgr_img, boxes):
        drawed_img = plot_boxes_cv2(bgr_img, boxes[0], class_names=self.class_names)
        return drawed_img

def test_image():
    yolov4 = YOLOV4()
    img_name = "./data/3553_00173.jpg"

    #read image
    bgr_img = cv2.imread(img_name)

    t1 = time.time()
    boxes = yolov4.process(bgr_img)
    t2 = time.time()
    drawed_img = yolov4.draw(bgr_img, boxes)
    t3 = time.time()
    print("result = ", len(boxes[0]), boxes, t2-t1, t3-t2)

    cv2.imwrite("out.jpg", drawed_img)

if __name__ == '__main__':
    test_image()

#utils.py

复制代码
import sys
import os
import time
import math
import numpy as np

import itertools
import struct  # get_image_size
import imghdr  # get_image_size


def sigmoid(x):
    return 1.0 / (np.exp(-x) + 1.)


def softmax(x):
    x = np.exp(x - np.expand_dims(np.max(x, axis=1), axis=1))
    x = x / np.expand_dims(x.sum(axis=1), axis=1)
    return x


def bbox_iou(box1, box2, x1y1x2y2=True):
    if x1y1x2y2:
        mx = min(box1[0], box2[0])
        Mx = max(box1[2], box2[2])
        my = min(box1[1], box2[1])
        My = max(box1[3], box2[3])
        w1 = box1[2] - box1[0]
        h1 = box1[3] - box1[1]
        w2 = box2[2] - box2[0]
        h2 = box2[3] - box2[1]
    else:
        w1 = box1[2]
        h1 = box1[3]
        w2 = box2[2]
        h2 = box2[3]

        mx = min(box1[0], box2[0])
        Mx = max(box1[0] + w1, box2[0] + w2)
        my = min(box1[1], box2[1])
        My = max(box1[1] + h1, box2[1] + h2)
    uw = Mx - mx
    uh = My - my
    cw = w1 + w2 - uw
    ch = h1 + h2 - uh
    carea = 0
    if cw <= 0 or ch <= 0:
        return 0.0

    area1 = w1 * h1
    area2 = w2 * h2
    carea = cw * ch
    uarea = area1 + area2 - carea
    return carea / uarea


def nms_cpu(boxes, confs, nms_thresh=0.5, min_mode=False):
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]

    areas = (x2 - x1) * (y2 - y1)
    order = confs.argsort()[::-1]

    keep = []
    while order.size > 0:
        idx_self = order[0]
        idx_other = order[1:]

        keep.append(idx_self)

        xx1 = np.maximum(x1[idx_self], x1[idx_other])
        yy1 = np.maximum(y1[idx_self], y1[idx_other])
        xx2 = np.minimum(x2[idx_self], x2[idx_other])
        yy2 = np.minimum(y2[idx_self], y2[idx_other])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        inter = w * h

        if min_mode:
            over = inter / np.minimum(areas[order[0]], areas[order[1:]])
        else:
            over = inter / (areas[order[0]] + areas[order[1:]] - inter)

        inds = np.where(over <= nms_thresh)[0]
        order = order[inds + 1]
    
    return np.array(keep)



def plot_boxes_cv2(img, boxes, class_names=None, color=None):
    import cv2
    img = np.copy(img)
    colors = np.array([[1, 0, 1], [0, 0, 1], [0, 1, 1], [0, 1, 0], [1, 1, 0], [1, 0, 0]], dtype=np.float32)

    def get_color(c, x, max_val):
        ratio = float(x) / max_val * 5
        i = int(math.floor(ratio))
        j = int(math.ceil(ratio))
        ratio = ratio - i
        r = (1 - ratio) * colors[i][c] + ratio * colors[j][c]
        return int(r * 255)

    width = img.shape[1]
    height = img.shape[0]
    for i in range(len(boxes)):
        box = boxes[i]
        x1 = int(box[0])
        y1 = int(box[1])
        x2 = int(box[2])
        y2 = int(box[3])
 
        bbox_thick = int(0.6 * (height + width) / 600)
        if color:
            rgb = color
        else:
            rgb = (255, 0, 0)
        if len(box) >= 7 and class_names:
            cls_conf = box[5]
            cls_id = box[6]
            print('%s: %f' % (class_names[cls_id], cls_conf))
            classes = len(class_names)
            offset = cls_id * 123457 % classes
            red = get_color(2, offset, classes)
            green = get_color(1, offset, classes)
            blue = get_color(0, offset, classes)
            if color is None:
                rgb = (red, green, blue)
            msg = str(class_names[cls_id])+" "+str(round(cls_conf,3))
            t_size = cv2.getTextSize(msg, 0, 0.7, thickness=bbox_thick // 2)[0]
            c1, c2 = (x1,y1), (x2, y2)
            c3 = (c1[0] + t_size[0], c1[1] - t_size[1] - 3)

            cv2.rectangle(img, (x1,y1), (np.int32(c3[0]), np.int32(c3[1])), rgb, -1)
            img = cv2.putText(img, msg, (c1[0], np.int32(c1[1] - 2)), cv2.FONT_HERSHEY_SIMPLEX,0.7, (0,0,0), bbox_thick//2,lineType=cv2.LINE_AA)
            #cv2.rectangle(img, (x1,y1), (np.float32(c3[0]), np.float32(c3[1])), rgb, -1)
            #img = cv2.putText(img, msg, (c1[0], np.float32(c1[1] - 2)), cv2.FONT_HERSHEY_SIMPLEX,0.7, (0,0,0), bbox_thick//2,lineType=cv2.LINE_AA)
        
        img = cv2.rectangle(img, (x1, y1), (x2, y2), rgb, bbox_thick)
    return img


def read_truths(lab_path):
    if not os.path.exists(lab_path):
        return np.array([])
    if os.path.getsize(lab_path):
        truths = np.loadtxt(lab_path)
        truths = truths.reshape(truths.size / 5, 5)  # to avoid single truth problem
        return truths
    else:
        return np.array([])


def load_class_names(namesfile):
    class_names = []
    with open(namesfile, 'r') as fp:
        lines = fp.readlines()
    for line in lines:
        line = line.rstrip()
        class_names.append(line)
    return class_names



def post_processing(conf_thresh, nms_thresh, output, height, width):

    # anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
    # num_anchors = 9
    # anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
    # strides = [8, 16, 32]
    # anchor_step = len(anchors) // num_anchors

    # [batch, num, 1, 4]
    box_array = output[0]
    # [batch, num, num_classes]
    confs = output[1]

    t1 = time.time()

    if type(box_array).__name__ != 'ndarray':
        box_array = box_array.cpu().detach().numpy()
        confs = confs.cpu().detach().numpy()

    num_classes = confs.shape[2]

    # [batch, num, 4]
    box_array = box_array[:, :, 0]

    # [batch, num, num_classes] --> [batch, num]
    max_conf = np.max(confs, axis=2)
    max_id = np.argmax(confs, axis=2)

    t2 = time.time()

    bboxes_batch = []
    for i in range(box_array.shape[0]):
       
        argwhere = max_conf[i] > conf_thresh
        l_box_array = box_array[i, argwhere, :]
        l_max_conf = max_conf[i, argwhere]
        l_max_id = max_id[i, argwhere]

        bboxes = []
        # nms for each class
        for j in range(num_classes):

            cls_argwhere = l_max_id == j
            ll_box_array = l_box_array[cls_argwhere, :]
            ll_max_conf = l_max_conf[cls_argwhere]
            ll_max_id = l_max_id[cls_argwhere]

            keep = nms_cpu(ll_box_array, ll_max_conf, nms_thresh)
            
            if (keep.size > 0):
                ll_box_array = ll_box_array[keep, :]
                ll_max_conf = ll_max_conf[keep]
                ll_max_id = ll_max_id[keep]

                for k in range(ll_box_array.shape[0]):
                    bboxes.append([ll_box_array[k, 0]*width, ll_box_array[k, 1]*height, ll_box_array[k, 2]*width, ll_box_array[k, 3]*height, ll_max_conf[k], ll_max_conf[k], ll_max_id[k]])
        
        bboxes_batch.append(bboxes)

    t3 = time.time()
    return bboxes_batch

atlas推理代码测试:

复制代码
python3 yolov4.py

视频测试:

参考链接:

https://github.com/AlexeyAB/darknet

https://github.com/Tianxiaomo/pytorch-YOLOv4

samples: CANN Samples - Gitee.com

相关推荐
qq_4309085716 小时前
华为ICT和AI智能应用
人工智能·华为
不爱吃糖的程序媛1 天前
鸿蒙版Taro 搭建开发环境
华为·harmonyos·taro
枫叶丹41 天前
【HarmonyOS Next之旅】DevEco Studio使用指南(三十)
华为·harmonyos·deveco studio·harmonyos next
IUings2 天前
【鸿蒙】HarmonyOS NEXT之如何正常加载地图组件
开发语言·华为·harmonyos·harmonyos next·地图服务·map kit
程序员小刘2 天前
【HarmonyOS 5】鸿蒙APP使用【团结引擎Unity】开发的案例教程
unity·华为·harmonyos
未来智慧谷2 天前
华为盘古 Ultra MoE 模型:国产 AI 的技术突破与行业影响
人工智能·华为
量子位2 天前
训练 MoE 足足提速 70%!华为只用了 3 招
华为·ai编程
鸿蒙自习室2 天前
鸿蒙UI开发——组件的自适应拉伸
ui·华为·harmonyos·鸿蒙
万少2 天前
HarmonyOS Next 弹窗系列教程(2)
前端·华为·harmonyos