如何在YoloV8中添加注意力机制(两种方式)

文章目录

概要

提示:这里可以添加技术概要

例如:

openAI 的 GPT 大模型的发展历程。

添加注意力机制流程

#添加方式一:将注意力机制添加到额外的一层

首先找一份注意力机制的代码,比如:ParNetAttention

powershell 复制代码
import numpy as np
import torch
from torch import nn
from torch.nn import init


class ParNetAttention(nn.Module):

    def __init__(self, channel=512):
        super().__init__()
        self.sse = nn.Sequential(
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(channel, channel, kernel_size=1),
            nn.Sigmoid()
        )

        self.conv1x1 = nn.Sequential(
            nn.Conv2d(channel, channel, kernel_size=1),
            nn.BatchNorm2d(channel)
        )
        self.conv3x3 = nn.Sequential(
            nn.Conv2d(channel, channel, kernel_size=3, padding=1),
            nn.BatchNorm2d(channel)
        )
        self.silu = nn.SiLU()

    def forward(self, x):
        b, c, _, _ = x.size()
        x1 = self.conv1x1(x)
        x2 = self.conv3x3(x)
        x3 = self.sse(x) * x
        y = self.silu(x1 + x2 + x3)
        return y

在ultralytics\nn\modules\下新建一份attention.py文件,将注意力机制代码放进去。

打开ultralytics\nn\tasks.py文件首先引入刚才新建的注意力机制代码:

powershell 复制代码
from ultralytics.nn.modules.attention import ParNetAttention

如果注意力机制代码是需要输入通道数的,那么在parse_model方法中加上这行代码:

powershell 复制代码
# 有通道数的注意力机制放在这
        elif m is (Zoom_cat,SSFF,ParNetAttention):
            c2 = ch[f]
            args = [c2, *args]

如果注意力机制代码是不需要输入通道数的,可以不加这个。

最后更改yaml文件,将这个注意力机制加在你想加的地方

比如我加在SPPF层后边:

powershell 复制代码
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f_attention, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f_attention, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f_attention, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f_attention, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 1, ParNetAttention,[]] # 10

加完之后这层注意力机制就是第10层,在Head中>=10层的全部+1

比如:

  • \[15, 18, 21\], 1, WorldDetect, \[nc, 512, True\]\] # Detect(P3, P4, P5)

  • \[16, 19, 22\], 1, WorldDetect, \[nc, 512, True\]\] # Detect(P3, P4, P5)

比如想将注意力机制加到c2f中,打开ultralytics\nn\modules\block.py

首先将注意力机制代码导入到block.py中,复制一份c2f代码:

powershell 复制代码
class C2f(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))

    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

重命名为:

powershell 复制代码
class C2f_attention(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))
        self.attention_AdditiveBlock = AdditiveBlock(c2)
    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.attention_AdditiveBlock(self.cv2(torch.cat(y, 1)))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

然后需要在__init___函数中声明注意力机制函数,比如:self.attention_AdditiveBlock = AdditiveBlock(c2) 这里注意力机制如果需要参数的话就写上一层的输出通道数,作为本层的输入通道数,这里我想将注意力机制添加在cv2层后边,那么我的参数就是cv2等的输出通道数也就是c2,

在forward中对哪一层使用注意力机制就可以放在哪一层,比如

powershell 复制代码
return self.attention_AdditiveBlock(self.cv2(torch.cat(y, 1)))

注意力机制就添加完成了,最后需要把新的C2f_attention注册一下,

步骤如下:

首先在block.py中引入C2f_attention

powershell 复制代码
__all__ = (
    "DFL",   "HGBlock",    "HGStem",    "SPP",
    "SPPF",    "C1",    "C2",    "C3",
    "C2f",    "C2fAttn",    "ImagePoolingAttn",
    "ContrastiveHead",    "BNContrastiveHead",
    "C3x",    "C3TR",    "C3Ghost",
    "GhostBottleneck",    "Bottleneck",
    "BottleneckCSP",    "Proto",    "RepC3",    "ResNetLayer",
    "RepNCSPELAN4",    "ELAN1",    "ADown",    "AConv",
    "SPPELAN",    "CBFuse",    "CBLinear",    "RepVGGDW",
    "CIB",    "C2fCIB",    "Attention",    "PSA",    "SCDown",
    # --------------------------------添加注意力机制
    "C2f_attention",
)

然后再__init__.py中添加

powershell 复制代码
from .block import (
    C1,    C2,    C3,    C3TR,    CIB,    DFL,
    ELAN1,    PSA,    SPP,    SPPELAN,    SPPF,
    AConv,    ADown,    Attention,    BNContrastiveHead,
    Bottleneck,    BottleneckCSP,    C2f,
    C2fAttn,    C2fCIB,    C3Ghost,    C3x,
    CBFuse,    CBLinear,    ContrastiveHead,
    GhostBottleneck,    HGBlock,
    HGStem,    ImagePoolingAttn,
    Proto,    RepC3,    RepNCSPELAN4,
    RepVGGDW,    ResNetLayer,    SCDown,
    # ---------------------添加注意力机制-------------
    C2f_attention,
)

以及__all__ = ("C2f_attention")都要添加。

最后在task.py中引入C2f_attention:

三个位置:

(1)from ultralytics.nn.modules import (C2f_attention)

(2)if m in { Classify, Conv, ConvTranspose, GhostConv,.........C2f_attention

(3)if m in {BottleneckCSP, C1, C2, C2f, C2fAttn, C3, C3TR, C3Ghost, C3x, RepC3, C2fCIB,C2f_attention}: args.insert(2, n) # number of repeats n = 1

最后在yaml文件中将C2F层替换为C2f_attention即可。

powershell 复制代码
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f_attention, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f_attention, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f_attention, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f_attention, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
相关推荐
Blossom.1186 小时前
使用Python和Scikit-Learn实现机器学习模型调优
开发语言·人工智能·python·深度学习·目标检测·机器学习·scikit-learn
叶子2024229 小时前
学习使用YOLO的predict函数使用
人工智能·学习·yolo
jndingxin12 小时前
OpenCV CUDA模块图像处理------创建一个模板匹配(Template Matching)对象函数createTemplateMatching()
图像处理·人工智能·opencv
Blossom.11813 小时前
使用Python和Flask构建简单的机器学习API
人工智能·python·深度学习·目标检测·机器学习·数据挖掘·flask
吴声子夜歌13 小时前
OpenCV——Mat类及常用数据结构
数据结构·opencv·webpack
一勺汤16 小时前
YOLO12 改进|融入 Mamba 架构:插入视觉状态空间模块 VSS Block 的硬核升级
yolo·计算机视觉·mamba·yolov12·yolo12·yolo12该机·yolo12 mamba
新知图书18 小时前
OpenCV为图像添加边框
人工智能·opencv·计算机视觉
蹦蹦跳跳真可爱5891 天前
Python----目标检测(使用YOLO 模型进行线程安全推理和流媒体源)
人工智能·python·yolo·目标检测·目标跟踪
Hero_HL1 天前
Towards Open World Object Detection概述(论文)
人工智能·目标检测·计算机视觉
audyxiao0011 天前
计算机视觉顶刊《International Journal of Computer Vision》2025年5月前沿热点可视化分析
图像处理·人工智能·opencv·目标检测·计算机视觉·大模型·视觉检测