深度学习网络笔记Ⅲ(轻量级网络)

轻量化网络的概述

共性目标:

**工业标准:**准,便宜,快,稳

**准确率:**残差基本上解决了ACC准确率的问题

**便宜:**模型小,运行速度快,占用内存也要少【轻量化】

实现轻量化模型的手段:

1.模型层数少。

2.多用1x1和3x3卷积,空间可分离,分组卷积,深度可分离卷积 + 1x1卷积 (我们用DW卷积深度可分离卷积)

3.残差

4.牺牲精度,获取速度。

总之一句话:在保证精度的前提下,通过降低计算量和参数量,适配移动端 / 嵌入式设备


MobileNet

MobileNet 是谷歌提出的移动端专用轻量级网络,三个版本迭代核心是在轻量化基础上持续提升精度和速度

1.MobileNet V1:

核心创新 ------ 深度可分离卷积

普通卷积:同时完成通道融合和空间特征提取,计算量巨大

K是卷积核的大小,C_in/C_out是输入输出的通道数。

深度可分离卷积: MobileNet V1就是将其拆分为两步卷积:

代码实现:

python 复制代码
import torch
import torch.nn as nn
import torch.nn.functional as F

class DepthwiseSeparableConv(nn.Module):
    """深度可分离卷积模块(MobileNet V1核心)"""
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1):
        super().__init__()
        # 1. 深度卷积:groups=in_channels,保证每个通道独立卷积
        self.depthwise = nn.Conv2d(
            in_channels=in_channels,
            out_channels=in_channels,
            kernel_size=kernel_size,
            stride=stride,
            padding=padding,
            groups=in_channels  # 分组卷积的极致:每组1个通道
        )
        # 2. 逐点卷积:1x1卷积融合通道
        self.pointwise = nn.Conv2d(
            in_channels=in_channels,
            out_channels=out_channels,
            kernel_size=1,
            stride=1,
            padding=0
        )
        # 标配BN+ReLU
        self.bn_dw = nn.BatchNorm2d(in_channels)
        self.bn_pw = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        # DW -> BN -> ReLU -> PW -> BN -> ReLU
        x = self.relu(self.bn_dw(self.depthwise(x)))
        x = self.relu(self.bn_pw(self.pointwise(x)))
        return x

# 测试模块
if __name__ == "__main__":
    x = torch.randn(2, 64, 32, 32)  # batch_size=2, 通道=64, 特征图=32x32
    conv_dw_pw = DepthwiseSeparableConv(64, 128)
    out = conv_dw_pw(x)
    print(f"输入shape: {x.shape}, 输出shape: {out.shape}")  # 输出应为 [2, 128, 32, 32]

局限性

  • 深度卷积的卷积核数量少,提取特征能力弱,精度略低于标准卷积网络
  • 没有残差连接,网络深度受限

2.MobileNet V2⭐:

核心创新 ------逆残差 + 线性瓶颈

  1. 逆残差(Inverted Residual)
    • 传统残差(ResNet):降维→卷积→升维(通道先减后增)
    • 逆残差:升维→卷积→降维(通道先增后减)
    • 原因:轻量级网络通道数少,先升维可以增加特征多样性,提升卷积的表达能力
  2. 线性瓶颈(Linear Bottleneck)
    • 传统卷积后用ReLU6(ReLU 的变种,限制输出最大为 6,适配移动端量化)
    • 但 ReLU 在低维空间会破坏特征(信息丢失),因此在逐点卷积降维后用线性激活代替 ReLU

代码实现:

python 复制代码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@Project :Pytorch
@File :MobileNet V2.py
@IDE :PyCharm
@Author :wjj
@Date :2025/12/28 13:49
@Description:MobileNet V2
"""
import torch
import torch.nn as nn
import torch.nn.functional as F

class InvertedResidual(nn.Module):
    """MobileNet V2 逆残差块"""
    def __init__(self, in_channels, out_channels, stride,expend_ratio=6):
        super(InvertedResidual, self).__init__()
        self.stride = stride
        self.expend_ratio = expend_ratio
        #使用 assert 确保 stride 只能是 1 或 2,保证特征图不变/下采样
        #避免更大步幅导致特征信息丢失过多
        assert stride in [1, 2], "stride must be 1 or 2"
        #升维后的隐藏通道数
        hidden_dim = in_channels * expend_ratio
        """
        残差连接生效条件(两个条件必须同时满足)
        1.stride == 1:特征图的宽、高尺寸不变(步幅 2 会导致尺寸减半,无法直接逐元素相加)。
        2.in_channels == out_channels:输入与输出的通道数一致(通道数不同无法直接逐元素相加).
        """
        self.use_res_connect = self.stride == 1 and in_channels == out_channels  # 是否用残差连接

        #升维
        self.pw1 = nn.Conv2d(in_channels, hidden_dim, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn1 = nn.BatchNorm2d(hidden_dim)
        self.relu6 = nn.ReLU6(inplace=True)

        #深度卷积
        self.dw = nn.Conv2d(hidden_dim, hidden_dim, kernel_size=3, stride=1, groups=hidden_dim, bias=False)
        self.bn2 = nn.BatchNorm2d(hidden_dim)

        #逐点卷积 降维+线性瓶颈
        self.pw2 = nn.Conv2d(hidden_dim, out_channels, 1, 1, 0, bias=False)
        self.bn3 = nn.BatchNorm2d(out_channels)

    def forward(self, x):
        out = self.pw1(x)
        out = self.bn1(out)
        out = self.relu6(out)
        out = self.dw(out)
        out = self.bn2(out)
        out = self.relu6(out)
        out = self.pw2(out)
        out = self.bn3(out)

        if self.use_res_connect:
            out = out + x
        return out

if __name__ == "__main__":
    x = torch.randn(2, 32, 64, 64)
    block = InvertedResidual(32, 16, stride=1, expend_ratio=6)
    out = block(x)
    print(f"输入shape: {x.shape}, 输出shape: {out.shape}")  # stride=1时shape不变

3.MobileNet V3:

核心创新:NAS 搜索 + SE 注意力 + h3. 改进

  1. NAS 搜索:用强化学习搜索最优的网络结构和参数(如 expand_ratio、stride)
  2. 嵌入 SE 注意力:在逆残差块中加入轻量级通道注意力(SE 模块),提升精度
  3. 改进激活函数 :用 h-swish 代替 ReLU6,精度更高且量化友好

4.尾部结构优化:去除冗余的卷积层,用全局平均池化(GAP)直接接全连接层

代码实例:

python 复制代码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@Project :Pytorch
@File :MobileNet V3.py
@IDE :PyCharm
@Author :wjj
@Date :2025/12/28 14:12
@Description:SE 模块 + h-swish
"""
import torch
import torch.nn as nn
import torch.nn.functional as F

class SEModule(nn.Module):
    """轻量级通道注意力:SE模块(适配MobileNet V3)"""
    def __init__(self, channel, reduction=4):
        super(SEModule, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Hardsigmoid(inplace=True)  # 量化友好,代替sigmoid
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y

class h_swish(nn.Module):
    """h-swish激活函数"""
    def forward(self, x):
        # 移除F.hardsigmoid的inplace=True,解决兼容性问题
        return x * F.hardsigmoid(x)

class InvertedResidualV3(nn.Module):
    def __init__(self,in_channels, out_channels, stride, expand_ratio, use_se=True):
        super(InvertedResidualV3, self).__init__()

        # 倒残差先升维度再卷积再降维度
        hidden_dim = in_channels * expand_ratio
        self.use_res_connect = stride == 1 and in_channels == out_channels
        self.use_se = use_se  # 保留布尔标记,不被覆盖

        # 1×1 逐点卷积(PW1,升维):stride固定为1,仅负责通道扩充
        self.pw1 = nn.Conv2d(in_channels, hidden_dim, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn1 = nn.BatchNorm2d(hidden_dim)
        self.act = h_swish()

        # 3x3深度卷积(DW,空间特征提取):stride使用外部传入值,负责空间尺寸变换
        self.dw = nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False)
        self.bn2 = nn.BatchNorm2d(hidden_dim)

        # SE模块实例化为self.se,保留self.use_se布尔标记
        if self.use_se:
            self.se = SEModule(hidden_dim)  # 加入SE注意力

        # 1×1 逐点卷积(PW2,降维 / 线性瓶颈):stride固定为1,仅负责通道压缩
        self.pw2 = nn.Conv2d(hidden_dim, out_channels, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn3 = nn.BatchNorm2d(out_channels)

    def forward(self, x):
        out = self.act(self.bn1(self.pw1(x)))
        out = self.act(self.bn2(self.dw(out)))

        # 调用实例化的self.se模块
        if self.use_se:
            out = self.se(out)

        out = self.bn3(self.pw2(out))
        if self.use_res_connect:
            out = out + x

        return out

if __name__ == "__main__":
    x = torch.randn(2, 32, 64, 64)
    block = InvertedResidualV3(32, 16, stride=1, expand_ratio=6, use_se=True)
    out = block(x)
    print(f"输入shape: {x.shape}, 输出shape: {out.shape}")

4.V1 vs V2 vs V3

版本 核心创新 计算量 精度 适用场景
V1 深度可分离卷积 最小 最低 极致轻量化场景(如单片机)
V2 逆残差 + 线性瓶颈 中等 中等 移动端通用场景(如手机 APP)
V3 NAS+SE 注意力 + h-swish 略高于 V1 最高 对精度有要求的移动端场景(如实时检测)

ShuffleNet

ShuffleNet 是旷视提出的轻量级网络,核心思路是分组卷积 + 通道洗牌,解决分组卷积的通道隔离问题。

1.ShuffleNet V1:

核心创新 ------ 分组卷积 + 通道洗牌

背景:

分组卷积(Group Conv)可以降低计算量,但会导致通道间信息隔离(不同组的通道互不交流),降低精度。

核心改进:

  1. 分组逐点卷积:将 1x1卷积分组,进一步降低计算量
  2. 通道洗牌(Channel Shuffle):打乱分组后的通道顺序,实现跨组信息交流

通道洗牌原理

  • 输入:g 个组,每组 c 个通道 → 总通道数 gc
  • 步骤 1:将通道维度 reshape 为 (g, c)
  • 步骤 2:转置为 (c, g)
  • 步骤 3:flatten 为 gc 个通道 → 完成洗牌

代码实现:

python 复制代码
def channel_shuffle(x, groups):
    """通道洗牌函数"""
    b, c, h, w = x.size()
    assert c % groups == 0, "通道数必须能被组数整除"
    return x.view(b, groups, c//groups, h, w).permute(0, 2, 1, 3, 4).reshape(b, c, h, w)

class ShuffleBlockV1(nn.Module):
    """ShuffleNet V1 残差块"""
    def __init__(self, in_channels, out_channels, stride, groups=3):
        super().__init__()
        self.stride = stride
        mid_channels = out_channels // 4  # 瓶颈层通道数
        self.groups = groups

        if stride == 2:
            # 步幅为2时,用深度卷积下采样+拼接代替残差连接
            self.branch1 = nn.Sequential(
                nn.Conv2d(in_channels, in_channels, 3, stride, 1, groups=in_channels, bias=False),
                nn.BatchNorm2d(in_channels),
                nn.Conv2d(in_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True)
            )
            in_channels = mid_channels
        else:
            self.branch1 = nn.Identity()  # 步幅为1时,分支1直接传递

        # 分支2:分组卷积+通道洗牌
        self.branch2 = nn.Sequential(
            nn.Conv2d(in_channels, mid_channels, 1, 1, 0, groups=groups, bias=False),
            nn.BatchNorm2d(mid_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(mid_channels, mid_channels, 3, stride, 1, groups=mid_channels, bias=False),
            nn.BatchNorm2d(mid_channels),
            nn.Conv2d(mid_channels, mid_channels, 1, 1, 0, groups=groups, bias=False),
            nn.BatchNorm2d(mid_channels),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        out1 = self.branch1(x)
        out2 = self.branch2(x if self.stride == 1 else out1)
        out = torch.cat([out1, out2], dim=1)  # 拼接
        out = channel_shuffle(out, self.groups)  # 通道洗牌
        return out

# 测试
if __name__ == "__main__":
    x = torch.randn(2, 24, 64, 64)
    block = ShuffleBlockV1(24, 48, stride=2, groups=3)
    out = block(x)
    print(f"输入shape: {x.shape}, 输出shape: {out.shape}")

2.ShuffleNet V2⭐:

核心创新 ------4 个高效设计原则

  1. 同等通道数:输入输出通道数相等时,分组卷积计算量最小
  2. 减少分组数:分组数过多会增加内存访问开销
  3. 减少碎片化:避免过多的小分支(如 GoogLeNet 的 Inception),降低并行计算难度
  4. 降低元素级操作:减少 ReLU、Add 等逐元素操作的数量

代码实践:

python 复制代码
class ShuffleBlockV2(nn.Module):
    """ShuffleNet V2 残差块:通道拆分代替分组卷积"""
    def __init__(self, in_channels, out_channels, stride):
        super().__init__()
        self.stride = stride
        assert stride in [1, 2]
        mid_channels = out_channels // 2

        if stride == 2:
            # 步幅2:两路都卷积+拼接
            self.branch1 = nn.Sequential(
                nn.Conv2d(in_channels, in_channels, 3, stride, 1, groups=in_channels, bias=False),
                nn.BatchNorm2d(in_channels),
                nn.Conv2d(in_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True)
            )
            self.branch2 = nn.Sequential(
                nn.Conv2d(in_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True),
                nn.Conv2d(mid_channels, mid_channels, 3, stride, 1, groups=mid_channels, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.Conv2d(mid_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True)
            )
        else:
            # 步幅1:通道拆分为两路,一路直接传递
            assert in_channels == out_channels
            self.branch1 = nn.Identity()
            self.branch2 = nn.Sequential(
                nn.Conv2d(mid_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True),
                nn.Conv2d(mid_channels, mid_channels, 3, stride, 1, groups=mid_channels, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.Conv2d(mid_channels, mid_channels, 1, 1, 0, bias=False),
                nn.BatchNorm2d(mid_channels),
                nn.ReLU(inplace=True)
            )

    def forward(self, x):
        if self.stride == 1:
            x1, x2 = x.chunk(2, dim=1)  # 通道拆分
            out = torch.cat([x1, self.branch2(x2)], dim=1)
        else:
            out = torch.cat([self.branch1(x), self.branch2(x)], dim=1)
        out = channel_shuffle(out, 2)  # 固定分组数2
        return out
版本 核心创新 计算效率 精度 特点
V1 分组卷积 + 通道洗牌 中等 分组数可调,灵活度高
V2 4 条高效原则 + 通道拆分 更高 更高 实际推理速度快,移动端首选
相关推荐
石工记2 小时前
Java 作为主开发语言 + 调用 AI 能力(大模型 API / 本地化轻量模型)
java·开发语言·人工智能
curd_boy2 小时前
【AI】mem0:面向大模型应用的记忆工程框架
人工智能
Ccuno2 小时前
Java虚拟机的内存结构
java·开发语言·深度学习
算法狗23 小时前
大模型推理中超出训练长度的外推方式有哪些?
人工智能
渡我白衣3 小时前
数据是燃料:理解数据类型、质量评估与基本预处理
人工智能·深度学习·神经网络·机器学习·自然语言处理·机器人·caffe
Codebee3 小时前
Ooder A2UI框架开源首发:构建企业级应用的全新选择
java·人工智能·全栈
百泰派克生物科技3 小时前
串联质量标签(TMT)
人工智能·机器学习·蛋白质组学·蛋白质·质谱
草莓熊Lotso3 小时前
Linux 实战:从零实现动态进度条(含缓冲区原理与多版本优化)
linux·运维·服务器·c++·人工智能·centos·进度条
渡我白衣4 小时前
多路转接之epoll:理论篇
人工智能·神经网络·网络协议·tcp/ip·自然语言处理·信息与通信·tcpdump