第J3周:DenseNet121算法详解

前言

  • 环境版本

    python 3.9.23
    pytorch 2.5.1
    pytorch-cuda 11.8
    pytorch-mutex 1.0
    torchaudio 2.5.1
    torchinfo 1.8.0
    torchvision 0.20.1
    Visual Studio Code 1.104.2 (user setup)

  • 什么是 DenseNet-121 ?

    • DenseNet-121 的核心在于"特征重用"。不同于其他网络只传递最后一步的结果,它对于每一层发出的信息(特征图),后面所有的层都能直接看到并使用。

      • 初始层(Stem):

        • 流程:输入 → 卷积(Conv) → 批归一化(BN) → 激活(ReLU) → 最大池化(MaxPool)。

        • 作用:对原始图像进行初步特征提取和空间尺寸压缩。

      • 密集块(DenseBlock):

        • 内部逻辑:采用 BN → ReLU → 1×1 Conv → BN → ReLU → 3×3 Conv 的预激活结构。

        • 关键动作:每一层的输出通过 拼接(Concat) 的方式堆叠到输入中。

        • 增长率:控制每一层只产生少量的特征图,以免模型参数过于巨大。

      • 过渡层(Transition Layer):

        • 流程:BN → ReLU → 1×1 Conv → 平均池化(AverPool)。

        • 作用:由于特征不断拼接会导致维度爆炸,该层负责压缩通道数(降维)并进一步减小图像尺寸。

      • 输出层:

        • 流程:全局平均池化 → 全连接层(Softmax)。

        • 作用:汇总全网络积累的特征信息,完成最终的分类任务。

python 复制代码
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
         MaxPool2d-3           [-1, 64, 56, 56]               0
       BatchNorm2d-4           [-1, 64, 56, 56]             128
            Conv2d-5          [-1, 128, 56, 56]           8,192
       BatchNorm2d-6          [-1, 128, 56, 56]             256
            Conv2d-7           [-1, 32, 56, 56]          36,864
        Bottleneck-8           [-1, 96, 56, 56]               0
       BatchNorm2d-9           [-1, 96, 56, 56]             192
           Conv2d-10          [-1, 128, 56, 56]          12,288
      BatchNorm2d-11          [-1, 128, 56, 56]             256
           Conv2d-12           [-1, 32, 56, 56]          36,864
       Bottleneck-13          [-1, 128, 56, 56]               0
      BatchNorm2d-14          [-1, 128, 56, 56]             256
           Conv2d-15          [-1, 128, 56, 56]          16,384
      BatchNorm2d-16          [-1, 128, 56, 56]             256
           Conv2d-17           [-1, 32, 56, 56]          36,864
       Bottleneck-18          [-1, 160, 56, 56]               0
      BatchNorm2d-19          [-1, 160, 56, 56]             320
           Conv2d-20          [-1, 128, 56, 56]          20,480
      BatchNorm2d-21          [-1, 128, 56, 56]             256
           Conv2d-22           [-1, 32, 56, 56]          36,864
       Bottleneck-23          [-1, 192, 56, 56]               0
      BatchNorm2d-24          [-1, 192, 56, 56]             384
           Conv2d-25          [-1, 128, 56, 56]          24,576
      BatchNorm2d-26          [-1, 128, 56, 56]             256
           Conv2d-27           [-1, 32, 56, 56]          36,864
       Bottleneck-28          [-1, 224, 56, 56]               0
      BatchNorm2d-29          [-1, 224, 56, 56]             448
           Conv2d-30          [-1, 128, 56, 56]          28,672
      BatchNorm2d-31          [-1, 128, 56, 56]             256
           Conv2d-32           [-1, 32, 56, 56]          36,864
       Bottleneck-33          [-1, 256, 56, 56]               0
       DenseBlock-34          [-1, 256, 56, 56]               0
      BatchNorm2d-35          [-1, 256, 56, 56]             512
           Conv2d-36          [-1, 128, 56, 56]          32,768
        AvgPool2d-37          [-1, 128, 28, 28]               0
       Transition-38          [-1, 128, 28, 28]               0
      BatchNorm2d-39          [-1, 128, 28, 28]             256
           Conv2d-40          [-1, 128, 28, 28]          16,384
      BatchNorm2d-41          [-1, 128, 28, 28]             256
           Conv2d-42           [-1, 32, 28, 28]          36,864
       Bottleneck-43          [-1, 160, 28, 28]               0
      BatchNorm2d-44          [-1, 160, 28, 28]             320
           Conv2d-45          [-1, 128, 28, 28]          20,480
      BatchNorm2d-46          [-1, 128, 28, 28]             256
           Conv2d-47           [-1, 32, 28, 28]          36,864
       Bottleneck-48          [-1, 192, 28, 28]               0
      BatchNorm2d-49          [-1, 192, 28, 28]             384
           Conv2d-50          [-1, 128, 28, 28]          24,576
      BatchNorm2d-51          [-1, 128, 28, 28]             256
           Conv2d-52           [-1, 32, 28, 28]          36,864
       Bottleneck-53          [-1, 224, 28, 28]               0
      BatchNorm2d-54          [-1, 224, 28, 28]             448
           Conv2d-55          [-1, 128, 28, 28]          28,672
      BatchNorm2d-56          [-1, 128, 28, 28]             256
           Conv2d-57           [-1, 32, 28, 28]          36,864
       Bottleneck-58          [-1, 256, 28, 28]               0
      BatchNorm2d-59          [-1, 256, 28, 28]             512
           Conv2d-60          [-1, 128, 28, 28]          32,768
      BatchNorm2d-61          [-1, 128, 28, 28]             256
           Conv2d-62           [-1, 32, 28, 28]          36,864
       Bottleneck-63          [-1, 288, 28, 28]               0
      BatchNorm2d-64          [-1, 288, 28, 28]             576
           Conv2d-65          [-1, 128, 28, 28]          36,864
      BatchNorm2d-66          [-1, 128, 28, 28]             256
           Conv2d-67           [-1, 32, 28, 28]          36,864
       Bottleneck-68          [-1, 320, 28, 28]               0
      BatchNorm2d-69          [-1, 320, 28, 28]             640
           Conv2d-70          [-1, 128, 28, 28]          40,960
      BatchNorm2d-71          [-1, 128, 28, 28]             256
           Conv2d-72           [-1, 32, 28, 28]          36,864
       Bottleneck-73          [-1, 352, 28, 28]               0
      BatchNorm2d-74          [-1, 352, 28, 28]             704
           Conv2d-75          [-1, 128, 28, 28]          45,056
      BatchNorm2d-76          [-1, 128, 28, 28]             256
           Conv2d-77           [-1, 32, 28, 28]          36,864
       Bottleneck-78          [-1, 384, 28, 28]               0
      BatchNorm2d-79          [-1, 384, 28, 28]             768
           Conv2d-80          [-1, 128, 28, 28]          49,152
      BatchNorm2d-81          [-1, 128, 28, 28]             256
           Conv2d-82           [-1, 32, 28, 28]          36,864
       Bottleneck-83          [-1, 416, 28, 28]               0
      BatchNorm2d-84          [-1, 416, 28, 28]             832
           Conv2d-85          [-1, 128, 28, 28]          53,248
      BatchNorm2d-86          [-1, 128, 28, 28]             256
           Conv2d-87           [-1, 32, 28, 28]          36,864
       Bottleneck-88          [-1, 448, 28, 28]               0
      BatchNorm2d-89          [-1, 448, 28, 28]             896
           Conv2d-90          [-1, 128, 28, 28]          57,344
      BatchNorm2d-91          [-1, 128, 28, 28]             256
           Conv2d-92           [-1, 32, 28, 28]          36,864
       Bottleneck-93          [-1, 480, 28, 28]               0
      BatchNorm2d-94          [-1, 480, 28, 28]             960
           Conv2d-95          [-1, 128, 28, 28]          61,440
      BatchNorm2d-96          [-1, 128, 28, 28]             256
           Conv2d-97           [-1, 32, 28, 28]          36,864
       Bottleneck-98          [-1, 512, 28, 28]               0
       DenseBlock-99          [-1, 512, 28, 28]               0
     BatchNorm2d-100          [-1, 512, 28, 28]           1,024
          Conv2d-101          [-1, 256, 28, 28]         131,072
       AvgPool2d-102          [-1, 256, 14, 14]               0
      Transition-103          [-1, 256, 14, 14]               0
     BatchNorm2d-104          [-1, 256, 14, 14]             512
          Conv2d-105          [-1, 128, 14, 14]          32,768
     BatchNorm2d-106          [-1, 128, 14, 14]             256
          Conv2d-107           [-1, 32, 14, 14]          36,864
      Bottleneck-108          [-1, 288, 14, 14]               0
     BatchNorm2d-109          [-1, 288, 14, 14]             576
          Conv2d-110          [-1, 128, 14, 14]          36,864
     BatchNorm2d-111          [-1, 128, 14, 14]             256
          Conv2d-112           [-1, 32, 14, 14]          36,864
      Bottleneck-113          [-1, 320, 14, 14]               0
     BatchNorm2d-114          [-1, 320, 14, 14]             640
          Conv2d-115          [-1, 128, 14, 14]          40,960
     BatchNorm2d-116          [-1, 128, 14, 14]             256
          Conv2d-117           [-1, 32, 14, 14]          36,864
      Bottleneck-118          [-1, 352, 14, 14]               0
     BatchNorm2d-119          [-1, 352, 14, 14]             704
          Conv2d-120          [-1, 128, 14, 14]          45,056
     BatchNorm2d-121          [-1, 128, 14, 14]             256
          Conv2d-122           [-1, 32, 14, 14]          36,864
      Bottleneck-123          [-1, 384, 14, 14]               0
     BatchNorm2d-124          [-1, 384, 14, 14]             768
          Conv2d-125          [-1, 128, 14, 14]          49,152
     BatchNorm2d-126          [-1, 128, 14, 14]             256
          Conv2d-127           [-1, 32, 14, 14]          36,864
      Bottleneck-128          [-1, 416, 14, 14]               0
     BatchNorm2d-129          [-1, 416, 14, 14]             832
          Conv2d-130          [-1, 128, 14, 14]          53,248
     BatchNorm2d-131          [-1, 128, 14, 14]             256
          Conv2d-132           [-1, 32, 14, 14]          36,864
      Bottleneck-133          [-1, 448, 14, 14]               0
     BatchNorm2d-134          [-1, 448, 14, 14]             896
          Conv2d-135          [-1, 128, 14, 14]          57,344
     BatchNorm2d-136          [-1, 128, 14, 14]             256
          Conv2d-137           [-1, 32, 14, 14]          36,864
      Bottleneck-138          [-1, 480, 14, 14]               0
     BatchNorm2d-139          [-1, 480, 14, 14]             960
          Conv2d-140          [-1, 128, 14, 14]          61,440
     BatchNorm2d-141          [-1, 128, 14, 14]             256
          Conv2d-142           [-1, 32, 14, 14]          36,864
      Bottleneck-143          [-1, 512, 14, 14]               0
     BatchNorm2d-144          [-1, 512, 14, 14]           1,024
          Conv2d-145          [-1, 128, 14, 14]          65,536
     BatchNorm2d-146          [-1, 128, 14, 14]             256
          Conv2d-147           [-1, 32, 14, 14]          36,864
      Bottleneck-148          [-1, 544, 14, 14]               0
     BatchNorm2d-149          [-1, 544, 14, 14]           1,088
          Conv2d-150          [-1, 128, 14, 14]          69,632
     BatchNorm2d-151          [-1, 128, 14, 14]             256
          Conv2d-152           [-1, 32, 14, 14]          36,864
      Bottleneck-153          [-1, 576, 14, 14]               0
     BatchNorm2d-154          [-1, 576, 14, 14]           1,152
          Conv2d-155          [-1, 128, 14, 14]          73,728
     BatchNorm2d-156          [-1, 128, 14, 14]             256
          Conv2d-157           [-1, 32, 14, 14]          36,864
      Bottleneck-158          [-1, 608, 14, 14]               0
     BatchNorm2d-159          [-1, 608, 14, 14]           1,216
          Conv2d-160          [-1, 128, 14, 14]          77,824
     BatchNorm2d-161          [-1, 128, 14, 14]             256
          Conv2d-162           [-1, 32, 14, 14]          36,864
      Bottleneck-163          [-1, 640, 14, 14]               0
     BatchNorm2d-164          [-1, 640, 14, 14]           1,280
          Conv2d-165          [-1, 128, 14, 14]          81,920
     BatchNorm2d-166          [-1, 128, 14, 14]             256
          Conv2d-167           [-1, 32, 14, 14]          36,864
      Bottleneck-168          [-1, 672, 14, 14]               0
     BatchNorm2d-169          [-1, 672, 14, 14]           1,344
          Conv2d-170          [-1, 128, 14, 14]          86,016
     BatchNorm2d-171          [-1, 128, 14, 14]             256
          Conv2d-172           [-1, 32, 14, 14]          36,864
      Bottleneck-173          [-1, 704, 14, 14]               0
     BatchNorm2d-174          [-1, 704, 14, 14]           1,408
          Conv2d-175          [-1, 128, 14, 14]          90,112
     BatchNorm2d-176          [-1, 128, 14, 14]             256
          Conv2d-177           [-1, 32, 14, 14]          36,864
      Bottleneck-178          [-1, 736, 14, 14]               0
     BatchNorm2d-179          [-1, 736, 14, 14]           1,472
          Conv2d-180          [-1, 128, 14, 14]          94,208
     BatchNorm2d-181          [-1, 128, 14, 14]             256
          Conv2d-182           [-1, 32, 14, 14]          36,864
      Bottleneck-183          [-1, 768, 14, 14]               0
     BatchNorm2d-184          [-1, 768, 14, 14]           1,536
          Conv2d-185          [-1, 128, 14, 14]          98,304
     BatchNorm2d-186          [-1, 128, 14, 14]             256
          Conv2d-187           [-1, 32, 14, 14]          36,864
      Bottleneck-188          [-1, 800, 14, 14]               0
     BatchNorm2d-189          [-1, 800, 14, 14]           1,600
          Conv2d-190          [-1, 128, 14, 14]         102,400
     BatchNorm2d-191          [-1, 128, 14, 14]             256
          Conv2d-192           [-1, 32, 14, 14]          36,864
      Bottleneck-193          [-1, 832, 14, 14]               0
     BatchNorm2d-194          [-1, 832, 14, 14]           1,664
          Conv2d-195          [-1, 128, 14, 14]         106,496
     BatchNorm2d-196          [-1, 128, 14, 14]             256
          Conv2d-197           [-1, 32, 14, 14]          36,864
      Bottleneck-198          [-1, 864, 14, 14]               0
     BatchNorm2d-199          [-1, 864, 14, 14]           1,728
          Conv2d-200          [-1, 128, 14, 14]         110,592
     BatchNorm2d-201          [-1, 128, 14, 14]             256
          Conv2d-202           [-1, 32, 14, 14]          36,864
      Bottleneck-203          [-1, 896, 14, 14]               0
     BatchNorm2d-204          [-1, 896, 14, 14]           1,792
          Conv2d-205          [-1, 128, 14, 14]         114,688
     BatchNorm2d-206          [-1, 128, 14, 14]             256
          Conv2d-207           [-1, 32, 14, 14]          36,864
      Bottleneck-208          [-1, 928, 14, 14]               0
     BatchNorm2d-209          [-1, 928, 14, 14]           1,856
          Conv2d-210          [-1, 128, 14, 14]         118,784
     BatchNorm2d-211          [-1, 128, 14, 14]             256
          Conv2d-212           [-1, 32, 14, 14]          36,864
      Bottleneck-213          [-1, 960, 14, 14]               0
     BatchNorm2d-214          [-1, 960, 14, 14]           1,920
          Conv2d-215          [-1, 128, 14, 14]         122,880
     BatchNorm2d-216          [-1, 128, 14, 14]             256
          Conv2d-217           [-1, 32, 14, 14]          36,864
      Bottleneck-218          [-1, 992, 14, 14]               0
     BatchNorm2d-219          [-1, 992, 14, 14]           1,984
          Conv2d-220          [-1, 128, 14, 14]         126,976
     BatchNorm2d-221          [-1, 128, 14, 14]             256
          Conv2d-222           [-1, 32, 14, 14]          36,864
      Bottleneck-223         [-1, 1024, 14, 14]               0
      DenseBlock-224         [-1, 1024, 14, 14]               0
     BatchNorm2d-225         [-1, 1024, 14, 14]           2,048
          Conv2d-226          [-1, 512, 14, 14]         524,288
       AvgPool2d-227            [-1, 512, 7, 7]               0
      Transition-228            [-1, 512, 7, 7]               0
     BatchNorm2d-229            [-1, 512, 7, 7]           1,024
          Conv2d-230            [-1, 128, 7, 7]          65,536
     BatchNorm2d-231            [-1, 128, 7, 7]             256
          Conv2d-232             [-1, 32, 7, 7]          36,864
      Bottleneck-233            [-1, 544, 7, 7]               0
     BatchNorm2d-234            [-1, 544, 7, 7]           1,088
          Conv2d-235            [-1, 128, 7, 7]          69,632
     BatchNorm2d-236            [-1, 128, 7, 7]             256
          Conv2d-237             [-1, 32, 7, 7]          36,864
      Bottleneck-238            [-1, 576, 7, 7]               0
     BatchNorm2d-239            [-1, 576, 7, 7]           1,152
          Conv2d-240            [-1, 128, 7, 7]          73,728
     BatchNorm2d-241            [-1, 128, 7, 7]             256
          Conv2d-242             [-1, 32, 7, 7]          36,864
      Bottleneck-243            [-1, 608, 7, 7]               0
     BatchNorm2d-244            [-1, 608, 7, 7]           1,216
          Conv2d-245            [-1, 128, 7, 7]          77,824
     BatchNorm2d-246            [-1, 128, 7, 7]             256
          Conv2d-247             [-1, 32, 7, 7]          36,864
      Bottleneck-248            [-1, 640, 7, 7]               0
     BatchNorm2d-249            [-1, 640, 7, 7]           1,280
          Conv2d-250            [-1, 128, 7, 7]          81,920
     BatchNorm2d-251            [-1, 128, 7, 7]             256
          Conv2d-252             [-1, 32, 7, 7]          36,864
      Bottleneck-253            [-1, 672, 7, 7]               0
     BatchNorm2d-254            [-1, 672, 7, 7]           1,344
          Conv2d-255            [-1, 128, 7, 7]          86,016
     BatchNorm2d-256            [-1, 128, 7, 7]             256
          Conv2d-257             [-1, 32, 7, 7]          36,864
      Bottleneck-258            [-1, 704, 7, 7]               0
     BatchNorm2d-259            [-1, 704, 7, 7]           1,408
          Conv2d-260            [-1, 128, 7, 7]          90,112
     BatchNorm2d-261            [-1, 128, 7, 7]             256
          Conv2d-262             [-1, 32, 7, 7]          36,864
      Bottleneck-263            [-1, 736, 7, 7]               0
     BatchNorm2d-264            [-1, 736, 7, 7]           1,472
          Conv2d-265            [-1, 128, 7, 7]          94,208
     BatchNorm2d-266            [-1, 128, 7, 7]             256
          Conv2d-267             [-1, 32, 7, 7]          36,864
      Bottleneck-268            [-1, 768, 7, 7]               0
     BatchNorm2d-269            [-1, 768, 7, 7]           1,536
          Conv2d-270            [-1, 128, 7, 7]          98,304
     BatchNorm2d-271            [-1, 128, 7, 7]             256
          Conv2d-272             [-1, 32, 7, 7]          36,864
      Bottleneck-273            [-1, 800, 7, 7]               0
     BatchNorm2d-274            [-1, 800, 7, 7]           1,600
          Conv2d-275            [-1, 128, 7, 7]         102,400
     BatchNorm2d-276            [-1, 128, 7, 7]             256
          Conv2d-277             [-1, 32, 7, 7]          36,864
      Bottleneck-278            [-1, 832, 7, 7]               0
     BatchNorm2d-279            [-1, 832, 7, 7]           1,664
          Conv2d-280            [-1, 128, 7, 7]         106,496
     BatchNorm2d-281            [-1, 128, 7, 7]             256
          Conv2d-282             [-1, 32, 7, 7]          36,864
      Bottleneck-283            [-1, 864, 7, 7]               0
     BatchNorm2d-284            [-1, 864, 7, 7]           1,728
          Conv2d-285            [-1, 128, 7, 7]         110,592
     BatchNorm2d-286            [-1, 128, 7, 7]             256
          Conv2d-287             [-1, 32, 7, 7]          36,864
      Bottleneck-288            [-1, 896, 7, 7]               0
     BatchNorm2d-289            [-1, 896, 7, 7]           1,792
          Conv2d-290            [-1, 128, 7, 7]         114,688
     BatchNorm2d-291            [-1, 128, 7, 7]             256
          Conv2d-292             [-1, 32, 7, 7]          36,864
      Bottleneck-293            [-1, 928, 7, 7]               0
     BatchNorm2d-294            [-1, 928, 7, 7]           1,856
          Conv2d-295            [-1, 128, 7, 7]         118,784
     BatchNorm2d-296            [-1, 128, 7, 7]             256
          Conv2d-297             [-1, 32, 7, 7]          36,864
      Bottleneck-298            [-1, 960, 7, 7]               0
     BatchNorm2d-299            [-1, 960, 7, 7]           1,920
          Conv2d-300            [-1, 128, 7, 7]         122,880
     BatchNorm2d-301            [-1, 128, 7, 7]             256
          Conv2d-302             [-1, 32, 7, 7]          36,864
      Bottleneck-303            [-1, 992, 7, 7]               0
     BatchNorm2d-304            [-1, 992, 7, 7]           1,984
          Conv2d-305            [-1, 128, 7, 7]         126,976
     BatchNorm2d-306            [-1, 128, 7, 7]             256
          Conv2d-307             [-1, 32, 7, 7]          36,864
      Bottleneck-308           [-1, 1024, 7, 7]               0
      DenseBlock-309           [-1, 1024, 7, 7]               0
     BatchNorm2d-310           [-1, 1024, 7, 7]           2,048
AdaptiveAvgPool2d-311           [-1, 1024, 1, 1]               0
          Linear-312                 [-1, 1000]       1,025,000
================================================================
Total params: 7,978,856
Trainable params: 7,978,856
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 265.11
Params size (MB): 30.44
Estimated Total Size (MB): 296.12
----------------------------------------------------------------

代码实现

设置gpu

python 复制代码
import torch
import torch.nn as nn
from torchvision import transforms, datasets
from PIL import Image
import matplotlib.pyplot as plt
import os,PIL,pathlib,warnings
warnings.filterwarnings("ignore")    #忽略警告信息

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

数据导入

python 复制代码
# 数据导入
data_dir = "../../datasets/bones/"

train_transforms = transforms.Compose([
    transforms.Resize([224, 224]), # 将输入图片resize成统一尺寸
    transforms.ToTensor(), # 将图像转换为tensor,并归一化到[0,1]之间
    transforms.Normalize( # 转换为标准正太分布(高斯分布)
        mean=[0.485, 0.456, 0.406], 
        std =[0.229, 0.224, 0.225])
])

total_data = datasets.ImageFolder(data_dir, transform = train_transforms)
total_data

标签打印

python 复制代码
# 数据集中每个类别名称对应的数字标签(class_to_idx),用于模型训练时的标签编码
total_data.class_to_idx

数据集划分和

python 复制代码
# 划分训练集和测试集(8:2)
total_size = len(total_data)
train_size = int(0.8 * total_size)
test_size = total_size - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
train_dataset, test_dataset
python 复制代码
# 创建训练数据加载器:每次从训练集中加载 batch_size=4 个样本,并打乱顺序(shuffle=True)
train_loader = torch.utils.data.DataLoader(train_dataset,
                                           batch_size=4,
                                           shuffle=True)
# 创建测试数据加载器:每次从测试集中加载 batch_size=4 个样本,不打乱顺序(默认 shuffle=False)
test_loader = torch.utils.data.DataLoader(test_dataset,
                                          batch_size=4)

验证数据形状

python 复制代码
# 打印一个 batch 的数据形状以验证
for X, y in test_loader:
    print(f"输入张量形状 [Batch, Channel, Height, Width]: {X.shape}")
    print(f"标签形状: {y.shape}, 数据类型: {y.dtype}")
    break

构建DenseNet-121

python 复制代码
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F


class Bottleneck(nn.Module):
    """
    DenseNet 中的瓶颈层(Bottleneck Layer)
    - 结构:BN → ReLU → 1×1 Conv(压缩通道)→ BN → ReLU → 3×3 Conv(生成新特征)
    - 输出 = 输入 x 与新生成特征在通道维度拼接(实现 dense connectivity)
    """
    def __init__(self, in_channels, growth_rate):
        super(Bottleneck, self).__init__()
        # 第一个 1×1 卷积用于降维(Bottleneck),将输入通道压缩为 4 * growth_rate
        self.bn1 = nn.BatchNorm2d(in_channels) # 对输入进行批标准化
        self.conv1 = nn.Conv2d(in_channels, 4 * growth_rate, kernel_size=1, bias=False)  # 1×1 卷积,减少计算量

        # 第二个 3×3 卷积用于生成 growth_rate 个新特征图
        self.bn2 = nn.BatchNorm2d(4 * growth_rate) # 对压缩后的特征做批标准化
        self.conv2 = nn.Conv2d(4 * growth_rate, growth_rate, kernel_size=3, padding=1, bias=False) # 3×3 卷积,padding=1 保持尺寸不变

    def forward(self, x):
        out = self.conv1(F.relu(self.bn1(x))) # 先 BN → ReLU → 1×1 Conv(通道压缩)
        out = self.conv2(F.relu(self.bn2(out))) # 再 BN → ReLU → 3×3 Conv(特征生成)
        out = torch.cat([x, out], dim=1) # 将原始输入 x 与新特征在通道维度拼接(dense connection)
        return out


class DenseBlock(nn.Module):
    """
    Dense Block:由多个 Bottleneck 层堆叠而成
    - 每一层的输入 = 原始输入 + 所有前序层输出的拼接
    - 实现"密集连接"(Dense Connectivity):每一层都接收前面所有层的特征作为输入
    """
    def __init__(self, in_channels, num_layers, growth_rate):
        super(DenseBlock, self).__init__()
        # 使用 ModuleList 动态构建多个 Bottleneck 层
        # 第 i 层的输入通道数 = in_channels + i * growth_rate(因为每层新增 growth_rate 个通道)
        self.layers = nn.ModuleList([
            Bottleneck(in_channels + i * growth_rate, growth_rate)
            for i in range(num_layers)
        ])

    def forward(self, x):
        for layer in self.layers:
            x = layer(x) # 顺序通过每个 Bottleneck,每次都将新特征拼接到 x 上
        return x


class Transition(nn.Module):
    """
    Transition Layer:连接两个 DenseBlock,用于降低复杂度
    - 功能1:1×1 卷积压缩通道数(通常压缩为原来的一半,即 compression factor θ=0.5)
    - 功能2:2×2 平均池化进行下采样(空间尺寸减半)
    - 结构:BN → ReLU → 1×1 Conv → AvgPool
    """
    def __init__(self, in_channels, out_channels):
        super(Transition, self).__init__()
        self.bn = nn.BatchNorm2d(in_channels) # 批标准化
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False)  # 1×1 卷积,压缩通道
        self.avg_pool = nn.AvgPool2d(kernel_size=2, stride=2) # 2×2 平均池化,步长为2,实现下采样

    def forward(self, x):
        out = self.conv(F.relu(self.bn(x)))
        out = self.avg_pool(out)
        return out


class DenseNet121(nn.Module):
    """
    DenseNet 主干网络(参考 DenseNet-121/169/201 等结构)
    - 特点:密集连接(Dense Connectivity)、瓶颈层(Bottleneck)、过渡层(Transition)
    - block_config 控制每个 DenseBlock 中的层数(如 (6,12,24,16) 对应 DenseNet-121)
    """
    def __init__(self, growth_rate=32, block_config=(6, 12, 24, 16), num_classes=1000):
        super(DenseNet, self).__init__()

        # 初始特征提取层
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)  # 7×7 卷积,stride=2 下采样
        self.bn1 = nn.BatchNorm2d(64)                                                   # 批标准化
        self.max_pool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)               # 3×3 最大池化,进一步下采样(总下采样倍数为4)

        num_features = 64  # 当前特征图的通道数(初始为64)
        self.blocks = nn.ModuleList([])  # 存储所有 DenseBlock 和 Transition 层

        # 构建多个 DenseBlock + Transition 层
        for i, num_layers in enumerate(block_config):
            # 创建一个 DenseBlock
            block = DenseBlock(num_features, num_layers, growth_rate)
            self.blocks.append(block)
            num_features += num_layers * growth_rate  # 每个 DenseBlock 新增 num_layers * growth_rate 个通道

            # 如果不是最后一个 DenseBlock,则添加 Transition 层(压缩 + 下采样)
            if i != len(block_config) - 1:
                trans = Transition(num_features, num_features // 2)
                self.blocks.append(trans)
                num_features = num_features // 2

        # 分类头
        self.bn_final = nn.BatchNorm2d(num_features) # 最后一个块输出的批标准化
        self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) # 全局平均池化
        self.fc = nn.Linear(num_features, num_classes) # 全连接层,输出类别 logits

    def forward(self, x):
        # 初始卷积 + 池化(下采样至 1/4)
        out = self.conv1(x)
        out = self.bn1(out)
        out = F.relu(out)
        out = self.max_pool(out)

        # 依次通过所有 DenseBlock 和 Transition 层
        for block in self.blocks:
            out = block(out) # 自动处理 DenseBlock(特征拼接)或 Transition(压缩+下采样)

        # 最终分类处理
        out = self.bn_final(out)
        out = F.relu(out)
        out = self.avg_pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out


# 自动选择设备(GPU 优先)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 实例化 DenseNet 模型
model = DenseNet121().to(device)

# 查看模型结构
import torchsummary as summary
summary.summary(model, (3, 224, 224))

训练和测试函数

python 复制代码
# 训练循环
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset) # 训练集的大小
    num_batches = len(dataloader) # 批次数目(size/batch_size,向上取整)
    
    train_loss, train_acc = 0, 0 # 初始化训练损失和正确率
    
    for X, y in dataloader: # 获取图片及其标签
        X, y = X.to(device), y.to(device)
        
        # 计算预测误差
        pred = model(X) # 网络输出
        loss = loss_fn(pred, y) # 计算网络输出和真实值之间的差距
        
        # 反向传播
        optimizer.zero_grad() # grad属性归零
        loss.backward() # 反向传播
        optimizer.step() # 每一步自动更新
        
        # 记录acc和loss
        train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
        train_loss += loss.item()
    
    train_acc /= size # 计算训练集整体正确率
    train_loss /= num_batches # 计算训练集平均损失
    
    return train_acc, train_loss
python 复制代码
# 测试函数
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset) # 测试集的大小
    num_batches = len(dataloader)  # 批次数目(size/batch_size,向上取整)
    test_loss, test_acc = 0, 0
 
    # 当不进行训练时,停止梯度更新,节省计算内存消耗
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)
            
            # 计算loss
            target_pred = model(imgs)
            loss = loss_fn(target_pred, target)
 
            test_loss += loss.item()
            test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()
 
    test_acc /= size
    test_loss /= num_batches
 
    return test_acc, test_loss

正式训练

学习率保持不变

python 复制代码
# 训练
import copy
import torch
import torch.nn as nn
import torch.optim as optim

# 初始化优化器与损失函数
optimizer = optim.AdamW(model.parameters(), lr=1e-4)
loss_fn = nn.CrossEntropyLoss()  # 创建损失函数

epochs = 50  # 训练轮数

# 初始化指标记录列表
train_loss = []
train_acc = []
test_loss = []
test_acc = []

best_acc = 0  # 设置最佳准确率,作为保存最佳模型的指标

for epoch in range(epochs):
    # 训练阶段
    model.train()  # 开启训练模式
    epoch_train_acc, epoch_train_loss = train(train_loader, model, loss_fn, optimizer)
    
    # 测试阶段
    model.eval()  # 开启评估模式
    epoch_test_acc, epoch_test_loss = test(test_loader, model, loss_fn)
    
    # 保存最佳模型
    if epoch_test_acc > best_acc:
        best_acc = epoch_test_acc
        best_model = copy.deepcopy(model) # 深拷贝当前最佳模型
    
    # 记录训练/测试指标
    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)
    
    # 获取当前学习率
    lr = optimizer.state_dict()['param_groups'][0]['lr']
    
    # 打印当前轮次的指标
    template = ('第 {:2d} 轮,训练准确率:{:.1f}%,训练损失:{:.3f},测试准确率:{:.1f}%,测试损失:{:.3f},学习率:{:.2E}')
    print(template.format(epoch + 1,
                          epoch_train_acc * 100,
                          epoch_train_loss,
                          epoch_test_acc * 100,
                          epoch_test_loss,
                          lr))

# 保存最佳模型到文件
PATH = './best_model.pth'  # 保存的参数文件名
torch.save(best_model.state_dict(), PATH)  # 保存模型的参数状态字典

print('完成')

学习率在第30轮时降至1e-5

python 复制代码
# 训练
import copy
import torch
import torch.nn as nn
import torch.optim as optim

# 初始化优化器与损失函数
optimizer = optim.AdamW(model.parameters(), lr=1e-4)
loss_fn = nn.CrossEntropyLoss()  # 创建损失函数

epochs = 50  # 训练轮数

# 初始化指标记录列表
train_loss = []
train_acc = []
test_loss = []
test_acc = []

best_acc = 0  # 设置最佳准确率,作为保存最佳模型的指标

for epoch in range(epochs):

    if epoch == 30:  # 在第30轮后降低学习率
        for param_group in optimizer.param_groups:
            param_group['lr'] = 1e-5  # 从 1e-4 → 1e-5
    # 训练阶段
    model.train()  # 开启训练模式
    epoch_train_acc, epoch_train_loss = train(train_loader, model, loss_fn, optimizer)
    
    # 测试阶段
    model.eval()  # 开启评估模式
    epoch_test_acc, epoch_test_loss = test(test_loader, model, loss_fn)
    
    # 保存最佳模型
    if epoch_test_acc > best_acc:
        best_acc = epoch_test_acc
        best_model = copy.deepcopy(model)  # 深拷贝当前最佳模型
    
    # 记录训练/测试指标
    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)
    
    # 获取当前学习率
    lr = optimizer.state_dict()['param_groups'][0]['lr']
    
    # 打印当前轮次的指标
    template = ('第 {:2d} 轮,训练准确率:{:.1f}%,训练损失:{:.3f},测试准确率:{:.1f}%,测试损失:{:.3f},学习率:{:.2E}')
    print(template.format(epoch + 1,
                          epoch_train_acc * 100,
                          epoch_train_loss,
                          epoch_test_acc * 100,
                          epoch_test_loss,
                          lr))

# 保存最佳模型到文件
PATH = './best_model.pth'  # 保存的参数文件名
torch.save(best_model.state_dict(), PATH)  # 保存模型的参数状态字典

print('完成')

可视化训练结果

python 复制代码
# 结果可视化
import matplotlib.pyplot as plt
# 隐藏警告
import warnings
warnings.filterwarnings("ignore")  # 忽略警告信息

# 配置 Matplotlib 显示(解决中文/负号显示问题)
plt.rcParams['font.sans-serif'] = ['SimHei'] # 正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 正常显示负号
plt.rcParams['figure.dpi'] = 100 # 设置图像分辨率为 100

from datetime import datetime
current_time = datetime.now() # 获取当前时间

epochs_range = range(epochs)

# 创建画布并绘制子图
plt.figure(figsize=(12, 3))

# 子图 1:准确率曲线
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc, label='训练准确率')
plt.plot(epochs_range, test_acc, label='测试准确率')
plt.legend(loc='lower right')
plt.title('训练与验证准确率')
plt.xlabel(f'训练轮次(生成时间:{current_time.strftime("%Y-%m-%d %H:%M:%S")})')  # 横轴标注当前时间

# 子图 2:损失曲线
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='训练损失')
plt.plot(epochs_range, test_loss, label='测试损失')
plt.legend(loc='upper right')
plt.title('训练与验证损失')

plt.show()

模型评估

python 复制代码
# 将参数加载到model当中
best_model.load_state_dict(torch.load(PATH, map_location=device))
# 用最佳模型评估测试集
epoch_test_acc, epoch_test_loss = test(test_loader, best_model, loss_fn)
# 输出最佳模型的测试指标
epoch_test_acc, epoch_test_loss

对比分析

学习率一保持不变时

在训练过程中表现出较为明显的过拟合和严重震荡,训练准确率持续上升,但测试准确率波动剧烈,尤其在第46轮出现崩溃(测试准确率从97.3%变化为88.3%)。原因在于全程使用固定学习率(1e-4),导致模型在后期陷入局部最优,优化路径不稳定。

学习率在后30轮除以10

相比之前全程使用固定学习率(1e-4)导致测试准确率剧烈震荡甚至崩溃,这一次在第30轮后将学习率降至1e-5的策略提升了训练稳定性:模型后期测试准确率比较稳定,波动极小,未再出现损失飙升或性能骤降。

原因分析

DenseNet 这个模型每一层都会把前面所有层的输出"拼"在一起作为自己的输入,这样虽然能更好地利用特征,但也存在问题:整个网络的参数之间联系特别紧密。所以一旦学习率设得太高(比如一直用 1e-4),训练后期稍微一更新参数,就容易"牵一发而动全身",导致模型突然崩掉或者测试效果剧烈波动。而且 DenseNet 的通道数增长很快,模型其实挺大的,很容易把训练数据"死记硬背"下来,但一到测试就表现不稳。固定学习率的时候,在46轮测试准确率突然大跌,就是这个原因。而在第30轮之后把学习率调小(降到 1e-5),相当于让模型从"大步快跑"变成"小心微调"。这样一来,它就不会乱改已经学得差不多的特征,训练和测试的表现都稳多了,没再出现那种突然崩溃的情况。说白了,DenseNet 这种结构对学习率比较敏感,前期可以学快点,后期一定要慢下来才能稳住。

一点小想法和总结

  • 最开始只是觉得卷积核的大小和步长等等一系列参数很复杂,根本不知道怎么得来的以及为什么必须要这个参数不然就跑不通。但从 PyTorch 入门到复现 ResNet 和 DenseNet,我逐渐学会推算维度对齐,并开始试着把复杂的网络拆成一个个模块。但要真正的理解还是在于理解背后的设计原理(思维),正如指导文档里标注的一样思维远大于方法更远大于知识,知识只是具体的技术细节,比如卷积核的大小、步长、填充等参数设置;方法是实现的策略,比如推算维度对齐、将复杂网络拆解为功能模块;而思维是理解模型为什么要这样设计的关键,比如 DenseNet 的设计是为了重用之前提取的特征,思维本质是看待问题的思考方法。

  • ResNet 是可以学不到新东西,但至少别把原来学到的弄丢;DenseNet相当于是大家之前的成果都能看得见,不用重复造轮子,有点类似于站在所有人肩膀上继续创新。

相关推荐
2301_779622411 小时前
Go语言怎么用信号量控制并发_Go语言semaphore信号量教程【入门】
jvm·数据库·python
2301_766283442 小时前
c++如何将控制台输出保存到文件_cout重定向到txt【详解】
jvm·数据库·python
小康小小涵3 小时前
基于ESP32S3实现无人机RID模块底层源码编译
linux·开发语言·python
lzjava20243 小时前
Python的函数
开发语言·python
Awesome Baron4 小时前
skill、tool calling、MCP区别
开发语言·人工智能·python
测试员周周4 小时前
【AI测试系统】第4篇:告别硬编码!基于 Markdown + Python 的 Skill 引擎设计:让 AI 测试系统拥有无限扩展的“灵魂”
人工智能·python·测试
武帝为此4 小时前
【Selenium 屏幕截图】
python·selenium·测试工具
念恒123065 小时前
Python(列表进阶)
python·学习
27669582925 小时前
阿里最新acw_sc__v2 分析
开发语言·python·acw_sc__v2·acw_sc__v2逆向·acw_sc__v2算法·acw_sc__v2算法分析·cookie逆向