小土堆pytorch--神经网路的基本骨架（nn.Module的使用）&卷积操作

小土堆pytorch--神经网路的基本骨架（nn.Module的使用）

对于官网nn.Module操作的解释

在pytorch官网可以看到

对于上述forward函数的解释：

示例代码

bash 复制代码

import torch
from torch import nn


class Test(nn.Module): # 继承神经网路的基本骨架nn
    def __init__(self): # 重写init函数
        super().__init__()

    def forward(self, input):
        output = input + 1
        return output

test = Test() #创建神经网路
x = torch.tensor(1.0)
output = test(x)
print(output)

运行结果是：

卷积操作，神经结构的使用

对于官网卷积操作的解释

前两个nn.Conv1d和nn.Conv2d比较常用分别是一维卷积和二维卷积，我们以二维卷积为例进行讲解

卷积运算基本概念

卷积是深度学习中卷积神经网络（CNN）的核心操作。在图像领域，它通过卷积核在输入图像上滑动，进行元素乘法和累加操作，提取图像特征。这里输入图像大小为(5×5)，卷积核大小为(3×3) ，步长（Stride）为(1) 。步长指卷积核在输入图像上每次滑动的像素数。

具体计算步骤

计算左上角元素（输出图像的第一个元素）
计算同行相邻元素（向右滑动卷积核）
继续向右滑动计算同行元素
计算下一行元素
按此规律遍历计算
- 重复上述向右滑动计算同行元素，向下滑动换行计算的过程，直到卷积核遍历完整个输入图像，就得到完整的卷积后输出图像。

通过这样的卷积操作，实现对输入图像特征的提取，不同的卷积核可以提取不同类型的特征，如边缘、纹理等。

相关代码及解释

编程实现上述卷积操作

bash 复制代码

import torch
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])

print(input.shape)
print(kernel.shape)

但这并不满足pytorch中对于input和kernel的尺寸的要求

因此我们可以使用reshape操作

bash 复制代码

import torch
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])

input = torch.reshape(input, (1,1,5,5))
kernel = torch.reshape(kernel, (1,1,3,3))

print(input.shape)
print(kernel.shape)

结果是：

input 和 weight(kernel) 的权重合适了之后，就可以进行卷积操作了

bash 复制代码

import torch
import torch.nn.functional as F
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])

input = torch.reshape(input, (1,1,5,5))
kernel = torch.reshape(kernel, (1,1,3,3))

print(input.shape)
print(kernel.shape)

output = F.conv2d(input, kernel,stride = 1)
print(output)

stride（步长）

含义：stride 指卷积核在输入张量上滑动时每次移动的距离。它可以是一个整数，此时在水平和垂直方向移动距离相同；也可以是一个元组 (sH, sW) ，分别指定在高度（H ）和宽度（W ）方向上的移动距离。
作用：
- 控制输出尺寸：增大 stride 会使卷积核在输入上跳过更多元素，输出特征图尺寸变小。例如，输入图像尺寸较大时，增大 stride 可快速下采样，减少计算量。
- 感受野变化：影响卷积核在输入上的覆盖范围（感受野）。stride 越大，卷积核在输入上覆盖区域跳跃越大，感受野相对增大，能捕捉到更全局的信息。
示例：若 stride 为 1 ，卷积核每次移动 1 个像素；若 stride 为 2 ，卷积核每次移动 2 个像素。

padding（填充）

含义：padding 是在输入张量的边缘添加值（通常为 0 ）的操作。它可以是一个整数，代表在输入张量所有边缘添加相同数量的填充值；也可以是一个元组 (padH, padW) ，分别指定在高度和宽度方向上添加的填充值。
作用：
- 保持输出尺寸：当卷积核在输入上滑动卷积时，若不填充，输出特征图尺寸通常会小于输入尺寸。通过合理设置 padding ，可使输出特征图尺寸与输入尺寸相同或满足特定需求。
- 控制边界信息：填充操作可使卷积核在输入边缘也能进行完整卷积，避免边缘信息被过度忽略，有助于更好地提取边缘特征。
示例：若 padding 为 1 ，则在输入张量的上下左右边缘各添加 1 层填充值；若 padding 为 (2, 1) ，则在高度方向上下各添加 2 层填充值，在宽度方向左右各添加 1 层填充值。

bash 复制代码

import torch
import torch.nn.functional as F
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])

input = torch.reshape(input, (1,1,5,5))
kernel = torch.reshape(kernel, (1,1,3,3))

print(input.shape)
print(kernel.shape)

output = F.conv2d(input, kernel,stride = 1)
print(output)

output2 = F.conv2d(input, kernel, stride = 2)
print(output2)

output3 = F.conv2d(input,kernel,stride = 1, padding=1)
print(output3)

运行结果为：

对于stride=2的解释