神经网络的基本骨架 nn.Module的使用
神经网络的主要工具都在 torch.nn里
Neural Network
Containers:
Containers 包含6个模块:
- Module
- Sequential
- ModuleList
- ModuleDict
- ParameterList
- ParameterDict
其中最常用的是 Module 模块(为所有神经网络提供基本骨架)
Module------Base class for all neural network modules.搭建的Model都必须继承该类
torch.nn.Module介绍
python
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
#在使用nn.module构建神经网络时,需要在__init__()方法中对继承的Module类中的属性进行调用,因此在初始化方法中需要添加
self.conv1 = nn.Conv2d(1, 20, 5) #这是第一层卷积层,定义了输入为单通道(1),输出通道数为 20,卷积核的大小为 5x5
self.conv2 = nn.Conv2d(20, 20, 5) #第二层卷积层,输入和输出通道都是 20,卷积核的大小仍然是 5x5。
# forward() 前向传播函数
def forward(self, x):
x = F.relu(self.conv1(x)) #输入(x) -> 卷积 -> 非线性
return F.relu(self.conv2(x)) #卷积 -> 非线性 -> 输出
pycharm 重写类方法的快捷键 ctrl + O
python
import torch
from torch import nn
class Tudui(nn.Module):
def __init__(self) -> None:
super().__init__()
def forward(self, input):
output = input + 1
return output
tudui = Tudui()
x = torch.tensor(1.0)
output = tudui(x)
print(output)
输出:
python
tensor(2.)
调试的时候,(断点此时在tudui = Tudui()),主要使用Step Into My Code,看下一次运行到我们代码中的哪一步。使用后跳转到自己重写的 def _ init (self) -> None中的super(). init_()一行,也就是说首先进行了父类初始化
卷积操作
torch.nn.functional.conv2d介绍
convolution卷积
kernel核心->卷积核
对由多个输入平面组成的输入图像进行二维卷积。
首先介绍一下torch.nn和torch.nn.functional的区别
- torch.nn是实例化的使用
- torch.nn.functional是方法的使用
torch.nn是对于torch.nn.functional的封装,二者类似于包含的关系,如果说torch.nn.functional是汽车齿轮的运转,那么torch.nn就是方向盘
在由多个输入平面组成的输入图像上应用 2D 卷积
参数
- input
形状的输入张量(minibatch ,in_channels , iH , iW) - weight
形状过滤器(out_channels, in_channels/groups , kH , kW) - bias
形状的可选偏差张量(out_channels). 默认:None - stride
卷积核的步幅。可以是单个数字或元组(sH, sW)。默认值:1 - padding
输入两侧的隐式填充。可以是字符串 {'valid', 'same'}、单个数字或元组(padH, padW)。默认值:0 padding='valid'与无填充相同。padding='same'填充输入,使输出具有与输入相同的形状。但是,此模式不支持 1 以外的任何步幅值。
nn.funcational.conv2d实际操作
python
torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
python
import torch
input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]])
kernel = torch.tensor([[1, 2, 1],
[0, 1, 0],
[2, 1, 0]])
print(input.shape) # torch.Size([5, 5])
print(kernel.shape)
input = torch.reshape(input, (1, 1, 5, 5)) # torch.Size([1, 1, 5, 5])
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape)
print(kernel.shape)
这里面(N,C,H,W)里面的四个是 N就是batch size也就是输入图片的数量,C就是channel图像通道的数量,这只是一个二维张量所以通道为1,H就是高,W就是宽,所以是1 1 5 5
灰度图用2维矩阵表示,所以通道数channel为1 。彩色图用3维矩阵表示,通道数为2
python
import torch
import torch.nn.functional as F
input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]])
kernel = torch.tensor([[1, 2, 1],
[0, 1, 0],
[2, 1, 0]])
print(input.shape) # torch.Size([5, 5])
print(kernel.shape)
#reshape函数:尺寸变换,要对应conv2d input的要求(minibatch=1,channel=1,h,w)
input = torch.reshape(input, (1, 1, 5, 5))
#如果不进行转换直接输入会出现报错
# RuntimeError: weight should have at least three dimensions
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape) # torch.Size([1, 1, 5, 5])
print(kernel.shape)
#stride:每次(在哪个方向)移动几步
output = F.conv2d(input, kernel, stride=1)
print(output)
输出:
python
tensor([[[[10, 12, 12],
[18, 16, 16],
[13, 9, 3]]]])
因为reshap后就是四维张量了,输入的就是一个四维,输出的自然也是四维的
使用padding:
为什么要进行填充?------为了多利用几次5*5的边缘数据,padding非0时可以避免边缘的数据只计算一次
填充是为了更好的保留边缘特征,简单来说中间块会被卷积利用多次,而边缘块使用的少,会造成不均
python
output = F.conv2d(input, kernel, stride=1)
print(output)
output2 = F.conv2d(input, kernel, stride=2)
print(output2)
output3 = F.conv2d(input, kernel, stride=1, padding=1)
print(output3)
python
tensor([[[[10, 12, 12],
[18, 16, 16],
[13, 9, 3]]]])
tensor([[[[10, 12],
[13, 3]]]])
tensor([[[[ 1, 3, 4, 10, 8],
[ 5, 10, 12, 12, 6],
[ 7, 18, 16, 16, 8],
[11, 13, 9, 3, 4],
[14, 13, 9, 7, 4]]]])
神经网络-卷积层
Convolution Layers
由于图像是二维矩阵,因此使用conv2d
nn.conv2d实际操作
对神经网络进行了封装,在面对更复杂的卷积网络场景的时候更适合
python
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
参数
- in_channels (int) -- Number of channels in the input image 输入图像的通道数
- out_channels (int) -- Number of channels produced by the convolution 卷积后输出的通道数,也代表着卷积核的个数
- kernel_size (int or tuple) -- Size of the - convolving kernel 卷积核的大小
- stride (int or tuple, optional) -- Stride of the convolution. Default: 1 步径大小
- padding (int, tuple or str, optional) -- Padding added to all four sides of the input. Default: 0 填充
- padding_mode (string, optional) -- 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros' 以什么方式填充
- dilation (int or tuple, optional) -- Spacing between kernel elements. Default: 1 卷积核之间的距离,空洞卷积
- groups (int, optional) -- Number of blocked connections from input channels to output channels. Default: 1
- bias (bool, optional) -- If True, adds a learnable bias to the output. Default: True 偏置
使用学习到的方法构建了一个简单的卷积网络,并对输入输出图像进行展示对比
不用设置卷积核的具体数字的原因是卷积核本身就是需要学习的参数
因为输入不同,上节课的输入是一层(无色)图像矩阵,channel=1,这里是三原色的,channel=3
python
import torch
import torchvision
from torch.utils.data import DataLoader
from torch import nn
from torch.nn import Conv2d
dataset = torchvision.datasets.CIFAR10("./data", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)
def forward(self, x):
x = self.conv1(x)
return x
tudui = Tudui()
print(tudui)
python
Tudui(
(conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
)
python
for data in dataloader:
imgs, targets = data
output = tudui(imgs)
print(imgs.shape) # torch.Size([64, 3, 32, 32]) batch_size=64, in_channels=3
print(output.shape) # torch.Size([16, 6, 30, 30]) out_channels=6
python
import torch
import torchvision
from torch.utils.data import DataLoader
from torch import nn
from torch.nn import Conv2d
from torch.utils.tensorboard import SummaryWriter
dataset = torchvision.datasets.CIFAR10("./data", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)
def forward(self, x):
x = self.conv1(x)
return x
tudui = Tudui()
print(tudui)
writer = SummaryWriter("./logs")
step = 0
for data in dataloader:
imgs, targets = data
output = tudui(imgs)
print(imgs.shape) # torch.Size([64, 3, 32, 32]) batch_size=64, in_channels=3
print(output.shape) # torch.Size([16, 6, 30, 30]) out_channels=6
# torch.Size([64, 3, 32, 32])
writer.add_images("input", imgs, step)
# torch.Size([64, 6, 30, 30]) -> [xxx, 3, 30, 30]
output = torch.reshape(output, (-1, 3, 30, 30))
writer.add_images("output", output, step)
step = step + 1
-1 是一个占位符,表示让 PyTorch 自动计算该维度的大小。
神经网络-最大池化层
最大池化MaxPool2d有时也被称为 下采样
MaxUnpool2d上采样
在一个范围内(选中的kernel_size中)选择有代表的一个数来代表整个kernel_size,减少数据量。
python
torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
在由多个输入平面组成的输入信号上应用 2D 最大池化
参数
- kernel_size -- 最大的窗口大小
- stride------窗口的步幅。默认值为kernel_size
- padding -- 要在两边添加隐式零填充
- dilation -- 控制窗口中元素步幅的参数(空洞卷积)
- return_indices - 如果True,将返回最大索引以及输出。torch.nn.MaxUnpool2d以后有用
- ceil_mode -- 默认是False。当为 True 时,将使用ceil而不是floor来计算输出形状。简单点来说,ceil模式就是会把不足square_size的边给保留下来,单独另算,或者也可以理解为在原来的数据上补充了值为-NAN的边。而floor模式则是直接把不足square_size的边给舍弃了
最大池化的目的:保留输入特征,同时又可以把数据量减小,参数变少,训练变快
python
import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
# dtype:将数据类型更改为浮点数
# 输入:5x5矩阵
input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]], dtype=torch.float32)
input = torch.reshape(input, (-1, 1, 5, 5))
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)
def forward(self, input):
output = self.maxpool1(input)
return output
tudui = Tudui()
output = tudui(input)
print(output)
python
import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
dataset=torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader=DataLoader(dataset=dataset, batch_size=64, shuffle=True, drop_last=False)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)
def forward(self, input):
output = self.maxpool1(input)
return output
tudui = Tudui()
# 图像输出
writer = SummaryWriter("logs_maxpool")
step = 0
for data in dataloader:
imgs, targets = data
writer.add_images("input", imgs, step)
output = tudui(imgs)
# 注意,最大池化不会改变channel(input是3通道,output也是三通道)三维图片经过最大池化后还是三维的
# 因此,下面不需要reshape了
writer.add_images("output", output, step)
step = step + 1
writer.close()
神经网络-非线性激活(ReLU函数)
python
torch.nn.ReLU(inplace=False)
参数:
- inplace 可以选择是否用output替换掉input执行操作。默认:False
python
import torch
from torch import nn
from torch.nn import ReLU
input = torch.tensor(([1, -0.5],
[-1, 3]))
input = torch.reshape(input, (-1, 1, 2, 2))
print(input.shape)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.relu1 = ReLU() # inplace = False, save the original value, put the new value to another variable
def forward(self, input):
output = self.relu1(input)
return output
tudui = Tudui()
output = tudui(input)
print(output)
python
torch.Size([1, 1, 2, 2])
tensor([[[[1., 0.],
[0., 3.]]]])
图像三个通道的取值都是0-255,用relu等于没有变化,用sigmoid才能映射到0-1的区间!
使用sigmoid
python
import torch
import torchvision
from torch import nn
from torch.nn import ReLU, Sigmoid
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
input = torch.tensor(([1, -0.5],
[-1, 3]))
input = torch.reshape(input, (-1, 1, 2, 2))
print(input.shape)
dataset = torchvision.datasets.CIFAR10(root="dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.relu1 = ReLU() # inplace = False, save the original value, put the new value to another variable
self.sigmoid1 = Sigmoid()
def forward(self, input):
output = self.sigmoid1(input)
return output
tudui = Tudui()
writer = SummaryWriter("P20")
step = 0
for data in dataloader:
imgs, targets = data
writer.add_images("input", imgs, global_step=step)
output = tudui(input)
writer.add_images("output", output, step)
step += 1
writer.close()
神经网络-线性层及其他层介绍
归一化层Normalization
采用归一化可以提高神经网络的训练速度
Dropout层
防止过拟合
线性层Linear
python
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
参数:
- in_features -- 每个输入样本的大小
- out_features -- 每个输出样本的大小
- bias------即线性层里面的b(kx+b)。如果设置为False,该层将不会学习附加偏差。默认:True
在神经网络的术语中,全连接层(Fully Connected Layer)和线性层(Linear Layer)实质上是指同一个概念,尽管在不同的上下文和框架中可能会使用不同的术语。
将图片内容转化为一维,展开后,inputsize为3072
python
for data in dataloader:
imgs,t=data
print(imgs.shape)
output=torch.reshape(imgs,(64,1,1,-1))
print(output.shape)
output:
#torch.Size([64, 3, 32, 32])
#torch.Size([64, 1, 1, 3072])
实际演练
python
import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader
dataset=torchvision.datasets.CIFAR10("dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)
#这里不droplast后面会报错,因为linear1的定义是196608,而最后一组图片不够64张,最后的大小也不足196608
dataloader=DataLoader(dataset,batch_size=64,drop_last=True)
class test(nn.Module):
def __init__(self):
super(test, self).__init__()
self.linear1=Linear(3072,10)
def forward(self,input):
output=self.linear1(input)
return output
test1=test()
for data in dataloader:
imgs,t=data
print(imgs.shape)
#将图片线性化
output=torch.reshape(imgs,(64,1,1,-1))
# 也可以用flatten,代码为output=torch.flatten(imgs)
print(output.shape)
output=test1(output)
print(output.shape)
output:
#torch.Size([64, 3, 32, 32])
#torch.Size([64, 1, 1, 3072])
#torch.Size([64, 1, 1, 10])
flatten和reshape展平的功能一样,但是只能够变成一维
神经网络-Sequential以及搭建网络小实战
CIFAR10 model structure:
第1次 卷积Convolution:
如果输入图像有3个通道(如RGB图像),每个卷积核也会有3个通道。卷积核的维度为 (输出通道数, 输入通道数, 卷积核高度, 卷积核宽度)
每个卷积核输出一个通道,几个卷积核就是几通道的。一个卷积核作用完RGB三个通道后会把得到的三个矩阵的对应值相加,也就是说会合并,,得到一个单独的特征图,所以一个卷积核产生一个通道。
用32个卷积核,每个卷积核的通道是3(没有标出)
每个卷积核都提取一个特征图。有32个卷积核,相当于提取了32个特征(其中每3个为RGB通道的特征)
在卷积操作中,如果不使用 padding,卷积核会逐步减少图像的尺寸(每经过一次卷积操作,图像尺寸会减小)。加了padding所以图像大小不变还是32*32
不使用nn.Sequential
计算padding的解题思路,当然也可以口算
对于卷积层中的每一个卷积核,其通道数必须与输入图像的通道数相匹配。这是因为卷积操作是逐通道进行的,然后(可选地)通过某种方式(如求和或加权和)将各通道的结果合并
不需要公式,奇数卷积核把中心格子对准图片第一个格子,卷积核在格子外有两层就padding=2
python
import torch
from torch import nn
class Tudui(nn.Module):
def __init__(self):
super(test, self).__init__()
#因为size_in和size_out都是32,经过计算得出padding=2,stride=1
self.conv1=nn.Conv2d(3,32,5,padding=2,stride=1)
self.pool1=nn.MaxPool2d(2)
#尺寸不变,和上面一样
self.conv2=nn.Conv2d(32,32,5,stride=1,padding=2)
self.pool2=nn.MaxPool2d(2)
# 尺寸不变,和上面一样
self.conv3=nn.Conv2d(32,64,5,stride=1,padding=2)
self.pool3 = nn.MaxPool2d(2)
self.flatten=nn.Flatten()
#in_feature:64*4*4,out_feature:64
self.linear1=nn.Linear(1024,64)
self.linear2=nn.Linear(64,10)
def forward(self,x):
x = self.conv1(x)
x = self.pool1(x)
x = self.conv2(x)
x = self.pool2(x)
x = self.conv3(x)
x = self.pool3(x)
x = self.flatten(x)
x = self.linear1(x)
x = self.linear2(x)
return x
tudui = Tudui()
#对网络结构进行检验
input=torch.ones((64,3,32,32))
output=tudui(input)
print(output.shape)
output:
#torch.Size([64, 10])
MaxPool2d中stride默认值为 kernel_size,不要写1
不使用nn.Sequential,且不确定线性层的输入时
python
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Linear
from torch.nn.modules.flatten import Flatten
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(3, 32, 5, padding=2)
self.maxpool1 = MaxPool2d(2)
self.conv2 = Conv2d(32, 32, 5, padding=2) # input size is as the same as output size, so padding is 2
self.maxpool2 = MaxPool2d(2)
self.conv3 = Conv2d(32, 64, 5, padding=2)
self.maxpool3 = MaxPool2d(2)
self.flatten = Flatten()
self.linear1 = Linear(1024, 64)
self.linear2 = Linear(64, 10)
def forward(self, x):
x = self.conv1(x)
x = self.maxpool1(x)
x = self.conv2(x)
x = self.maxpool2(x)
x = self.conv3(x)
x = self.maxpool3(x)
x = self.flatten(x)
return x
tudui = Tudui()
print(tudui)
# test if the net is correct
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
使用nn.Sequential,直接输出
Sequential的作用就是简化代码块1,把代码块1装进去Sequential容器里,就是代码块2的代码了,然后代码块1就可以删除
python
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Linear, Sequential
from torch.nn.modules.flatten import Flatten
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
tudui = Tudui()
print(tudui)
# test if the net is correct
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
使用nn.Sequential,SummaryWriter进行可视化
python
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Linear, Sequential
from torch.nn.modules.flatten import Flatten
from torch.utils.tensorboard import SummaryWriter
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
tudui = Tudui()
print(tudui)
# test if the net is correct
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
writer = SummaryWriter("P21")
writer.add_graph(tudui, input)
writer.close()
tensorboard中打开可以看到每一层的关系和参数
损失函数与反向传播
损失函数
损失函数的作用:
- 计算实际输出与目标输出之间的差距
- 为我们更新数据输出提供一定依据(反向传播),grand
1.L1Loss
参数设置
python
torch.nn.L1Loss(size_average=None, reduce=None, reduction='mean')
参数除了最后一个前面的都已经弃用了
当使用的参数为 mean(在pytorch1.7.1中elementwise_mean已经弃用)会对N个样本的loss进行平均之后返回
当使用的参数为 sum会对N个样本的loss求和
reduction = none<表示直接返回n分样本的loss
实际运行
python
import torch
from torch.nn import L1Loss
inputs = torch.tensor([1, 2, 3], dtype=torch.float32)
targets = torch.tensor([1, 2, 5], dtype=torch.float32)
inputs = torch.reshape(inputs, (1, 1, 1, 3))
targets = torch.reshape(targets, (1, 1, 1, 3))
# L1Loss = (0 + 0 + 2) / 3 = 0.6667
loss = L1Loss()
result = loss(inputs, targets)
print(result)
tensor在产生时会自带维度的,reshape是为了保证其维度是我们想要的,符合某个你想使用的函数的特定要求
2.平方差MSELoss
参数设置
python
torch.nn.MSELoss(size_average=None, reduce=None, reduction='mean')
实际运行
python
import torch
from torch.nn import L1Loss
from torch import nn
inputs = torch.tensor([1, 2, 3], dtype=torch.float32)
targets = torch.tensor([1, 2, 5], dtype=torch.float32)
inputs = torch.reshape(inputs, (1, 1, 1, 3))
targets = torch.reshape(targets, (1, 1, 1, 3))
# L1Loss = (0 + 0 + 2) / 3 = 0.6667
# reduction = 'sum', L1Loss = 0 + 0 + 2 = 2
loss = L1Loss(reduction='sum')
result = loss(inputs, targets)
# MSELoss = (0 + 0 + 2^2) / 3 = 1.3333
loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs, targets)
print(result)
print(result_mse)
# output
# tensor(2.)
# tensor(1.3333)
3.交叉熵CrossEntropyLoss
神经网络两大类主题,回归和分类,一般mse用于回归,crossentropy用于分类
注意这里的交叉熵损失公式是组合了softmax的结果
参数设置
python
torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=- 100, reduce=None, reduction='mean', label_smoothing=0.0)
运行代码
python
import torchvision
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Linear
from torch.nn.modules.flatten import Flatten
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10(root="dataset", train=False, transform=torchvision.transforms.ToTensor(),
download=True)
dataloader = DataLoader(dataset, batch_size=1)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
loss = nn.CrossEntropyLoss()
tudui = Tudui()
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets)
print(result_loss)
前面讲过这个数据集是10个分类组成的所以这里可以用分类的交叉熵
这地方backward只是求了梯度而已 参数的优化在优化器里面
梯度下降
以交叉熵损失函数为例,backward()方法为反向传播算法
python
import torchvision
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Linear
from torch.nn.modules.flatten import Flatten
from torch.utils.data import DataLoader
dataset = torchvision.datasets.CIFAR10(root="dataset", train=False, transform=torchvision.transforms.ToTensor(),
download=True)
dataloader = DataLoader(dataset, batch_size=1)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self, x):
x = self.model1(x)
return x
loss = nn.CrossEntropyLoss()
tudui = Tudui()
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets)
result_loss.backward()
print(result_loss)
优化器torch.optim
使用随机梯度下降法(SGD)作为优化器的优化依据
优化器的作用:将模型的中的参数根据要求进行实时调整更新,使得模型变得更加优良。
构建与使用
python
# 优化器通常设置模型的参数,学习率等
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
optimizer = optim.Adam([var1, var2], lr=0.0001)
for input, target in dataset:
#把上一次计算的梯度清零
optimizer.zero_grad()
output = model(input)
loss = loss_fn(output, target)
# 损失反向传播,计算梯度
loss.backward()
# 使用梯度进行学习,即参数的优化
optimizer.step()
在degug内查看gard
python
test1=test()
#lossFunction模型
loss=nn.CrossEntropyLoss()
#优化器模型
optim=torch.optim.SGD(test1.parameters(),0.01)
#只对每张图片进行一轮学习
for data in dataloader:
imgs,t=data
output=test1(imgs)
result_loss=loss(output,t)
#将每个梯度清为0(初始化)
optim.zero_grad()
#反向传播,得到每个可调节参数对应的梯度(grad不再是none)
result_loss.backward()
#对每个参数进行改变,weight-data被改变
optim.step()
print(result_loss)
43行之前的weight,grad不为none(非首次循环)
43行运行完毕,gard清零,data不变
45行运行完毕,grad被重新计算,data不变
47行运行完毕,grad不变,data更新(data就是模型中的参数)
注意到loss并没有减小,因为注意dataloader,现在这个网络模型在数据上相当于都只看了一遍,现在看到的数据对下一次的数据影响其实不是很大。因此一般需要对数据进行好几轮的学习
for data in dataloader:这一个循环只是一轮的训练
使用优化器进行优化
python
import torch
import torchvision
from torch import nn
from torch import optim
from torch.nn import Conv2d,MaxPool2d,Flatten,Linear,Sequential
from torch.utils.data import DataLoader
#导入数据
dataset2 = torchvision.datasets.CIFAR10("dataset2",transform=torchvision.transforms.ToTensor(),download=True)
dataloader = DataLoader(dataset2,batch_size=1)
#搭建模型
class qiqi(nn.Module):
def __init__(self):
super(qiqi, self).__init__()
self.model1=Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)
def forward(self,x):
x=self.model1(x)
return x
#引入交叉熵损失函数
loss=nn.CrossEntropyLoss()
#引入优化器
optim=torch.optim.SGD(test1.parameters(),0.01)
qq=qiqi()
for epoch in range(20): #最外面这一层是学习的次数
running_loss=0.0
for data in dataloader: #针对dataloader里面一batch_size的数据学习一次
imgs,targets=data #如果dataloader里面只有一个数据,则针对这个数据计算参数
outputs=qq(imgs)
result_loss=loss(outputs,targets) #计算输出和目标之间的差,计入result_loss
optim.zero_grad() #优化器中每一个梯度的参数清零
result_loss.backward() #反向传播,求出每一个节点的梯度
optim.step() #对每一个参数进行调优
running_loss=running_loss+result_loss #记录叠加的损失值
print(running_loss)
output:
#总loss在逐渐变小
# tensor(18712.0938, grad_fn= < AddBackward0 >)
# tensor(16126.7949, grad_fn= < AddBackward0 >)
# tensor(15382.0703, grad_fn= < AddBackward0 >)
running_loss就是 每一轮 所有数据上的 损失之和
现有网络模型的使用及修改------以vgg16模型为例
注意 现在的文档里 vgg16没有参数pretrained 取而代之的是weights.也就是说不再提供经过数据集ImageNet预训练的模型。如果不进行预训练,在weights = none;如果仍想加载在ImageNet上的预训练参数,则weight= 'imagenet'
简单查看一下vgg16模型
python
torchvision.models.vgg16(pretrained: bool = False, progress: bool = True, **kwargs: Any) → torchvision.models.vgg.VGG
参数
- pretrained (bool) -- If True, returns a model pre-trained on ImageNet
- progress (bool) -- If True, displays a progress bar of the download to stderr显示下载进度条
深入了解
python
vgg16_t=torchvision.models.vgg16(pretrained=True)
vgg16_f=torchvision.models.vgg16(pretrained=False)
vgg16_t的weight:
vgg16_f的weight:
vgg16_t的网络结构:
vgg16_f的网络结构:
可以看到vgg16最后输出1000个类,想要将这1000个类改成10个类。对现有网络模型的修改可分为两种方式,一种为添加,另一种为修改。
python
import torch
from torch import nn
import torchvision
vgg_true = torchvision.models.vgg16(pretrained=True)
vgg_false = torchvision.models.vgg16(pretrained=False)
# 添加
vgg_true.add_module("add_model", nn.Linear(in_features=1000, out_features=10))
print(vgg_true)
# 修改
vgg_false.classifier[6] = nn.Linear(in_features=4096, out_features=10)
print(vgg_false)
pretrained=False 改为weight=None pretrained=True改weights='DEFAULT'
修改下载路径的方法,在代码开头加入import os 换行os.environ['TORCH_HOME']='G:/pycharm/pycharm download'即可
网络模型的保存与读取
保存
python
import torchvision
import torch
vgg16 = torchvision.models.vgg16(pretrained=False)
# 保存方式1:既保存模型结构,也保存了参数,.pth不是必须的
torch.save(vgg16, "vgg16_model1.pth")
# 保存方式2 : 把参数保存成字典,不保存结构 (官方推荐)
torch.save(vgg16.state_dict(), "vgg16_model2.pth")
print("end")
读取
python
import torch
import torchvision
# 加载方式1 - 保存方式1
model = torch.load("vgg16_model1.pth")
print(model)
# 加载方式2
vgg16 = torchvision.models.vgg16(pretrained=False)
vgg16.load_state_dict(torch.load("vgg16_model2.pth"))
print(vgg16)
以方式一形式读取自定义模型时要先将该模型复制或引用到读取文件中,否则会报错
python
import torch
import torchvision
from torch import nn
# trap with saving method 1
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=3)
def forward(self, x):
x = self.conv1(x)
return x
tudui = Tudui()
torch.save(tudui, "tudui_method1.pth")
from data_save import * #将具体的文件名进行导入,来保证模型可以调用。可以不用再次实例化模型,即 qq=qiqi()
model = torch.load("tudui_method1.pth") # must visit the definition of the model
print(model)
完整的训练套路
数据集进行下载读取,并进行分组
python
# 读取数据集
trainset = torchvision.datasets.CIFAR10("dataset", train=True, transform=torchvision.transforms.ToTensor(), download=True)
testset = torchvision.datasets.CIFAR10("dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 数据分组
train_loader = DataLoader(trainset, 64)
test_loader = DataLoader(testset, 64)
搭建神经网络:
文件model专门用于存放模型,使用时进行调用更符合实际应用场景。
最大池化的stride参数不写的话,就默认与kernel_size参数大小一样,即为2
python
import torch
from torch import nn
class Test(nn.Module):
def __init__(self):
super(Test, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Flatten(),
nn.Linear(in_features=1024, out_features=64),
nn.Linear(in_features=64, out_features=10)
)
def forward(self, input):
output = self.model(input)
return output
# to test if the neural network is right
if __name__ == '__main__':
input = torch.ones((64, 3, 32, 32))
test = Test()
output = test(input)
print(output.shape)
train.py文件
python
# 创建网络模型
test = Test()
# 定义损失函数
loss_fn = torch.nn.CrossEntropyLoss()
# 定义优化器
learning_rate = 0.001
optimizer = torch.optim.SGD(params=test.parameters(), lr=learning_rate)
# 记录训练次数
train_step = 0
# 记录测试次数
test_step = 0
writer = SummaryWriter("logs")
对模型进行训练
python
# 训练
epoch = 20
for i in range(epoch):
# 训练步骤
print("-------第 {} 轮训练-------".format(i+1))
test.train()
for train_data in train_loader:
imgs, target = train_data
output = test(imgs)
loss = loss_fn(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_step += 1
if train_step % 100 == 0:
print("第 {} 次训练完成 训练损失:{}".format(train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), train_step)
打草稿的地方:
python
import torch
a = torch.tensor(5)
print(a) # tensor(5)
print(a.item()) # 5
在测试集上进行测试
训练集 验证集 测试集。验证集与测试集不一样的,验证集是在训练中用的,防止模型过拟合,测试集是在模型完全训练好后使用的
数据集分成3部分,训练集训练神经网络,验证集用来查看网络效果,修改参数等,测试集是最后一步,用来看网络怎么样
在测试的过程中就不需要调优了,就只是利用现有的模型进行测试 。因此使用with torch.no_grad(),来保证不会进行调优。否则虽然不调用但会累计梯度,会慢
python
# 测试步骤
test.eval()
test_loss_sum = 0.0
total_accuracy = 0
with torch.no_grad():
for test_data in test_loader:
imgs, target = test_data
output = test(imgs)
loss = loss_fn(output, target)
accuracy = (output.argmax(1) == target).sum()
test_loss_sum += loss.item()
total_accuracy += accuracy
writer.add_scalar("test_loss", test_loss_sum, test_step)
print("在测试集上的Loss:{}, 正确率:{}".format(test_loss_sum, total_accuracy/len(testset)))
test_step += 1
全流程源代码
python
import torch
from torch import nn
# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, 1, 2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, 1, 2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64*4*4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.model(x)
return x
if __name__ == '__main__':
tudui = Tudui()
input = torch.ones((64, 3, 32, 32))
print(input.shape)
output = tudui(input)
print(output.shape)
print(output)
python
torch.Size([64, 3, 32, 32])
torch.Size([64, 10]) # 64张图片,每张图片最终变成长度10
if name == 'main': 下的代码只有在文件作为脚本直接执行时才会被执行,而 该py脚本import 到其他脚本中时,if__name__==...后的代码不会被执行。
python
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
# 网络模型
from model import *
# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="dataset", train=True, transform=torchvision.transforms.ToTenso(), download=True)
test_data = torchvision.datasets.CIFAR10(root="dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 数据集长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用DataLoader来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 创建网络模型
tudui = Tudui()
# 损失函数
loss_fn = nn.CrossEntropyLoss()
# 优化器
# learning_rate = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 训练测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_train")
for i in range(epoch):
print("-----------第 {} 轮训练开始-----------".format(i+1))
# 训练步骤开始
tudui.train()
for data in train_dataloader:
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad()
loss.backward() # 注意是对loss
optimizer.step()
total_train_step += 1
if total_train_step % 100 == 0:
print("训练次数:{},Loss:{}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
tudui.eval()
total_test_loss = 0
total_accuracy = 0
with torch.no_grad():
for data in test_dataloader:
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
total_test_loss += loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy += accuracy
print("整体测试集上的Loss:{}".format(total_test_loss))
print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss, total_test_step)
writer.add_scalar("test_accuracy", total_accuracy/test_data_size, total_test_step)
total_test_step += 1
torch.save(tudui, "tudui_{}.pth".format(i))
# torch.save(tudui.state.dict(), "tudui_{}.pth".format(i))
print("模型已保存")
writer.close()
python
import torch
import torchvision.datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from MyModel import *
# 准备数据集
train_set = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_set = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 查看数据集大小
print(len(train_set))
print(len(test_set))
# 打包数据集
train_loader = DataLoader(train_set, batch_size=64, shuffle=True, drop_last=True)
test_loader = DataLoader(test_set, batch_size=64, shuffle=True, drop_last=True)
# 模型
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MyNN().to(DEVICE)
# 损失、优化器以及一些超参数
criteria = nn.CrossEntropyLoss().to(DEVICE)
LR = 0.01
EPOCHS = 20
optimizer = torch.optim.SGD(model.parameters(), lr=LR)
writer = SummaryWriter("train-logs") # tensorboard可视化
batch_num = 0 # 无论处于哪个epoch,当前处在第几批
# 训练
for epoch in range(EPOCHS):
print("-----开始第{}轮epoch训练-----".format(epoch + 1))
current_epoch_batch_num = 0 # 当前epoch中的第几批
model.train() # 训练模式,和eval一样,目的在于是否激活dropout等。
for data in train_loader:
imgs, labels = data
imgs, labels = imgs.to(DEVICE), labels.to(DEVICE)
output = model(imgs)
optimizer.zero_grad()
loss = criteria(output, labels)
loss.backward()
optimizer.step()
current_epoch_batch_num += 1 # 累计当前epoch中的批数
batch_num += 1 # 累计总批数
# 在当前epoch中,每扫过100批才打印出那批的损失
if current_epoch_batch_num % 100 == 0:
print("第{}个epoch中第{}批的损失为:{}".format(epoch + 1, current_epoch_batch_num, loss.item()))
writer.add_scalar("train-loss", scalar_value=loss.item(), global_step=batch_num) # tensorboard可视化
# 测试:固定梯度(准确说是验证,用来确定超参数)
test_loss = 0
test_acc = 0
model.eval() # 测试模式
with torch.no_grad():
for data in test_loader:
imgs, labels = data
imgs, labels = imgs.to(DEVICE), labels.to(DEVICE)
output = model(imgs)
loss = criteria(output, labels)
test_loss += loss.item()
pred = output.argmax(1) # 按轴取得每个样本预测的最大值的索引(索引对应label编号!)
pred_right_sum = (pred == labels).sum() # 比对每个样本预测和真实标记,结果为true/false组成的数组,求和时会自动转换为数字
test_acc += pred_right_sum # 将当前批预测正确的数量累加进总计数器
print("第{}个epoch上的测试集损失为:{}".format(epoch + 1, test_loss))
print("第{}个epoch上整体测试集上的准确率:{}".format(epoch + 1, test_acc / len(test_set)))
writer.add_scalar("test-loss", scalar_value=test_loss, global_step=epoch + 1) # tensorboard可视化
writer.add_scalar("test-acc", scalar_value=test_acc / len(test_set), global_step=epoch + 1)
# 保存每个epoch训练出的模型
torch.save(model, "NO.{}_model.pth".format(epoch + 1)) # 保存模型结构和参数
print("当前epoch的模型已保存")
# torch.save(model.state_dict(), "NO.{}_model_param_dic.pth".format(epoch + 1)) # 仅保存参数字典
writer.flush()
writer.close()
利用GPU训练
方法一:数据(输入、标注) 损失函数 模型 .cuda()
python
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import torch
from torch import nn
import time
# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="dataset2", train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_data = torchvision.datasets.CIFAR10(root="dataset2", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 数据集长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用DataLoader来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64, shuffle=True, drop_last=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True, drop_last=True)
# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Flatten(),
nn.Linear(in_features=64*4*4, out_features=64),
nn.Linear(in_features=64, out_features=10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
tudui = Tudui()
if torch.cuda.is_available():
tudui = tudui.cuda()
# 损失函数
loss_fn = nn.CrossEntropyLoss()
if torch.cuda.is_available():
loss_fn = loss_fn.cuda()
# 优化器
# learning_rate = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 训练测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_train")
start_time = time.time()
for i in range(epoch):
print("-----------第 {} 轮训练开始-----------".format(i+1))
# 训练步骤开始
tudui.train()
for data in train_dataloader:
imgs, targets = data
if torch.cuda.is_available():
imgs = imgs.cuda()
targets = targets.cuda()
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad()
loss.backward() # 注意是对loss
optimizer.step()
total_train_step += 1
if total_train_step % 100 == 0:
end_time = time.time()
print("训练次数:{},总时间:{}".format(total_train_step, end_time - start_time))
print("训练次数:{},Loss:{}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
tudui.eval()
total_test_loss = 0
total_accuracy = 0
with torch.no_grad():
for data in test_dataloader:
imgs, targets = data
if torch.cuda.is_available():
imgs = imgs.cuda()
targets = targets.cuda()
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
total_test_loss += loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy += accuracy
print("整体测试集上的Loss:{}".format(total_test_loss))
print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss, total_test_step)
writer.add_scalar("test_accuracy", total_accuracy/test_data_size, total_test_step)
total_test_step += 1
torch.save(tudui, "tudui_{}.pth".format(i))
# torch.save(tudui.state.dict(), "tudui_{}.pth".format(i))
print("模型已保存")
writer.close()
方法二:.to(device)
Device = torch.device("cpu") 或 torch.device("cuda")
torch.device("cuda:0") 指定第一张显卡
python
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import torch
from torch import nn
import time
# 定义训练的设备
device = torch.device("cuda")
# device = torch.device("cuda:0")
# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="dataset2", train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_data = torchvision.datasets.CIFAR10(root="dataset2", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 数据集长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用DataLoader来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64, shuffle=True, drop_last=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True, drop_last=True)
# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Flatten(),
nn.Linear(in_features=64*4*4, out_features=64),
nn.Linear(in_features=64, out_features=10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
tudui = Tudui()
tudui = tudui.to(device)
# 损失函数
loss_fn = nn.CrossEntropyLoss()
loss_fn = loss_fn.to(device)
# 优化器
# learning_rate = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 训练测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_train")
start_time = time.time()
for i in range(epoch):
print("-----------第 {} 轮训练开始-----------".format(i+1))
# 训练步骤开始
tudui.train()
for data in train_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad()
loss.backward() # 注意是对loss
optimizer.step()
total_train_step += 1
if total_train_step % 100 == 0:
end_time = time.time()
print("训练次数:{},总时间:{}".format(total_train_step, end_time - start_time))
print("训练次数:{},Loss:{}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
tudui.eval()
total_test_loss = 0
total_accuracy = 0
with torch.no_grad():
for data in test_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
total_test_loss += loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy += accuracy
print("整体测试集上的Loss:{}".format(total_test_loss))
print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss, total_test_step)
writer.add_scalar("test_accuracy", total_accuracy/test_data_size, total_test_step)
total_test_step += 1
torch.save(tudui, "tudui_{}.pth".format(i))
# torch.save(tudui.state.dict(), "tudui_{}.pth".format(i))
print("模型已保存")
writer.close()
python
# 定义训练的设备
device = torch.device("cuda:0")
# 创建网络模型
test = Test()
test = test.to(device)
# 定义损失函数
loss_fn = torch.nn.CrossEntropyLoss()
loss_fn = loss_fn.to(device)
for train_data in train_loader:
imgs, target = train_data
imgs = imgs.to(device)
target = target.to(device)
for test_data in test_loader:
imgs, target = test_data
imgs = imgs.to(device)
target = target.to(device)
第二种方法的完整代码
powershell
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import torch
from torch import nn
import time
# 定义训练的设备
device = torch.device("cuda")
# device = torch.device("cuda:0")
# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="dataset2", train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_data = torchvision.datasets.CIFAR10(root="dataset2", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 数据集长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))
# 利用DataLoader来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64, shuffle=True, drop_last=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True, drop_last=True)
# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=2),
nn.Flatten(),
nn.Linear(in_features=64*4*4, out_features=64),
nn.Linear(in_features=64, out_features=10)
)
def forward(self, x):
x = self.model(x)
return x
# 创建网络模型
tudui = Tudui()
tudui = tudui.to(device)
# 损失函数
loss_fn = nn.CrossEntropyLoss()
loss_fn = loss_fn.to(device)
# 优化器
# learning_rate = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)
# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 训练测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10
# 添加tensorboard
writer = SummaryWriter("logs_train")
start_time = time.time()
for i in range(epoch):
print("-----------第 {} 轮训练开始-----------".format(i+1))
# 训练步骤开始
tudui.train()
for data in train_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
# 优化器优化模型
optimizer.zero_grad()
loss.backward() # 注意是对loss
optimizer.step()
total_train_step += 1
if total_train_step % 100 == 0:
end_time = time.time()
print("训练次数:{},总时间:{}".format(total_train_step, end_time - start_time))
print("训练次数:{},Loss:{}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)
# 测试步骤开始
tudui.eval()
total_test_loss = 0
total_accuracy = 0
with torch.no_grad():
for data in test_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
total_test_loss += loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy += accuracy
print("整体测试集上的Loss:{}".format(total_test_loss))
print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss", total_test_loss, total_test_step)
writer.add_scalar("test_accuracy", total_accuracy/test_data_size, total_test_step)
total_test_step += 1
torch.save(tudui, "tudui_{}.pth".format(i))
# torch.save(tudui.state.dict(), "tudui_{}.pth".format(i))
print("模型已保存")
writer.close()
完整的模型验证(测试,demo)套路
本节内容:利用已经训练好的模型,然后给它提供输入
和测试集那里是差不多的
验证代码如下:
拿小狗图像作为输入
PNG格式的图像可以具有3个通道(RGB)或4个通道(RGBA),具体取决于图像中是否包含透明度信息
python
import torch
import torchvision.transforms
from PIL import Image
# 网上下载一张狗的图片
img = Image.open("../images/dog.png")
image = image.convert('RGB')
#因为png格式是四个通道,除了RGB三通道之外,还有一个透明通道,所以,我们调用此行保留颜色通道
#加上这句话之后可以进一步适应不用格式,不同截图软件的图片
pipeline = torchvision.transforms.Compose([
torchvision.transforms.Resize((32, 32)),
torchvision.transforms.ToTensor()
])
img = pipeline(img)
print(img.shape)
img = torch.reshape(img, (1, 3, 32, 32)) # reshape一下,增加维度,满足输入格式
print(img.shape)
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load("tudui_9.pth") # 加载模型
model.eval()
with torch.no_grad():
img = img.cuda() # 等价写法:img = img.to(DEVICE),因为模型是在cuda上训练保存的,现在要保持输入数据一致
output = model(img)
print(output.argmax(1)) # tensor([5], device='cuda:0') # 预测为第五类:表示狗,正确
加载模型和加载模型的参数要分开考虑,之前的两种保存方式,一种保存了模型的结构和参数,一种只保存了模型结构。第一种方式保存的话就得引入模型