PyTorch学习笔记|从异或问题到深层神经网络

异或问题解决

神经网络之所以沉寂了几十年，就是有大佬说神经网络解决不了异或问题，当然现在神经网络这么火，就说明这个问题已经被解决，那今天我们就来了解这一段莫欺少年穷的戏码。

说异或之前，我们再看看还有哪些其他的逻辑运算，以及如何用pytorch来实现。

与运算

x0	x1	x2	z
1	0	0	0
1	1	0	0
1	0	1	0
1	1	1	1

go 复制代码

import torch

X = torch.tensor([[1,0,0],[1,1,0],[1,0,1],[1,1,1]], dtype = torch.float32)
w = torch.tensor([-0.2, 0.15, 0.15])
z = torch.tensor([0., 0, 0, 1])

def Linear(X, w):
    zhat = torch.mv(X, w)
    return zhat

zhat = Linear(X,w)
sigma = torch.sigmoid(zhat)
andhat = torch.tensor([int(x) for x in sigma >= 0.5],dtype = torch.float32)
print(andhat)

#tensor([0., 0., 0., 1.])

或运算

x0	x1	x2	z
1	0	0	0
1	1	0	1
1	0	1	1
1	1	1	1

非与运算

x0	x1	x2	z
1	0	0	1
1	1	0	1
1	0	1	1
1	1	1	0

大家可以试试看怎么实现上面的逻辑运算。最后我们再来看看异或运算。

异或运算

x0	x1	x2	z
1	0	0	0
1	1	0	1
1	0	1	1
1	1	1	0

我们用图来看一下，当我们只用一层神经网络的时候，无非就是找一个直线把数据两分，但是异或我们没办法用直线切分。

那我们就加一层神经网络看看能不能解决。

go 复制代码

X = torch.tensor([[1,0,0],[1,1,0],[1,0,1],[1,1,1]],dtype=torch.float32)

def AND(X):
    w = torch.tensor([-0.2,0.15, 0.15], dtype = torch.float32)
    zhat = torch.mv(X,w)
    andhat = torch.tensor([int(x) for x in zhat >= 0],dtype=torch.float32)
    return andhat

def OR(X):
    w = torch.tensor([-0.08, 0.15,0.15], dtype = torch.float32)
    zhat = torch.mv(X,w)
    yhat = torch.tensor([int(x) for x in zhat > 0], dtype=torch.float32)
    return yhat

def NAND(X):
    w = torch.tensor([0.23,-0.15,-0.15], dtype = torch.float32) #和与门、或门都不同的权重
    zhat = torch.mv(X,w)
    yhat = torch.tensor([int(x) for x in zhat >= 0],dtype=torch.float32)
    return yhat

sigma_nand = NAND(X)
sigma_or = OR(X)
x0 = torch.tensor([1,1,1,1], dtype=torch.float32)
input_2 = torch.cat((x0.view(4,1),sigma_nand.view(4,1),sigma_or.view(4,1)),dim=1)
print(AND(input_2))

# tensor([0., 1., 1., 0.])

再看激活函数

那我们能认为只要层数加的越多，神经网络就很好用吗，答案是否定的，因为其实神经网络都是线性变换，所以一定要加上激活函数。激活函数一般都是非线性函数，它出现在神经网络中除了输入层以外的每层的每个神经元上。上文我们已经介绍了sigmoid、sign、ReLU，tanh，softmax等激活函数，我个人认为是多层+激活才是神经网络如此成功的秘籍。

更进一步

接下来，我们就用pytorch来实现多层神经网络的正向传播过程，假设我们有500条数据，20个特征，标签为3分类。我们现在要实现一个三层神经网络，这个神经网络的架构如下:第一层有13个神经元，第二层有8个神经元，第三层是输出层。其中，第一层的激活函数是relu，第二层是sigmoid。

go 复制代码

import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(520) # 设置随机数种子
x = torch.rand((500,20), dtype=torch.float32)
y = torch.randint(low=0, high=3,size=(500, 1),dtype=torch.float32)

#继承nn.Modules类来定义神经网路的架构
class Model(nn.Module):
    def __init__(self, in_features=10, out_features=2):
        super(Model,self).__init__()
        self.linear1 = nn.Linear(in_features,13)
        self.linear2 = nn.Linear(13,8)
        self.output = nn.Linear(8,out_features)

    def forward(self, x):
        z1 = self.linear1(x)
        sigma1 = torch.relu(z1)
        z2 = self.linear2(sigma1)
        sigma2 = torch.sigmoid(z2)
        z3 = self.output(sigma2)
        sigma3 = F.softmax(z3, dim=1)
        return sigma3

input_ = x.shape[1] #特征数
output_ = len(y.unique()) #分类数

net = Model(in_features=input_, out_features=output_)
print(net(x).shape)

# torch.Size([500, 3])