目录
一、Logistic函数
Logistic函数是学习前馈神经网络的基础。所以在介绍前馈神经网络之前,我们首先来看一看Logistic函数。
Logistic函数定义为:
data:image/s3,"s3://crabby-images/84f63/84f636c68f84371fcc68421214bd59d590026c7f" alt=""
Logistic函数可以看成是一个"挤压"函数, 把一个实数域的输入"挤压"到(0,1)。当输入值在0附近时。Sigmoid型函数近似为线性函数;当输入值靠近两侧时,对输入进行抑制。输入越小,越接近于0;输入越大,越接近于1。
这样的特点也和生物神经元类似,对一些输入会产生兴奋(输入为1),对另一些输入产生抑制(输出为0)。和感知器使用的阶跃激活函数相比,Logistic函数是连续可导的,其数学性质更好。
因为Logistic函数的性质,使得装备了Logistic激活函数的神经元具有以下两点性质:
(1)其输出直接可以看作概率分布,使得神经网络可以更好地和统计学习模型进行结合;
(2)其可以看作一个软性门,用来控制其他神经元输出信息的数量。
Logistic函数的导数为,其推导过程如下:
Logistic函数的图像如下:
data:image/s3,"s3://crabby-images/f192f/f192fc30d7444e7d64679505125878778b72fa54" alt=""
二、前馈神经网络(FNN)
前馈神经网络其实是由多层的Logistic回归模型(连续的非线性函数)组成,而不是由多层的感知器(不连续的非线性函数)组成。
在前馈神经网络中, 各神经元分别属于不同的层。每一层的神经元可以接收前一层神经元的信号,并产生信号输出到下一层。第0层称为输入层 ,最后一层称为输出层 ,其他中间层称为隐藏层 。整个网络中无反馈,信号从输入层向输出层单向传递 ,可用一个有向无环图表示。
data:image/s3,"s3://crabby-images/4fe18/4fe185f031e18b81058132d903571f640183a3d7" alt=""
接下来,我们以下面的一个神经网络为例,推导前馈神经网络的数学模型。
data:image/s3,"s3://crabby-images/c0974/c09745c8c3bf761091c5d8daa68141bc19b3285f" alt=""
图中,代表第j层第i个神经元的活性值,
代表控制激活函数从第j层映射到第j+1层的权重矩阵。
这里的激活函数我们使用的是Logistic函数,这里我们用g(x)表示。
因此,有:
data:image/s3,"s3://crabby-images/8bd1c/8bd1c83f0c6150571c6036586b5cca34f8a006ea" alt=""
data:image/s3,"s3://crabby-images/42708/42708d48b27a1f44d7fdd6c988fded373b4ed922" alt=""
data:image/s3,"s3://crabby-images/66dc7/66dc7a0ca11d05b3636812608174c3366bbc044a" alt=""
data:image/s3,"s3://crabby-images/6a50a/6a50a74da72863ab654643257c830c219ab473a8" alt=""
data:image/s3,"s3://crabby-images/b7cd5/b7cd52849b6d18332899e455bb622ca358bfc8f1" alt=""
data:image/s3,"s3://crabby-images/bc9c1/bc9c1ac5b28876eef321aef584f189b0f15a3854" alt=""
data:image/s3,"s3://crabby-images/d96c1/d96c16dbdf503de8e2c2077787eaa88d884101a4" alt=""
data:image/s3,"s3://crabby-images/5b4d4/5b4d4250b81abce1b42baafa7e07d27270243ffa" alt=""
data:image/s3,"s3://crabby-images/c2038/c2038c262f6e45667ef4205753589b419b1312f7" alt=""
data:image/s3,"s3://crabby-images/ec0ad/ec0ada842e8c0caf78a7f1b74ce5d48f91e67d90" alt=""
data:image/s3,"s3://crabby-images/ef820/ef820f53fb3f9e8eb3fcbf2318bc385c6b8c735c" alt=""
data:image/s3,"s3://crabby-images/8abba/8abbab3eb31e86bae1bca66aadbab67602f8ccc4" alt=""
data:image/s3,"s3://crabby-images/698c3/698c367fb0defc6941488f2f529444d16e787624" alt=""
data:image/s3,"s3://crabby-images/c9e65/c9e65294c43077212ad4645f515229f0d6b2bdb2" alt=""
data:image/s3,"s3://crabby-images/9b706/9b706ab78ed99bdb1d10c83644cb85cad98f8ae5" alt=""
data:image/s3,"s3://crabby-images/39438/39438bc8ff4e0a38f98deb1874e80265aba56c14" alt=""
data:image/s3,"s3://crabby-images/2105a/2105a148c7e8a4a21ed5186a3d5781b37b771eb3" alt=""
我们也可以将上面的公式写成向量的形式:
data:image/s3,"s3://crabby-images/59d44/59d44c5627819f2a52344422f4c95ee2d8553b53" alt=""
data:image/s3,"s3://crabby-images/3d408/3d4089a22127842634987081b1cec79daa40de63" alt=""
data:image/s3,"s3://crabby-images/ab366/ab366f8309b62fe1fdcc9bde0db84c951a39186b" alt=""
data:image/s3,"s3://crabby-images/b0b8d/b0b8d2a9e1e6a1a03481bcf3e8c793483eca2eca" alt=""
data:image/s3,"s3://crabby-images/58531/5853165cb8d269d53a2630c414cd904d4522450e" alt=""
data:image/s3,"s3://crabby-images/a9797/a979739ff83a1e60b7294ff6e4a7262fd2661f6d" alt=""
data:image/s3,"s3://crabby-images/5d4ba/5d4baf0ddbfbf0dbdcc5b1600d8a582e54d782ca" alt=""
data:image/s3,"s3://crabby-images/5868b/5868bec630c7d6f166829631522e779d8bd85847" alt=""
data:image/s3,"s3://crabby-images/b298d/b298d7d667af800aac42c82e5293a68b2478b308" alt=""
因此,该前馈神经网络最后的输出值为:
data:image/s3,"s3://crabby-images/17725/1772580a53764d0683c55e9af39503ec1d2a5b2e" alt=""
data:image/s3,"s3://crabby-images/2f180/2f180ea4017ba1227931745900fa38b8e718dcde" alt=""
data:image/s3,"s3://crabby-images/4d87e/4d87ece2bdc53dffa98749f5d495a1fad238b156" alt=""
data:image/s3,"s3://crabby-images/27886/27886d098eb7fef7f8f45e1912772f132ce4fceb" alt=""
data:image/s3,"s3://crabby-images/2de64/2de647eba1e828df7cc0d62a3959817a2a67f623" alt=""
data:image/s3,"s3://crabby-images/998e6/998e6ff5a4eb5591ecde173800a85785a157d44c" alt=""
data:image/s3,"s3://crabby-images/1a5d0/1a5d02957589bd382ce98ad1f19a73fbcf8bc6d8" alt=""
可以看出,这是一个复合函数。
三、反向传播算法(BP算法)
这里,我们还是使用上面的神经网络模型:
data:image/s3,"s3://crabby-images/c0974/c09745c8c3bf761091c5d8daa68141bc19b3285f" alt=""
这里,代表第l层第j个神经元的误差。
该神经网络的损失函数为:
data:image/s3,"s3://crabby-images/77362/773621606dfd039f30e9cab5181ede71d98de9e5" alt=""
这里,我们令 ,并且有
,在不考虑正则项的情况下,有:
data:image/s3,"s3://crabby-images/ff598/ff598253dab38190a6635fe850373812128e8671" alt=""
于是,反向传播算法的推导过程如下:
首先,令
data:image/s3,"s3://crabby-images/47eae/47eaed9aabe83169fa33d461e3fa0a67fd04067e" alt=""
(=预测值-真实值)
根据链式求导法则有:
data:image/s3,"s3://crabby-images/b79c5/b79c5b692ae780e28528e9582a7608b79d371eab" alt=""
由于
data:image/s3,"s3://crabby-images/75cfe/75cfee70b2344ac090207db38969240213607408" alt=""
故
data:image/s3,"s3://crabby-images/8d341/8d3411104c91cabb97c4b11000e49192fe8cf889" alt=""
由于
data:image/s3,"s3://crabby-images/1e8d8/1e8d8bf87733078f42883a4267f8887335d5ad78" alt=""
故
data:image/s3,"s3://crabby-images/57573/575737c59a872664b93b3d995ea985d5883e8fa4" alt=""
data:image/s3,"s3://crabby-images/fcafa/fcafa63282c08779fff880833f5480a40b2dd927" alt=""
data:image/s3,"s3://crabby-images/dd0c5/dd0c5f1ba71a2abad33b16bd5759caf2bb1d34f6" alt=""
data:image/s3,"s3://crabby-images/184a6/184a64ec22aaad7e1703b13d433fc76b8f8a40ee" alt=""
data:image/s3,"s3://crabby-images/ca6b7/ca6b7b24e8beda4460116b28ff8fc39480bed7eb" alt=""
data:image/s3,"s3://crabby-images/19b84/19b84f3d02c9ab62c4b16356cf4a1ef0561edeb0" alt=""
data:image/s3,"s3://crabby-images/489d4/489d47ff1732e41902383d43708bd8a782656595" alt=""
因此,
data:image/s3,"s3://crabby-images/7cab8/7cab895977e511a52c0547f39118b9a252f2e935" alt=""
接下来,我们先来推导一下:
首先,
根据链式求导法则,有:
data:image/s3,"s3://crabby-images/98260/9826046c42af339f2a5f6abfd6c884ec97d7e93b" alt=""
已知,
又由于
data:image/s3,"s3://crabby-images/f1b41/f1b418ac56b20df100d7121a6050415ac2c5c70f" alt=""
故有:
data:image/s3,"s3://crabby-images/8f586/8f586ab088d7977ece0ad3fcded840c1cb90581e" alt=""
因此,有:
data:image/s3,"s3://crabby-images/619e6/619e6a93224907ada271500a4d66751eaa9b2fb7" alt=""
接着,再推导:
data:image/s3,"s3://crabby-images/c4b5c/c4b5c857d65b0cdb52c0dc56ebe92704a78f3169" alt=""
已知 ,
,
又由于
data:image/s3,"s3://crabby-images/d0765/d0765a2cb8ca1e1e67865c3c7fe1427b20b851f7" alt=""
故有:
data:image/s3,"s3://crabby-images/787c2/787c2d0c2505d30b5b3bc9c547159ed9b7381fd6" alt=""
因此,有:
data:image/s3,"s3://crabby-images/d8a42/d8a42a58241d43af7cab5d2141078e2927568de1" alt=""
下面继续推导 :
由链式求导法则有:
data:image/s3,"s3://crabby-images/2d209/2d209b03f865b054f1489b2933f0cbc12296a6b0" alt=""
已知 ,
,
又由于
data:image/s3,"s3://crabby-images/ec0ad/ec0ada842e8c0caf78a7f1b74ce5d48f91e67d90" alt=""
data:image/s3,"s3://crabby-images/ef820/ef820f53fb3f9e8eb3fcbf2318bc385c6b8c735c" alt=""
故有:
data:image/s3,"s3://crabby-images/e71f7/e71f73f7182927106b6dc3b2ad7bb60f54734433" alt=""
因此,
data:image/s3,"s3://crabby-images/bac93/bac933c2a7c00e48d5533b0af787683745e4513e" alt=""
接着,继续推导:
由链式求导法则有:
data:image/s3,"s3://crabby-images/d4c56/d4c566d7daa632a10e3c5dbec1ae48c24f8dffd2" alt=""
已知 ,
,
,
又由于
data:image/s3,"s3://crabby-images/6a50a/6a50a74da72863ab654643257c830c219ab473a8" alt=""
data:image/s3,"s3://crabby-images/b7cd5/b7cd52849b6d18332899e455bb622ca358bfc8f1" alt=""
故有:
data:image/s3,"s3://crabby-images/5b7f4/5b7f4a907fc88c79b4ec667c1069220f2fc25da3" alt=""
因此,
data:image/s3,"s3://crabby-images/6c5f9/6c5f9150775fe8023c50fc00789df58ceab54b5c" alt=""
data:image/s3,"s3://crabby-images/d1622/d1622871bafb37cea348a946074d71f39a498f3f" alt=""
data:image/s3,"s3://crabby-images/51ea4/51ea44159b615b80d4eccd5cd906b3ba93543c53" alt=""
综上,有:
data:image/s3,"s3://crabby-images/7cab8/7cab895977e511a52c0547f39118b9a252f2e935" alt=""
data:image/s3,"s3://crabby-images/1baea/1baeaa94c8b714757cb30730a2be3286f6cd6add" alt=""
data:image/s3,"s3://crabby-images/48a7a/48a7ad5b7f3cc1df53488d124f9a41a56ca215ab" alt=""
因此,有:
data:image/s3,"s3://crabby-images/b3753/b3753592ba4c3b0bc97feaffa79b9803f33e7419" alt=""
四、基于前馈神经网络的手写体数字识别
首先查看手写体数据集情况:
python
from scipy.io import loadmat
data=loadmat("C:\\Users\\LEGION\\Documents\\Tencent Files\\215503595\\FileRecv\\hw11data.mat")
X=data['X']
y=data['y']
print('X type:',type(X))
print('X shape:',X.shape)
print('y type:',type(y))
print('y shape:',y.shape)
pythonX type: <class 'numpy.ndarray'> X shape: (5000, 400) y type: <class 'numpy.ndarray'> y shape: (5000, 1)
接着,从数据集中随机选取100行并转化成图片:
python
from random import sample
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
'''随机选取100行'''
r=[int(i) for i in range(5000)]
R=sample(r,100)
X_choose=np.zeros((100,400))
for i in range(100):
X_choose[i,:]=X[R[i],:]
'''将随机选取的100行数据分别转换成20X20的矩阵形式'''
X_matrix=[X_choose[i].reshape([20,20]).T for i in range(100)]
'''转换成图片'''
fig=plt.figure()
for i in range(100):
ax=fig.add_subplot(10,10,i+1)
ax.imshow(X_matrix[i],interpolation='nearest')
plt.show()
data:image/s3,"s3://crabby-images/be7b4/be7b455e1908cb008af0eee61e77f24f40753151" alt=""
查看已经训练好的权重数据集情况:
python
from scipy.io import loadmat
weights=loadmat("C:\\Users\\LEGION\\Documents\\Tencent Files\\215503595\\FileRecv\\hw11weights.mat")
theta1=weights['Theta1']
theta2=weights['Theta2']
print('theta1 tyep:',type(theta1))
print('theta1 shape:',theta1.shape)
print('theta2 type:',type(theta2))
print('tehta2 shape:',theta2.shape)
pythontheta1 tyep: <class 'numpy.ndarray'> theta1 shape: (25, 401) theta2 type: <class 'numpy.ndarray'> tehta2 shape: (10, 26)
计算前馈神经网络对手写体数字识别的准确率:
python
'''添加元素1'''
X0=X.tolist()
for i in range(5000):
X0[i].insert(0,1)
X1=np.array(X0)
'''进行神经网络的第一层计算'''
Z1=[] #5000 date of second layer
for i in range(5000):
a=np.dot(theta1,X1[i].T)
z1=(a.T).tolist()
Z1.append(z1)
'''计算逻辑函数值'''
Y1=[]
for i in range(5000):
y0=[]
for j in range(25):
b=1/(1+np.exp(-Z1[i][j]))
y0.append(b)
Y1.append(y0)
'''添加元素1'''
for i in range(5000):
Y1[i].insert(0,1)
Y2=np.array(Y1)
'''进行神经网络的第二层计算'''
Z2=[] #5000 date of third layer
for i in range(5000):
a=np.dot(theta2,Y2[i].T)
z2=(a.T).tolist()
Z2.append(z2)
'''计算逻辑函数值'''
Y2=[]
for i in range(5000):
y0=[]
for j in range(10):
c=1/(1+np.exp(-Z2[i][j]))
y0.append(c)
Y2.append(y0)
'''转换成输出值'''
Y=[]
for i in range(5000):
s=Y2[i].index(max(Y2[i]))
Y.append(s+1)
'''计算神经网络预测的准确率'''
n=0
for i in range(5000):
if y[i]==Y[i]:
n+=1
pre_ratio=n/5000
print("神经网络预测的准确率:{}".format(pre_ratio))
python神经网络预测的准确率:0.9752
计算损失函数值:
python
from scipy.io import loadmat
import numpy as np
'''读取数据'''
data=loadmat("C:\\Users\\LEGION\\Documents\\Tencent Files\\215503595\\FileRecv\\hw11data.mat")
X=data['X']
y=data['y']
weights=loadmat("C:\\Users\\LEGION\\Documents\\Tencent Files\\215503595\\FileRecv\\hw11weights.mat")
theta1=weights['Theta1']
theta2=weights['Theta2']
#进行神经网络运算
'''添加元素1'''
X0=X.tolist()
for i in range(5000):
X0[i].insert(0,1)
X1=np.array(X0)
'''进行神经网络的第一层计算'''
Z1=[] #5000 date of second layer
for i in range(5000):
a=np.dot(theta1,X1[i].T)
z1=(a.T).tolist()
Z1.append(z1)
'''计算逻辑函数值'''
Y1=[]
for i in range(5000):
y0=[]
for j in range(25):
b=1/(1+np.exp(-Z1[i][j]))
y0.append(b)
Y1.append(y0)
'''添加元素1'''
for i in range(5000):
Y1[i].insert(0,1)
Y2=np.array(Y1)
'''进行神经网络的第二层计算'''
Z2=[] #5000 date of third layer
for i in range(5000):
a=np.dot(theta2,Y2[i].T)
z2=(a.T).tolist()
Z2.append(z2)
'''计算逻辑函数值'''
Y2=[]
for i in range(5000):
y0=[]
for j in range(10):
c=1/(1+np.exp(-Z2[i][j]))
y0.append(c)
Y2.append(y0)
'''转换成输出值'''
Y=[]
for i in range(5000):
s=Y2[i].index(max(Y2[i]))
Y.append(s+1)
#计算损失函数值
cost=0
for i in range(5000):
cost0=0
d=[0 for i in range(10)]
d[y[i][0]-1]=1
for j in range(10):
p=d[j]*np.log(Y2[i][j])+(1-d[j])*np.log(1-Y2[i][j])
cost0=cost0+p
cost=cost+cost0
cost=cost*(-1/5000)
print("损失函数值:{}".format(cost))
python损失函数值:0.2876291651613188