四层神经网络案例(含反向传播)

🧠 四层神经网络案例(含反向传播)

📘 网络结构

层次 名称 神经元个数
L₁ 输入层 2
L₂ 隐藏层1 2
L₃ 隐藏层2 2
L₄ 输出层 1

激活函数:Sigmoid

损失函数:均方误差 (MSE)


⚙️ 一、前向传播(Forward Propagation)

输入:
x1=1,x2=2x_1=1, \quad x_2=2x1=1,x2=2

目标输出:
ytrue=1y_{true}=1ytrue=1

1️⃣ 权重与偏置设定

权重矩阵 偏置
L1→L2 KaTeX parse error: Undefined control sequence: \0 at position 31: ...atrix}0.1 & 0.2\̲0̲.3 & 0.4\end{bm... b1=[0.1,0.2]b_1=[0.1, 0.2]b1=[0.1,0.2]
L2→L3 KaTeX parse error: Undefined control sequence: \0 at position 31: ...atrix}0.5 & 0.6\̲0̲.7 & 0.8\end{bm... b2=[0.1,0.2]b_2=[0.1, 0.2]b2=[0.1,0.2]
L3→L4 KaTeX parse error: Undefined control sequence: \1 at position 25: ...gin{bmatrix}0.9\̲1̲.0\end{bmatrix} b3=[0.3]b_3=[0.3]b3=[0.3]

2️⃣ 层层计算

隐藏层1(L₂)

z(2)=XW1+b1z^{(2)} = XW_1 + b_1z(2)=XW1+b1
KaTeX parse error: Undefined control sequence: \0 at position 40: ...atrix}0.1 & 0.2\̲0̲.3 & 0.4\end{bm...

激活:
a(2)=σ(z(2))=[0.68997,0.76852]a^{(2)} = \sigma(z^{(2)}) = [0.68997, 0.76852]a(2)=σ(z(2))=[0.68997,0.76852]


隐藏层2(L₃)

z(3)=a(2)W2+b2z^{(3)} = a^{(2)}W_2 + b_2z(3)=a(2)W2+b2
KaTeX parse error: Undefined control sequence: \0 at position 52: ...atrix}0.5 & 0.6\̲0̲.7 & 0.8\end{bm...

激活:
a(3)=σ(z(3))=[0.7043,0.7421]a^{(3)} = \sigma(z^{(3)}) = [0.7043, 0.7421]a(3)=σ(z(3))=[0.7043,0.7421]


输出层(L₄)

z(4)=a(3)W3+b3z^{(4)} = a^{(3)}W_3 + b_3z(4)=a(3)W3+b3
KaTeX parse error: Undefined control sequence: \1 at position 44: ...gin{bmatrix}0.9\̲1̲.0\end{bmatrix}...

激活:
y^=σ(1.6740)=0.8421\hat{y} = \sigma(1.6740) = 0.8421y^=σ(1.6740)=0.8421


前向传播结果

激活输出
a(2)a^{(2)}a(2) [0.68997, 0.76852]
a(3)a^{(3)}a(3) [0.7043, 0.7421]
a(4)=y^a^{(4)}=\hat{y}a(4)=y^ 0.8421

损失:
L=12(ytrue−y^)2=0.0124L = \frac{1}{2}(y_{true}-\hat{y})^2 = 0.0124L=21(ytrue−y^)2=0.0124


🔁 二、反向传播(Backpropagation)

1️⃣ 输出层梯度

dLdy^=(y^−ytrue)=−0.1579\frac{dL}{d\hat{y}} = (\hat{y} - y_{true}) = -0.1579dy^dL=(y^−ytrue)=−0.1579
dy^dz(4)=y^(1−y^)=0.1329\frac{d\hat{y}}{dz^{(4)}} = \hat{y}(1-\hat{y}) = 0.1329dz(4)dy^=y^(1−y^)=0.1329
dLdz(4)=−0.0210\frac{dL}{dz^{(4)}} = -0.0210dz(4)dL=−0.0210

权重与偏置梯度:
KaTeX parse error: Undefined control sequence: \- at position 91: ...bmatrix}-0.0148\̲-̲0.0156\end{bmat...
db3=−0.0210db_3 = -0.0210db3=−0.0210


2️⃣ 反传到隐藏层2(L₃)

dLda(3)=W3dLdz(4)=[−0.0189,−0.0210]\frac{dL}{da^{(3)}} = W_3 \frac{dL}{dz^{(4)}} = [-0.0189, -0.0210]da(3)dL=W3dz(4)dL=[−0.0189,−0.0210]
da(3)dz(3)=a(3)(1−a(3))=[0.2083,0.1914]\frac{da^{(3)}}{dz^{(3)}} = a^{(3)}(1-a^{(3)}) = [0.2083, 0.1914]dz(3)da(3)=a(3)(1−a(3))=[0.2083,0.1914]
dLdz(3)=[−0.00394,−0.00402]\frac{dL}{dz^{(3)}} = [-0.00394, -0.00402]dz(3)dL=[−0.00394,−0.00402]

梯度:
dW2=a(2)TdLdz(3)=[−0.00272−0.00278 −0.00303−0.00309]dW_2 = a^{(2)T}\frac{dL}{dz^{(3)}} = \begin{bmatrix}-0.00272 & -0.00278\ -0.00303 & -0.00309\end{bmatrix}dW2=a(2)Tdz(3)dL=[−0.00272−0.00278 −0.00303−0.00309]
db2=[−0.00394,−0.00402]db_2 = [-0.00394, -0.00402]db2=[−0.00394,−0.00402]


3️⃣ 反传到隐藏层1(L₂)

dLda(2)=dLdz(3)W2T=[−0.00563,−0.00624]\frac{dL}{da^{(2)}} = \frac{dL}{dz^{(3)}}W_2^T = [-0.00563, -0.00624]da(2)dL=dz(3)dLW2T=[−0.00563,−0.00624]
da(2)dz(2)=a(2)(1−a(2))=[0.2148,0.1778]\frac{da^{(2)}}{dz^{(2)}} = a^{(2)}(1-a^{(2)}) = [0.2148, 0.1778]dz(2)da(2)=a(2)(1−a(2))=[0.2148,0.1778]
dLdz(2)=[−0.00121,−0.00111]\frac{dL}{dz^{(2)}} = [-0.00121, -0.00111]dz(2)dL=[−0.00121,−0.00111]

梯度:
dW1=XTdLdz(2)=[−0.00121−0.00111 −0.00242−0.00222]dW_1 = X^T\frac{dL}{dz^{(2)}} = \begin{bmatrix}-0.00121 & -0.00111\ -0.00242 & -0.00222\end{bmatrix}dW1=XTdz(2)dL=[−0.00121−0.00111 −0.00242−0.00222]
db1=[−0.00121,−0.00111]db_1 = [-0.00121, -0.00111]db1=[−0.00121,−0.00111]


🧩 三、反向传播计算图

复制代码
        (x1,x2)
           │
           ▼
       [Layer1] -----------→  dL/dz2 → dW1
           │
           ▼
       [Layer2] -----------→  dL/dz3 → dW2
           │
           ▼
       [Output] -----------→  dL/dz4 → dW3

或完整箭头图:

复制代码
Forward:  X → Z2 → A2 → Z3 → A3 → Z4 → A4(ŷ)
Backward:        ← dZ2 ← dA2 ← dZ3 ← dA3 ← dZ4 ← dA4

✅ 四、总结

步骤 内容 说明
前向传播 X → Ŷ 计算预测输出
计算损失 L(Ŷ, Y) 度量误差
反向传播 dL/dW, dL/db 计算梯度
参数更新 W ← W - η·dW 用学习率更新参数
相关推荐
m0_650108242 小时前
【论文精读】AVID:基于扩散模型的任意长度视频修复
人工智能·扩散模型·论文精读·视频修复·时序一致性·任意时长·结构引导
TYUT_xiaoming3 小时前
ubuntu22.04 GPU环境安装mindspore
linux·人工智能·深度学习
海边夕阳20063 小时前
【每天一个AI小知识】:什么是自监督学习?
人工智能·经验分享·学习
开发者工具分享3 小时前
用户调研样本不具代表性时怎么办
人工智能·数据挖掘
稳稳C93 小时前
02|Langgraph | 从入门到实战 | workflow与Agent
人工智能·langchain·agent·langgraph
聚梦小课堂3 小时前
2025年11月4日 AI快讯
人工智能·新闻资讯·ai大事件
Dev7z3 小时前
基于ResNet50和PyTorch的猫狗图像分类系统设计与实现
人工智能·pytorch·分类
lybugproducer3 小时前
深度学习专题:模型训练的数据并行(三)
人工智能·深度学习·概率论
Gloria_niki3 小时前
图像分割深度学习学习总结
人工智能