- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊
前言
- LSTM 模型一直是一个很经典的模型,这个模型当然也很复杂,一般需要先学习RNN、GRU模型之后再学,GRU、LSTM的模型讲解将在这两天发布更新,其中:
- 这一篇 :是基于LSTM模型火灾预测研究,讲述了如何构建时间数据、模型如何构建、pytorch中LSTM的API、动态调整学习率等=,最后用RMSE、R2做评估;
- 欢迎收藏 + 关注,本人将会持续更新
文章目录
1、导入数据与数据展示
1、导入库
python
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
# 设置分辨率
plt.rcParams['savefig.dpi'] = 500 # 图片分辨率
plt.rcParams['figure.dpi'] = 500 # 分辨率
device = "cpu"
device
'cpu'
2、导入数据
python
data_df = pd.read_csv('./woodpine2.csv')
data_df.head()
| | Time | Tem1 | CO 1 | Soot 1 |
| 0 | 0.000 | 25.0 | 0.0 | 0.0 |
| 1 | 0.228 | 25.0 | 0.0 | 0.0 |
| 2 | 0.456 | 25.0 | 0.0 | 0.0 |
| 3 | 0.685 | 25.0 | 0.0 | 0.0 |
4 | 0.913 | 25.0 | 0.0 | 0.0 |
---|
数据位实验数据,数据是定时收集的:
- Time: 时间从 0.000 开始,每隔大约 0.228 的间隔递增。
- Tem1: 是温度(Temperature)的缩写,单位可能是摄氏度 (°C)。
- CO: 是指一氧化碳 (Carbon Monoxide) 的浓度。
- Soot: 是指烟炱或炭黑 (Soot) 的浓度。
python
# 数据信息查询
data_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5948 entries, 0 to 5947
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Time 5948 non-null float64
1 Tem1 5948 non-null float64
2 CO 1 5948 non-null float64
3 Soot 1 5948 non-null float64
dtypes: float64(4)
memory usage: 186.0 KB
python
# 数据缺失值
data_df.isnull().sum()
Time 0
Tem1 0
CO 1 0
Soot 1 0
dtype: int64
3、数据可视化
时间是每隔固定时间收集的,故有用特征为:温度、CO、Soot
python
_, ax = plt.subplots(1, 3, constrained_layout=True, figsize=(14, 3)) # constrained_layout=True 自动调整子图
sns.lineplot(data=data_df['Tem1'], ax=ax[0])
sns.lineplot(data=data_df['CO 1'], ax=ax[1])
sns.lineplot(data=data_df['Soot 1'], ax=ax[2])
plt.show()
4、相关性分析(热力图展示)
python
columns = ['Tem1', 'CO 1', 'Soot 1']
plt.figure(figsize=(8, 6))
sns.heatmap(data=data_df[columns].corr(), annot=True, fmt=".2f")
plt.show()
python
# 统计分析
data_df.describe()
| | Time | Tem1 | CO 1 | Soot 1 |
| count | 5948.000000 | 5948.000000 | 5948.000000 | 5948.000000 |
| mean | 226.133238 | 152.534919 | 0.000035 | 0.000222 |
| std | 96.601445 | 77.026019 | 0.000022 | 0.000144 |
| min | 0.000000 | 25.000000 | 0.000000 | 0.000000 |
| 25% | 151.000000 | 89.000000 | 0.000015 | 0.000093 |
| 50% | 241.000000 | 145.000000 | 0.000034 | 0.000220 |
| 75% | 310.000000 | 220.000000 | 0.000054 | 0.000348 |
max | 367.000000 | 307.000000 | 0.000080 | 0.000512 |
---|
当我看到相关性为1的时候,我也惊呆了,后面查看了统计量,还是没发现出来,但是看上面的可视化图展示,我信了,随着温度升高,CO化碳、Soot浓度一起升高,这个也符合火灾的场景,数据没啥问题。
5、特征提取
python
# 由于时间间隔一样,故这里去除
data = data_df.iloc[:, 1:]
data.head(3)
| | Tem1 | CO 1 | Soot 1 |
| 0 | 25.0 | 0.0 | 0.0 |
| 1 | 25.0 | 0.0 | 0.0 |
2 | 25.0 | 0.0 | 0.0 |
---|
python
data.tail(3)
| | Tem1 | CO 1 | Soot 1 |
| 5945 | 292.0 | 0.000077 | 0.000491 |
| 5946 | 291.0 | 0.000076 | 0.000489 |
5947 | 290.0 | 0.000076 | 0.000487 |
---|
特征间数据差距较大,故需要做标准化
2、时间数据构建
1、数据标准化
python
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler()
for col in ['Tem1', 'CO 1', 'Soot 1']:
data[col] = sc.fit_transform(data[col].values.reshape(-1, 1))
# 查看维度
data.shape
(5948, 3)
2、构建时间数据集
LSTM 模型期望输入数据的形状是 (样本数, 时间步长, 特征数),本文数据:
- 样本数:5948
- 时间步长:本文设置为8
- 即是:取特征每8行(Tem1, CO 1, Soot 1)为一个时间段,第9个时间段的Tem1为y(温度),火灾预测本质也是预测温度
- 特征数:3
python
width_x = 8
width_y = 1
# 构建时间数据X, y(解释在上)
X, y = [], []
# 设置开始构建数据位置
start_position = 0
for _, _ in data.iterrows():
in_end = start_position + width_x
out_end = in_end + width_y
if out_end < len(data):
# 采集时间数据集
X_ = np.array(data.iloc[start_position : in_end, :])
y_ = np.array(data.iloc[in_end : out_end, 0])
X.append(X_)
y.append(y_)
start_position += 1
# 转化为数组
X = np.array(X)
# y也要构建出适合维度的变量
y = np.array(y).reshape(-1, 1, 1)
X.shape, y.shape
((5939, 8, 3), (5939, 1, 1))
3、划分数据集和加载数据集
1、数据划分
python
# 取前5000个数据位训练集,后面为测试集
X_train = torch.tensor(np.array(X[:5000, ]), dtype=torch.float32)
X_test = torch.tensor(np.array(X[5000:, ]), dtype=torch.float32)
y_train = torch.tensor(np.array(y[:5000, ]), dtype=torch.float32)
y_test = torch.tensor(np.array(y[5000:, ]), dtype=torch.float32)
X_train.shape, y_train.shape
(torch.Size([5000, 8, 3]), torch.Size([5000, 1, 1]))
数据集构建:
- TensorDataset 是 PyTorch 中的一个类,用于将两个或多个张量组合成一个数据集。每个样本由一个输入张量和一个目标张量组成(构建的数据集中,每一个输入对应一个输出)
python
from torch.utils.data import TensorDataset, DataLoader
batch_size = 64
train_dl = DataLoader(TensorDataset(X_train, y_train),
batch_size=batch_size,
shuffle=True)
test_dl = DataLoader(TensorDataset(X_test, y_test),
batch_size=batch_size,
shuffle=False)
3、模型构建
nn.LSTM
的 API
*构造函数
python
torch.nn.LSTM(input_size, hidden_size, num_layers=1, bias=True, batch_first=False, dropout=0, bidirectional=False, proj_size=0)
input_size
(int
):每个时间步输入特征的数量。hidden_size
(int
):LSTM 层中隐藏状态(h)的特征数。这也是 LSTM 输出的特征数量,除非指定了proj_size
。num_layers
(int
, 可选):LSTM 层的数量。默认值为 1。bias
(bool
, 可选):如果为True
,则使用偏置项;否则不使用。默认值为True
。batch_first
(bool
, 可选):如果为True
,则输入和输出张量的形状为(batch, seq, feature)
;否则为(seq, batch, feature)
。默认值为False
。dropout
(float
, 可选):除了最后一层之外的所有 LSTM 层之后应用的 dropout 概率。如果num_layers = 1
,则不会应用 dropout。默认值为 0。bidirectional
(bool
, 可选):如果为True
,则变为双向 LSTM。默认值为False
。proj_size
(int
, 可选):如果大于 0,则 LSTM 会将隐藏状态投影到一个不同维度的空间。这减少了模型参数的数量,并且可以加速训练。默认值为 0,表示没有投影。
输入
input
(tensor
):形状为(seq_len, batch, input_size)
或者如果batch_first=True
则为(batch, seq_len, input_size)
。(h_0, c_0)
(tuple
, 可选):包含两个张量(h_0, c_0)
,分别代表初始的隐藏状态和细胞状态。它们的形状均为(num_layers * num_directions, batch, hidden_size)
。如果没有提供,那么所有状态都会被初始化为零。
其中:
- 单向 LSTM (bidirectional=False):此时 num_directions=1。LSTM 只按照时间序列的顺序从前向后处理数据,即从第一个时间步到最后一个时间步。
- 双向 LSTM (bidirectional=True):此时 num_directions=2。双向 LSTM 包含两个独立的 LSTM 层,一个按正常的时间顺序从前向后处理数据,另一个则反过来从后向前处理数据。这样做可以让模型同时捕捉到过去和未来的信息,对于某些任务(如自然语言处理中的语义理解)特别有用。
输出(两个)
output
(tensor
):包含了最后一个时间步的输出特征(h_t
)。如果batch_first=True
,则形状为(batch, seq_len, num_directions * hidden_size)
;否则为(seq_len, batch, num_directions * hidden_size)
。注意,如果proj_size > 0
,则输出的最后一个维度将是num_directions * proj_size
。(h_n, c_n)
(tuple
):包含两个张量(h_n, c_n)
,分别代表所有时间步后的最终隐藏状态和细胞状态。它们的形状均为(num_layers * num_directions, batch, hidden_size)
。同样地,如果proj_size > 0
,则h_n
的最后一个维度将是proj_size
。
python
'''
模型采用两个lstm层:
3->320:lstm
->320:lstm(进一步提取时间特征)
->1:linear
'''
class model_lstm(nn.Module):
def __init__(self):
super().__init__()
self.lstm1 = nn.LSTM(input_size=3, hidden_size=320, num_layers=1, batch_first=True)
self.lstm2 = nn.LSTM(input_size=320, hidden_size=320, num_layers=1, batch_first=True)
self.fc = nn.Linear(320, 1)
def forward(self, x):
out, hidden = self.lstm1(x)
out, _ = self.lstm2(out)
out = self.fc(out) # 这个时候,输出维度(batch_size, sequence_length, output_size), 这里是(64, 8, 1)
return out[:, -1, :].view(-1, 1, 1) # 取最后一条数据 (64, 1, 1), 在pytorch中如果一个维度是1,可能会自动压缩,所以这里需要再次形状重塑
model = model_lstm().to(device)
model
model_lstm(
(lstm1): LSTM(3, 320, batch_first=True)
(lstm2): LSTM(320, 320, batch_first=True)
(fc): Linear(in_features=320, out_features=1, bias=True)
)
python
# 先做测试
model(torch.rand(30, 8, 3)).shape
torch.Size([30, 1, 1])
4、模型训练
1、训练集函数
python
def train(train_dl, model, loss_fn, optimizer, lr_scheduler=None):
size = len(train_dl.dataset)
num_batchs = len(train_dl)
train_loss = 0
for X, y in train_dl:
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss += loss.item()
if lr_scheduler is not None:
lr_scheduler.step()
print("learning rate = {:.5f}".format(optimizer.param_groups[0]['lr']), end=" ")
train_loss /= num_batchs
return train_loss
2、测试集函数
python
def test(test_dl, model, loss_fn):
size = len(test_dl.dataset)
num_batchs = len(test_dl)
test_loss = 0
with torch.no_grad():
for X, y in test_dl:
X, y = X.to(device), y.to(device)
pred = model(X)
loss = loss_fn(pred, y)
test_loss += loss.item()
test_loss /= num_batchs
return test_loss
3、模型训练
python
# 设置超参数
loss_fn = nn.MSELoss()
lr = 1e-1
opt = torch.optim.SGD(model.parameters(), lr=lr, weight_decay=1e-4) # weight_decay 实际上是在应用 L2 正则化(也称为权重衰减)
epochs = 50
# 动态调整学习率
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(opt, epochs, last_epoch=-1)
train_loss = []
test_loss = []
for epoch in range(epochs):
model.train()
epoch_train_loss = train(train_dl, model, loss_fn, opt, lr_scheduler)
model.eval()
epoch_test_loss = test(test_dl, model, loss_fn)
train_loss.append(epoch_train_loss)
test_loss.append(epoch_test_loss)
template = ('Epoch:{:2d}, Train_loss:{:.5f}, Test_loss:{:.5f}')
print(template.format(epoch+1, epoch_train_loss, epoch_test_loss))
learning rate = 0.09990 Epoch: 1, Train_loss:0.00320, Test_loss:0.00285
learning rate = 0.09961 Epoch: 2, Train_loss:0.00022, Test_loss:0.00084
learning rate = 0.09911 Epoch: 3, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09843 Epoch: 4, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09755 Epoch: 5, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.09649 Epoch: 6, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.09524 Epoch: 7, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09382 Epoch: 8, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09222 Epoch: 9, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09045 Epoch:10, Train_loss:0.00015, Test_loss:0.00066
learning rate = 0.08853 Epoch:11, Train_loss:0.00015, Test_loss:0.00077
learning rate = 0.08645 Epoch:12, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08423 Epoch:13, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08187 Epoch:14, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.07939 Epoch:15, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07679 Epoch:16, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.07409 Epoch:17, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07129 Epoch:18, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.06841 Epoch:19, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06545 Epoch:20, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06243 Epoch:21, Train_loss:0.00015, Test_loss:0.00069
learning rate = 0.05937 Epoch:22, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.05627 Epoch:23, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.05314 Epoch:24, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.05000 Epoch:25, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.04686 Epoch:26, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.04373 Epoch:27, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.04063 Epoch:28, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.03757 Epoch:29, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.03455 Epoch:30, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.03159 Epoch:31, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.02871 Epoch:32, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.02591 Epoch:33, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02321 Epoch:34, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02061 Epoch:35, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.01813 Epoch:36, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.01577 Epoch:37, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.01355 Epoch:38, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.01147 Epoch:39, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00955 Epoch:40, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00778 Epoch:41, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.00618 Epoch:42, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00476 Epoch:43, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00351 Epoch:44, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00245 Epoch:45, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00157 Epoch:46, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00089 Epoch:47, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00039 Epoch:48, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00010 Epoch:49, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00000 Epoch:50, Train_loss:0.00015, Test_loss:0.00063
5、结果展示
1、损失函数
python
import matplotlib.pyplot as plt
from datetime import datetime
current_time = datetime.now() # 获取当前时间
plt.figure(figsize=(5, 3),dpi=120)
plt.plot(train_loss , label='LSTM Training Loss')
plt.plot(test_loss, label='LSTM Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel(current_time) # 打卡请带上时间戳,否则代码截图无效
plt.legend()
plt.show()
效果不错,收敛了
2、预测展示
python
predicted_y_lstm = sc.inverse_transform(model(X_test).detach().numpy().reshape(-1,1)) # 测试集输入模型进行预测
y_test_1 = sc.inverse_transform(y_test.reshape(-1,1))
y_test_one = [i[0] for i in y_test_1]
predicted_y_lstm_one = [i[0] for i in predicted_y_lstm]
plt.figure(figsize=(5, 3),dpi=120) # 画出真实数据和预测数据的对比曲线
plt.plot(y_test_one[:2000], color='red', label='real_temp')
plt.plot(predicted_y_lstm_one[:2000], color='blue', label='prediction')
plt.title('Title')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()
3、R2评估
python
from sklearn import metrics
"""
RMSE :均方根误差 -----> 对均方误差开方
R2 :决定系数,可以简单理解为反映模型拟合优度的重要的统计量
"""
RMSE_lstm = metrics.mean_squared_error(predicted_y_lstm_one, y_test_1)**0.5
R2_lstm = metrics.r2_score(predicted_y_lstm_one, y_test_1)
print('均方根误差: %.5f' % RMSE_lstm)
print('R2: %.5f' % R2_lstm)
均方根误差: 0.00001
R2: 0.82422
rmse、r2都不错,但是拟合度还可以再提高