深度学习项目--基于LSTM的火灾预测研究(pytorch实现)

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊

前言

LSTM 模型一直是一个很经典的模型，这个模型当然也很复杂，一般需要先学习RNN、GRU模型之后再学，GRU、LSTM的模型讲解将在这两天发布更新，其中：
- 深度学习基础--一文搞懂RNN
- 深度学习基础--GRU学习笔记(李沐《动手学习深度学习》)
这一篇 ：是基于LSTM模型火灾预测研究，讲述了如何构建时间数据、模型如何构建、pytorch中LSTM的API、动态调整学习率等=，最后用RMSE、R2做评估；
欢迎收藏 + 关注，本人将会持续更新

文章目录

1、导入数据与数据展示

1、导入库

2、导入数据

3、数据可视化

4、相关性分析(热力图展示)

5、特征提取

2、时间数据构建

1、数据标准化

2、构建时间数据集

3、划分数据集和加载数据集

1、数据划分

3、模型构建

4、模型训练

1、训练集函数

2、测试集函数

3、模型训练

5、结果展示

1、损失函数

2、预测展示

3、R2评估

1、导入数据与数据展示

1、导入库

python 复制代码

import torch  
import torch.nn as nn 
import pandas as pd 
import numpy as np 
import seaborn as sns 
import matplotlib.pylab as plt 

# 设置分辨率
plt.rcParams['savefig.dpi'] = 500  # 图片分辨率
plt.rcParams['figure.dpi'] = 500 # 分辨率

device = "cpu"

device

复制代码

'cpu'

2、导入数据

python 复制代码

data_df = pd.read_csv('./woodpine2.csv')

data_df.head()

| | Time | Tem1 | CO 1 | Soot 1 |
| 0 | 0.000 | 25.0 | 0.0 | 0.0 |
| 1 | 0.228 | 25.0 | 0.0 | 0.0 |
| 2 | 0.456 | 25.0 | 0.0 | 0.0 |
| 3 | 0.685 | 25.0 | 0.0 | 0.0 |

4	0.913	25.0	0.0	0.0

数据位实验数据，数据是定时收集的：

Time: 时间从 0.000 开始，每隔大约 0.228 的间隔递增。
Tem1: 是温度（Temperature）的缩写，单位可能是摄氏度 (°C)。
CO: 是指一氧化碳 (Carbon Monoxide) 的浓度。
Soot: 是指烟炱或炭黑 (Soot) 的浓度。

python 复制代码

# 数据信息查询
data_df.info()

复制代码

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5948 entries, 0 to 5947
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Time    5948 non-null   float64
 1   Tem1    5948 non-null   float64
 2   CO 1    5948 non-null   float64
 3   Soot 1  5948 non-null   float64
dtypes: float64(4)
memory usage: 186.0 KB

python 复制代码

# 数据缺失值
data_df.isnull().sum()

复制代码

Time      0
Tem1      0
CO 1      0
Soot 1    0
dtype: int64

3、数据可视化

时间是每隔固定时间收集的，故有用特征为：温度、CO、Soot

python 复制代码

_, ax = plt.subplots(1, 3, constrained_layout=True, figsize=(14, 3)) # constrained_layout=True  自动调整子图

sns.lineplot(data=data_df['Tem1'], ax=ax[0])
sns.lineplot(data=data_df['CO 1'], ax=ax[1])
sns.lineplot(data=data_df['Soot 1'], ax=ax[2])
plt.show()

4、相关性分析(热力图展示)

python 复制代码

columns = ['Tem1', 'CO 1', 'Soot 1']

plt.figure(figsize=(8, 6))
sns.heatmap(data=data_df[columns].corr(), annot=True, fmt=".2f")
plt.show()

python 复制代码

# 统计分析
data_df.describe()

| | Time | Tem1 | CO 1 | Soot 1 |
| count | 5948.000000 | 5948.000000 | 5948.000000 | 5948.000000 |
| mean | 226.133238 | 152.534919 | 0.000035 | 0.000222 |
| std | 96.601445 | 77.026019 | 0.000022 | 0.000144 |
| min | 0.000000 | 25.000000 | 0.000000 | 0.000000 |
| 25% | 151.000000 | 89.000000 | 0.000015 | 0.000093 |
| 50% | 241.000000 | 145.000000 | 0.000034 | 0.000220 |
| 75% | 310.000000 | 220.000000 | 0.000054 | 0.000348 |

max	367.000000	307.000000	0.000080	0.000512

当我看到相关性为1的时候，我也惊呆了，后面查看了统计量，还是没发现出来，但是看上面的可视化图展示，我信了，随着温度升高，CO化碳、Soot浓度一起升高，这个也符合火灾的场景，数据没啥问题。

5、特征提取

python 复制代码

# 由于时间间隔一样，故这里去除
data = data_df.iloc[:, 1:]

data.head(3)

| | Tem1 | CO 1 | Soot 1 |
| 0 | 25.0 | 0.0 | 0.0 |
| 1 | 25.0 | 0.0 | 0.0 |

2	25.0	0.0	0.0

python 复制代码

data.tail(3)

| | Tem1 | CO 1 | Soot 1 |
| 5945 | 292.0 | 0.000077 | 0.000491 |
| 5946 | 291.0 | 0.000076 | 0.000489 |

5947	290.0	0.000076	0.000487

特征间数据差距较大，故需要做标准化

2、时间数据构建

1、数据标准化

python 复制代码

from sklearn.preprocessing import MinMaxScaler

sc = MinMaxScaler()

for col in ['Tem1', 'CO 1', 'Soot 1']:
    data[col] = sc.fit_transform(data[col].values.reshape(-1, 1))
    
# 查看维度
data.shape

复制代码

(5948, 3)

2、构建时间数据集

LSTM 模型期望输入数据的形状是 (样本数, 时间步长, 特征数)，本文数据：

样本数：5948
时间步长：本文设置为8
- 即是：取特征每8行(Tem1, CO 1, Soot 1)为一个时间段，第9个时间段的Tem1为y(温度)，火灾预测本质也是预测温度
特征数：3

python 复制代码

width_x = 8
width_y = 1

# 构建时间数据X， y(解释在上)
X, y = [], []

# 设置开始构建数据位置
start_position = 0

for _, _ in data.iterrows():
    in_end = start_position + width_x
    out_end = in_end + width_y 
    
    if out_end < len(data):
        # 采集时间数据集
        X_ = np.array(data.iloc[start_position : in_end, :])
        y_ = np.array(data.iloc[in_end : out_end, 0])
        
        X.append(X_)
        y.append(y_)
        
    start_position += 1

# 转化为数组
X = np.array(X)
# y也要构建出适合维度的变量
y = np.array(y).reshape(-1, 1, 1)

X.shape, y.shape

复制代码

((5939, 8, 3), (5939, 1, 1))

3、划分数据集和加载数据集

1、数据划分

python 复制代码

# 取前5000个数据位训练集，后面为测试集
X_train = torch.tensor(np.array(X[:5000, ]), dtype=torch.float32)
X_test = torch.tensor(np.array(X[5000:, ]), dtype=torch.float32)

y_train = torch.tensor(np.array(y[:5000, ]), dtype=torch.float32)
y_test = torch.tensor(np.array(y[5000:, ]), dtype=torch.float32)

X_train.shape, y_train.shape

复制代码

(torch.Size([5000, 8, 3]), torch.Size([5000, 1, 1]))

数据集构建：

TensorDataset 是 PyTorch 中的一个类，用于将两个或多个张量组合成一个数据集。每个样本由一个输入张量和一个目标张量组成(构建的数据集中，每一个输入对应一个输出)

python 复制代码

from torch.utils.data import TensorDataset, DataLoader

batch_size = 64

train_dl = DataLoader(TensorDataset(X_train, y_train),
                      batch_size=batch_size,
                      shuffle=True)

test_dl = DataLoader(TensorDataset(X_test, y_test),
                      batch_size=batch_size,
                      shuffle=False)

3、模型构建

nn.LSTM 的 API

*构造函数

python 复制代码

torch.nn.LSTM(input_size, hidden_size, num_layers=1, bias=True, batch_first=False, dropout=0, bidirectional=False, proj_size=0)

input_size (int)：每个时间步输入特征的数量。
hidden_size (int)：LSTM 层中隐藏状态（h）的特征数。这也是 LSTM 输出的特征数量，除非指定了 proj_size。
num_layers (int, 可选)：LSTM 层的数量。默认值为 1。
bias (bool, 可选)：如果为 True，则使用偏置项；否则不使用。默认值为 True。
batch_first (bool, 可选)：如果为 True，则输入和输出张量的形状为 (batch, seq, feature)；否则为 (seq, batch, feature)。默认值为 False。
dropout (float, 可选)：除了最后一层之外的所有 LSTM 层之后应用的 dropout 概率。如果 num_layers = 1，则不会应用 dropout。默认值为 0。
bidirectional (bool, 可选)：如果为 True，则变为双向 LSTM。默认值为 False。
proj_size (int, 可选)：如果大于 0，则 LSTM 会将隐藏状态投影到一个不同维度的空间。这减少了模型参数的数量，并且可以加速训练。默认值为 0，表示没有投影。

输入

input (tensor)：形状为 (seq_len, batch, input_size) 或者如果 batch_first=True 则为 (batch, seq_len, input_size)。
(h_0, c_0) (tuple, 可选)：包含两个张量 (h_0, c_0)，分别代表初始的隐藏状态和细胞状态。它们的形状均为 (num_layers * num_directions, batch, hidden_size)。如果没有提供，那么所有状态都会被初始化为零。

其中：

单向 LSTM (bidirectional=False)：此时 num_directions=1。LSTM 只按照时间序列的顺序从前向后处理数据，即从第一个时间步到最后一个时间步。
双向 LSTM (bidirectional=True)：此时 num_directions=2。双向 LSTM 包含两个独立的 LSTM 层，一个按正常的时间顺序从前向后处理数据，另一个则反过来从后向前处理数据。这样做可以让模型同时捕捉到过去和未来的信息，对于某些任务（如自然语言处理中的语义理解）特别有用。

输出(两个)

output (tensor)：包含了最后一个时间步的输出特征（h_t）。如果 batch_first=True，则形状为 (batch, seq_len, num_directions * hidden_size)；否则为 (seq_len, batch, num_directions * hidden_size)。注意，如果 proj_size > 0，则输出的最后一个维度将是 num_directions * proj_size。
(h_n, c_n) (tuple)：包含两个张量 (h_n, c_n)，分别代表所有时间步后的最终隐藏状态和细胞状态。它们的形状均为 (num_layers * num_directions, batch, hidden_size)。同样地，如果 proj_size > 0，则 h_n 的最后一个维度将是 proj_size。

python 复制代码

'''
模型采用两个lstm层：
    3->320:lstm
    ->320:lstm(进一步提取时间特征)
    ->1:linear
'''

class model_lstm(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.lstm1 = nn.LSTM(input_size=3, hidden_size=320, num_layers=1, batch_first=True)
        self.lstm2 = nn.LSTM(input_size=320, hidden_size=320, num_layers=1, batch_first=True)
        self.fc = nn.Linear(320, 1)
        
    def forward(self, x):
        out, hidden = self.lstm1(x)
        out, _ = self.lstm2(out)
        out = self.fc(out)   # 这个时候，输出维度(batch_size, sequence_length, output_size), 这里是(64, 8, 1)
        return out[:, -1, :].view(-1, 1, 1)  # 取最后一条数据  (64, 1, 1), 在pytorch中如果一个维度是1，可能会自动压缩，所以这里需要再次形状重塑
    
model = model_lstm().to(device)
model

复制代码

model_lstm(
  (lstm1): LSTM(3, 320, batch_first=True)
  (lstm2): LSTM(320, 320, batch_first=True)
  (fc): Linear(in_features=320, out_features=1, bias=True)
)

python 复制代码

# 先做测试
model(torch.rand(30, 8, 3)).shape

复制代码

torch.Size([30, 1, 1])

4、模型训练

1、训练集函数

python 复制代码

def train(train_dl, model, loss_fn, optimizer, lr_scheduler=None):
    size = len(train_dl.dataset)
    num_batchs = len(train_dl)
    
    train_loss = 0
    
    for X, y in train_dl:
        X, y = X.to(device), y.to(device)
        
        pred = model(X)
        loss = loss_fn(pred, y)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item()
        
    if lr_scheduler is not None:
        lr_scheduler.step()
        print("learning rate = {:.5f}".format(optimizer.param_groups[0]['lr']), end="  ")
        
    train_loss /= num_batchs
    
    return train_loss

2、测试集函数

python 复制代码

def test(test_dl, model, loss_fn):
    size = len(test_dl.dataset)
    num_batchs = len(test_dl)
    
    test_loss = 0
    
    with torch.no_grad():
        for X, y in test_dl:
            X, y = X.to(device), y.to(device)
        
            pred = model(X)
            loss = loss_fn(pred, y)
        
            test_loss += loss.item()
        
    test_loss /= num_batchs
    
    return test_loss

3、模型训练

python 复制代码

# 设置超参数
loss_fn = nn.MSELoss()
lr = 1e-1
opt = torch.optim.SGD(model.parameters(), lr=lr, weight_decay=1e-4) # weight_decay 实际上是在应用 L2 正则化（也称为权重衰减）

epochs = 50

# 动态调整学习率
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(opt, epochs, last_epoch=-1)


train_loss = []
test_loss = []

for epoch in range(epochs):
    model.train()
    epoch_train_loss = train(train_dl, model, loss_fn, opt, lr_scheduler)
    
    model.eval()
    epoch_test_loss = test(test_dl, model, loss_fn)
    
    train_loss.append(epoch_train_loss)
    test_loss.append(epoch_test_loss)
    
    template = ('Epoch:{:2d}, Train_loss:{:.5f}, Test_loss:{:.5f}')     
    print(template.format(epoch+1, epoch_train_loss,  epoch_test_loss))

复制代码

learning rate = 0.09990  Epoch: 1, Train_loss:0.00320, Test_loss:0.00285
learning rate = 0.09961  Epoch: 2, Train_loss:0.00022, Test_loss:0.00084
learning rate = 0.09911  Epoch: 3, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09843  Epoch: 4, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09755  Epoch: 5, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.09649  Epoch: 6, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.09524  Epoch: 7, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09382  Epoch: 8, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09222  Epoch: 9, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09045  Epoch:10, Train_loss:0.00015, Test_loss:0.00066
learning rate = 0.08853  Epoch:11, Train_loss:0.00015, Test_loss:0.00077
learning rate = 0.08645  Epoch:12, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08423  Epoch:13, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08187  Epoch:14, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.07939  Epoch:15, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07679  Epoch:16, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.07409  Epoch:17, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07129  Epoch:18, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.06841  Epoch:19, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06545  Epoch:20, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06243  Epoch:21, Train_loss:0.00015, Test_loss:0.00069
learning rate = 0.05937  Epoch:22, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.05627  Epoch:23, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.05314  Epoch:24, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.05000  Epoch:25, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.04686  Epoch:26, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.04373  Epoch:27, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.04063  Epoch:28, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.03757  Epoch:29, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.03455  Epoch:30, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.03159  Epoch:31, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.02871  Epoch:32, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.02591  Epoch:33, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02321  Epoch:34, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02061  Epoch:35, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.01813  Epoch:36, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.01577  Epoch:37, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.01355  Epoch:38, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.01147  Epoch:39, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00955  Epoch:40, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00778  Epoch:41, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.00618  Epoch:42, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00476  Epoch:43, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00351  Epoch:44, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00245  Epoch:45, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00157  Epoch:46, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00089  Epoch:47, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00039  Epoch:48, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00010  Epoch:49, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00000  Epoch:50, Train_loss:0.00015, Test_loss:0.00063

5、结果展示

1、损失函数

python 复制代码

import matplotlib.pyplot as plt 
from datetime import datetime 
current_time = datetime.now() # 获取当前时间 
 
plt.figure(figsize=(5, 3),dpi=120)   
plt.plot(train_loss    , label='LSTM Training Loss') 
plt.plot(test_loss, label='LSTM Validation Loss')   
plt.title('Training and Validation Loss') 
plt.xlabel(current_time) # 打卡请带上时间戳，否则代码截图无效 
plt.legend() 
plt.show()

效果不错，收敛了

2、预测展示

python 复制代码

predicted_y_lstm = sc.inverse_transform(model(X_test).detach().numpy().reshape(-1,1))                    # 测试集输入模型进行预测 
y_test_1         = sc.inverse_transform(y_test.reshape(-1,1)) 
y_test_one       = [i[0] for i in y_test_1] 
predicted_y_lstm_one = [i[0] for i in predicted_y_lstm]   
plt.figure(figsize=(5, 3),dpi=120) # 画出真实数据和预测数据的对比曲线 
plt.plot(y_test_one[:2000], color='red', label='real_temp') 
plt.plot(predicted_y_lstm_one[:2000], color='blue', label='prediction')   
plt.title('Title') 
plt.xlabel('X') 
plt.ylabel('Y') 
plt.legend() 
plt.show()

3、R2评估

python 复制代码

from sklearn import metrics 
""" 
RMSE ：均方根误差  ----->  对均方误差开方 
R2   ：决定系数，可以简单理解为反映模型拟合优度的重要的统计量 
""" 
RMSE_lstm  = metrics.mean_squared_error(predicted_y_lstm_one, y_test_1)**0.5 
R2_lstm    = metrics.r2_score(predicted_y_lstm_one, y_test_1)   
print('均方根误差: %.5f' % RMSE_lstm) 
print('R2: %.5f' % R2_lstm)

复制代码

均方根误差: 0.00001
R2: 0.82422

rmse、r2都不错，但是拟合度还可以再提高