基于PyTorch的CNN-BiLSTM时序预测：从数据滑窗到模型部署

在金融风控、工业物联网、气象预报等领域，时序预测是核心技术之一。单一的CNN擅长提取局部空间特征，却难以捕捉时序数据的长期依赖；而LSTM虽能建模序列关系，却对局部高频特征不敏感。本文将手把手教你用PyTorch构建CNN-BiLSTM混合模型，既挖掘时序数据的局部特征，又捕捉双向长程依赖，最终完成从数据预处理到模型部署的全流程。

一、时序预测痛点与CNN-BiLSTM解决方案

1.1 时序数据的核心挑战

时序数据具有时间依赖性 、局部波动性 和长期趋势性三大特征：

局部特征：如传感器信号的突变、股票价格的短时波动（CNN擅长捕捉）
长期依赖：如气象数据的季节周期、商品价格的长期趋势（BiLSTM擅长捕捉）
噪声干扰：真实场景下的时序数据往往包含大量噪声，需模型具备鲁棒性

1.2 CNN-BiLSTM的优势

CNN负责对时序数据做局部特征提取（类似对时间窗口内的特征卷积），BiLSTM则对卷积后的特征序列进行双向时序建模，最终通过全连接层输出预测值。整体架构如下：
时序数据
滑动窗口切分
数据标准化
CNN层：提取局部特征
BiLSTM层：双向时序建模
全连接层
预测输出

二、环境准备与数据预处理

2.1 环境配置

首先安装所需依赖：

bash 复制代码

pip install torch numpy pandas scikit-learn matplotlib

2.2 滑动窗口构建时序样本

时序预测的核心是将一维时序数据转换为「输入序列-输出值」的样本对，滑动窗口是最常用的方法。假设我们有长度为N的时序数据，窗口大小为W，预测步长为1，则可生成N-W个样本：

python 复制代码

import numpy as np

def sliding_window(data, window_size, pred_step=1):
    """
    滑动窗口生成时序样本
    :param data: 原始时序数据，shape=(timesteps, features)
    :param window_size: 输入序列长度
    :param pred_step: 预测步长（预测未来第几步）
    :return: X (样本输入), y (样本输出)
    """
    X, y = [], []
    for i in range(len(data) - window_size - pred_step + 1):
        # 输入：连续window_size个时间步的特征
        X.append(data[i:i+window_size, :])
        # 输出：第i+window_size+pred_step-1个时间步的目标值（取第一列作为预测目标）
        y.append(data[i+window_size+pred_step-1, 0])
    return np.array(X), np.array(y)

# 测试滑动窗口函数
if __name__ == "__main__":
    # 生成模拟时序数据：100个时间步，2个特征
    mock_data = np.random.randn(100, 2)
    X, y = sliding_window(mock_data, window_size=10, pred_step=1)
    print(f"输入样本形状: {X.shape} (样本数, 时间步, 特征数)")
    print(f"输出样本形状: {y.shape} (样本数,)")

输出结果：

复制代码

输入样本形状: (90, 10, 2) (样本数, 时间步, 特征数)
输出样本形状: (90,) (样本数,)

2.3 数据标准化与划分

时序数据的量纲差异会严重影响模型收敛，需先做标准化（注意：必须用训练集的均值/标准差标准化测试集，避免数据泄露）：

python 复制代码

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# 1. 加载/生成数据（这里用模拟数据演示）
data = np.random.randn(1000, 3)  # 1000个时间步，3个特征（第一列为预测目标）

# 2. 滑动窗口切分
window_size = 24  # 输入序列长度（如24小时）
X, y = sliding_window(data, window_size=window_size)

# 3. 划分训练集/测试集（时序数据不能随机划分！）
split_idx = int(len(X) * 0.8)
X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]

# 4. 标准化（仅对特征维度标准化）
# 注意：需要分别标准化输入的每个特征维度
scaler_X = StandardScaler()
# 重塑为(样本数*时间步, 特征数)，便于按特征标准化
X_train_reshaped = X_train.reshape(-1, X_train.shape[-1])
scaler_X.fit(X_train_reshaped)
# 标准化后恢复原形状
X_train = scaler_X.transform(X_train_reshaped).reshape(X_train.shape)
X_test = scaler_X.transform(X_test.reshape(-1, X_test.shape[-1])).reshape(X_test.shape)

# 标准化输出值
scaler_y = StandardScaler()
y_train = scaler_y.fit_transform(y_train.reshape(-1, 1)).flatten()
y_test = scaler_y.transform(y_test.reshape(-1, 1)).flatten()

# 转换为PyTorch张量
import torch
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)

print(f"训练集输入形状: {X_train.shape}")
print(f"测试集输入形状: {X_test.shape}")

三、CNN-BiLSTM模型构建

CNN层用1D卷积提取局部时序特征，BiLSTM层建模双向时序依赖，最后通过全连接层输出预测值：

python 复制代码

import torch.nn as nn

class CNNBiLSTM(nn.Module):
    def __init__(self, input_dim, cnn_out_channels, cnn_kernel_size, 
                 lstm_hidden_dim, lstm_num_layers, dropout_rate=0.2):
        super(CNNBiLSTM, self).__init__()
        
        # 1. CNN层：提取局部时序特征
        self.cnn = nn.Conv1d(
            in_channels=input_dim,  # 输入通道数=特征数
            out_channels=cnn_out_channels,  # 输出通道数
            kernel_size=cnn_kernel_size,  # 卷积核大小（时间窗口）
            padding='same'  # 保持卷积后序列长度不变
        )
        self.relu = nn.ReLU()
        self.dropout_cnn = nn.Dropout(dropout_rate)
        
        # 2. BiLSTM层：双向时序建模
        self.bilstm = nn.LSTM(
            input_size=cnn_out_channels,  # 输入维度=CNN输出通道数
            hidden_size=lstm_hidden_dim,  # 隐藏层维度
            num_layers=lstm_num_layers,  # LSTM层数
            bidirectional=True,  # 双向LSTM
            batch_first=True,  # 输入形状=(batch_size, seq_len, features)
            dropout=dropout_rate if lstm_num_layers > 1 else 0
        )
        self.dropout_lstm = nn.Dropout(dropout_rate)
        
        # 3. 全连接层：输出预测值
        self.fc = nn.Linear(lstm_hidden_dim * 2, 1)  # 双向LSTM输出维度=2*hidden_dim
    
    def forward(self, x):
        # x shape: (batch_size, seq_len, input_dim)
        
        # CNN层需要将特征维度放到第二维：(batch_size, input_dim, seq_len)
        x = x.permute(0, 2, 1)
        x = self.cnn(x)
        x = self.relu(x)
        x = self.dropout_cnn(x)
        
        # 还原维度顺序：(batch_size, seq_len, cnn_out_channels)
        x = x.permute(0, 2, 1)
        
        # BiLSTM层
        lstm_out, _ = self.bilstm(x)  # lstm_out shape: (batch_size, seq_len, 2*lstm_hidden_dim)
        # 取最后一个时间步的输出作为特征
        lstm_out = lstm_out[:, -1, :]
        lstm_out = self.dropout_lstm(lstm_out)
        
        # 全连接层输出
        out = self.fc(lstm_out)  # out shape: (batch_size, 1)
        return out.squeeze()  # 压缩维度为(batch_size,)

# 初始化模型
model = CNNBiLSTM(
    input_dim=3,  # 特征数
    cnn_out_channels=16,
    cnn_kernel_size=3,
    lstm_hidden_dim=32,
    lstm_num_layers=2,
    dropout_rate=0.2
)
print(model)

四、模型部署

训练好的模型需要部署到生产环境，PyTorch提供了TorchScript和ONNX两种主流导出方式。

4.1 TorchScript导出（PyTorch原生部署）

TorchScript是PyTorch的静态图格式，可脱离Python环境运行：

python 复制代码

# 1. 切换模型到评估模式
model.eval()

# 2. 创建示例输入（需与模型输入形状一致）
example_input = torch.randn(1, window_size, 3)  # (batch_size, seq_len, input_dim)

# 3. 导出TorchScript模型
traced_model = torch.jit.trace(model, example_input)
traced_model.save('cnn_bilstm_ts.pt')

# 4. 加载并测试导出的模型
loaded_model = torch.jit.load('cnn_bilstm_ts.pt')
with torch.no_grad():
    test_pred = loaded_model(example_input)
print(f"TorchScript模型预测结果: {test_pred.item():.4f}")

4.2 ONNX导出（跨平台部署）

ONNX是开源的模型格式，支持部署到TensorRT、OpenVINO、TensorFlow等框架：

python 复制代码

# 1. 导出ONNX模型
torch.onnx.export(
    model,
    example_input,
    'cnn_bilstm.onnx',
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={
        'input': {0: 'batch_size'},  # 动态批次维度
        'output': {0: 'batch_size'}
    },
    opset_version=12
)

# 2. 验证ONNX模型
import onnx
import onnxruntime as ort

# 检查模型有效性
onnx_model = onnx.load('cnn_bilstm.onnx')
onnx.checker.check_model(onnx_model)

# 运行ONNX模型
ort_session = ort.InferenceSession('cnn_bilstm.onnx')
ort_inputs = {ort_session.get_inputs()[0].name: example_input.numpy()}
ort_outputs = ort_session.run(None, ort_inputs)
print(f"ONNX模型预测结果: {ort_outputs[0][0]:.4f}")

4.3 部署注意事项

标准化器保存：需将scaler_X和scaler_y保存为pickle文件，部署时用于数据预处理和结果还原：

python 复制代码

import pickle
with open('scaler_X.pkl', 'wb') as f:
    pickle.dump(scaler_X, f)
with open('scaler_y.pkl', 'wb') as f:
    pickle.dump(scaler_y, f)

输入形状检查：部署时需确保输入数据的形状为(batch_size, window_size, input_dim)；
推理优化：可使用torch.compile()（PyTorch 2.0+）加速推理，或转换为TensorRT引擎。

五、进阶优化技巧

超参数调优：用Optuna优化窗口大小、CNN通道数、LSTM隐藏层维度等超参数；
注意力机制：在BiLSTM后加入Attention层，聚焦关键时间步的特征；
多步预测：修改输出层为多维度，或用递归预测实现多步时序预测；
数据增强：对时序数据添加高斯噪声、时间扭曲等增强策略，提升模型鲁棒性。

总结

CNN-BiLSTM通过CNN提取局部时序特征、BiLSTM建模双向长程依赖，是时序预测的高效混合架构；
时序数据预处理核心是滑动窗口构建样本对，且需用训练集的统计量标准化测试集；
模型部署可选择TorchScript（PyTorch原生）或ONNX（跨平台），同时需保存数据标准化器以完成端到端预测。