电网公司区域电力负荷预测(LSTM算法)

业务痛点:某区域电网公司(覆盖10个地市,负荷峰值500万千瓦)存在三大问题:

  • 预测精度低:传统ARIMA模型预测误差大(MAPE=18%),导致备用容量冗余(年浪费电费2000万元)或供电不足(限电损失年超1500万元)
  • 特征利用不足:未充分融合气象(温度/湿度)、节假日、新能源(风电/光伏)出力等动态特征,极端天气下预测偏差超30%
  • 动态响应差:负荷模式随季节/产业调整变化快(如夏季空调负荷突增),模型迭代周期长(季度级),无法实时适应

算法团队:时序数据清洗(Spark)、特征工程(滑动窗口/多源特征融合)、LSTM模型构建(PyTorch)、模型训练与评估(MAE/RMSE/MAPE)、模型存储(MinIO)

交付清单:

  • 模型文件:model/lstm_best.pth(PyTorch模型)、model/scaler.pkl(特征缩放器)
  • 特征数据:feature_path(滑动窗口特征矩阵.npy)、label_path(标签序列.npy)
  • 工具:数据清洗脚本(load_cleaning.py)、特征生成脚本(feature_generator.py)
  • 文档:《LSTM参数调优报告》《特征工程手册》《评估指标说明》(MAE/RMSE/MAPE定义)

算法性能:

  • 测试集MAPE≤8%,RMSE≤5万千瓦,预测响应时间≤5分钟(单次预测)

业务团队:API网关、负荷预测服务(Python/FastAPI)、调度系统集成(Java/Spring Boot)、监控告警;

交付清单:

  • 系统:负荷预测API(Python/FastAPI)、调度系统(Java/Spring Boot)、调度终端(Vue前端)
  • API文档:《负荷预测API调用手册》(含请求示例:最近72小时288个时间点×8特征)
  • 运维文档:《K8s部署手册》《模型热更新流程》

业务价值:备用容量优化年降本2000万元,限电损失减少年1500万元,新能源消纳率提升10%

电网运营团队:通过调度终端查看预测曲线,应急指挥系统基于预测结果制定切负荷策略

数据准备与特征变化

(1)原数据结构(多源时序数据,来自电网系统)

原始数据包括历史负荷、气象、日期、新能源出力,存储于数据湖(MinIO),含缺失值(传感器故障)、异常值(仪表误差)、时序错位等问题

① 历史负荷数据(historical_load,Parquet格式)

② 气象数据(weather_data,CSV格式)

③ 日期特征(date_features,SQL生成)

④ 新能源出力(renewable_output,Parquet格式)

(2)数据清洗(详细代码,算法团队负责)

目标:处理缺失值(线性插值)、异常值(3σ法则)、时序对齐(统一时间戳粒度)

代码文件:data_processing/load_cleaning.py(Python)

python 复制代码
import pandas as pd
import numpy as np
from scipy.interpolate import interpld
import logging

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

def clean_load_data(raw_load_path:str,output_path:str)->pd.DataFrame:
	"""清洗历史负荷数据:缺失值插值、异常值修正、时序对齐""" 
	# 1.读取原始数据
	df = pd.read_parquet(raw_load_path)
	df["timestamp"] = pd.to_datetime(df["timestamp"])
	df.set_index("timestamp",inplace=True)
	logger.info(f"原始负荷数据量:{len(df)}条,缺失值占比:{df['load_mw'].isnull().mean():.2%}")

	# 2.缺失值处理(线性插值,优先用同期历史数据填补)
	# 按region_id分组处理
	cleaned_dfs = []
	for region_id in df["region_id"].unique():
		region_df = df[df["region_id"]==region_id].copy()
		# 线性插值填补短期缺失(≤4个时间点)
		region_df["load_mw"] = region_df["load_mw"].interpolate(method="",limit=4)
		# 长期缺失(>4个时间点):用同期上周数据填补
		mask = region_df["load_mw"].isnull()
		if mask.any():
			region_df.loc[mask, "load_mw"] = region_df.shift(672)[mask] # 672=7天×96个15分钟点  
			logger.warning(f"区域{region_id}仍有{mask.sum()}条缺失值,用上周同期填补")
		cleaned_dfs.append(region_df)
	
	cleaned_df = pd.concat(cleaned_dfs).reset_index()

	# 3.异常值处理(3σ法则:偏离均值3倍标准差视为异常)  
	mean_load = cleaned_df.groupby("region_id")["load_mw"].mean()
	std_load = cleaned_df.groupby("region_id")["load_mw"].std()
	cleaned_Df["is_outlier"] = False
	for region_id in cleaned_df["region_id"].unique():
		region_mask = cleaned_df["region_id"] == region_id
		upper_bound = mean_load[region_id] + 3 * std_load[region_id]  
        lower_bound = mean_load[region_id] - 3 * std_load[region_id]  
		outlier_mask = region_mask & ((cleaned_df["load_mw"] > upper_bound) | (cleaned_df["load_mw"] < lower_bound))
		
		cleaned_df.loc[outlier_mask,"load_mw"] = np.nan # 先标记为缺失
		cleaned_df.loc[outlier_mask,"is_outlier"] = True

		# 异常值用插值填补
		cleaned_df.loc[region_mask,"load_mw"] = cleaned_df.loc[region_mask, "load_mw"].interpolate() 

	# 4.保存清洗后数据
	cleaned_df.to_parquet(output_path,index=False)
	logger.info(f"清洗后负荷数据量:{len(cleaned_df)}条,保存至{output_path}")  
    return cleaned_df  

if __name__ == "__main__":  
    raw_load_path = "s3://grid-data-lake/raw/historical_load.parquet"  
    output_path = "s3://grid-data-lake/cleaned/load_cleaned.parquet"  
    clean_load_data(raw_load_path, output_path)
	

(3)特征工程与特征数据生成(明确feature_path/label_path)

将清洗后数据转换为LSTM可输入的时序特征矩阵(feature_path)和标签序列(label_path),核心是构造滑动窗口样本(输入历史特征→预测未来负荷)

  • 时序特征:历史负荷(过去24小时96个点,即t-96到t-1)、负荷变化率(差分)
  • 外部特征:气象(当前时刻温度/湿度)、日期(是否节假日、星期几、季节)、新能源出力(风电/光伏当前出力)
  • 滑动窗口构造:用过去72小时(288个点)的特征预测未来4小时(16个点),窗口大小=288(输入步长),预测步长=16
  • 特征缩放:对所有数值特征(负荷、温度、出力)做标准化
  • 特征数据存储:feature_path指向滑动窗口特征矩阵(Parquet格式,含输入特征和标签),label_path指向标签序列(未来4小时负荷,用于评估)

代码文件:feature_engineering/feature_generator.py(特征生成)、feature_engineering/generate_feature_data.py(明确feature_path/label_path)

① 滑动窗口特征构造(feature_generator.py)

python 复制代码
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
import logging

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

def create_sliding_window(df: pd.DataFrame, input_steps: int = 288, output_steps: int = 16, target_col: str = "load_mw")-> tuple:
	"""构造滑动窗口样本:输入(input_steps个特征)→标签(output_steps个目标值)"""  
    """  
    :param df: 清洗后数据(含timestamp, region_id, load_mw, temp_c, is_holiday等)  
    :param input_steps: 输入步长(过去72小时=288个15分钟点)  
    :param output_steps: 预测步长(未来4小时=16个15分钟点)  
    :param target_col: 目标列(负荷load_mw)  
    :return: X(特征矩阵,shape=(样本数, input_steps, 特征数)), y(标签矩阵,shape=(样本数, output_steps))  
    """  
	features = []  
    labels = []  
    regions = df["region_id"].unique()  

    for region_id in regions:  
        region_df = df[df["region_id"] == region_id].sort_values("timestamp").reset_index(drop=True)  
        values = region_df.drop(columns=["timestamp", "region_id"]).values  # 特征矩阵(行:时间点,列:特征)  
        timestamps = region_df["timestamp"].values  

        # 遍历所有可能窗口(从第input_steps个点开始,到最后output_steps个点结束)  
        for i in range(input_steps, len(values) - output_steps):  
            # 输入特征:过去input_steps个时间点的所有特征(负荷、温度、节假日等)  
            X_window = values[i-input_steps:i, :]  # shape=(input_steps, 特征数)  
            # 标签:未来output_steps个时间点的目标负荷  
            y_window = values[i:i+output_steps, region_df.columns.get_loc(target_col)]  # shape=(output_steps,)  
            features.append(X_window)  
            labels.append(y_window)  

    # 转换为numpy数组  
    X = np.array(features, dtype=np.float32)  # shape=(样本数, input_steps, 特征数)  
    y = np.array(labels, dtype=np.float32)  # shape=(样本数, output_steps)  
    logger.info(f"滑动窗口构造完成:样本数={len(X)},输入形状={X.shape},标签形状={y.shape}")  
    return X, y  

def scale_features(X_train: np.ndarray, X_val: np.ndarray, X_test: np.ndarray) -> tuple:  
    """特征标准化(按训练集均值/标准差缩放)"""  
    scaler = StandardScaler()  
    # 合并训练集所有特征(展平为2D:(样本数×input_steps, 特征数))  
    train_features_flat = X_train.reshape(-1, X_train.shape[-1])  
    scaler.fit(train_features_flat)  

    # 分别缩放训练/验证/测试集  
    X_train_scaled = scaler.transform(train_features_flat).reshape(X_train.shape)  
    X_val_scaled = scaler.transform(X_val.reshape(-1, X_val.shape[-1])).reshape(X_val.shape)  
    X_test_scaled = scaler.transform(X_test.reshape(-1, X_test.shape[-1])).reshape(X_test.shape)  

    logger.info("特征标准化完成(基于训练集均值/标准差)")  
    return X_train_scaled, X_val_scaled, X_test_scaled, scaler
	

② 特征数据生成(generate_feature_data.py,明确feature_path/label_path)

python 复制代码
import pandas as pd  
import numpy as np  
from sklearn.model_selection import train_test_split  
from feature_generator import create_sliding_window, scale_features  
import logging  

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

def generate_feature_data(cleaned_load_path: str, weather_path: str, date_path: str, renewable_path: str) -> tuple:  
    """生成LSTM特征数据(feature_path)和标签(label_path)"""  
    # 1. 加载多源数据并融合  
    load_df = pd.read_parquet(cleaned_load_path)  
    weather_df = pd.read_csv(weather_path).merge(load_df[["timestamp", "region_id"]], on=["timestamp", "region_id"])  
    date_df = pd.read_csv(date_path).merge(load_df[["timestamp", "region_id"]], left_on="date", right_on="timestamp")  
    renewable_df = pd.read_parquet(renewable_path).merge(load_df[["timestamp", "region_id"]], on=["timestamp", "region_id"])  

    # 融合所有特征(按timestamp和region_id对齐)  
    merged_df = load_df.merge(weather_df, on=["timestamp", "region_id"], how="left") \  
                       .merge(date_df, on=["timestamp", "region_id"], how="left") \  
                       .merge(renewable_df, on=["timestamp", "region_id"], how="left")  

    # 2. 构造滑动窗口样本(输入288步→预测16步)  
    X, y = create_sliding_window(merged_df, input_steps=288, output_steps=16, target_col="load_mw")  

    # 3. 划分数据集(train:val:test=7:2:1,按时间顺序划分,避免未来数据泄露)  
    split_idx1 = int(len(X) * 0.7)  
    split_idx2 = int(len(X) * 0.9)  
    X_train, X_val, X_test = X[:split_idx1], X[split_idx1:split_idx2], X[split_idx2:]  
    y_train, y_val, y_test = y[:split_idx1], y[split_idx1:split_idx2], y[split_idx2:]  

    # 4. 特征标准化  
    X_train_scaled, X_val_scaled, X_test_scaled, scaler = scale_features(X_train, X_val, X_test)  

    # 5. 定义存储路径(算法团队数据湖位置)  
    feature_path = {  
        "train": "s3://grid-data-lake/processed/lstm_features_train.npy",  
        "val": "s3://grid-data-lake/processed/lstm_features_val.npy",  
        "test": "s3://grid-data-lake/processed/lstm_features_test.npy"  
    }  
    label_path = {  
        "train": "s3://grid-data-lake/processed/lstm_labels_train.npy",  
        "val": "s3://grid-data-lake/processed/lstm_labels_val.npy",  
        "test": "s3://grid-data-lake/processed/lstm_labels_test.npy"  
    }  

    # 6. 保存数据  
    np.save(feature_path["train"], X_train_scaled)  
    np.save(feature_path["val"], X_val_scaled)  
    np.save(feature_path["test"], X_test_scaled)  
    np.save(label_path["train"], y_train)  
    np.save(label_path["val"], y_val)  
    np.save(label_path["test"], y_test)  

    logger.info(f"""  
    【算法团队特征数据存储说明】  
    - feature_path: {feature_path}  
      存储内容:滑动窗口特征矩阵(.npy格式),shape=(样本数, input_steps=288, 特征数=8),特征包括:  
        load_mw(历史负荷)、temp_c(温度)、humidity(湿度)、wind_speed(风速)、is_holiday(节假日)、day_of_week(星期几)、season(季节)、wind_mw(风电)、solar_mw(光伏)  
      示例(train特征矩阵前1个样本):\n{X_train_scaled[0, :5, :3]}(前5个时间点,前3个特征)  

    - label_path: {label_path}  
      存储内容:标签序列(未来16步负荷,.npy格式),shape=(样本数, output_steps=16)  
      示例(train标签前1个样本):\n{y_train[0]}(未来4小时16个15分钟点负荷)  
    """)  

    return feature_path, label_path, X_train_scaled, y_train, X_val_scaled, y_val, X_test_scaled, y_test

代码结构

算法团队仓库(algorithm-load-forecasting,Python)

text 复制代码
algorithm-load-forecasting/  
├── data_processing/                # 数据清洗(缺失值/异常值处理)  
│   ├── load_cleaning.py             # 负荷数据清洗(含详细注释)  
│   ├── weather_cleaning.py          # 气象数据清洗  
│   └── requirements.txt             # 依赖:pandas, numpy, scipy  
├── feature_engineering/            # 特征工程(滑动窗口/标准化)  
│   ├── feature_generator.py         # 滑动窗口构造+特征缩放  
│   ├── generate_feature_data.py     # 特征数据生成(含feature_path/label_path说明)  
│   └── requirements.txt             # 依赖:scikit-learn  
├── model_training/                 # LSTM模型训练(核心)  
│   ├── lstm_model.py                 # LSTM模型定义(PyTorch,含门控机制注释)  
│   ├── train_lstm.py                 # 训练入口(损失函数/优化器/早停)  
│   ├── evaluate_model.py             # 评估(MAE/RMSE/MAPE计算)  
│   └── lstm_params.yaml               # 调参记录(hidden_size=64, lr=0.001)  
├── model_storage/                  # 模型存储(MinIO)  
│   ├── save_model.py                 # 保存模型(.h5/.pt)  
│   └── load_model.py                 # 加载模型(推理用)  
└── mlflow_tracking/                # MLflow实验跟踪  
    └── run_lstm_experiment.py         # 记录超参数/指标/模型

(1)算法团队:LSTM模型定义(model_training/lstm_model.py,含原理注释)

LSTM层结构(输入门/遗忘门/输出门)、多特征输入处理、预测步长扩展(Seq2Seq思想)

python 复制代码
import torch  
import torch.nn as nn  
import logging  

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

class LSTMForecaster(nn.Module):  
    """LSTM电力负荷预测模型(Seq2Seq结构:编码器-解码器)"""  
    def __init__(self, input_size: int, hidden_size: int, output_steps: int, num_layers: int = 2):  
        """  
        :param input_size: 输入特征数(8:负荷、温度、湿度等)  
        :param hidden_size: LSTM隐藏层神经元数(64)  
        :param output_steps: 预测步长(16:未来4小时16个点)  
        :param num_layers: LSTM层数(2层堆叠)  
        """  
        super(LSTMForecaster, self).__init__()  
        self.hidden_size = hidden_size  
        self.num_layers = num_layers  
        self.output_steps = output_steps  

        # LSTM编码器(输入历史时序特征)  
        self.lstm_encoder = nn.LSTM(  
            input_size=input_size,  
            hidden_size=hidden_size,  
            num_layers=num_layers,  
            batch_first=True,  # 输入形状(batch_size, seq_len, input_size)  
            dropout=0.2        # 层间 dropout 防止过拟合  
        )  

        # LSTM解码器(输出未来负荷序列)  
        self.lstm_decoder = nn.LSTM(  
            input_size=hidden_size,  # 解码器输入为编码器输出的隐藏状态  
            hidden_size=hidden_size,  
            num_layers=num_layers,  
            batch_first=True,  
            dropout=0.2  
        )  

        # 全连接层(将解码器输出映射为预测负荷)  
        self.fc = nn.Linear(hidden_size, 1)  # 输出1个值(负荷)  

    def forward(self, x: torch.Tensor) -> torch.Tensor:  
        """前向传播:输入历史特征→输出未来16步负荷"""  
        # x.shape=(batch_size, input_steps=288, input_size=8)  

        # 1. 编码器:处理输入序列,输出最后一个时间步的隐藏状态和细胞状态  
        encoder_out, (hidden, cell) = self.lstm_encoder(x)  # encoder_out.shape=(batch_size, 288, hidden_size)  

        # 2. 解码器:用编码器输出的隐藏状态初始化,逐步预测未来负荷  
        # 解码器输入:初始用编码器最后一个时间步的输出(重复output_steps次)  
        decoder_input = encoder_out[:, -1:, :].repeat(1, self.output_steps, 1)  # shape=(batch_size, 16, hidden_size)  
        decoder_out, _ = self.lstm_decoder(decoder_input, (hidden, cell))  # decoder_out.shape=(batch_size, 16, hidden_size)  

        # 3. 全连接层:输出预测负荷(16步)  
        output = self.fc(decoder_out)  # shape=(batch_size, 16, 1)  
        return output.squeeze(-1)  # 移除最后一维,shape=(batch_size, 16)  

# 示例:模型初始化  
def init_lstm_model(input_size=8, hidden_size=64, output_steps=16, num_layers=2):  
    model = LSTMForecaster(input_size, hidden_size, output_steps, num_layers)  
    logger.info(f"LSTM模型初始化完成:input_size={input_size}, hidden_size={hidden_size}, output_steps={output_steps}")  
    return model

(2)算法团队:模型训练与评估(model_training/train_lstm.py+evaluate_model.py)

  • Epoch:整个训练集遍历1次(如100个epoch表示模型看过100遍所有样本)
  • Batch Size:每次训练输入的样本数(如32表示每次用32个样本计算梯度更新参数)
  • Loss Function:损失函数(MSE均方误差,衡量预测值与真实值的平方差,回归任务常用)
  • Optimizer:优化器(Adam,自适应学习率,加速收敛)
  • Early Stopping:早停(验证集误差不再下降时停止训练,防止过拟合)
python 复制代码
# model_training/train_lstm.py(训练循环)
import torch  
import torch.nn as nn  
from torch.utils.data import DataLoader, TensorDataset  
import numpy as np  
from lstm_model import init_lstm_model  
import mlflow  
import logging  

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

def train_lstm(  
    X_train: np.ndarray, y_train: np.ndarray,  
    X_val: np.ndarray, y_val: np.ndarray,  
    hidden_size: int = 64, lr: float = 0.001,  
    batch_size: int = 32, epochs: int = 100, patience: int = 10  
) -> nn.Module:  
    """训练LSTM模型"""  
    # 1. 设备配置(GPU优先)  
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")  
    logger.info(f"使用设备:{device}")  

    # 2. 转换为PyTorch张量  
    X_train_tensor = torch.tensor(X_train, dtype=torch.float32).to(device)  # shape=(样本数, 288, 8)  
    y_train_tensor = torch.tensor(y_train, dtype=torch.float32).to(device)  # shape=(样本数, 16)  
    X_val_tensor = torch.tensor(X_val, dtype=torch.float32).to(device)  
    y_val_tensor = torch.tensor(y_val, dtype=torch.float32).to(device)  

    # 3. 创建数据集和数据加载器  
    train_dataset = TensorDataset(X_train_tensor, y_train_tensor)  
    val_dataset = TensorDataset(X_val_tensor, y_val_tensor)  
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)  
    val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)  

    # 4. 初始化模型、损失函数、优化器  
    model = init_lstm_model(input_size=X_train.shape[-1], hidden_size=hidden_size, output_steps=y_train.shape[-1])  
    model.to(device)  
    criterion = nn.MSELoss()  # 均方误差损失(回归任务)  
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)  # Adam优化器  
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode="min", factor=0.5, patience=5)  # 学习率衰减  

    # 5. 训练循环(含早停)  
    best_val_loss = float("inf")  
    early_stop_counter = 0  

    for epoch in range(epochs):  
        # 训练模式  
        model.train()  
        train_loss = 0.0  
        for batch_X, batch_y in train_loader:  
            optimizer.zero_grad()  # 清零梯度  
            outputs = model(batch_X)  # 前向传播  
            loss = criterion(outputs, batch_y)  # 计算损失  
            loss.backward()  # 反向传播  
            optimizer.step()  # 更新参数  
            train_loss += loss.item() * batch_X.size(0)  
        train_loss /= len(train_dataset)  

        # 验证模式  
        model.eval()  
        val_loss = 0.0  
        with torch.no_grad():  
            for batch_X, batch_y in val_loader:  
                outputs = model(batch_X)  
                loss = criterion(outputs, batch_y)  
                val_loss += loss.item() * batch_X.size(0)  
        val_loss /= len(val_dataset)  

        # 记录MLflow实验  
        with mlflow.start_run(run_name=f"lstm_epoch_{epoch}"):  
            mlflow.log_metric("train_loss", train_loss, step=epoch)  
            mlflow.log_metric("val_loss", val_loss, step=epoch)  
            mlflow.log_param("hidden_size", hidden_size)  
            mlflow.log_param("lr", lr)  

        # 早停判断  
        if val_loss < best_val_loss:  
            best_val_loss = val_loss  
            torch.save(model.state_dict(), "model/lstm_best.pth")  # 保存最佳模型  
            early_stop_counter = 0  
            logger.info(f"Epoch {epoch+1}: 验证损失下降至{best_val_loss:.4f},保存模型")  
        else:  
            early_stop_counter += 1  
            if early_stop_counter >= patience:  
                logger.info(f"早停触发({patience}轮验证损失未下降),停止训练")  
                break  

        logger.info(f"Epoch {epoch+1}/{epochs}, Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}")  
        scheduler.step(val_loss)  # 更新学习率  

    logger.info(f"训练完成,最佳验证损失:{best_val_loss:.4f}")  
    model.load_state_dict(torch.load("model/lstm_best.pth"))  # 加载最佳模型  
    return model

评估代码(model_training/evaluate_model.py,含MAE/RMSE/MAPE计算):

python 复制代码
import numpy as np  
import torch  
from sklearn.metrics import mean_absolute_error, mean_squared_error  
import logging  

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

def evaluate_model(model: torch.nn.Module, X_test: np.ndarray, y_test: np.ndarray, device: torch.device) -> dict:  
    """评估模型性能(MAE/RMSE/MAPE)"""  
    model.eval()  
    X_test_tensor = torch.tensor(X_test, dtype=torch.float32).to(device)  
    y_test_tensor = torch.tensor(y_test, dtype=torch.float32).to(device)  

    with torch.no_grad():  
        y_pred = model(X_test_tensor).cpu().numpy()  # 预测值(shape=(样本数, 16))  
        y_true = y_test_tensor.cpu().numpy()       # 真实值  

    # 计算指标(对所有样本、所有预测步长的平均值)  
    mae = mean_absolute_error(y_true.flatten(), y_pred.flatten())  
    rmse = np.sqrt(mean_squared_error(y_true.flatten(), y_pred.flatten()))  
    mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100  # 百分比  

    metrics = {"MAE": mae, "RMSE": rmse, "MAPE": mape}  
    logger.info(f"模型评估结果:MAE={mae:.2f}万千瓦,RMSE={rmse:.2f}万千瓦,MAPE={mape:.2f}%")  
    return metrics

业务团队仓库(business-load-forecasting,Java+Python)

text 复制代码
business-load-forecasting/  
├── api_gateway/                    # API网关(Kong配置)  
├── load_forecasting_service/      # 负荷预测服务(Python/FastAPI)  
│   ├── main.py                      # FastAPI服务(调用LSTM模型)  
│   ├── model_loader.py              # 加载MinIO模型(.pt→PyTorch模型)  
│   └── Dockerfile                  # 容器化配置  
├── scheduling_system/              # 调度系统(Java/Spring Boot)  
│   ├── backend/                    # 发电计划/备用容量计算  
│   ├── frontend/                   # Vue前端(调度终端可视化)  
│   └── sql/                        # PostgreSQL表结构(预测结果/历史负荷)  
├── monitoring/                     # 监控告警(Prometheus+Grafana)  
└── deployment/                     # K8s部署配置(服务/Ingress)

load_forecasting_service/main.py(Python API服务)

python 复制代码
from fastapi import FastAPI, HTTPException  
import numpy as np  
import torch  
from lstm_model import init_lstm_model  
from model_loader import load_model_from_minio  
import logging  

logging.basicConfig(level=logging.INFO)  
logger = logging.getLogger(__name__)  

app = FastAPI(title="电网负荷预测API")  

# 全局变量:加载模型和scaler  
model = None  
scaler = None  
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")  

@app.on_event("startup")  
def load_resources():  
    """启动时加载模型和特征缩放器"""  
    global model, scaler  
    model = load_model_from_minio("s3://grid-models/lstm_best.pth")  # 从MinIO加载模型  
    model.to(device)  
    model.eval()  
    # 加载训练时的特征缩放器(实际用joblib加载保存的scaler.pkl)  
    logger.info("模型和缩放器加载完成")  

@app.post("/predict")  
def predict_load(request: dict):  
    """预测接口:输入历史特征→输出未来4小时负荷"""  
    try:  
        # 1. 解析请求(包含region_id、最近72小时特征)  
        region_id = request["region_id"]  
        recent_features = np.array(request["recent_features"], dtype=np.float32)  # shape=(288, 8)  

        # 2. 特征标准化(用训练时的scaler)  
        scaled_features = scaler.transform(recent_features.reshape(1, -1)).reshape(1, 288, 8)  

        # 3. 模型推理  
        with torch.no_grad():  
            input_tensor = torch.tensor(scaled_features, dtype=torch.float32).to(device)  
            prediction = model(input_tensor).cpu().numpy()[0]  # shape=(16,)  

        # 4. 返回预测结果(未来16个15分钟点负荷)  
        return {  
            "region_id": region_id,  
            "prediction": prediction.tolist(),  
            "timestamp": request["timestamp"]  # 预测起始时间  
        }  
    except Exception as e:  
        logger.error(f"预测失败:{str(e)}")  
        raise HTTPException(status_code=500, detail=str(e))  

if __name__ == "__main__":  
    import uvicorn  
    uvicorn.run(app, host="0.0.0.0", port=8080)

部署后应用流程

Step 1:实时数据采集与特征构造

  • SCADA系统每15分钟采集一次负荷数据,气象/新能源系统实时同步特征,通过Kafka传输至数据湖,构造最近72小时288个时间点的特征矩阵(8特征/点)

Step 2:调用预测API生成未来负荷

  • 调度系统每15分钟调用预测API(POST /predict),传入区域ID和最近72小时特征,API返回未来4小时16个点的负荷预测值

Step 3:调度决策与备用容量优化

  • 调度终端可视化预测曲线(ECharts),对比实际负荷;调度系统根据预测值调整发电计划(如预测负荷突增时提前启动备用机组),优化备用容量(减少冗余)

Step 4:监控与迭代

  • 监控系统跟踪MAPE,若连续3天MAPE>8%,触发Airflow调度算法团队重训模型(用最新7天数据),周级更新模型
相关推荐
千里马也想飞3 小时前
人工智能在医疗领域的应用与研究论文写作实操:AI辅助快速完成框架+正文创作
人工智能
Rorsion3 小时前
PyTorch实现二分类(单特征输出+单层神经网络)
人工智能·pytorch·分类
勾股导航4 小时前
K-means
人工智能·机器学习·kmeans
liliangcsdn4 小时前
Diff2Flow中扩散和流匹配的对齐探索
人工智能
SmartBrain4 小时前
战略洞察:以AI为代表的第四次工业革命
人工智能·语言模型·aigc
一个处女座的程序猿4 小时前
AI之Agent之VibeCoding:《Vibe Coding Kills Open Source》翻译与解读
人工智能·开源·vibecoding·氛围编程
Jay Kay4 小时前
GVPO:Group Variance Policy Optimization
人工智能·算法·机器学习
风指引着方向4 小时前
归约操作优化:ops-math 的 Sum/Mean/Max 实现
人工智能·wpf
机器之心4 小时前
英伟达世界模型再进化,一个模型驱动所有机器人!机器人的GPT时刻真正到来
人工智能·openai
Epiphany.5564 小时前
蓝桥杯备赛题目-----爆破
算法·职场和发展·蓝桥杯