[深度学习网络从入门到入土] 自回归AR

个人导航

知乎：https://www.zhihu.com/people/byzh_rc

CSDN：https://blog.csdn.net/qq_54636039

注：本文仅对所述内容做了框架性引导，具体细节可查询其余相关资料or源码

参考文章：各方资料

文章目录

[[深度学习网络从入门到入土] 自回归AR](#[深度学习网络从入门到入土] 自回归AR)
个人导航
参考资料
背景
架构(公式)
- - - 1) AR(p) 定义 AR(p) 定义)
    - 2) 向量形式（更像"线性回归"）向量形式（更像“线性回归”）)
    - 3) 平稳性（非常重要的前提）平稳性（非常重要的前提）)
    - 4) 阶数 p p p 怎么选（常用经验）阶数 p p p 怎么选（常用经验）)
创新点
[代码实现 - numpy](#代码实现 - numpy)
[代码实现 - statsmodels](#代码实现 - statsmodels)

参考资料

Box & Jenkins：时间序列分析经典框架

背景

很多序列数据有一个很朴素的规律：当前值与过去若干时刻的值高度相关

温度、传感器信号、经济指标、心电某些节律片段

AR 模型就用最简单的线性方式表达这种"惯性/相关性"：

用过去 p p p 个值的线性组合来预测当前值
作为深度学习之前的经典统计基线

架构(公式)

1) AR§ 定义

设序列为 { x t } \{x_t\} {xt}，AR§：
x t = c + ∑ i = 1 p ϕ i x t − i + ε t x_t = c + \sum_{i=1}^{p}\phi_i x_{t-i} + \varepsilon_t xt=c+i=1∑pϕixt−i+εt

c c c：常数项（可对应均值）
ϕ i \phi_i ϕi：自回归系数
p p p：阶数（用多少个滞后项）
ε t \varepsilon_t εt：白噪声（通常假设 ε t ∼ N ( 0 , σ 2 ) \varepsilon_t \sim \mathcal{N}(0,\sigma^2) εt∼N(0,σ2)）

直观：当前 = 常数 + 过去 p 步的加权和 + 噪声

2) 向量形式（更像"线性回归"）

把特征向量写成：(前p个值)
x t − 1 : t − p = [ x t − 1 , x t − 2 , ... , x t − p ] ⊤ \mathbf{x}{t-1:t-p} = [x{t-1}, x_{t-2}, \ldots, x_{t-p}]^\top xt−1:t−p=[xt−1,xt−2,...,xt−p]⊤

参数向量：
ϕ = [ ϕ 1 , ... , ϕ p ] ⊤ \boldsymbol{\phi} = [\phi_1,\ldots,\phi_p]^\top ϕ=[ϕ1,...,ϕp]⊤

则：
x t = c + ϕ ⊤ x t − 1 : t − p + ε t x_t = c + \boldsymbol{\phi}^\top \mathbf{x}_{t-1:t-p} + \varepsilon_t xt=c+ϕ⊤xt−1:t−p+εt

所以 AR 本质上就是：用"滞后值"当特征的线性回归

3) 平稳性（非常重要的前提）

AR 通常要求序列（或差分后序列）平稳，否则参数不稳定、预测会漂

AR(1)： x t = c + ϕ x t − 1 + ε t x_t = c + \phi x_{t-1} + \varepsilon_t xt=c+ϕxt−1+εt
平稳条件： ∣ ϕ ∣ < 1 |\phi| < 1 ∣ϕ∣<1
AR§：更一般是特征多项式的根在单位圆外（记住"不要爆炸"即可）

4) 阶数 p p p 怎么选（常用经验）

看 PACF ：AR§ 的 PACF 往往在 p p p 阶"截尾"
或用信息准则：AIC/BIC（越小越好）

创新点

极简但很强的序列建模假设：只靠自相关就能做预测
可解释 ：每个 ϕ i \phi_i ϕi 代表"第 i 个滞后对现在的贡献"
训练快、数据需求小 ：相比深度模型，AR 对小数据场景很友好
是更大体系的基石：AR → ARMA/ARIMA（再加季节项、外生变量等）

代码实现 - numpy

py 复制代码

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams["font.sans-serif"] = ["SimHei"]  # 用来正常显示中文标签
plt.rcParams["axes.unicode_minus"] = False  # 用来正常显示负号


# =========================
# 1) AR(p) 拟合：OLS
# =========================
def _build_lag_matrix(x: np.ndarray, p: int, fit_intercept: bool):
    """
    构造 AR(p) 的回归矩阵：
      y_t = c + sum_{i=1..p} phi_i x_{t-i} + eps_t
    """
    x = np.asarray(x, dtype=float)
    n = x.shape[0]
    if n <= p:
        raise ValueError("序列长度必须 > p")

    y = x[p:]  # [x_p, ..., x_{n-1}]
    X_lags = np.column_stack([x[p - i - 1: n - i - 1] for i in range(p)])  # [n-p, p]

    if fit_intercept:
        X = np.column_stack([np.ones((n - p, 1)), X_lags])  # [n-p, 1+p]
    else:
        X = X_lags
    return X, y


def fit_ar_ols(x: np.ndarray, p: int, fit_intercept: bool = True):
    """
    最小二乘拟合 AR(p)，返回 (c, phi, sigma2, bic)
    """
    X, y = _build_lag_matrix(x, p, fit_intercept)
    beta, *_ = np.linalg.lstsq(X, y, rcond=None)

    y_hat = X @ beta
    e = y - y_hat
    n_eff = y.shape[0]

    sigma2 = float(np.mean(e ** 2))
    k = X.shape[1]  # 参数个数

    bic = n_eff * np.log(sigma2 + 1e-12) + k * np.log(n_eff)

    if fit_intercept:
        c = float(beta[0])
        phi = beta[1:]
    else:
        c = 0.0
        phi = beta

    return c, phi.astype(float), sigma2, float(bic)


def select_ar_order_bic(x: np.ndarray, p_max: int = 20, fit_intercept: bool = True):
    """
    用 BIC 在 p=1..p_max 里选阶
    """
    best = None
    for p in range(1, p_max + 1):
        c, phi, sigma2, bic = fit_ar_ols(x, p=p, fit_intercept=fit_intercept)
        if best is None or bic < best["bic"]:
            best = {"p": p, "c": c, "phi": phi, "sigma2": sigma2, "bic": bic}
    return best


# =========================
# 2) 预测：one-step rolling 与 multi-step recursive
# =========================
def predict_one_step_rolling(train: np.ndarray, test: np.ndarray, c: float, phi: np.ndarray):
    """
    单步滚动预测：每一步用"真实值"更新历史
    """
    p = len(phi)
    history = list(np.asarray(train, dtype=float))
    preds = []

    for xt in test:
        last = np.array(history[-p:][::-1])  # [x_{t-1},...,x_{t-p}]
        yhat = c + float(phi @ last)
        preds.append(yhat)
        history.append(float(xt))  # 用真实值更新

    return np.array(preds, dtype=float)


def forecast_multi_step(history: np.ndarray, c: float, phi: np.ndarray, steps: int):
    """
    多步递推预测：后续用"预测值"更新历史（误差会累积）
    """
    p = len(phi)
    buf = list(np.asarray(history, dtype=float))
    preds = []

    for _ in range(steps):
        last = np.array(buf[-p:][::-1])
        yhat = c + float(phi @ last)
        preds.append(yhat)
        buf.append(yhat)

    return np.array(preds, dtype=float)


# =========================
# 3) 造一个平稳 AR(2) 序列（带 burn-in）
# =========================
def simulate_ar2(n: int = 400, phi=(0.7, -0.25), c=0.0, noise_std=1.0, burn_in=200, seed=0):
    """
    x_t = c + phi1*x_{t-1} + phi2*x_{t-2} + eps_t
    """
    rng = np.random.default_rng(seed)
    phi1, phi2 = phi
    eps = rng.normal(0.0, noise_std, size=n + burn_in)

    x = np.zeros(n + burn_in, dtype=float)
    x[0] = rng.normal(0.0, noise_std)
    x[1] = rng.normal(0.0, noise_std)

    for t in range(2, n + burn_in):
        x[t] = c + phi1 * x[t - 1] + phi2 * x[t - 2] + eps[t]

    return x[burn_in:]


# =========================
# 4) Demo：拟合 + 子图可视化
# =========================
def demo():
    x = simulate_ar2(
        n=500,
        phi=(0.75, -0.35),
        c=0.2,
        noise_std=0.8,
        burn_in=300,
        seed=42
    )

    n_train = 400
    train = x[:n_train]
    test = x[n_train:]

    best = select_ar_order_bic(train, p_max=20, fit_intercept=True)
    p = best["p"]
    c_hat = best["c"]
    phi_hat = best["phi"]

    print(f"[BIC选阶] best p={p}, c={c_hat:.4f}, phi={np.round(phi_hat, 4)}  BIC={best['bic']:.2f}")

    pred_roll = predict_one_step_rolling(train, test, c_hat, phi_hat)
    pred_multi = forecast_multi_step(train, c_hat, phi_hat, steps=len(test))

    mse_roll = float(np.mean((pred_roll - test) ** 2))
    mse_multi = float(np.mean((pred_multi - test) ** 2))
    print(f"[Test] one-step rolling MSE={mse_roll:.4f} | multi-step recursive MSE={mse_multi:.4f}")

    t_train = np.arange(len(train))
    t_test = np.arange(len(train), len(train) + len(test))

    # =========================
    # 子图：上=单步，下=多步
    # =========================
    fig, axes = plt.subplots(2, 1, figsize=(10, 8), sharex=True)

    # ---- 上：单步滚动 ----
    ax = axes[0]
    ax.plot(t_train, train, label="train")
    ax.plot(t_test, test, label="test (true)")
    ax.plot(t_test, pred_roll, label="AR单步滚动预测")
    ax.axvline(len(train) - 1)
    ax.set_title(f"AR基线(选择p={p}) - 单步滚动预测 | MSE={mse_roll:.3f}")
    ax.set_ylabel("$x_t$")
    ax.legend()

    # ---- 下：多步递归 ----
    ax = axes[1]
    ax.plot(t_train, train, label="train")
    ax.plot(t_test, test, label="test (true)")
    ax.plot(t_test, pred_multi, label="AR多步递归预测")
    ax.axvline(len(train) - 1)
    ax.set_title(f"AR基线(选择p={p}) - 多步递归预测 | MSE={mse_multi:.3f}")
    ax.set_xlabel("时间 t")
    ax.set_ylabel("$x_t$")
    ax.legend()

    plt.tight_layout()
    plt.show()


if __name__ == "__main__":
    demo()

代码实现 - statsmodels

py 复制代码

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams["font.sans-serif"] = ["SimHei"]
plt.rcParams["axes.unicode_minus"] = False

from statsmodels.tsa.ar_model import AutoReg


# =========================
# 1) 造一个平稳 AR(2) 序列（带 burn-in）
# =========================
def simulate_ar2(n: int = 400, phi=(0.7, -0.25), c=0.0, noise_std=1.0, burn_in=200, seed=0):
    """
    x_t = c + phi1*x_{t-1} + phi2*x_{t-2} + eps_t
    """
    rng = np.random.default_rng(seed)
    phi1, phi2 = phi
    eps = rng.normal(0.0, noise_std, size=n + burn_in)

    x = np.zeros(n + burn_in, dtype=float)
    x[0] = rng.normal(0.0, noise_std)
    x[1] = rng.normal(0.0, noise_std)

    for t in range(2, n + burn_in):
        x[t] = c + phi1 * x[t - 1] + phi2 * x[t - 2] + eps[t]

    return x[burn_in:]


# =========================
# 2) 用 BIC 选 AR 阶数 + 拟合
# =========================
def fit_ar_by_bic(train: np.ndarray, p_max: int = 20, trend: str = "c"):
    """
    trend:
      "c"  : 带截距
      "n"  : 不带截距
    返回：
      best_res: statsmodels 拟合结果（AutoRegResults）
      best_p  : 最优阶数
      bic_list: 各阶 BIC（便于画图/打印）
    """
    train = np.asarray(train, dtype=float)

    best_res = None
    best_p = None
    bic_list = []

    for p in range(1, p_max + 1):
        res = AutoReg(train, lags=p, trend=trend, old_names=False).fit()
        bic_list.append(res.bic)
        if best_res is None or res.bic < best_res.bic:
            best_res = res
            best_p = p

    return best_res, best_p, np.array(bic_list, dtype=float)


# =========================
# 3) 预测：单步滚动 vs 多步递归
# =========================
def one_step_rolling_forecast(res, train: np.ndarray, test: np.ndarray):
    """
    单步滚动：
      每一步预测 x_t（用真实值更新历史）
    statsmodels 做法：每次用 history 重新 fit（更严格，但更慢）
    对于小规模演示 OK；真实大数据可以用更高效方法/其他API。
    """
    train = np.asarray(train, float)
    test = np.asarray(test, float)

    p = res.model._maxlag
    trend = res.model.trend

    history = list(train)
    preds = []

    for xt in test:
        # 用当前历史重新拟合（保证"用真实历史预测下一步"）
        tmp_res = AutoReg(np.array(history), lags=p, trend=trend, old_names=False).fit()
        # 预测下一步（最后一个点的下一时刻）
        yhat = float(tmp_res.predict(start=len(history), end=len(history))[0])
        preds.append(yhat)
        history.append(float(xt))

    return np.array(preds, dtype=float)


def multi_step_recursive_forecast(res, train: np.ndarray, steps: int):
    """
    多步递归：
      用一次拟合的模型，从训练末尾开始递推 steps 步（误差会累积）
    """
    start = len(train)
    end = len(train) + steps - 1
    pred = res.predict(start=start, end=end)
    return np.asarray(pred, dtype=float)


# =========================
# 4) Demo：拟合 + subplot 可视化
# =========================
def demo():
    # 生成一段更像"真实序列"的 AR(2)（带轻微振荡 + 噪声）
    x = simulate_ar2(
        n=500,
        phi=(0.75, -0.35),
        c=0.2,
        noise_std=0.8,
        burn_in=300,
        seed=42
    )

    # train/test
    n_train = 400
    train = x[:n_train]
    test = x[n_train:]

    # BIC 选阶 + 拟合
    res, p, bic_list = fit_ar_by_bic(train, p_max=20, trend="c")
    print(f"[BIC选阶] best p={p}, BIC={res.bic:.2f}")
    print("[参数] ", res.params)  # 包含截距和各滞后系数

    # 单步滚动预测（严格版：每步重新拟合）
    pred_roll = one_step_rolling_forecast(res, train, test)

    # 多步递归预测（一次拟合，直接 predict 多步）
    pred_multi = multi_step_recursive_forecast(res, train, steps=len(test))

    mse_roll = float(np.mean((pred_roll - test) ** 2))
    mse_multi = float(np.mean((pred_multi - test) ** 2))
    print(f"[Test] one-step rolling MSE={mse_roll:.4f} | multi-step recursive MSE={mse_multi:.4f}")

    t_train = np.arange(len(train))
    t_test = np.arange(len(train), len(train) + len(test))

    # subplot：上单步，下多步
    fig, axes = plt.subplots(2, 1, figsize=(10, 8), sharex=True)

    ax = axes[0]
    ax.plot(t_train, train, label="train")
    ax.plot(t_test, test, label="test (true)")
    ax.plot(t_test, pred_roll, label="AR单步滚动预测(statsmodels)")
    ax.axvline(len(train) - 1)
    ax.set_title(f"AR基线(选择p={p}) - 单步滚动预测 | MSE={mse_roll:.3f}")
    ax.set_ylabel("$x_t$")
    ax.legend()

    ax = axes[1]
    ax.plot(t_train, train, label="train")
    ax.plot(t_test, test, label="test (true)")
    ax.plot(t_test, pred_multi, label="AR多步递归预测(statsmodels)")
    ax.axvline(len(train) - 1)
    ax.set_title(f"AR基线(选择p={p}) - 多步递归预测 | MSE={mse_multi:.3f}")
    ax.set_xlabel("时间 t")
    ax.set_ylabel("$x_t$")
    ax.legend()

    plt.tight_layout()
    plt.show()


if __name__ == "__main__":
    demo()