第100+9步 ChatGPT文献复现：ARIMA预测百日咳

基于WIN10的64位系统演示

一、写在前面

我们来继续换一篇文章来学习学习：

《BMC Public Health》杂志的2022年一篇题目为《ARIMA and ARIMA-ERNN models for prediction of pertussis incidence in mainland China from 2004 to 2021》文章的模拟数据做案例。

这文章做的是用：使用单纯ARIMA模型和ARIMA-ERNN组合模型预测中国大陆百日咳发病率。

文章是用单纯的ARIMA模型作为对照，更新了ARIMA-ERNN模型。本期，我们先来尝试ARIMA模型。

数据不是原始数据哈，是我使用GPT-4根据文章的散点图提取出来近似数据，只弄到了2004-2017年的。

二、闲聊和复现：

（ 1 ） 数据基本描述

没啥好说的，就是最最基本的描述统计。

要是有原始数据，直接可以让GPT-4帮分析。

（2）季节拆分

之前介绍过了哈，使用SPSS也可以实现，文章使用的是R语言：

①首先，使用GPT-4生成：

咒语：

直接输出结果：

注意哈，我的数据是模拟数据，且只到2017年12月，所以乍一看跟文章的略有偏差。

GPT给出的代码（Python）供参考：

python 复制代码

import pandas as pd
from statsmodels.tsa.seasonal import STL
import matplotlib.pyplot as plt

# Load the data from the CSV file
data = pd.read_csv('/mnt/data/数据.csv')

# Convert the 'time' column to datetime to facilitate resampling by year
data['time'] = pd.to_datetime(data['time'], format='%b-%y')

# Set the time column as index and set frequency to month start
data.set_index('time', inplace=True)
data.index.freq = 'MS'

# Prepare the data for STL decomposition by ensuring it has a frequency (monthly data)
# Apply STL decomposition with a seasonal period of 13 (adjusted as needed)
stl = STL(data['incidence'], seasonal=13)
result = stl.fit()

# Plotting the results of the STL decomposition
plt.figure(figsize=(12, 8))

# Original data plot
plt.subplot(4, 1, 1)
plt.plot(result.observed)
plt.title('Original Data')

# Seasonal component plot
plt.subplot(4, 1, 2)
plt.plot(result.seasonal)
plt.title('Seasonal Component')

# Trend component plot
plt.subplot(4, 1, 3)
plt.plot(result.trend)
plt.title('Trend Component')

# Residual plot
plt.subplot(4, 1, 4)
plt.plot(result.resid)
plt.title('Residual')

# Adjust layout and save the figure to a file
plt.tight_layout()
plt.savefig('/mnt/data/STL_Decomposition_Corrected.png')

plt.show()