【已更新代码图表】2023数学建模国赛E题python代码--黄河水沙监测数据分析

E 题黄河水沙监测数据分析

黄河是中华民族的母亲河。研究黄河水沙通量的变化规律对沿黄流域的环境治理、气候变

化和人民生活的影响，以及对优化黄河流域水资源分配、协调人地关系、调水调沙、防洪减灾

等方面都具有重要的理论指导意义。

附件 1 给出了位于小浪底水库下游黄河某水文站近 6 年的水位、水流量与含沙量的实际监

测数据，附件 2 给出了该水文站近 6 年黄河断面的测量数据，附件 3 给出了该水文站部分监测

点的相关数据。请建立数学模型研究以下问题：

问题 1 研究该水文站黄河水的含沙量与时间、水位、水流量的关系，并估算近 6 年该水

文站的年总水流量和年总排沙量。

bash 复制代码

#完整代码：https://mbd.pub/o/bread/mbd-ZJ2cl59p

#思路

有6年每天多个时刻下的水位、流量，然后含沙量是只有两千条数据，

其他是空着的，他就问研究该水文站黄河水的含沙量与时间、水位、水流量的关系

先预测剩下的含沙量

最后在计算一下近 6 年该水文站的年总水流量和年总排沙量。

考虑到数据的时序性通过线性回归的代码进行拟合

通过两千条数据拿来训练，然后预测剩下的一万四千条数据

bash 复制代码

#千千数模 q群：790539996
#代码购买链接：https://www.jdmm.cc/file/2709544/
#倒卖欢迎举报 举报有奖


# In[2]:


table = pd.read_excel(r"./data/附件1.xlsx")
for i in range(2017, 2017+5):
#     移除table最后一条数据（重复了）
#     print(table.iloc[len(table)-1])
    table.drop((len(table)-1),inplace=True)
    i = str(i)
    temp = pd.read_excel(r"./data/附件1.xlsx",sheet_name = i)
    table = pd.concat([table, temp])
    table = table.reset_index(drop=True)
table


# In[3]:


# 补齐时间
table['年'].fillna(method='ffill', inplace=True)
table['月'].fillna(method='ffill', inplace=True)
table['日'].fillna(method='ffill', inplace=True)
table


# In[13]:


# 数据预处理
time_list = []
for i in range(len(table)):
    m, d, h = str(int(table.iloc[i,1])), str(int(table.iloc[i,2])),str(table.iloc[i,3])
    if(int(table.iloc[i,1])<10):
        m = "0" + str(int(table.iloc[i,1]))
    if(int(table.iloc[i,2])<10):
        d = "0" + str(int(table.iloc[i,2])) 
#     print(m,d)
    time = str(int(table.iloc[i,0]))+"-"+ m+"-"+ d +" "+ h
#     print(time)
    time_list.append(time)

temp = pd.DataFrame(time_list, columns=["时刻"])
temp["时刻"]= pd.to_datetime(temp["时刻"])
# temp.to_csv('example3.csv', index=False)
# temp
table1 = pd.concat([table, temp],axis=1)
# table

df =table1.iloc[:, [7,4,5,6]]
df.to_csv('example2.csv', index=False)

# 将索引转换为日期时间
# df.set_index("时刻", inplace=True)
df


# In[5]:


df["时刻"]= pd.to_datetime(df["时刻"])
# 将时间序列转换为数值型特征
df['时刻'] = df['时刻'].apply(lambda x: x.timestamp())
df


# In[6]:


# 提取时间、水位、水流量和含沙量的数据
data = df[pd.notna(df["含沙量(kg/m3) "])]
X = data[['时刻', '水位(m)', '流量(m3/s)']]
y = data['含沙量(kg/m3) ']
y


# In[7]:


# 建立线性回归模型
#LSTM 
model = LinearRegression()
model.fit(X, y)


# In[8]:


new_df = df[pd.isna(df.loc[:,"含沙量(kg/m3) "])]
new_X = new_df.loc[:,['时刻', '水位(m)', '流量(m3/s)']]
new_df.loc[:,"含沙量(kg/m3) "] = model.predict(new_X)
new_df


# In[9]:


# 使用 fillna 方法填充空白部分
table['含沙量(kg/m3) '].fillna(new_df['含沙量(kg/m3) '], inplace = True)
# table.to_csv('example.csv', index=False)
table


# In[10]: