使用accumulate step节省显卡内存

使用前提:

单卡,模型+batch=1的数据能跑起来

使用accumulate step的意思就是,每次forward较小的batch,如batch=4,每4steps再更新一次参数,训练结果等效于batch=16

先跑一次原先的模型

复制代码
python NLinear_exp_full.py --accu_step 1 --batch 16 
epoch: 0
time comsuming: 1.8598144054412842
training epoch:0:0.0%
time comsuming: 2.137087106704712
training epoch:0:80.64516129032258%
time comsuming: 2.2242424488067627
time comsuming: 2.294013500213623
test epoch:0:0.0%
episode 0 mae 23.900234 rmse 66.41403 smape 0.934281
epoch: 1
time comsuming: 3.2021634578704834
training epoch:1:0.0%
time comsuming: 3.477159261703491
training epoch:1:80.64516129032258%
time comsuming: 3.560976505279541
time comsuming: 3.624363422393799
test epoch:1:0.0%
episode 1 mae 22.137833 rmse 64.748055 smape 0.79881644
epoch: 2
time comsuming: 3.982663869857788
training epoch:2:0.0%
time comsuming: 4.26115345954895
training epoch:2:80.64516129032258%
time comsuming: 4.350359678268433
time comsuming: 4.427008628845215
test epoch:2:0.0%
episode 2 mae 21.542023 rmse 64.10915 smape 0.68798375
epoch: 3
time comsuming: 4.786099910736084
training epoch:3:0.0%
time comsuming: 5.036171913146973
training epoch:3:80.64516129032258%
time comsuming: 5.121201038360596
time comsuming: 5.197283744812012
test epoch:3:0.0%
episode 3 mae 21.322206 rmse 64.079384 smape 0.6753313
epoch: 4
time comsuming: 5.5672008991241455
training epoch:4:0.0%
time comsuming: 5.830775260925293
training epoch:4:80.64516129032258%
time comsuming: 5.919378757476807
time comsuming: 5.9778666496276855

再跑一次batch设置为4,且accumulate step为4的情况

复制代码
python NLinear_exp_full.py --accu_step 4 --batch 4 
time comsuming: 1.9860742092132568
training epoch:0:0.0%
time comsuming: 2.221600294113159
training epoch:0:20.161290322580644%
time comsuming: 2.453077554702759
training epoch:0:40.32258064516129%
time comsuming: 2.675966262817383
training epoch:0:60.483870967741936%
time comsuming: 2.832383394241333
training epoch:0:80.64516129032258%
time comsuming: 3.0732641220092773
time comsuming: 3.1844491958618164
test epoch:0:0.0%
time comsuming: 3.4134249687194824
test epoch:0:72.99270072992701%
episode 0 mae 23.900234 rmse 66.41403 smape 0.934281
epoch: 1
time comsuming: 4.225269079208374
training epoch:1:0.0%
time comsuming: 4.442946434020996
training epoch:1:20.161290322580644%
time comsuming: 4.611685752868652
training epoch:1:40.32258064516129%
time comsuming: 4.845811367034912
training epoch:1:60.483870967741936%
time comsuming: 5.074229001998901
training epoch:1:80.64516129032258%
time comsuming: 5.326176166534424
time comsuming: 5.397624492645264
test epoch:1:0.0%
time comsuming: 5.633365869522095
test epoch:1:72.99270072992701%
episode 1 mae 22.137833 rmse 64.748055 smape 0.79881644
epoch: 2
time comsuming: 5.991377592086792
training epoch:2:0.0%
time comsuming: 6.217101097106934
training epoch:2:20.161290322580644%
time comsuming: 6.363693714141846
training epoch:2:40.32258064516129%
time comsuming: 6.590087175369263
training epoch:2:60.483870967741936%
time comsuming: 6.823684215545654
training epoch:2:80.64516129032258%
time comsuming: 7.081570625305176
time comsuming: 7.148298978805542
test epoch:2:0.0%
time comsuming: 7.377046823501587
test epoch:2:72.99270072992701%
episode 2 mae 21.542023 rmse 64.10915 smape 0.68798375
epoch: 3
time comsuming: 7.766062021255493
training epoch:3:0.0%
time comsuming: 7.996231317520142
training epoch:3:20.161290322580644%
time comsuming: 8.161593675613403
training epoch:3:40.32258064516129%
time comsuming: 8.388957738876343
training epoch:3:60.483870967741936%
time comsuming: 8.618509769439697
training epoch:3:80.64516129032258%
time comsuming: 8.876739978790283
time comsuming: 8.95041275024414
test epoch:3:0.0%
time comsuming: 9.18027663230896

显存占比: 514MB VS 494MB

相关推荐
jinxindeep10 分钟前
CVPR26最佳论文提名:NitroGen,面向通用游戏智能体的 视觉-动作基础模型
人工智能·游戏
小雨下雨的雨4 小时前
井字棋AI机器人实现详解 - Minimax算法实战-鸿蒙PC Electron框架完成
前端·人工智能·算法·华为·electron·鸿蒙
我没胡说八道6 小时前
高校论文AI检测优化工具对比研究与实测分析(2026)
人工智能·深度学习·机器学习·计算机视觉·aigc·论文
秦亚伟6 小时前
AI浪潮重塑融资租赁行业新格局
人工智能
love530love6 小时前
LiveTalking 数字人项目 Windows 部署完全指南(EPGF 架构)
人工智能·windows·python·架构·livetalking·epgf
元启数宇6 小时前
喷淋AI布点实战:8小时人工布点→20分钟自动出图
人工智能
哈哈,柳暗花明6 小时前
人工智能专业术语详解(H)
人工智能·专业术语
圣殿骑士-Khtangc6 小时前
AI 编程工具 2026 实战横评:Cursor 3 vs Claude Code vs Copilot,开发者选型完全指南
人工智能·copilot
云器科技6 小时前
云器Lakehouse 2026年5月版本发布:拥抱 AI Agent,重塑数据智能开发新范式
人工智能
小鹰-上海鹰谷-电子实验记录本6 小时前
第六届党建引领科创生态座谈会 | 邓光辉博士出席分享AI赋能创新药科研新范式
人工智能·ai·电子实验记录本·药企合规