qwen2vl - qwen2vl技术,学习,经验文章

西西弗Sisyphus

2 年前

模型训练中梯度累积步数（gradient_accumulation_steps）的作用flyfish在使用训练大模型时，TrainingArguments有一个参数梯度累积步数（gradient_accumulation_steps）