【Hugging Face】解决BART模型调用时KeyError: ‘new_zeros‘的问题

错误代码:

python 复制代码
tokenizer = AutoTokenizer.from_pretrained("philschmid/bart-large-cnn-samsum")
model = AutoModelForSeq2SeqLM.from_pretrained("philschmid/bart-large-cnn-samsum")

model.eval()
model.to("cuda")
loss = 0
for i in range(len(self.dataset)):
    batch = tokenizer([self.dataset[i]["source"]], return_tensors="pt", padding=True).to("cuda")
    labels = tokenizer([self.dataset[i]["target"]], return_tensors="pt", padding=True).to("cuda")
    print(batch)
    outputs = model(**batch, labels=labels)
    print(outputs.loss.item())

报错内容:

python 复制代码
Traceback (most recent call last):
  File "D:\anaconda\envs\supTextDebug\lib\site-packages\transformers\tokenization_utils_base.py", line 266, in __getattr__
    return self.data[item]
KeyError: 'new_zeros'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\supTextDebug\supTextDebugCode\textDebugger.py", line 360, in <module>
    debugger.run_baselines()
  File "E:\supTextDebug\supTextDebugCode\textDebugger.py", line 299, in run_baselines
    loss.get_loss()
  File "E:\supTextDebug\supTextDebugCode\lossbased.py", line 26, in get_loss
    outputs = model(**batch, labels=labels)
  File "D:\anaconda\envs\supTextDebug\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda\envs\supTextDebug\lib\site-packages\transformers\models\bart\modeling_bart.py", line 1724, in forward
    decoder_input_ids = shift_tokens_right(
  File "D:\anaconda\envs\supTextDebug\lib\site-packages\transformers\models\bart\modeling_bart.py", line 104, in shift_tokens_right
    shifted_input_ids = input_ids.new_zeros(input_ids.shape)
  File "D:\anaconda\envs\supTextDebug\lib\site-packages\transformers\tokenization_utils_base.py", line 268, in __getattr__
    raise AttributeError
AttributeError

解决方案:

错误行:outputs = model(**batch, labels=labels)

直接使用模型的forward方法,而不是将所有参数传递给 model:

python 复制代码
tokenizer = AutoTokenizer.from_pretrained("philschmid/bart-large-cnn-samsum")
model = AutoModelForSeq2SeqLM.from_pretrained("philschmid/bart-large-cnn-samsum")

model.eval()
model.to("cuda")
loss = 0
for i in range(len(self.dataset)):
    batch = tokenizer([self.dataset[i]["source"]], return_tensors="pt", padding=True).to("cuda")
    labels = tokenizer([self.dataset[i]["target"]], return_tensors="pt", padding=True).to("cuda")
    print(batch)
    outputs = model(input_ids=batch["input_ids"], attention_mask=batch["attention_mask"], labels=labels["input_ids"])
    print(outputs.loss.item())
相关推荐
IT_陈寒2 小时前
Python搞不定字符串编码?这破玩意坑我两小时!
前端·人工智能·后端
大模型真好玩3 小时前
什么是Loop Engineering?最通俗易懂的Loop Engineering核心概念
人工智能·agent·deepseek
叁两3 小时前
前端转型AI Agent该如何学习?(前置篇)
前端·人工智能·node.js
LaiYoung_4 小时前
🎁 送你一套超好用超实用的 FE AI-Coding Skills
前端·人工智能·开源
ZzT6 小时前
怎么做才不会被 AI 替代?
人工智能·程序员
道友可好6 小时前
从今天开始:你的第一个 Harness Engineering 实践
前端·人工智能·后端
小姜前线技术7 小时前
AI回答代码块高亮加一键复制
人工智能
洛阳泰山7 小时前
从 0 到 1.6K Star:一个 Java 开源项目的增长复盘
人工智能·后端·开源
米小虾8 小时前
Agent Skill 设计模式完全指南
人工智能·agent
饼干哥哥9 小时前
保姆级教程:用Image2 + Seedance2.0 做长视频,以品牌广告为例
人工智能