pytorch加载预训练权重失败

问题

给当前模型换了个开源的主干网络,并且删除了某些层后,但是发现预训练权重一直加载不上。strict为True时加载报错,strict为False时又什么都加载不上,然后不知道哪里出问题了。

解决

当strict为False时,load_state_dict函数会返回一个字典,该字典含有以下两个键:

复制代码
missing_keys:在当前模型中存在,但在预训练权重中不存在的键。
unexpected_keys:在当前模型不存在,但在预训练权重中存在的键。
python 复制代码
        result=self.backbone.load_state_dict(model_weight,strict=False)
        print("Missing keys:", result.missing_keys)
        print("Unexpected keys:", result.unexpected_keys)

得到输出:

复制代码
Missing keys: ['model.patch_embed.conv1.weight', 'model.patch_embed.conv1.bias', 'model.patch_embed.norm1.1.weight', 'model.patch_embed.norm1.1.bias', 'model.patch_embed.conv2.weight', 'model.patch_embed.conv2.bias', 'model.patch_embed.norm2.1.weight', 'model.patch_embed.norm2.1.bias', 'model.levels.0.blocks.0.norm1.0.weight', 'model.levels.0.blocks.0.norm1.0.bias', 'model.levels.0.blocks.0.dcn.offset_mask.weight', 'model.levels.0.blocks.0.dcn.offset_mask.bias', 'model.levels.0.blocks.0.dcn.value_proj.weight', 'model.levels.0.blocks.0.dcn.value_proj.bias', 'model.levels.0.blocks.0.dcn.output_proj.weight', 'model.levels.0.blocks.0.norm2.0.weight', 'model.levels.0.blocks.0.norm2.0.bias', 'model.levels.0.blocks.0.mlp.fc1.weight', 'model.levels.0.blocks.0.mlp.fc1.bias', 'model.levels.0.blocks.0.mlp.fc2.weight', 'model.levels.0.blocks.1.norm1.0.weight', 'model.levels.0.blocks.1.norm1.0.bias', 'model.levels.0.blocks.1.dcn.offset_mask.weight', 'model.levels.0.blocks.1.dcn.offset_mask.bias', 'model.levels.0.blocks.1.dcn.value_proj.weight', 'model.levels.0.blocks.1.dcn.value_proj.bias', 'model.levels.0.blocks.1.dcn.output_proj.weight', 'model.levels.0.blocks.1.norm2.0.weight', 'model.levels.0.blocks.1.norm2.0.bias', 'model.levels.0.blocks.1.mlp.fc1.weight', 'model.levels.0.blocks.1.mlp.fc1.bias', 'model.levels.0.blocks.1.mlp.fc2.weight', 'model.levels.0.blocks.2.norm1.0.weight', 'model.levels.0.blocks.2.norm1.0.bias', 'model.levels.0.blocks.2.dcn.offset_mask.weight', 'model.levels.0.blocks.2.dcn.offset_mask.bias', 'model.levels.0.blocks.2.dcn.value_proj.weight', 'model.levels.0.blocks.2.dcn.value_proj.bias', 'model.levels.0.blocks.2.dcn.output_proj.weight', 'model.levels.0.blocks.2.norm2.0.weight', 'model.levels.0.blocks.2.norm2.0.bias', 'model.levels.0.blocks.2.mlp.fc1.weight', 'model.levels.0.blocks.2.mlp.fc1.bias', 'model.levels.0.blocks.2.mlp.fc2.weight', 'model.levels.0.blocks.3.norm1.0.weight', 'model.levels.0.blocks.3.norm1.0.bias', 'model.levels.0.blocks.3.dcn.offset_mask.weight', 'model.levels.0.blocks.3.dcn.offset_mask.bias', 'model.levels.0.blocks.3.dcn.value_proj.weight', 'model.levels.0.blocks.3.dcn.value_proj.bias', 'model.levels.0.blocks.3.dcn.output_proj.weight', 'model.levels.0.blocks.3.norm2.0.weight', 'model.levels.0.blocks.3.norm2.0.bias', 'model.levels.0.blocks.3.mlp.fc1.weight', 'model.levels.0.blocks.3.mlp.fc1.bias', 'model.levels.0.blocks.3.mlp.fc2.weight', 'model.levels.0.norm.0.weight', 'model.levels.0.norm.0.bias', 'model.levels.0.downsample.conv.weight', 'model.levels.0.downsample.norm.1.weight', 'model.levels.0.downsample.norm.1.bias', 'model.levels.1.blocks.0.norm1.0.weight', 'model.levels.1.blocks.0.norm1.0.bias', 'model.levels.1.blocks.0.dcn.offset_mask.weight', 'model.levels.1.blocks.0.dcn.offset_mask.bias', 'model.levels.1.blocks.0.dcn.value_proj.weight', 'model.levels.1.blocks.0.dcn.value_proj.bias', 'model.levels.1.blocks.0.dcn.output_proj.weight', 'model.levels.1.blocks.0.norm2.0.weight', 'model.levels.1.blocks.0.norm2.0.bias', 'model.levels.1.blocks.0.mlp.fc1.weight', 'model.levels.1.blocks.0.mlp.fc1.bias', 'model.levels.1.blocks.0.mlp.fc2.weight', 'model.levels.1.blocks.1.norm1.0.weight', 'model.levels.1.blocks.1.norm1.0.bias', 'model.levels.1.blocks.1.dcn.offset_mask.weight', 'model.levels.1.blocks.1.dcn.offset_mask.bias', 'model.levels.1.blocks.1.dcn.value_proj.weight', 'model.levels.1.blocks.1.dcn.value_proj.bias', 'model.levels.1.blocks.1.dcn.output_proj.weight', 'model.levels.1.blocks.1.norm2.0.weight', 'model.levels.1.blocks.1.norm2.0.bias', 'model.levels.1.blocks.1.mlp.fc1.weight', 'model.levels.1.blocks.1.mlp.fc1.bias', 'model.levels.1.blocks.1.mlp.fc2.weight', 'model.levels.1.blocks.2.norm1.0.weight', 'model.levels.1.blocks.2.norm1.0.bias', 'model.levels.1.blocks.2.dcn.offset_mask.weight', 'model.levels.1.blocks.2.dcn.offset_mask.bias', 'model.levels.1.blocks.2.dcn.value_proj.weight', 'model.levels.1.blocks.2.dcn.value_proj.bias', 'model.levels.1.blocks.2.dcn.output_proj.weight', 'model.levels.1.blocks.2.norm2.0.weight', 'model.levels.1.blocks.2.norm2.0.bias', 'model.levels.1.blocks.2.mlp.fc1.weight', 'model.levels.1.blocks.2.mlp.fc1.bias', 'model.levels.1.blocks.2.mlp.fc2.weight', 'model.levels.1.blocks.3.norm1.0.weight', 'model.levels.1.blocks.3.norm1.0.bias', 'model.levels.1.blocks.3.dcn.offset_mask.weight', 'model.levels.1.blocks.3.dcn.offset_mask.bias', 'model.levels.1.blocks.3.dcn.value_proj.weight', 'model.levels.1.blocks.3.dcn.value_proj.bias', 'model.levels.1.blocks.3.dcn.output_proj.weight', 'model.levels.1.blocks.3.norm2.0.weight', 'model.levels.1.blocks.3.norm2.0.bias', 'model.levels.1.blocks.3.mlp.fc1.weight', 'model.levels.1.blocks.3.mlp.fc1.bias', 'model.levels.1.blocks.3.mlp.fc2.weight', 'model.levels.1.norm.0.weight', 'model.levels.1.norm.0.bias', 'model.levels.1.downsample.conv.weight', 'model.levels.1.downsample.norm.1.weight', 'model.levels.1.downsample.norm.1.bias', 'model.levels.2.blocks.0.norm1.0.weight', 'model.levels.2.blocks.0.norm1.0.bias', 'model.levels.2.blocks.0.dcn.offset_mask.weight', 'model.levels.2.blocks.0.dcn.offset_mask.bias', 'model.levels.2.blocks.0.dcn.value_proj.weight', 'model.levels.2.blocks.0.dcn.value_proj.bias', 'model.levels.2.blocks.0.dcn.output_proj.weight', 'model.levels.2.blocks.0.norm2.0.weight', 'model.levels.2.blocks.0.norm2.0.bias', 'model.levels.2.blocks.0.mlp.fc1.weight', 'model.levels.2.blocks.0.mlp.fc1.bias', 'model.levels.2.blocks.0.mlp.fc2.weight', 'model.levels.2.blocks.1.norm1.0.weight', 'model.levels.2.blocks.1.norm1.0.bias', 'model.levels.2.blocks.1.dcn.offset_mask.weight', 'model.levels.2.blocks.1.dcn.offset_mask.bias', 'model.levels.2.blocks.1.dcn.value_proj.weight', 'model.levels.2.blocks.1.dcn.value_proj.bias', 'model.levels.2.blocks.1.dcn.output_proj.weight', 'model.levels.2.blocks.1.norm2.0.weight', 'model.levels.2.blocks.1.norm2.0.bias', 'model.levels.2.blocks.1.mlp.fc1.weight', 'model.levels.2.blocks.1.mlp.fc1.bias', 'model.levels.2.blocks.1.mlp.fc2.weight', 'model.levels.2.blocks.2.norm1.0.weight', 'model.levels.2.blocks.2.norm1.0.bias', 'model.levels.2.blocks.2.dcn.offset_mask.weight', 'model.levels.2.blocks.2.dcn.offset_mask.bias', 'model.levels.2.blocks.2.dcn.value_proj.weight', 'model.levels.2.blocks.2.dcn.value_proj.bias', 'model.levels.2.blocks.2.dcn.output_proj.weight', 'model.levels.2.blocks.2.norm2.0.weight', 'model.levels.2.blocks.2.norm2.0.bias', 'model.levels.2.blocks.2.mlp.fc1.weight', 'model.levels.2.blocks.2.mlp.fc1.bias', 'model.levels.2.blocks.2.mlp.fc2.weight', 'model.levels.2.blocks.3.norm1.0.weight', 'model.levels.2.blocks.3.norm1.0.bias', 'model.levels.2.blocks.3.dcn.offset_mask.weight', 'model.levels.2.blocks.3.dcn.offset_mask.bias', 'model.levels.2.blocks.3.dcn.value_proj.weight', 'model.levels.2.blocks.3.dcn.value_proj.bias', 'model.levels.2.blocks.3.dcn.output_proj.weight', 'model.levels.2.blocks.3.norm2.0.weight', 'model.levels.2.blocks.3.norm2.0.bias', 'model.levels.2.blocks.3.mlp.fc1.weight', 'model.levels.2.blocks.3.mlp.fc1.bias', 'model.levels.2.blocks.3.mlp.fc2.weight', 'model.levels.2.blocks.4.norm1.0.weight', 'model.levels.2.blocks.4.norm1.0.bias', 'model.levels.2.blocks.4.dcn.offset_mask.weight', 'model.levels.2.blocks.4.dcn.offset_mask.bias', 'model.levels.2.blocks.4.dcn.value_proj.weight', 'model.levels.2.blocks.4.dcn.value_proj.bias', 'model.levels.2.blocks.4.dcn.output_proj.weight', 'model.levels.2.blocks.4.norm2.0.weight', 'model.levels.2.blocks.4.norm2.0.bias', 'model.levels.2.blocks.4.mlp.fc1.weight', 'model.levels.2.blocks.4.mlp.fc1.bias', 'model.levels.2.blocks.4.mlp.fc2.weight', 'model.levels.2.blocks.5.norm1.0.weight', 'model.levels.2.blocks.5.norm1.0.bias', 'model.levels.2.blocks.5.dcn.offset_mask.weight', 'model.levels.2.blocks.5.dcn.offset_mask.bias', 'model.levels.2.blocks.5.dcn.value_proj.weight', 'model.levels.2.blocks.5.dcn.value_proj.bias', 'model.levels.2.blocks.5.dcn.output_proj.weight', 'model.levels.2.blocks.5.norm2.0.weight', 'model.levels.2.blocks.5.norm2.0.bias', 'model.levels.2.blocks.5.mlp.fc1.weight', 'model.levels.2.blocks.5.mlp.fc1.bias', 'model.levels.2.blocks.5.mlp.fc2.weight', 'model.levels.2.blocks.6.norm1.0.weight', 'model.levels.2.blocks.6.norm1.0.bias', 'model.levels.2.blocks.6.dcn.offset_mask.weight', 'model.levels.2.blocks.6.dcn.offset_mask.bias', 'model.levels.2.blocks.6.dcn.value_proj.weight', 'model.levels.2.blocks.6.dcn.value_proj.bias', 'model.levels.2.blocks.6.dcn.output_proj.weight', 'model.levels.2.blocks.6.norm2.0.weight', 'model.levels.2.blocks.6.norm2.0.bias', 'model.levels.2.blocks.6.mlp.fc1.weight', 'model.levels.2.blocks.6.mlp.fc1.bias', 'model.levels.2.blocks.6.mlp.fc2.weight', 'model.levels.2.blocks.7.norm1.0.weight', 'model.levels.2.blocks.7.norm1.0.bias', 'model.levels.2.blocks.7.dcn.offset_mask.weight', 'model.levels.2.blocks.7.dcn.offset_mask.bias', 'model.levels.2.blocks.7.dcn.value_proj.weight', 'model.levels.2.blocks.7.dcn.value_proj.bias', 'model.levels.2.blocks.7.dcn.output_proj.weight', 'model.levels.2.blocks.7.norm2.0.weight', 'model.levels.2.blocks.7.norm2.0.bias', 'model.levels.2.blocks.7.mlp.fc1.weight', 'model.levels.2.blocks.7.mlp.fc1.bias', 'model.levels.2.blocks.7.mlp.fc2.weight', 'model.levels.2.blocks.8.norm1.0.weight', 'model.levels.2.blocks.8.norm1.0.bias', 'model.levels.2.blocks.8.dcn.offset_mask.weight', 'model.levels.2.blocks.8.dcn.offset_mask.bias', 'model.levels.2.blocks.8.dcn.value_proj.weight', 'model.levels.2.blocks.8.dcn.value_proj.bias', 'model.levels.2.blocks.8.dcn.output_proj.weight', 'model.levels.2.blocks.8.norm2.0.weight', 'model.levels.2.blocks.8.norm2.0.bias', 'model.levels.2.blocks.8.mlp.fc1.weight', 'model.levels.2.blocks.8.mlp.fc1.bias', 'model.levels.2.blocks.8.mlp.fc2.weight', 'model.levels.2.blocks.9.norm1.0.weight', 'model.levels.2.blocks.9.norm1.0.bias', 'model.levels.2.blocks.9.dcn.offset_mask.weight', 'model.levels.2.blocks.9.dcn.offset_mask.bias', 'model.levels.2.blocks.9.dcn.value_proj.weight', 'model.levels.2.blocks.9.dcn.value_proj.bias', 'model.levels.2.blocks.9.dcn.output_proj.weight', 'model.levels.2.blocks.9.norm2.0.weight', 'model.levels.2.blocks.9.norm2.0.bias', 'model.levels.2.blocks.9.mlp.fc1.weight', 'model.levels.2.blocks.9.mlp.fc1.bias', 'model.levels.2.blocks.9.mlp.fc2.weight', 'model.levels.2.blocks.10.norm1.0.weight', 'model.levels.2.blocks.10.norm1.0.bias', 'model.levels.2.blocks.10.dcn.offset_mask.weight', 'model.levels.2.blocks.10.dcn.offset_mask.bias', 'model.levels.2.blocks.10.dcn.value_proj.weight', 'model.levels.2.blocks.10.dcn.value_proj.bias', 'model.levels.2.blocks.10.dcn.output_proj.weight', 'model.levels.2.blocks.10.norm2.0.weight', 'model.levels.2.blocks.10.norm2.0.bias', 'model.levels.2.blocks.10.mlp.fc1.weight', 'model.levels.2.blocks.10.mlp.fc1.bias', 'model.levels.2.blocks.10.mlp.fc2.weight', 'model.levels.2.blocks.11.norm1.0.weight', 'model.levels.2.blocks.11.norm1.0.bias', 'model.levels.2.blocks.11.dcn.offset_mask.weight', 'model.levels.2.blocks.11.dcn.offset_mask.bias', 'model.levels.2.blocks.11.dcn.value_proj.weight', 'model.levels.2.blocks.11.dcn.value_proj.bias', 'model.levels.2.blocks.11.dcn.output_proj.weight', 'model.levels.2.blocks.11.norm2.0.weight', 'model.levels.2.blocks.11.norm2.0.bias', 'model.levels.2.blocks.11.mlp.fc1.weight', 'model.levels.2.blocks.11.mlp.fc1.bias', 'model.levels.2.blocks.11.mlp.fc2.weight', 'model.levels.2.blocks.12.norm1.0.weight', 'model.levels.2.blocks.12.norm1.0.bias', 'model.levels.2.blocks.12.dcn.offset_mask.weight', 'model.levels.2.blocks.12.dcn.offset_mask.bias', 'model.levels.2.blocks.12.dcn.value_proj.weight', 'model.levels.2.blocks.12.dcn.value_proj.bias', 'model.levels.2.blocks.12.dcn.output_proj.weight', 'model.levels.2.blocks.12.norm2.0.weight', 'model.levels.2.blocks.12.norm2.0.bias', 'model.levels.2.blocks.12.mlp.fc1.weight', 'model.levels.2.blocks.12.mlp.fc1.bias', 'model.levels.2.blocks.12.mlp.fc2.weight', 'model.levels.2.blocks.13.norm1.0.weight', 'model.levels.2.blocks.13.norm1.0.bias', 'model.levels.2.blocks.13.dcn.offset_mask.weight', 'model.levels.2.blocks.13.dcn.offset_mask.bias', 'model.levels.2.blocks.13.dcn.value_proj.weight', 'model.levels.2.blocks.13.dcn.value_proj.bias', 'model.levels.2.blocks.13.dcn.output_proj.weight', 'model.levels.2.blocks.13.norm2.0.weight', 'model.levels.2.blocks.13.norm2.0.bias', 'model.levels.2.blocks.13.mlp.fc1.weight', 'model.levels.2.blocks.13.mlp.fc1.bias', 'model.levels.2.blocks.13.mlp.fc2.weight', 'model.levels.2.blocks.14.norm1.0.weight', 'model.levels.2.blocks.14.norm1.0.bias', 'model.levels.2.blocks.14.dcn.offset_mask.weight', 'model.levels.2.blocks.14.dcn.offset_mask.bias', 'model.levels.2.blocks.14.dcn.value_proj.weight', 'model.levels.2.blocks.14.dcn.value_proj.bias', 'model.levels.2.blocks.14.dcn.output_proj.weight', 'model.levels.2.blocks.14.norm2.0.weight', 'model.levels.2.blocks.14.norm2.0.bias', 'model.levels.2.blocks.14.mlp.fc1.weight', 'model.levels.2.blocks.14.mlp.fc1.bias', 'model.levels.2.blocks.14.mlp.fc2.weight', 'model.levels.2.blocks.15.norm1.0.weight', 'model.levels.2.blocks.15.norm1.0.bias', 'model.levels.2.blocks.15.dcn.offset_mask.weight', 'model.levels.2.blocks.15.dcn.offset_mask.bias', 'model.levels.2.blocks.15.dcn.value_proj.weight', 'model.levels.2.blocks.15.dcn.value_proj.bias', 'model.levels.2.blocks.15.dcn.output_proj.weight', 'model.levels.2.blocks.15.norm2.0.weight', 'model.levels.2.blocks.15.norm2.0.bias', 'model.levels.2.blocks.15.mlp.fc1.weight', 'model.levels.2.blocks.15.mlp.fc1.bias', 'model.levels.2.blocks.15.mlp.fc2.weight', 'model.levels.2.blocks.16.norm1.0.weight', 'model.levels.2.blocks.16.norm1.0.bias', 'model.levels.2.blocks.16.dcn.offset_mask.weight', 'model.levels.2.blocks.16.dcn.offset_mask.bias', 'model.levels.2.blocks.16.dcn.value_proj.weight', 'model.levels.2.blocks.16.dcn.value_proj.bias', 'model.levels.2.blocks.16.dcn.output_proj.weight', 'model.levels.2.blocks.16.norm2.0.weight', 'model.levels.2.blocks.16.norm2.0.bias', 'model.levels.2.blocks.16.mlp.fc1.weight', 'model.levels.2.blocks.16.mlp.fc1.bias', 'model.levels.2.blocks.16.mlp.fc2.weight', 'model.levels.2.blocks.17.norm1.0.weight', 'model.levels.2.blocks.17.norm1.0.bias', 'model.levels.2.blocks.17.dcn.offset_mask.weight', 'model.levels.2.blocks.17.dcn.offset_mask.bias', 'model.levels.2.blocks.17.dcn.value_proj.weight', 'model.levels.2.blocks.17.dcn.value_proj.bias', 'model.levels.2.blocks.17.dcn.output_proj.weight', 'model.levels.2.blocks.17.norm2.0.weight', 'model.levels.2.blocks.17.norm2.0.bias', 'model.levels.2.blocks.17.mlp.fc1.weight', 'model.levels.2.blocks.17.mlp.fc1.bias', 'model.levels.2.blocks.17.mlp.fc2.weight', 'model.levels.2.norm.0.weight', 'model.levels.2.norm.0.bias', 'model.levels.2.downsample.conv.weight', 'model.levels.2.downsample.norm.1.weight', 'model.levels.2.downsample.norm.1.bias', 'model.levels.3.blocks.0.norm1.0.weight', 'model.levels.3.blocks.0.norm1.0.bias', 'model.levels.3.blocks.0.dcn.offset_mask.weight', 'model.levels.3.blocks.0.dcn.offset_mask.bias', 'model.levels.3.blocks.0.dcn.value_proj.weight', 'model.levels.3.blocks.0.dcn.value_proj.bias', 'model.levels.3.blocks.0.dcn.output_proj.weight', 'model.levels.3.blocks.0.norm2.0.weight', 'model.levels.3.blocks.0.norm2.0.bias', 'model.levels.3.blocks.0.mlp.fc1.weight', 'model.levels.3.blocks.0.mlp.fc1.bias', 'model.levels.3.blocks.0.mlp.fc2.weight', 'model.levels.3.blocks.1.norm1.0.weight', 'model.levels.3.blocks.1.norm1.0.bias', 'model.levels.3.blocks.1.dcn.offset_mask.weight', 'model.levels.3.blocks.1.dcn.offset_mask.bias', 'model.levels.3.blocks.1.dcn.value_proj.weight', 'model.levels.3.blocks.1.dcn.value_proj.bias', 'model.levels.3.blocks.1.dcn.output_proj.weight', 'model.levels.3.blocks.1.norm2.0.weight', 'model.levels.3.blocks.1.norm2.0.bias', 'model.levels.3.blocks.1.mlp.fc1.weight', 'model.levels.3.blocks.1.mlp.fc1.bias', 'model.levels.3.blocks.1.mlp.fc2.weight', 'model.levels.3.blocks.2.norm1.0.weight', 'model.levels.3.blocks.2.norm1.0.bias', 'model.levels.3.blocks.2.dcn.offset_mask.weight', 'model.levels.3.blocks.2.dcn.offset_mask.bias', 'model.levels.3.blocks.2.dcn.value_proj.weight', 'model.levels.3.blocks.2.dcn.value_proj.bias', 'model.levels.3.blocks.2.dcn.output_proj.weight', 'model.levels.3.blocks.2.norm2.0.weight', 'model.levels.3.blocks.2.norm2.0.bias', 'model.levels.3.blocks.2.mlp.fc1.weight', 'model.levels.3.blocks.2.mlp.fc1.bias', 'model.levels.3.blocks.2.mlp.fc2.weight', 'model.levels.3.blocks.3.norm1.0.weight', 'model.levels.3.blocks.3.norm1.0.bias', 'model.levels.3.blocks.3.dcn.offset_mask.weight', 'model.levels.3.blocks.3.dcn.offset_mask.bias', 'model.levels.3.blocks.3.dcn.value_proj.weight', 'model.levels.3.blocks.3.dcn.value_proj.bias', 'model.levels.3.blocks.3.dcn.output_proj.weight', 'model.levels.3.blocks.3.norm2.0.weight', 'model.levels.3.blocks.3.norm2.0.bias', 'model.levels.3.blocks.3.mlp.fc1.weight', 'model.levels.3.blocks.3.mlp.fc1.bias', 'model.levels.3.blocks.3.mlp.fc2.weight', 'model.levels.3.norm.0.weight', 'model.levels.3.norm.0.bias', 'model.conv_head.0.weight', 'model.conv_head.1.0.weight', 'model.conv_head.1.0.bias', 'model.conv_head.1.0.running_mean', 'model.conv_head.1.0.running_var']
Unexpected keys: ['patch_embed.conv1.weight', 'patch_embed.conv1.bias', 'patch_embed.norm1.1.weight', 'patch_embed.norm1.1.bias', 'patch_embed.conv2.weight', 'patch_embed.conv2.bias', 'patch_embed.norm2.1.weight', 'patch_embed.norm2.1.bias', 'levels.0.blocks.0.norm1.0.weight', 'levels.0.blocks.0.norm1.0.bias', 'levels.0.blocks.0.dcn.offset_mask.weight', 'levels.0.blocks.0.dcn.offset_mask.bias', 'levels.0.blocks.0.dcn.value_proj.weight', 'levels.0.blocks.0.dcn.value_proj.bias', 'levels.0.blocks.0.dcn.output_proj.weight', 'levels.0.blocks.0.norm2.0.weight', 'levels.0.blocks.0.norm2.0.bias', 'levels.0.blocks.0.mlp.fc1.weight', 'levels.0.blocks.0.mlp.fc1.bias', 'levels.0.blocks.0.mlp.fc2.weight', 'levels.0.blocks.1.norm1.0.weight', 'levels.0.blocks.1.norm1.0.bias', 'levels.0.blocks.1.dcn.offset_mask.weight', 'levels.0.blocks.1.dcn.offset_mask.bias', 'levels.0.blocks.1.dcn.value_proj.weight', 'levels.0.blocks.1.dcn.value_proj.bias', 'levels.0.blocks.1.dcn.output_proj.weight', 'levels.0.blocks.1.norm2.0.weight', 'levels.0.blocks.1.norm2.0.bias', 'levels.0.blocks.1.mlp.fc1.weight', 'levels.0.blocks.1.mlp.fc1.bias', 'levels.0.blocks.1.mlp.fc2.weight', 'levels.0.blocks.2.norm1.0.weight', 'levels.0.blocks.2.norm1.0.bias', 'levels.0.blocks.2.dcn.offset_mask.weight', 'levels.0.blocks.2.dcn.offset_mask.bias', 'levels.0.blocks.2.dcn.value_proj.weight', 'levels.0.blocks.2.dcn.value_proj.bias', 'levels.0.blocks.2.dcn.output_proj.weight', 'levels.0.blocks.2.norm2.0.weight', 'levels.0.blocks.2.norm2.0.bias', 'levels.0.blocks.2.mlp.fc1.weight', 'levels.0.blocks.2.mlp.fc1.bias', 'levels.0.blocks.2.mlp.fc2.weight', 'levels.0.blocks.3.norm1.0.weight', 'levels.0.blocks.3.norm1.0.bias', 'levels.0.blocks.3.dcn.offset_mask.weight', 'levels.0.blocks.3.dcn.offset_mask.bias', 'levels.0.blocks.3.dcn.value_proj.weight', 'levels.0.blocks.3.dcn.value_proj.bias', 'levels.0.blocks.3.dcn.output_proj.weight', 'levels.0.blocks.3.norm2.0.weight', 'levels.0.blocks.3.norm2.0.bias', 'levels.0.blocks.3.mlp.fc1.weight', 'levels.0.blocks.3.mlp.fc1.bias', 'levels.0.blocks.3.mlp.fc2.weight', 'levels.0.norm.0.weight', 'levels.0.norm.0.bias', 'levels.0.downsample.conv.weight', 'levels.0.downsample.norm.1.weight', 'levels.0.downsample.norm.1.bias', 'levels.1.blocks.0.norm1.0.weight', 'levels.1.blocks.0.norm1.0.bias', 'levels.1.blocks.0.dcn.offset_mask.weight', 'levels.1.blocks.0.dcn.offset_mask.bias', 'levels.1.blocks.0.dcn.value_proj.weight', 'levels.1.blocks.0.dcn.value_proj.bias', 'levels.1.blocks.0.dcn.output_proj.weight', 'levels.1.blocks.0.norm2.0.weight', 'levels.1.blocks.0.norm2.0.bias', 'levels.1.blocks.0.mlp.fc1.weight', 'levels.1.blocks.0.mlp.fc1.bias', 'levels.1.blocks.0.mlp.fc2.weight', 'levels.1.blocks.1.norm1.0.weight', 'levels.1.blocks.1.norm1.0.bias', 'levels.1.blocks.1.dcn.offset_mask.weight', 'levels.1.blocks.1.dcn.offset_mask.bias', 'levels.1.blocks.1.dcn.value_proj.weight', 'levels.1.blocks.1.dcn.value_proj.bias', 'levels.1.blocks.1.dcn.output_proj.weight', 'levels.1.blocks.1.norm2.0.weight', 'levels.1.blocks.1.norm2.0.bias', 'levels.1.blocks.1.mlp.fc1.weight', 'levels.1.blocks.1.mlp.fc1.bias', 'levels.1.blocks.1.mlp.fc2.weight', 'levels.1.blocks.2.norm1.0.weight', 'levels.1.blocks.2.norm1.0.bias', 'levels.1.blocks.2.dcn.offset_mask.weight', 'levels.1.blocks.2.dcn.offset_mask.bias', 'levels.1.blocks.2.dcn.value_proj.weight', 'levels.1.blocks.2.dcn.value_proj.bias', 'levels.1.blocks.2.dcn.output_proj.weight', 'levels.1.blocks.2.norm2.0.weight', 'levels.1.blocks.2.norm2.0.bias', 'levels.1.blocks.2.mlp.fc1.weight', 'levels.1.blocks.2.mlp.fc1.bias', 'levels.1.blocks.2.mlp.fc2.weight', 'levels.1.blocks.3.norm1.0.weight', 'levels.1.blocks.3.norm1.0.bias', 'levels.1.blocks.3.dcn.offset_mask.weight', 'levels.1.blocks.3.dcn.offset_mask.bias', 'levels.1.blocks.3.dcn.value_proj.weight', 'levels.1.blocks.3.dcn.value_proj.bias', 'levels.1.blocks.3.dcn.output_proj.weight', 'levels.1.blocks.3.norm2.0.weight', 'levels.1.blocks.3.norm2.0.bias', 'levels.1.blocks.3.mlp.fc1.weight', 'levels.1.blocks.3.mlp.fc1.bias', 'levels.1.blocks.3.mlp.fc2.weight', 'levels.1.norm.0.weight', 'levels.1.norm.0.bias', 'levels.1.downsample.conv.weight', 'levels.1.downsample.norm.1.weight', 'levels.1.downsample.norm.1.bias', 'levels.2.blocks.0.norm1.0.weight', 'levels.2.blocks.0.norm1.0.bias', 'levels.2.blocks.0.dcn.offset_mask.weight', 'levels.2.blocks.0.dcn.offset_mask.bias', 'levels.2.blocks.0.dcn.value_proj.weight', 'levels.2.blocks.0.dcn.value_proj.bias', 'levels.2.blocks.0.dcn.output_proj.weight', 'levels.2.blocks.0.norm2.0.weight', 'levels.2.blocks.0.norm2.0.bias', 'levels.2.blocks.0.mlp.fc1.weight', 'levels.2.blocks.0.mlp.fc1.bias', 'levels.2.blocks.0.mlp.fc2.weight', 'levels.2.blocks.1.norm1.0.weight', 'levels.2.blocks.1.norm1.0.bias', 'levels.2.blocks.1.dcn.offset_mask.weight', 'levels.2.blocks.1.dcn.offset_mask.bias', 'levels.2.blocks.1.dcn.value_proj.weight', 'levels.2.blocks.1.dcn.value_proj.bias', 'levels.2.blocks.1.dcn.output_proj.weight', 'levels.2.blocks.1.norm2.0.weight', 'levels.2.blocks.1.norm2.0.bias', 'levels.2.blocks.1.mlp.fc1.weight', 'levels.2.blocks.1.mlp.fc1.bias', 'levels.2.blocks.1.mlp.fc2.weight', 'levels.2.blocks.2.norm1.0.weight', 'levels.2.blocks.2.norm1.0.bias', 'levels.2.blocks.2.dcn.offset_mask.weight', 'levels.2.blocks.2.dcn.offset_mask.bias', 'levels.2.blocks.2.dcn.value_proj.weight', 'levels.2.blocks.2.dcn.value_proj.bias', 'levels.2.blocks.2.dcn.output_proj.weight', 'levels.2.blocks.2.norm2.0.weight', 'levels.2.blocks.2.norm2.0.bias', 'levels.2.blocks.2.mlp.fc1.weight', 'levels.2.blocks.2.mlp.fc1.bias', 'levels.2.blocks.2.mlp.fc2.weight', 'levels.2.blocks.3.norm1.0.weight', 'levels.2.blocks.3.norm1.0.bias', 'levels.2.blocks.3.dcn.offset_mask.weight', 'levels.2.blocks.3.dcn.offset_mask.bias', 'levels.2.blocks.3.dcn.value_proj.weight', 'levels.2.blocks.3.dcn.value_proj.bias', 'levels.2.blocks.3.dcn.output_proj.weight', 'levels.2.blocks.3.norm2.0.weight', 'levels.2.blocks.3.norm2.0.bias', 'levels.2.blocks.3.mlp.fc1.weight', 'levels.2.blocks.3.mlp.fc1.bias', 'levels.2.blocks.3.mlp.fc2.weight', 'levels.2.blocks.4.norm1.0.weight', 'levels.2.blocks.4.norm1.0.bias', 'levels.2.blocks.4.dcn.offset_mask.weight', 'levels.2.blocks.4.dcn.offset_mask.bias', 'levels.2.blocks.4.dcn.value_proj.weight', 'levels.2.blocks.4.dcn.value_proj.bias', 'levels.2.blocks.4.dcn.output_proj.weight', 'levels.2.blocks.4.norm2.0.weight', 'levels.2.blocks.4.norm2.0.bias', 'levels.2.blocks.4.mlp.fc1.weight', 'levels.2.blocks.4.mlp.fc1.bias', 'levels.2.blocks.4.mlp.fc2.weight', 'levels.2.blocks.5.norm1.0.weight', 'levels.2.blocks.5.norm1.0.bias', 'levels.2.blocks.5.dcn.offset_mask.weight', 'levels.2.blocks.5.dcn.offset_mask.bias', 'levels.2.blocks.5.dcn.value_proj.weight', 'levels.2.blocks.5.dcn.value_proj.bias', 'levels.2.blocks.5.dcn.output_proj.weight', 'levels.2.blocks.5.norm2.0.weight', 'levels.2.blocks.5.norm2.0.bias', 'levels.2.blocks.5.mlp.fc1.weight', 'levels.2.blocks.5.mlp.fc1.bias', 'levels.2.blocks.5.mlp.fc2.weight', 'levels.2.blocks.6.norm1.0.weight', 'levels.2.blocks.6.norm1.0.bias', 'levels.2.blocks.6.dcn.offset_mask.weight', 'levels.2.blocks.6.dcn.offset_mask.bias', 'levels.2.blocks.6.dcn.value_proj.weight', 'levels.2.blocks.6.dcn.value_proj.bias', 'levels.2.blocks.6.dcn.output_proj.weight', 'levels.2.blocks.6.norm2.0.weight', 'levels.2.blocks.6.norm2.0.bias', 'levels.2.blocks.6.mlp.fc1.weight', 'levels.2.blocks.6.mlp.fc1.bias', 'levels.2.blocks.6.mlp.fc2.weight', 'levels.2.blocks.7.norm1.0.weight', 'levels.2.blocks.7.norm1.0.bias', 'levels.2.blocks.7.dcn.offset_mask.weight', 'levels.2.blocks.7.dcn.offset_mask.bias', 'levels.2.blocks.7.dcn.value_proj.weight', 'levels.2.blocks.7.dcn.value_proj.bias', 'levels.2.blocks.7.dcn.output_proj.weight', 'levels.2.blocks.7.norm2.0.weight', 'levels.2.blocks.7.norm2.0.bias', 'levels.2.blocks.7.mlp.fc1.weight', 'levels.2.blocks.7.mlp.fc1.bias', 'levels.2.blocks.7.mlp.fc2.weight', 'levels.2.blocks.8.norm1.0.weight', 'levels.2.blocks.8.norm1.0.bias', 'levels.2.blocks.8.dcn.offset_mask.weight', 'levels.2.blocks.8.dcn.offset_mask.bias', 'levels.2.blocks.8.dcn.value_proj.weight', 'levels.2.blocks.8.dcn.value_proj.bias', 'levels.2.blocks.8.dcn.output_proj.weight', 'levels.2.blocks.8.norm2.0.weight', 'levels.2.blocks.8.norm2.0.bias', 'levels.2.blocks.8.mlp.fc1.weight', 'levels.2.blocks.8.mlp.fc1.bias', 'levels.2.blocks.8.mlp.fc2.weight', 'levels.2.blocks.9.norm1.0.weight', 'levels.2.blocks.9.norm1.0.bias', 'levels.2.blocks.9.dcn.offset_mask.weight', 'levels.2.blocks.9.dcn.offset_mask.bias', 'levels.2.blocks.9.dcn.value_proj.weight', 'levels.2.blocks.9.dcn.value_proj.bias', 'levels.2.blocks.9.dcn.output_proj.weight', 'levels.2.blocks.9.norm2.0.weight', 'levels.2.blocks.9.norm2.0.bias', 'levels.2.blocks.9.mlp.fc1.weight', 'levels.2.blocks.9.mlp.fc1.bias', 'levels.2.blocks.9.mlp.fc2.weight', 'levels.2.blocks.10.norm1.0.weight', 'levels.2.blocks.10.norm1.0.bias', 'levels.2.blocks.10.dcn.offset_mask.weight', 'levels.2.blocks.10.dcn.offset_mask.bias', 'levels.2.blocks.10.dcn.value_proj.weight', 'levels.2.blocks.10.dcn.value_proj.bias', 'levels.2.blocks.10.dcn.output_proj.weight', 'levels.2.blocks.10.norm2.0.weight', 'levels.2.blocks.10.norm2.0.bias', 'levels.2.blocks.10.mlp.fc1.weight', 'levels.2.blocks.10.mlp.fc1.bias', 'levels.2.blocks.10.mlp.fc2.weight', 'levels.2.blocks.11.norm1.0.weight', 'levels.2.blocks.11.norm1.0.bias', 'levels.2.blocks.11.dcn.offset_mask.weight', 'levels.2.blocks.11.dcn.offset_mask.bias', 'levels.2.blocks.11.dcn.value_proj.weight', 'levels.2.blocks.11.dcn.value_proj.bias', 'levels.2.blocks.11.dcn.output_proj.weight', 'levels.2.blocks.11.norm2.0.weight', 'levels.2.blocks.11.norm2.0.bias', 'levels.2.blocks.11.mlp.fc1.weight', 'levels.2.blocks.11.mlp.fc1.bias', 'levels.2.blocks.11.mlp.fc2.weight', 'levels.2.blocks.12.norm1.0.weight', 'levels.2.blocks.12.norm1.0.bias', 'levels.2.blocks.12.dcn.offset_mask.weight', 'levels.2.blocks.12.dcn.offset_mask.bias', 'levels.2.blocks.12.dcn.value_proj.weight', 'levels.2.blocks.12.dcn.value_proj.bias', 'levels.2.blocks.12.dcn.output_proj.weight', 'levels.2.blocks.12.norm2.0.weight', 'levels.2.blocks.12.norm2.0.bias', 'levels.2.blocks.12.mlp.fc1.weight', 'levels.2.blocks.12.mlp.fc1.bias', 'levels.2.blocks.12.mlp.fc2.weight', 'levels.2.blocks.13.norm1.0.weight', 'levels.2.blocks.13.norm1.0.bias', 'levels.2.blocks.13.dcn.offset_mask.weight', 'levels.2.blocks.13.dcn.offset_mask.bias', 'levels.2.blocks.13.dcn.value_proj.weight', 'levels.2.blocks.13.dcn.value_proj.bias', 'levels.2.blocks.13.dcn.output_proj.weight', 'levels.2.blocks.13.norm2.0.weight', 'levels.2.blocks.13.norm2.0.bias', 'levels.2.blocks.13.mlp.fc1.weight', 'levels.2.blocks.13.mlp.fc1.bias', 'levels.2.blocks.13.mlp.fc2.weight', 'levels.2.blocks.14.norm1.0.weight', 'levels.2.blocks.14.norm1.0.bias', 'levels.2.blocks.14.dcn.offset_mask.weight', 'levels.2.blocks.14.dcn.offset_mask.bias', 'levels.2.blocks.14.dcn.value_proj.weight', 'levels.2.blocks.14.dcn.value_proj.bias', 'levels.2.blocks.14.dcn.output_proj.weight', 'levels.2.blocks.14.norm2.0.weight', 'levels.2.blocks.14.norm2.0.bias', 'levels.2.blocks.14.mlp.fc1.weight', 'levels.2.blocks.14.mlp.fc1.bias', 'levels.2.blocks.14.mlp.fc2.weight', 'levels.2.blocks.15.norm1.0.weight', 'levels.2.blocks.15.norm1.0.bias', 'levels.2.blocks.15.dcn.offset_mask.weight', 'levels.2.blocks.15.dcn.offset_mask.bias', 'levels.2.blocks.15.dcn.value_proj.weight', 'levels.2.blocks.15.dcn.value_proj.bias', 'levels.2.blocks.15.dcn.output_proj.weight', 'levels.2.blocks.15.norm2.0.weight', 'levels.2.blocks.15.norm2.0.bias', 'levels.2.blocks.15.mlp.fc1.weight', 'levels.2.blocks.15.mlp.fc1.bias', 'levels.2.blocks.15.mlp.fc2.weight', 'levels.2.blocks.16.norm1.0.weight', 'levels.2.blocks.16.norm1.0.bias', 'levels.2.blocks.16.dcn.offset_mask.weight', 'levels.2.blocks.16.dcn.offset_mask.bias', 'levels.2.blocks.16.dcn.value_proj.weight', 'levels.2.blocks.16.dcn.value_proj.bias', 'levels.2.blocks.16.dcn.output_proj.weight', 'levels.2.blocks.16.norm2.0.weight', 'levels.2.blocks.16.norm2.0.bias', 'levels.2.blocks.16.mlp.fc1.weight', 'levels.2.blocks.16.mlp.fc1.bias', 'levels.2.blocks.16.mlp.fc2.weight', 'levels.2.blocks.17.norm1.0.weight', 'levels.2.blocks.17.norm1.0.bias', 'levels.2.blocks.17.dcn.offset_mask.weight', 'levels.2.blocks.17.dcn.offset_mask.bias', 'levels.2.blocks.17.dcn.value_proj.weight', 'levels.2.blocks.17.dcn.value_proj.bias', 'levels.2.blocks.17.dcn.output_proj.weight', 'levels.2.blocks.17.norm2.0.weight', 'levels.2.blocks.17.norm2.0.bias', 'levels.2.blocks.17.mlp.fc1.weight', 'levels.2.blocks.17.mlp.fc1.bias', 'levels.2.blocks.17.mlp.fc2.weight', 'levels.2.norm.0.weight', 'levels.2.norm.0.bias', 'levels.2.downsample.conv.weight', 'levels.2.downsample.norm.1.weight', 'levels.2.downsample.norm.1.bias', 'levels.3.blocks.0.norm1.0.weight', 'levels.3.blocks.0.norm1.0.bias', 'levels.3.blocks.0.dcn.offset_mask.weight', 'levels.3.blocks.0.dcn.offset_mask.bias', 'levels.3.blocks.0.dcn.value_proj.weight', 'levels.3.blocks.0.dcn.value_proj.bias', 'levels.3.blocks.0.dcn.output_proj.weight', 'levels.3.blocks.0.norm2.0.weight', 'levels.3.blocks.0.norm2.0.bias', 'levels.3.blocks.0.mlp.fc1.weight', 'levels.3.blocks.0.mlp.fc1.bias', 'levels.3.blocks.0.mlp.fc2.weight', 'levels.3.blocks.1.norm1.0.weight', 'levels.3.blocks.1.norm1.0.bias', 'levels.3.blocks.1.dcn.offset_mask.weight', 'levels.3.blocks.1.dcn.offset_mask.bias', 'levels.3.blocks.1.dcn.value_proj.weight', 'levels.3.blocks.1.dcn.value_proj.bias', 'levels.3.blocks.1.dcn.output_proj.weight', 'levels.3.blocks.1.norm2.0.weight', 'levels.3.blocks.1.norm2.0.bias', 'levels.3.blocks.1.mlp.fc1.weight', 'levels.3.blocks.1.mlp.fc1.bias', 'levels.3.blocks.1.mlp.fc2.weight', 'levels.3.blocks.2.norm1.0.weight', 'levels.3.blocks.2.norm1.0.bias', 'levels.3.blocks.2.dcn.offset_mask.weight', 'levels.3.blocks.2.dcn.offset_mask.bias', 'levels.3.blocks.2.dcn.value_proj.weight', 'levels.3.blocks.2.dcn.value_proj.bias', 'levels.3.blocks.2.dcn.output_proj.weight', 'levels.3.blocks.2.norm2.0.weight', 'levels.3.blocks.2.norm2.0.bias', 'levels.3.blocks.2.mlp.fc1.weight', 'levels.3.blocks.2.mlp.fc1.bias', 'levels.3.blocks.2.mlp.fc2.weight', 'levels.3.blocks.3.norm1.0.weight', 'levels.3.blocks.3.norm1.0.bias', 'levels.3.blocks.3.dcn.offset_mask.weight', 'levels.3.blocks.3.dcn.offset_mask.bias', 'levels.3.blocks.3.dcn.value_proj.weight', 'levels.3.blocks.3.dcn.value_proj.bias', 'levels.3.blocks.3.dcn.output_proj.weight', 'levels.3.blocks.3.norm2.0.weight', 'levels.3.blocks.3.norm2.0.bias', 'levels.3.blocks.3.mlp.fc1.weight', 'levels.3.blocks.3.mlp.fc1.bias', 'levels.3.blocks.3.mlp.fc2.weight', 'levels.3.norm.0.weight', 'levels.3.norm.0.bias', 'conv_head.0.weight', 'conv_head.1.0.weight', 'conv_head.1.0.bias', 'conv_head.1.0.running_mean', 'conv_head.1.0.running_var', 'conv_head.1.0.num_batches_tracked', 'head.weight', 'head.bias']

可以看到,我的模型的名字每一层都比预训练的权重多了一个'model.',这就导致了无法加载权重。

于是就把预训练的权重的键名加上'model.'即可。

python 复制代码
        model_weight= {'model.' + key: value for key, value in model_weight.items()}

然后重新调试,可以看到输出:

复制代码
Missing keys: []
Unexpected keys: ['model.head.weight', 'model.head.bias']

可以看到Missing keys为空,所以需要的权重全部加载了。

相关推荐
iAm_Ike6 小时前
Go 中自定义类型与基础类型间的显式类型转换详解
jvm·数据库·python
iuvtsrt6 小时前
Golang怎么实现方法集与接口的匹配_Golang如何理解值类型和指针类型实现接口的区别【详解】
jvm·数据库·python
牧子川6 小时前
009-Transformer-Architecture
人工智能·深度学习·transformer
covco6 小时前
矩阵管理系统指南:拆解星链引擎的架构设计与全链路落地实践
大数据·人工智能·矩阵
沪漂阿龙6 小时前
AI大模型面试题:支持向量机是什么?间隔最大化、软间隔、核函数、LinearSVC 全面拆解
人工智能·算法·支持向量机
lifewange6 小时前
AI编写测试用例工具介绍
人工智能·测试用例
陕西字符6 小时前
2026 西安 豆包获客优化技术深度解析:企来客科技 AI 全域获客系统测评
大数据·人工智能
掘金安东尼6 小时前
GGUF、GPTQ、AWQ、EXL2、MLX、VMLX...运行大模型,为什么会有这么多格式?
人工智能
新知图书6 小时前
市场分析报告自动化生成(使用千问)
人工智能·ai助手·千问·高效办公
无心水6 小时前
【Hermes:安全、权限与生产环境】38、Hermes Agent 安全四层纵深:最小权限原则从理论到落地的完全指南
人工智能·安全·mcp协议·openclaw·养龙虾·hermes·honcho