pytorch加载预训练权重失败

问题

给当前模型换了个开源的主干网络,并且删除了某些层后,但是发现预训练权重一直加载不上。strict为True时加载报错,strict为False时又什么都加载不上,然后不知道哪里出问题了。

解决

当strict为False时,load_state_dict函数会返回一个字典,该字典含有以下两个键:

missing_keys:在当前模型中存在,但在预训练权重中不存在的键。
unexpected_keys:在当前模型不存在,但在预训练权重中存在的键。
python 复制代码
        result=self.backbone.load_state_dict(model_weight,strict=False)
        print("Missing keys:", result.missing_keys)
        print("Unexpected keys:", result.unexpected_keys)

得到输出:

Missing keys: ['model.patch_embed.conv1.weight', 'model.patch_embed.conv1.bias', 'model.patch_embed.norm1.1.weight', 'model.patch_embed.norm1.1.bias', 'model.patch_embed.conv2.weight', 'model.patch_embed.conv2.bias', 'model.patch_embed.norm2.1.weight', 'model.patch_embed.norm2.1.bias', 'model.levels.0.blocks.0.norm1.0.weight', 'model.levels.0.blocks.0.norm1.0.bias', 'model.levels.0.blocks.0.dcn.offset_mask.weight', 'model.levels.0.blocks.0.dcn.offset_mask.bias', 'model.levels.0.blocks.0.dcn.value_proj.weight', 'model.levels.0.blocks.0.dcn.value_proj.bias', 'model.levels.0.blocks.0.dcn.output_proj.weight', 'model.levels.0.blocks.0.norm2.0.weight', 'model.levels.0.blocks.0.norm2.0.bias', 'model.levels.0.blocks.0.mlp.fc1.weight', 'model.levels.0.blocks.0.mlp.fc1.bias', 'model.levels.0.blocks.0.mlp.fc2.weight', 'model.levels.0.blocks.1.norm1.0.weight', 'model.levels.0.blocks.1.norm1.0.bias', 'model.levels.0.blocks.1.dcn.offset_mask.weight', 'model.levels.0.blocks.1.dcn.offset_mask.bias', 'model.levels.0.blocks.1.dcn.value_proj.weight', 'model.levels.0.blocks.1.dcn.value_proj.bias', 'model.levels.0.blocks.1.dcn.output_proj.weight', 'model.levels.0.blocks.1.norm2.0.weight', 'model.levels.0.blocks.1.norm2.0.bias', 'model.levels.0.blocks.1.mlp.fc1.weight', 'model.levels.0.blocks.1.mlp.fc1.bias', 'model.levels.0.blocks.1.mlp.fc2.weight', 'model.levels.0.blocks.2.norm1.0.weight', 'model.levels.0.blocks.2.norm1.0.bias', 'model.levels.0.blocks.2.dcn.offset_mask.weight', 'model.levels.0.blocks.2.dcn.offset_mask.bias', 'model.levels.0.blocks.2.dcn.value_proj.weight', 'model.levels.0.blocks.2.dcn.value_proj.bias', 'model.levels.0.blocks.2.dcn.output_proj.weight', 'model.levels.0.blocks.2.norm2.0.weight', 'model.levels.0.blocks.2.norm2.0.bias', 'model.levels.0.blocks.2.mlp.fc1.weight', 'model.levels.0.blocks.2.mlp.fc1.bias', 'model.levels.0.blocks.2.mlp.fc2.weight', 'model.levels.0.blocks.3.norm1.0.weight', 'model.levels.0.blocks.3.norm1.0.bias', 'model.levels.0.blocks.3.dcn.offset_mask.weight', 'model.levels.0.blocks.3.dcn.offset_mask.bias', 'model.levels.0.blocks.3.dcn.value_proj.weight', 'model.levels.0.blocks.3.dcn.value_proj.bias', 'model.levels.0.blocks.3.dcn.output_proj.weight', 'model.levels.0.blocks.3.norm2.0.weight', 'model.levels.0.blocks.3.norm2.0.bias', 'model.levels.0.blocks.3.mlp.fc1.weight', 'model.levels.0.blocks.3.mlp.fc1.bias', 'model.levels.0.blocks.3.mlp.fc2.weight', 'model.levels.0.norm.0.weight', 'model.levels.0.norm.0.bias', 'model.levels.0.downsample.conv.weight', 'model.levels.0.downsample.norm.1.weight', 'model.levels.0.downsample.norm.1.bias', 'model.levels.1.blocks.0.norm1.0.weight', 'model.levels.1.blocks.0.norm1.0.bias', 'model.levels.1.blocks.0.dcn.offset_mask.weight', 'model.levels.1.blocks.0.dcn.offset_mask.bias', 'model.levels.1.blocks.0.dcn.value_proj.weight', 'model.levels.1.blocks.0.dcn.value_proj.bias', 'model.levels.1.blocks.0.dcn.output_proj.weight', 'model.levels.1.blocks.0.norm2.0.weight', 'model.levels.1.blocks.0.norm2.0.bias', 'model.levels.1.blocks.0.mlp.fc1.weight', 'model.levels.1.blocks.0.mlp.fc1.bias', 'model.levels.1.blocks.0.mlp.fc2.weight', 'model.levels.1.blocks.1.norm1.0.weight', 'model.levels.1.blocks.1.norm1.0.bias', 'model.levels.1.blocks.1.dcn.offset_mask.weight', 'model.levels.1.blocks.1.dcn.offset_mask.bias', 'model.levels.1.blocks.1.dcn.value_proj.weight', 'model.levels.1.blocks.1.dcn.value_proj.bias', 'model.levels.1.blocks.1.dcn.output_proj.weight', 'model.levels.1.blocks.1.norm2.0.weight', 'model.levels.1.blocks.1.norm2.0.bias', 'model.levels.1.blocks.1.mlp.fc1.weight', 'model.levels.1.blocks.1.mlp.fc1.bias', 'model.levels.1.blocks.1.mlp.fc2.weight', 'model.levels.1.blocks.2.norm1.0.weight', 'model.levels.1.blocks.2.norm1.0.bias', 'model.levels.1.blocks.2.dcn.offset_mask.weight', 'model.levels.1.blocks.2.dcn.offset_mask.bias', 'model.levels.1.blocks.2.dcn.value_proj.weight', 'model.levels.1.blocks.2.dcn.value_proj.bias', 'model.levels.1.blocks.2.dcn.output_proj.weight', 'model.levels.1.blocks.2.norm2.0.weight', 'model.levels.1.blocks.2.norm2.0.bias', 'model.levels.1.blocks.2.mlp.fc1.weight', 'model.levels.1.blocks.2.mlp.fc1.bias', 'model.levels.1.blocks.2.mlp.fc2.weight', 'model.levels.1.blocks.3.norm1.0.weight', 'model.levels.1.blocks.3.norm1.0.bias', 'model.levels.1.blocks.3.dcn.offset_mask.weight', 'model.levels.1.blocks.3.dcn.offset_mask.bias', 'model.levels.1.blocks.3.dcn.value_proj.weight', 'model.levels.1.blocks.3.dcn.value_proj.bias', 'model.levels.1.blocks.3.dcn.output_proj.weight', 'model.levels.1.blocks.3.norm2.0.weight', 'model.levels.1.blocks.3.norm2.0.bias', 'model.levels.1.blocks.3.mlp.fc1.weight', 'model.levels.1.blocks.3.mlp.fc1.bias', 'model.levels.1.blocks.3.mlp.fc2.weight', 'model.levels.1.norm.0.weight', 'model.levels.1.norm.0.bias', 'model.levels.1.downsample.conv.weight', 'model.levels.1.downsample.norm.1.weight', 'model.levels.1.downsample.norm.1.bias', 'model.levels.2.blocks.0.norm1.0.weight', 'model.levels.2.blocks.0.norm1.0.bias', 'model.levels.2.blocks.0.dcn.offset_mask.weight', 'model.levels.2.blocks.0.dcn.offset_mask.bias', 'model.levels.2.blocks.0.dcn.value_proj.weight', 'model.levels.2.blocks.0.dcn.value_proj.bias', 'model.levels.2.blocks.0.dcn.output_proj.weight', 'model.levels.2.blocks.0.norm2.0.weight', 'model.levels.2.blocks.0.norm2.0.bias', 'model.levels.2.blocks.0.mlp.fc1.weight', 'model.levels.2.blocks.0.mlp.fc1.bias', 'model.levels.2.blocks.0.mlp.fc2.weight', 'model.levels.2.blocks.1.norm1.0.weight', 'model.levels.2.blocks.1.norm1.0.bias', 'model.levels.2.blocks.1.dcn.offset_mask.weight', 'model.levels.2.blocks.1.dcn.offset_mask.bias', 'model.levels.2.blocks.1.dcn.value_proj.weight', 'model.levels.2.blocks.1.dcn.value_proj.bias', 'model.levels.2.blocks.1.dcn.output_proj.weight', 'model.levels.2.blocks.1.norm2.0.weight', 'model.levels.2.blocks.1.norm2.0.bias', 'model.levels.2.blocks.1.mlp.fc1.weight', 'model.levels.2.blocks.1.mlp.fc1.bias', 'model.levels.2.blocks.1.mlp.fc2.weight', 'model.levels.2.blocks.2.norm1.0.weight', 'model.levels.2.blocks.2.norm1.0.bias', 'model.levels.2.blocks.2.dcn.offset_mask.weight', 'model.levels.2.blocks.2.dcn.offset_mask.bias', 'model.levels.2.blocks.2.dcn.value_proj.weight', 'model.levels.2.blocks.2.dcn.value_proj.bias', 'model.levels.2.blocks.2.dcn.output_proj.weight', 'model.levels.2.blocks.2.norm2.0.weight', 'model.levels.2.blocks.2.norm2.0.bias', 'model.levels.2.blocks.2.mlp.fc1.weight', 'model.levels.2.blocks.2.mlp.fc1.bias', 'model.levels.2.blocks.2.mlp.fc2.weight', 'model.levels.2.blocks.3.norm1.0.weight', 'model.levels.2.blocks.3.norm1.0.bias', 'model.levels.2.blocks.3.dcn.offset_mask.weight', 'model.levels.2.blocks.3.dcn.offset_mask.bias', 'model.levels.2.blocks.3.dcn.value_proj.weight', 'model.levels.2.blocks.3.dcn.value_proj.bias', 'model.levels.2.blocks.3.dcn.output_proj.weight', 'model.levels.2.blocks.3.norm2.0.weight', 'model.levels.2.blocks.3.norm2.0.bias', 'model.levels.2.blocks.3.mlp.fc1.weight', 'model.levels.2.blocks.3.mlp.fc1.bias', 'model.levels.2.blocks.3.mlp.fc2.weight', 'model.levels.2.blocks.4.norm1.0.weight', 'model.levels.2.blocks.4.norm1.0.bias', 'model.levels.2.blocks.4.dcn.offset_mask.weight', 'model.levels.2.blocks.4.dcn.offset_mask.bias', 'model.levels.2.blocks.4.dcn.value_proj.weight', 'model.levels.2.blocks.4.dcn.value_proj.bias', 'model.levels.2.blocks.4.dcn.output_proj.weight', 'model.levels.2.blocks.4.norm2.0.weight', 'model.levels.2.blocks.4.norm2.0.bias', 'model.levels.2.blocks.4.mlp.fc1.weight', 'model.levels.2.blocks.4.mlp.fc1.bias', 'model.levels.2.blocks.4.mlp.fc2.weight', 'model.levels.2.blocks.5.norm1.0.weight', 'model.levels.2.blocks.5.norm1.0.bias', 'model.levels.2.blocks.5.dcn.offset_mask.weight', 'model.levels.2.blocks.5.dcn.offset_mask.bias', 'model.levels.2.blocks.5.dcn.value_proj.weight', 'model.levels.2.blocks.5.dcn.value_proj.bias', 'model.levels.2.blocks.5.dcn.output_proj.weight', 'model.levels.2.blocks.5.norm2.0.weight', 'model.levels.2.blocks.5.norm2.0.bias', 'model.levels.2.blocks.5.mlp.fc1.weight', 'model.levels.2.blocks.5.mlp.fc1.bias', 'model.levels.2.blocks.5.mlp.fc2.weight', 'model.levels.2.blocks.6.norm1.0.weight', 'model.levels.2.blocks.6.norm1.0.bias', 'model.levels.2.blocks.6.dcn.offset_mask.weight', 'model.levels.2.blocks.6.dcn.offset_mask.bias', 'model.levels.2.blocks.6.dcn.value_proj.weight', 'model.levels.2.blocks.6.dcn.value_proj.bias', 'model.levels.2.blocks.6.dcn.output_proj.weight', 'model.levels.2.blocks.6.norm2.0.weight', 'model.levels.2.blocks.6.norm2.0.bias', 'model.levels.2.blocks.6.mlp.fc1.weight', 'model.levels.2.blocks.6.mlp.fc1.bias', 'model.levels.2.blocks.6.mlp.fc2.weight', 'model.levels.2.blocks.7.norm1.0.weight', 'model.levels.2.blocks.7.norm1.0.bias', 'model.levels.2.blocks.7.dcn.offset_mask.weight', 'model.levels.2.blocks.7.dcn.offset_mask.bias', 'model.levels.2.blocks.7.dcn.value_proj.weight', 'model.levels.2.blocks.7.dcn.value_proj.bias', 'model.levels.2.blocks.7.dcn.output_proj.weight', 'model.levels.2.blocks.7.norm2.0.weight', 'model.levels.2.blocks.7.norm2.0.bias', 'model.levels.2.blocks.7.mlp.fc1.weight', 'model.levels.2.blocks.7.mlp.fc1.bias', 'model.levels.2.blocks.7.mlp.fc2.weight', 'model.levels.2.blocks.8.norm1.0.weight', 'model.levels.2.blocks.8.norm1.0.bias', 'model.levels.2.blocks.8.dcn.offset_mask.weight', 'model.levels.2.blocks.8.dcn.offset_mask.bias', 'model.levels.2.blocks.8.dcn.value_proj.weight', 'model.levels.2.blocks.8.dcn.value_proj.bias', 'model.levels.2.blocks.8.dcn.output_proj.weight', 'model.levels.2.blocks.8.norm2.0.weight', 'model.levels.2.blocks.8.norm2.0.bias', 'model.levels.2.blocks.8.mlp.fc1.weight', 'model.levels.2.blocks.8.mlp.fc1.bias', 'model.levels.2.blocks.8.mlp.fc2.weight', 'model.levels.2.blocks.9.norm1.0.weight', 'model.levels.2.blocks.9.norm1.0.bias', 'model.levels.2.blocks.9.dcn.offset_mask.weight', 'model.levels.2.blocks.9.dcn.offset_mask.bias', 'model.levels.2.blocks.9.dcn.value_proj.weight', 'model.levels.2.blocks.9.dcn.value_proj.bias', 'model.levels.2.blocks.9.dcn.output_proj.weight', 'model.levels.2.blocks.9.norm2.0.weight', 'model.levels.2.blocks.9.norm2.0.bias', 'model.levels.2.blocks.9.mlp.fc1.weight', 'model.levels.2.blocks.9.mlp.fc1.bias', 'model.levels.2.blocks.9.mlp.fc2.weight', 'model.levels.2.blocks.10.norm1.0.weight', 'model.levels.2.blocks.10.norm1.0.bias', 'model.levels.2.blocks.10.dcn.offset_mask.weight', 'model.levels.2.blocks.10.dcn.offset_mask.bias', 'model.levels.2.blocks.10.dcn.value_proj.weight', 'model.levels.2.blocks.10.dcn.value_proj.bias', 'model.levels.2.blocks.10.dcn.output_proj.weight', 'model.levels.2.blocks.10.norm2.0.weight', 'model.levels.2.blocks.10.norm2.0.bias', 'model.levels.2.blocks.10.mlp.fc1.weight', 'model.levels.2.blocks.10.mlp.fc1.bias', 'model.levels.2.blocks.10.mlp.fc2.weight', 'model.levels.2.blocks.11.norm1.0.weight', 'model.levels.2.blocks.11.norm1.0.bias', 'model.levels.2.blocks.11.dcn.offset_mask.weight', 'model.levels.2.blocks.11.dcn.offset_mask.bias', 'model.levels.2.blocks.11.dcn.value_proj.weight', 'model.levels.2.blocks.11.dcn.value_proj.bias', 'model.levels.2.blocks.11.dcn.output_proj.weight', 'model.levels.2.blocks.11.norm2.0.weight', 'model.levels.2.blocks.11.norm2.0.bias', 'model.levels.2.blocks.11.mlp.fc1.weight', 'model.levels.2.blocks.11.mlp.fc1.bias', 'model.levels.2.blocks.11.mlp.fc2.weight', 'model.levels.2.blocks.12.norm1.0.weight', 'model.levels.2.blocks.12.norm1.0.bias', 'model.levels.2.blocks.12.dcn.offset_mask.weight', 'model.levels.2.blocks.12.dcn.offset_mask.bias', 'model.levels.2.blocks.12.dcn.value_proj.weight', 'model.levels.2.blocks.12.dcn.value_proj.bias', 'model.levels.2.blocks.12.dcn.output_proj.weight', 'model.levels.2.blocks.12.norm2.0.weight', 'model.levels.2.blocks.12.norm2.0.bias', 'model.levels.2.blocks.12.mlp.fc1.weight', 'model.levels.2.blocks.12.mlp.fc1.bias', 'model.levels.2.blocks.12.mlp.fc2.weight', 'model.levels.2.blocks.13.norm1.0.weight', 'model.levels.2.blocks.13.norm1.0.bias', 'model.levels.2.blocks.13.dcn.offset_mask.weight', 'model.levels.2.blocks.13.dcn.offset_mask.bias', 'model.levels.2.blocks.13.dcn.value_proj.weight', 'model.levels.2.blocks.13.dcn.value_proj.bias', 'model.levels.2.blocks.13.dcn.output_proj.weight', 'model.levels.2.blocks.13.norm2.0.weight', 'model.levels.2.blocks.13.norm2.0.bias', 'model.levels.2.blocks.13.mlp.fc1.weight', 'model.levels.2.blocks.13.mlp.fc1.bias', 'model.levels.2.blocks.13.mlp.fc2.weight', 'model.levels.2.blocks.14.norm1.0.weight', 'model.levels.2.blocks.14.norm1.0.bias', 'model.levels.2.blocks.14.dcn.offset_mask.weight', 'model.levels.2.blocks.14.dcn.offset_mask.bias', 'model.levels.2.blocks.14.dcn.value_proj.weight', 'model.levels.2.blocks.14.dcn.value_proj.bias', 'model.levels.2.blocks.14.dcn.output_proj.weight', 'model.levels.2.blocks.14.norm2.0.weight', 'model.levels.2.blocks.14.norm2.0.bias', 'model.levels.2.blocks.14.mlp.fc1.weight', 'model.levels.2.blocks.14.mlp.fc1.bias', 'model.levels.2.blocks.14.mlp.fc2.weight', 'model.levels.2.blocks.15.norm1.0.weight', 'model.levels.2.blocks.15.norm1.0.bias', 'model.levels.2.blocks.15.dcn.offset_mask.weight', 'model.levels.2.blocks.15.dcn.offset_mask.bias', 'model.levels.2.blocks.15.dcn.value_proj.weight', 'model.levels.2.blocks.15.dcn.value_proj.bias', 'model.levels.2.blocks.15.dcn.output_proj.weight', 'model.levels.2.blocks.15.norm2.0.weight', 'model.levels.2.blocks.15.norm2.0.bias', 'model.levels.2.blocks.15.mlp.fc1.weight', 'model.levels.2.blocks.15.mlp.fc1.bias', 'model.levels.2.blocks.15.mlp.fc2.weight', 'model.levels.2.blocks.16.norm1.0.weight', 'model.levels.2.blocks.16.norm1.0.bias', 'model.levels.2.blocks.16.dcn.offset_mask.weight', 'model.levels.2.blocks.16.dcn.offset_mask.bias', 'model.levels.2.blocks.16.dcn.value_proj.weight', 'model.levels.2.blocks.16.dcn.value_proj.bias', 'model.levels.2.blocks.16.dcn.output_proj.weight', 'model.levels.2.blocks.16.norm2.0.weight', 'model.levels.2.blocks.16.norm2.0.bias', 'model.levels.2.blocks.16.mlp.fc1.weight', 'model.levels.2.blocks.16.mlp.fc1.bias', 'model.levels.2.blocks.16.mlp.fc2.weight', 'model.levels.2.blocks.17.norm1.0.weight', 'model.levels.2.blocks.17.norm1.0.bias', 'model.levels.2.blocks.17.dcn.offset_mask.weight', 'model.levels.2.blocks.17.dcn.offset_mask.bias', 'model.levels.2.blocks.17.dcn.value_proj.weight', 'model.levels.2.blocks.17.dcn.value_proj.bias', 'model.levels.2.blocks.17.dcn.output_proj.weight', 'model.levels.2.blocks.17.norm2.0.weight', 'model.levels.2.blocks.17.norm2.0.bias', 'model.levels.2.blocks.17.mlp.fc1.weight', 'model.levels.2.blocks.17.mlp.fc1.bias', 'model.levels.2.blocks.17.mlp.fc2.weight', 'model.levels.2.norm.0.weight', 'model.levels.2.norm.0.bias', 'model.levels.2.downsample.conv.weight', 'model.levels.2.downsample.norm.1.weight', 'model.levels.2.downsample.norm.1.bias', 'model.levels.3.blocks.0.norm1.0.weight', 'model.levels.3.blocks.0.norm1.0.bias', 'model.levels.3.blocks.0.dcn.offset_mask.weight', 'model.levels.3.blocks.0.dcn.offset_mask.bias', 'model.levels.3.blocks.0.dcn.value_proj.weight', 'model.levels.3.blocks.0.dcn.value_proj.bias', 'model.levels.3.blocks.0.dcn.output_proj.weight', 'model.levels.3.blocks.0.norm2.0.weight', 'model.levels.3.blocks.0.norm2.0.bias', 'model.levels.3.blocks.0.mlp.fc1.weight', 'model.levels.3.blocks.0.mlp.fc1.bias', 'model.levels.3.blocks.0.mlp.fc2.weight', 'model.levels.3.blocks.1.norm1.0.weight', 'model.levels.3.blocks.1.norm1.0.bias', 'model.levels.3.blocks.1.dcn.offset_mask.weight', 'model.levels.3.blocks.1.dcn.offset_mask.bias', 'model.levels.3.blocks.1.dcn.value_proj.weight', 'model.levels.3.blocks.1.dcn.value_proj.bias', 'model.levels.3.blocks.1.dcn.output_proj.weight', 'model.levels.3.blocks.1.norm2.0.weight', 'model.levels.3.blocks.1.norm2.0.bias', 'model.levels.3.blocks.1.mlp.fc1.weight', 'model.levels.3.blocks.1.mlp.fc1.bias', 'model.levels.3.blocks.1.mlp.fc2.weight', 'model.levels.3.blocks.2.norm1.0.weight', 'model.levels.3.blocks.2.norm1.0.bias', 'model.levels.3.blocks.2.dcn.offset_mask.weight', 'model.levels.3.blocks.2.dcn.offset_mask.bias', 'model.levels.3.blocks.2.dcn.value_proj.weight', 'model.levels.3.blocks.2.dcn.value_proj.bias', 'model.levels.3.blocks.2.dcn.output_proj.weight', 'model.levels.3.blocks.2.norm2.0.weight', 'model.levels.3.blocks.2.norm2.0.bias', 'model.levels.3.blocks.2.mlp.fc1.weight', 'model.levels.3.blocks.2.mlp.fc1.bias', 'model.levels.3.blocks.2.mlp.fc2.weight', 'model.levels.3.blocks.3.norm1.0.weight', 'model.levels.3.blocks.3.norm1.0.bias', 'model.levels.3.blocks.3.dcn.offset_mask.weight', 'model.levels.3.blocks.3.dcn.offset_mask.bias', 'model.levels.3.blocks.3.dcn.value_proj.weight', 'model.levels.3.blocks.3.dcn.value_proj.bias', 'model.levels.3.blocks.3.dcn.output_proj.weight', 'model.levels.3.blocks.3.norm2.0.weight', 'model.levels.3.blocks.3.norm2.0.bias', 'model.levels.3.blocks.3.mlp.fc1.weight', 'model.levels.3.blocks.3.mlp.fc1.bias', 'model.levels.3.blocks.3.mlp.fc2.weight', 'model.levels.3.norm.0.weight', 'model.levels.3.norm.0.bias', 'model.conv_head.0.weight', 'model.conv_head.1.0.weight', 'model.conv_head.1.0.bias', 'model.conv_head.1.0.running_mean', 'model.conv_head.1.0.running_var']
Unexpected keys: ['patch_embed.conv1.weight', 'patch_embed.conv1.bias', 'patch_embed.norm1.1.weight', 'patch_embed.norm1.1.bias', 'patch_embed.conv2.weight', 'patch_embed.conv2.bias', 'patch_embed.norm2.1.weight', 'patch_embed.norm2.1.bias', 'levels.0.blocks.0.norm1.0.weight', 'levels.0.blocks.0.norm1.0.bias', 'levels.0.blocks.0.dcn.offset_mask.weight', 'levels.0.blocks.0.dcn.offset_mask.bias', 'levels.0.blocks.0.dcn.value_proj.weight', 'levels.0.blocks.0.dcn.value_proj.bias', 'levels.0.blocks.0.dcn.output_proj.weight', 'levels.0.blocks.0.norm2.0.weight', 'levels.0.blocks.0.norm2.0.bias', 'levels.0.blocks.0.mlp.fc1.weight', 'levels.0.blocks.0.mlp.fc1.bias', 'levels.0.blocks.0.mlp.fc2.weight', 'levels.0.blocks.1.norm1.0.weight', 'levels.0.blocks.1.norm1.0.bias', 'levels.0.blocks.1.dcn.offset_mask.weight', 'levels.0.blocks.1.dcn.offset_mask.bias', 'levels.0.blocks.1.dcn.value_proj.weight', 'levels.0.blocks.1.dcn.value_proj.bias', 'levels.0.blocks.1.dcn.output_proj.weight', 'levels.0.blocks.1.norm2.0.weight', 'levels.0.blocks.1.norm2.0.bias', 'levels.0.blocks.1.mlp.fc1.weight', 'levels.0.blocks.1.mlp.fc1.bias', 'levels.0.blocks.1.mlp.fc2.weight', 'levels.0.blocks.2.norm1.0.weight', 'levels.0.blocks.2.norm1.0.bias', 'levels.0.blocks.2.dcn.offset_mask.weight', 'levels.0.blocks.2.dcn.offset_mask.bias', 'levels.0.blocks.2.dcn.value_proj.weight', 'levels.0.blocks.2.dcn.value_proj.bias', 'levels.0.blocks.2.dcn.output_proj.weight', 'levels.0.blocks.2.norm2.0.weight', 'levels.0.blocks.2.norm2.0.bias', 'levels.0.blocks.2.mlp.fc1.weight', 'levels.0.blocks.2.mlp.fc1.bias', 'levels.0.blocks.2.mlp.fc2.weight', 'levels.0.blocks.3.norm1.0.weight', 'levels.0.blocks.3.norm1.0.bias', 'levels.0.blocks.3.dcn.offset_mask.weight', 'levels.0.blocks.3.dcn.offset_mask.bias', 'levels.0.blocks.3.dcn.value_proj.weight', 'levels.0.blocks.3.dcn.value_proj.bias', 'levels.0.blocks.3.dcn.output_proj.weight', 'levels.0.blocks.3.norm2.0.weight', 'levels.0.blocks.3.norm2.0.bias', 'levels.0.blocks.3.mlp.fc1.weight', 'levels.0.blocks.3.mlp.fc1.bias', 'levels.0.blocks.3.mlp.fc2.weight', 'levels.0.norm.0.weight', 'levels.0.norm.0.bias', 'levels.0.downsample.conv.weight', 'levels.0.downsample.norm.1.weight', 'levels.0.downsample.norm.1.bias', 'levels.1.blocks.0.norm1.0.weight', 'levels.1.blocks.0.norm1.0.bias', 'levels.1.blocks.0.dcn.offset_mask.weight', 'levels.1.blocks.0.dcn.offset_mask.bias', 'levels.1.blocks.0.dcn.value_proj.weight', 'levels.1.blocks.0.dcn.value_proj.bias', 'levels.1.blocks.0.dcn.output_proj.weight', 'levels.1.blocks.0.norm2.0.weight', 'levels.1.blocks.0.norm2.0.bias', 'levels.1.blocks.0.mlp.fc1.weight', 'levels.1.blocks.0.mlp.fc1.bias', 'levels.1.blocks.0.mlp.fc2.weight', 'levels.1.blocks.1.norm1.0.weight', 'levels.1.blocks.1.norm1.0.bias', 'levels.1.blocks.1.dcn.offset_mask.weight', 'levels.1.blocks.1.dcn.offset_mask.bias', 'levels.1.blocks.1.dcn.value_proj.weight', 'levels.1.blocks.1.dcn.value_proj.bias', 'levels.1.blocks.1.dcn.output_proj.weight', 'levels.1.blocks.1.norm2.0.weight', 'levels.1.blocks.1.norm2.0.bias', 'levels.1.blocks.1.mlp.fc1.weight', 'levels.1.blocks.1.mlp.fc1.bias', 'levels.1.blocks.1.mlp.fc2.weight', 'levels.1.blocks.2.norm1.0.weight', 'levels.1.blocks.2.norm1.0.bias', 'levels.1.blocks.2.dcn.offset_mask.weight', 'levels.1.blocks.2.dcn.offset_mask.bias', 'levels.1.blocks.2.dcn.value_proj.weight', 'levels.1.blocks.2.dcn.value_proj.bias', 'levels.1.blocks.2.dcn.output_proj.weight', 'levels.1.blocks.2.norm2.0.weight', 'levels.1.blocks.2.norm2.0.bias', 'levels.1.blocks.2.mlp.fc1.weight', 'levels.1.blocks.2.mlp.fc1.bias', 'levels.1.blocks.2.mlp.fc2.weight', 'levels.1.blocks.3.norm1.0.weight', 'levels.1.blocks.3.norm1.0.bias', 'levels.1.blocks.3.dcn.offset_mask.weight', 'levels.1.blocks.3.dcn.offset_mask.bias', 'levels.1.blocks.3.dcn.value_proj.weight', 'levels.1.blocks.3.dcn.value_proj.bias', 'levels.1.blocks.3.dcn.output_proj.weight', 'levels.1.blocks.3.norm2.0.weight', 'levels.1.blocks.3.norm2.0.bias', 'levels.1.blocks.3.mlp.fc1.weight', 'levels.1.blocks.3.mlp.fc1.bias', 'levels.1.blocks.3.mlp.fc2.weight', 'levels.1.norm.0.weight', 'levels.1.norm.0.bias', 'levels.1.downsample.conv.weight', 'levels.1.downsample.norm.1.weight', 'levels.1.downsample.norm.1.bias', 'levels.2.blocks.0.norm1.0.weight', 'levels.2.blocks.0.norm1.0.bias', 'levels.2.blocks.0.dcn.offset_mask.weight', 'levels.2.blocks.0.dcn.offset_mask.bias', 'levels.2.blocks.0.dcn.value_proj.weight', 'levels.2.blocks.0.dcn.value_proj.bias', 'levels.2.blocks.0.dcn.output_proj.weight', 'levels.2.blocks.0.norm2.0.weight', 'levels.2.blocks.0.norm2.0.bias', 'levels.2.blocks.0.mlp.fc1.weight', 'levels.2.blocks.0.mlp.fc1.bias', 'levels.2.blocks.0.mlp.fc2.weight', 'levels.2.blocks.1.norm1.0.weight', 'levels.2.blocks.1.norm1.0.bias', 'levels.2.blocks.1.dcn.offset_mask.weight', 'levels.2.blocks.1.dcn.offset_mask.bias', 'levels.2.blocks.1.dcn.value_proj.weight', 'levels.2.blocks.1.dcn.value_proj.bias', 'levels.2.blocks.1.dcn.output_proj.weight', 'levels.2.blocks.1.norm2.0.weight', 'levels.2.blocks.1.norm2.0.bias', 'levels.2.blocks.1.mlp.fc1.weight', 'levels.2.blocks.1.mlp.fc1.bias', 'levels.2.blocks.1.mlp.fc2.weight', 'levels.2.blocks.2.norm1.0.weight', 'levels.2.blocks.2.norm1.0.bias', 'levels.2.blocks.2.dcn.offset_mask.weight', 'levels.2.blocks.2.dcn.offset_mask.bias', 'levels.2.blocks.2.dcn.value_proj.weight', 'levels.2.blocks.2.dcn.value_proj.bias', 'levels.2.blocks.2.dcn.output_proj.weight', 'levels.2.blocks.2.norm2.0.weight', 'levels.2.blocks.2.norm2.0.bias', 'levels.2.blocks.2.mlp.fc1.weight', 'levels.2.blocks.2.mlp.fc1.bias', 'levels.2.blocks.2.mlp.fc2.weight', 'levels.2.blocks.3.norm1.0.weight', 'levels.2.blocks.3.norm1.0.bias', 'levels.2.blocks.3.dcn.offset_mask.weight', 'levels.2.blocks.3.dcn.offset_mask.bias', 'levels.2.blocks.3.dcn.value_proj.weight', 'levels.2.blocks.3.dcn.value_proj.bias', 'levels.2.blocks.3.dcn.output_proj.weight', 'levels.2.blocks.3.norm2.0.weight', 'levels.2.blocks.3.norm2.0.bias', 'levels.2.blocks.3.mlp.fc1.weight', 'levels.2.blocks.3.mlp.fc1.bias', 'levels.2.blocks.3.mlp.fc2.weight', 'levels.2.blocks.4.norm1.0.weight', 'levels.2.blocks.4.norm1.0.bias', 'levels.2.blocks.4.dcn.offset_mask.weight', 'levels.2.blocks.4.dcn.offset_mask.bias', 'levels.2.blocks.4.dcn.value_proj.weight', 'levels.2.blocks.4.dcn.value_proj.bias', 'levels.2.blocks.4.dcn.output_proj.weight', 'levels.2.blocks.4.norm2.0.weight', 'levels.2.blocks.4.norm2.0.bias', 'levels.2.blocks.4.mlp.fc1.weight', 'levels.2.blocks.4.mlp.fc1.bias', 'levels.2.blocks.4.mlp.fc2.weight', 'levels.2.blocks.5.norm1.0.weight', 'levels.2.blocks.5.norm1.0.bias', 'levels.2.blocks.5.dcn.offset_mask.weight', 'levels.2.blocks.5.dcn.offset_mask.bias', 'levels.2.blocks.5.dcn.value_proj.weight', 'levels.2.blocks.5.dcn.value_proj.bias', 'levels.2.blocks.5.dcn.output_proj.weight', 'levels.2.blocks.5.norm2.0.weight', 'levels.2.blocks.5.norm2.0.bias', 'levels.2.blocks.5.mlp.fc1.weight', 'levels.2.blocks.5.mlp.fc1.bias', 'levels.2.blocks.5.mlp.fc2.weight', 'levels.2.blocks.6.norm1.0.weight', 'levels.2.blocks.6.norm1.0.bias', 'levels.2.blocks.6.dcn.offset_mask.weight', 'levels.2.blocks.6.dcn.offset_mask.bias', 'levels.2.blocks.6.dcn.value_proj.weight', 'levels.2.blocks.6.dcn.value_proj.bias', 'levels.2.blocks.6.dcn.output_proj.weight', 'levels.2.blocks.6.norm2.0.weight', 'levels.2.blocks.6.norm2.0.bias', 'levels.2.blocks.6.mlp.fc1.weight', 'levels.2.blocks.6.mlp.fc1.bias', 'levels.2.blocks.6.mlp.fc2.weight', 'levels.2.blocks.7.norm1.0.weight', 'levels.2.blocks.7.norm1.0.bias', 'levels.2.blocks.7.dcn.offset_mask.weight', 'levels.2.blocks.7.dcn.offset_mask.bias', 'levels.2.blocks.7.dcn.value_proj.weight', 'levels.2.blocks.7.dcn.value_proj.bias', 'levels.2.blocks.7.dcn.output_proj.weight', 'levels.2.blocks.7.norm2.0.weight', 'levels.2.blocks.7.norm2.0.bias', 'levels.2.blocks.7.mlp.fc1.weight', 'levels.2.blocks.7.mlp.fc1.bias', 'levels.2.blocks.7.mlp.fc2.weight', 'levels.2.blocks.8.norm1.0.weight', 'levels.2.blocks.8.norm1.0.bias', 'levels.2.blocks.8.dcn.offset_mask.weight', 'levels.2.blocks.8.dcn.offset_mask.bias', 'levels.2.blocks.8.dcn.value_proj.weight', 'levels.2.blocks.8.dcn.value_proj.bias', 'levels.2.blocks.8.dcn.output_proj.weight', 'levels.2.blocks.8.norm2.0.weight', 'levels.2.blocks.8.norm2.0.bias', 'levels.2.blocks.8.mlp.fc1.weight', 'levels.2.blocks.8.mlp.fc1.bias', 'levels.2.blocks.8.mlp.fc2.weight', 'levels.2.blocks.9.norm1.0.weight', 'levels.2.blocks.9.norm1.0.bias', 'levels.2.blocks.9.dcn.offset_mask.weight', 'levels.2.blocks.9.dcn.offset_mask.bias', 'levels.2.blocks.9.dcn.value_proj.weight', 'levels.2.blocks.9.dcn.value_proj.bias', 'levels.2.blocks.9.dcn.output_proj.weight', 'levels.2.blocks.9.norm2.0.weight', 'levels.2.blocks.9.norm2.0.bias', 'levels.2.blocks.9.mlp.fc1.weight', 'levels.2.blocks.9.mlp.fc1.bias', 'levels.2.blocks.9.mlp.fc2.weight', 'levels.2.blocks.10.norm1.0.weight', 'levels.2.blocks.10.norm1.0.bias', 'levels.2.blocks.10.dcn.offset_mask.weight', 'levels.2.blocks.10.dcn.offset_mask.bias', 'levels.2.blocks.10.dcn.value_proj.weight', 'levels.2.blocks.10.dcn.value_proj.bias', 'levels.2.blocks.10.dcn.output_proj.weight', 'levels.2.blocks.10.norm2.0.weight', 'levels.2.blocks.10.norm2.0.bias', 'levels.2.blocks.10.mlp.fc1.weight', 'levels.2.blocks.10.mlp.fc1.bias', 'levels.2.blocks.10.mlp.fc2.weight', 'levels.2.blocks.11.norm1.0.weight', 'levels.2.blocks.11.norm1.0.bias', 'levels.2.blocks.11.dcn.offset_mask.weight', 'levels.2.blocks.11.dcn.offset_mask.bias', 'levels.2.blocks.11.dcn.value_proj.weight', 'levels.2.blocks.11.dcn.value_proj.bias', 'levels.2.blocks.11.dcn.output_proj.weight', 'levels.2.blocks.11.norm2.0.weight', 'levels.2.blocks.11.norm2.0.bias', 'levels.2.blocks.11.mlp.fc1.weight', 'levels.2.blocks.11.mlp.fc1.bias', 'levels.2.blocks.11.mlp.fc2.weight', 'levels.2.blocks.12.norm1.0.weight', 'levels.2.blocks.12.norm1.0.bias', 'levels.2.blocks.12.dcn.offset_mask.weight', 'levels.2.blocks.12.dcn.offset_mask.bias', 'levels.2.blocks.12.dcn.value_proj.weight', 'levels.2.blocks.12.dcn.value_proj.bias', 'levels.2.blocks.12.dcn.output_proj.weight', 'levels.2.blocks.12.norm2.0.weight', 'levels.2.blocks.12.norm2.0.bias', 'levels.2.blocks.12.mlp.fc1.weight', 'levels.2.blocks.12.mlp.fc1.bias', 'levels.2.blocks.12.mlp.fc2.weight', 'levels.2.blocks.13.norm1.0.weight', 'levels.2.blocks.13.norm1.0.bias', 'levels.2.blocks.13.dcn.offset_mask.weight', 'levels.2.blocks.13.dcn.offset_mask.bias', 'levels.2.blocks.13.dcn.value_proj.weight', 'levels.2.blocks.13.dcn.value_proj.bias', 'levels.2.blocks.13.dcn.output_proj.weight', 'levels.2.blocks.13.norm2.0.weight', 'levels.2.blocks.13.norm2.0.bias', 'levels.2.blocks.13.mlp.fc1.weight', 'levels.2.blocks.13.mlp.fc1.bias', 'levels.2.blocks.13.mlp.fc2.weight', 'levels.2.blocks.14.norm1.0.weight', 'levels.2.blocks.14.norm1.0.bias', 'levels.2.blocks.14.dcn.offset_mask.weight', 'levels.2.blocks.14.dcn.offset_mask.bias', 'levels.2.blocks.14.dcn.value_proj.weight', 'levels.2.blocks.14.dcn.value_proj.bias', 'levels.2.blocks.14.dcn.output_proj.weight', 'levels.2.blocks.14.norm2.0.weight', 'levels.2.blocks.14.norm2.0.bias', 'levels.2.blocks.14.mlp.fc1.weight', 'levels.2.blocks.14.mlp.fc1.bias', 'levels.2.blocks.14.mlp.fc2.weight', 'levels.2.blocks.15.norm1.0.weight', 'levels.2.blocks.15.norm1.0.bias', 'levels.2.blocks.15.dcn.offset_mask.weight', 'levels.2.blocks.15.dcn.offset_mask.bias', 'levels.2.blocks.15.dcn.value_proj.weight', 'levels.2.blocks.15.dcn.value_proj.bias', 'levels.2.blocks.15.dcn.output_proj.weight', 'levels.2.blocks.15.norm2.0.weight', 'levels.2.blocks.15.norm2.0.bias', 'levels.2.blocks.15.mlp.fc1.weight', 'levels.2.blocks.15.mlp.fc1.bias', 'levels.2.blocks.15.mlp.fc2.weight', 'levels.2.blocks.16.norm1.0.weight', 'levels.2.blocks.16.norm1.0.bias', 'levels.2.blocks.16.dcn.offset_mask.weight', 'levels.2.blocks.16.dcn.offset_mask.bias', 'levels.2.blocks.16.dcn.value_proj.weight', 'levels.2.blocks.16.dcn.value_proj.bias', 'levels.2.blocks.16.dcn.output_proj.weight', 'levels.2.blocks.16.norm2.0.weight', 'levels.2.blocks.16.norm2.0.bias', 'levels.2.blocks.16.mlp.fc1.weight', 'levels.2.blocks.16.mlp.fc1.bias', 'levels.2.blocks.16.mlp.fc2.weight', 'levels.2.blocks.17.norm1.0.weight', 'levels.2.blocks.17.norm1.0.bias', 'levels.2.blocks.17.dcn.offset_mask.weight', 'levels.2.blocks.17.dcn.offset_mask.bias', 'levels.2.blocks.17.dcn.value_proj.weight', 'levels.2.blocks.17.dcn.value_proj.bias', 'levels.2.blocks.17.dcn.output_proj.weight', 'levels.2.blocks.17.norm2.0.weight', 'levels.2.blocks.17.norm2.0.bias', 'levels.2.blocks.17.mlp.fc1.weight', 'levels.2.blocks.17.mlp.fc1.bias', 'levels.2.blocks.17.mlp.fc2.weight', 'levels.2.norm.0.weight', 'levels.2.norm.0.bias', 'levels.2.downsample.conv.weight', 'levels.2.downsample.norm.1.weight', 'levels.2.downsample.norm.1.bias', 'levels.3.blocks.0.norm1.0.weight', 'levels.3.blocks.0.norm1.0.bias', 'levels.3.blocks.0.dcn.offset_mask.weight', 'levels.3.blocks.0.dcn.offset_mask.bias', 'levels.3.blocks.0.dcn.value_proj.weight', 'levels.3.blocks.0.dcn.value_proj.bias', 'levels.3.blocks.0.dcn.output_proj.weight', 'levels.3.blocks.0.norm2.0.weight', 'levels.3.blocks.0.norm2.0.bias', 'levels.3.blocks.0.mlp.fc1.weight', 'levels.3.blocks.0.mlp.fc1.bias', 'levels.3.blocks.0.mlp.fc2.weight', 'levels.3.blocks.1.norm1.0.weight', 'levels.3.blocks.1.norm1.0.bias', 'levels.3.blocks.1.dcn.offset_mask.weight', 'levels.3.blocks.1.dcn.offset_mask.bias', 'levels.3.blocks.1.dcn.value_proj.weight', 'levels.3.blocks.1.dcn.value_proj.bias', 'levels.3.blocks.1.dcn.output_proj.weight', 'levels.3.blocks.1.norm2.0.weight', 'levels.3.blocks.1.norm2.0.bias', 'levels.3.blocks.1.mlp.fc1.weight', 'levels.3.blocks.1.mlp.fc1.bias', 'levels.3.blocks.1.mlp.fc2.weight', 'levels.3.blocks.2.norm1.0.weight', 'levels.3.blocks.2.norm1.0.bias', 'levels.3.blocks.2.dcn.offset_mask.weight', 'levels.3.blocks.2.dcn.offset_mask.bias', 'levels.3.blocks.2.dcn.value_proj.weight', 'levels.3.blocks.2.dcn.value_proj.bias', 'levels.3.blocks.2.dcn.output_proj.weight', 'levels.3.blocks.2.norm2.0.weight', 'levels.3.blocks.2.norm2.0.bias', 'levels.3.blocks.2.mlp.fc1.weight', 'levels.3.blocks.2.mlp.fc1.bias', 'levels.3.blocks.2.mlp.fc2.weight', 'levels.3.blocks.3.norm1.0.weight', 'levels.3.blocks.3.norm1.0.bias', 'levels.3.blocks.3.dcn.offset_mask.weight', 'levels.3.blocks.3.dcn.offset_mask.bias', 'levels.3.blocks.3.dcn.value_proj.weight', 'levels.3.blocks.3.dcn.value_proj.bias', 'levels.3.blocks.3.dcn.output_proj.weight', 'levels.3.blocks.3.norm2.0.weight', 'levels.3.blocks.3.norm2.0.bias', 'levels.3.blocks.3.mlp.fc1.weight', 'levels.3.blocks.3.mlp.fc1.bias', 'levels.3.blocks.3.mlp.fc2.weight', 'levels.3.norm.0.weight', 'levels.3.norm.0.bias', 'conv_head.0.weight', 'conv_head.1.0.weight', 'conv_head.1.0.bias', 'conv_head.1.0.running_mean', 'conv_head.1.0.running_var', 'conv_head.1.0.num_batches_tracked', 'head.weight', 'head.bias']

可以看到,我的模型的名字每一层都比预训练的权重多了一个'model.',这就导致了无法加载权重。

于是就把预训练的权重的键名加上'model.'即可。

python 复制代码
        model_weight= {'model.' + key: value for key, value in model_weight.items()}

然后重新调试,可以看到输出:

Missing keys: []
Unexpected keys: ['model.head.weight', 'model.head.bias']

可以看到Missing keys为空,所以需要的权重全部加载了。

相关推荐
查理零世26 分钟前
保姆级讲解 python之zip()方法实现矩阵行列转置
python·算法·矩阵
刀客12337 分钟前
python3+TensorFlow 2.x(四)反向传播
人工智能·python·tensorflow
SpikeKing43 分钟前
LLM - 大模型 ScallingLaws 的设计 100B 预训练方案(PLM) 教程(5)
人工智能·llm·预训练·scalinglaws·100b·deepnorm·egs
小枫@码1 小时前
免费GPU算力,不花钱部署DeepSeek-R1
人工智能·语言模型
liruiqiang051 小时前
机器学习 - 初学者需要弄懂的一些线性代数的概念
人工智能·线性代数·机器学习·线性回归
Icomi_1 小时前
【外文原版书阅读】《机器学习前置知识》1.线性代数的重要性,初识向量以及向量加法
c语言·c++·人工智能·深度学习·神经网络·机器学习·计算机视觉
微学AI1 小时前
GPU算力平台|在GPU算力平台部署可图大模型Kolors的应用实战教程
人工智能·大模型·llm·gpu算力
西猫雷婶1 小时前
python学opencv|读取图像(四十六)使用cv2.bitwise_or()函数实现图像按位或运算
人工智能·opencv·计算机视觉
IT古董1 小时前
【深度学习】常见模型-生成对抗网络(Generative Adversarial Network, GAN)
人工智能·深度学习·生成对抗网络
Jackilina_Stone1 小时前
【论文阅读笔记】“万字”关于深度学习的图像和视频阴影检测、去除和生成的综述笔记 | 2024.9.3
论文阅读·人工智能·笔记·深度学习·ai