Gemma2 2B 模型的model.safetensors.index.json文件解析

Gemma2 2B 模型的 model.safetensors.index.json 文件解析

在使用 Gemma2 2B 模型或其他大型预训练模型时,model.safetensors.index.json 文件起到了索引的作用,它帮助我们了解模型的结构、参数存储方式以及如何加载模型的具体权重。本博客将深入解析该文件的内容和用途。

下载到本地的文件如下所示:


1. 文件结构概述

model.safetensors.index.json 文件的主要结构包括两个关键部分:

  1. Metadata 元数据:包含模型的总大小信息。
  2. Weight Map 权重映射:定义模型参数与实际存储文件的对应关系。

示例内容如下:

json 复制代码
{
  "metadata": {
    "total_size": 10457367552
  },
  "weight_map": {
    "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors"
  }
}

2. Metadata 元数据解析

total_size

  • 作用:表示所有模型参数文件的总大小(以字节为单位)。
  • 示例10457367552 字节约等于 10.45 GB
  • 意义
    1. 帮助用户评估存储需求。
    2. 检查文件是否下载完整,与预期大小匹配。

3. Weight Map 权重映射解析

weight_map

  • 作用
    将模型的各层参数映射到具体的 .safetensors 文件。
  • 格式
    • 键:模型参数的名称,表示权重在模型中的位置。
    • 值:存储这些权重的 .safetensors 文件。
  • 示例解析
    • model.embed_tokens.weight: 嵌入层的权重存储在 model-00001-of-00003.safetensors 文件中。
    • model.layers.0.mlp.up_proj.weight: 第 0 层 MLP 的上投影矩阵参数位于 model-00001-of-00003.safetensors
    • model.layers.10.mlp.down_proj.weight: 第 10 层 MLP 的下投影矩阵参数位于 model-00002-of-00003.safetensors

用途

  1. 分布式存储:大型模型被拆分为多个小文件,方便管理和加载。
  2. 增量更新:支持部分更新,不必重写整个模型。
  3. 动态加载:根据需求按需加载模型的某些部分。

4. 模型分片机制

为什么需要分片?

  1. 存储限制:单个文件过大可能超出文件系统限制。
  2. 加载效率:分片可以按需加载,提高内存利用率。
  3. 分布式训练:多个 GPU 或节点可以并行处理不同的参数分片。

如何定位分片?

  • 文件命名规则:model-<编号>-of-<总数>.safetensors
    • model-00001-of-00003.safetensors 表示 3 个分片中的第 1 个。
  • 使用索引文件确保参数名和文件名一一对应。

5. Safetensors 格式简介

优势

  1. 安全性:防止恶意代码注入,保障权重文件的安全加载。
  2. 效率高:二进制存储格式,支持快速读取和写入。
  3. 跨平台兼容性:适用于 CPU 和 GPU 环境。

加载示例

python 复制代码
from safetensors.torch import load_file

# 加载特定分片
weights = load_file("model-00001-of-00003.safetensors")
print(weights.keys())

6. 实际应用场景

1. 模型加载过程

  1. 根据 model.safetensors.index.json 文件读取分片信息。
  2. 根据需要加载某些分片到 GPU,减少内存占用。
  3. 动态合并加载的参数,恢复完整模型结构。

2. 文件一致性检查

  • 利用 total_size 验证下载的文件总大小是否正确,确保数据完整性。

3. 参数微调

  • 用户可以根据需求只加载特定层权重进行微调,避免加载不必要的参数。

7. 总结

model.safetensors.index.json 文件是大型模型权重管理的重要工具,尤其适合 Gemma2 2B 这样的多层神经网络。通过解析该文件,可以了解模型的存储布局、参数分片策略以及如何高效加载和管理模型权重。

关键要点

  1. 元数据部分提供总大小信息,便于存储规划和完整性检查。
  2. 权重映射部分详细记录模型参数与存储文件的对应关系,支持灵活加载。
  3. Safetensors 格式提高了加载速度和安全性,适合大规模模型的分布式部署。

希望这篇博客能帮助您更好地理解 model.safetensors.index.json 文件的作用和实现原理,助力您的模型开发和部署工作!

后记

2024年12月30日13点45分于上海,在GPT4o大模型辅助下完成。

附录

下面是完整的Gemma2 2B 模型的model.safetensors.index.json文件:

go 复制代码
{
  "metadata": {
    "total_size": 10457367552
  },
  "weight_map": {
    "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
    "model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.8.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.8.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
    "model.layers.9.input_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
    "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
    "model.norm.weight": "model-00003-of-00003.safetensors"
  }
}

仅供参考

相关推荐
亚里随笔2 小时前
稳定且高效:GSPO如何革新大型语言模型的强化学习训练?
人工智能·机器学习·语言模型·自然语言处理·llm·rlhf
SuperherRo5 小时前
Web攻防-大模型应用&LLM安全&提示词注入&不安全输出&代码注入&直接间接&数据投毒
大模型·llm·提示词注入·不安全输出·直接·间接
堆栈future8 小时前
LangGraph实践-构建AI工作流:创建一本大模型应用开发书籍
langchain·llm·aigc
大志说编程9 小时前
LangChain框架入门15:深度解析Retrievers检索器组件
python·langchain·llm
AI大模型10 小时前
基于 Ollama 本地 LLM 大语言模型实现 ChatGPT AI 聊天系统
程序员·llm·ollama
AI大模型10 小时前
AI大模型选择指南:从ChatGPT到国产新秀,一文看懂如何选对你的AI助手
gpt·程序员·llm
努力还债的学术吗喽12 小时前
2020 GPT3 原文 Language Models are Few-Shot Learners 精选注解
gpt·大模型·llm·gpt-3·大语言模型·few-shot·zero-shot
龍小南14 小时前
RAG第2章:向量数据库(理论和常见数据库)
llm
AI大模型1 天前
深度解析AI大模型【架构→训练→推理】核心技术全景图
程序员·llm·agent
AI大模型1 天前
一文读懂:大模型应用开发平台选型指南(附教程)
程序员·llm·agent