LLAMA Factory 微调Qwen2.0-VL-2B视觉大模型

辣大辣条2025-10-22 17:21

本文主要参考：https://developer.aliyun.com/article/1643200

LLAMA Factory安装和使用参考：https://blog.csdn.net/pangxing6491/article/details/153682310?spm=1001.2014.3001.5501

一、制作数据集

1、增加mllm_data_tiny文件夹，并且修改dataset_info.json

2、mllm_data_tiny中包含图片和一个mllm_data_tiny.json文件，文件夹的名称和json文件名保持一致；

3、mllm_data_tiny.json格式如下，messages字段包含问题和答案，images包含图片的路径；

python 复制代码

[
  {
    "messages": [
      {
        "content": "<image>图片中的诊断结果是怎样?",
        "role": "user"
      },
      {
        "content": "The image is a non-contrasted computed tomography (CT) scan of the brain, showing the cerebral structures without any medical devices present. The region of interest, located centrally and in the middle of the image, exhibits an area of altered density, which is indicative of a brain hemorrhage. This area is distinct from the surrounding brain tissue, suggesting a possible hematoma or bleeding within the brain parenchyma. The location and characteristics of this abnormality may suggest a relationship with the surrounding brain tissue, potentially causing a mass effect or contributing to increased intracranial pressure.",
        "role": "assistant"
      }
    ],
    "images": [
      "mllm_data_tiny/8031efe0-1b5c-11ef-8929-000066532cad.jpg"
    ]
  }
]

二、选择基础大模型和目标数据集进行训练；

具体参考：https://blog.csdn.net/pangxing6491/article/details/153682310?spm=1001.2014.3001.5501