一、任务介绍
- follow 教学文档和视频使用QLoRA进行微调模型,复现微调效果,并能成功讲出梗图.
- 尝试使用LoRA,或调整xtuner的config,如LoRA rank,学习率。看模型Loss会如何变化,并记录调整后效果(选做,使用LoRA或调整config可以二选一)
二、根据文档搭建环境
Tutorial/docs/L2/InternVL/joke_readme.md at camp3 · InternLM/Tutorial · GitHub
三、使用InternVL 推理部署
3.1、使用pipeline进行推理
3.1.1、创建test_lmdeploy.py以及推理的图片
data:image/s3,"s3://crabby-images/45a94/45a943ba7c970395d9375b85b6cf6778264d6547" alt=""
from lmdeploy import pipeline
from lmdeploy.vl import load_image
pipe = pipeline('/root/model/InternVL2-2B')
image = load_image('/root/InternLM/007aPnLRgy1hb39z0im50j30ci0el0wm.jpg')
response = pipe(('请你根据这张图片,讲一个脑洞大开的梗', image))
print(response.text)
3.1.2、推理结果
data:image/s3,"s3://crabby-images/6ba9a/6ba9af0fef6da6517c937857a8e810f9653e3fc1" alt=""
四、InternVL 微调攻略
4.1、准备数据集
# 为了高效训练,请确保数据格式为:
{
"id": "000000033471",
"image": ["coco/train2017/000000033471.jpg"], # 如果是纯文本,则该字段为 None 或者不存在
"conversations": [
{
"from": "human",
"value": "<image>\nWhat are the colors of the bus in the image?"
},
{
"from": "gpt",
"value": "The bus in the image is white and red."
}
]
}
4.2、配置微调参数
让我们一起修改XTuner下 InternVL的config,文件在: /root/InternLM/code/XTuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_qlora_finetune.py
data:image/s3,"s3://crabby-images/8ed5d/8ed5d13973f0cc5fa2b06a46d5a16c39919c3f6a" alt=""
4.3、开始训练
NPROC_PER_NODE=1 xtuner train /root/InternLM/code/XTuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_qlora_finetune.py --work-dir /root/InternLM/work_dir/internvl_ft_run_8_filter --deepspeed deepspeed_zero1
30%资源无法训练
data:image/s3,"s3://crabby-images/4945f/4945ff295593a4892668bfceaeb58642a64e84f0" alt=""
升级为50%资源训练
data:image/s3,"s3://crabby-images/ca152/ca152efa0966de3f79f784c1219e1530eb72285d" alt=""
data:image/s3,"s3://crabby-images/31367/31367ab40c09337a45bb47f2038924a5f5467f7a" alt=""
4.4、合并权重&&模型转换
python3 xtuner/configs/internvl/v1_5/convert_to_official.py xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_qlora_finetune.py /root/InternLM/work_dir/internvl_ft_run_8_filter/iter_3000.pth /root/InternLM/InternVL2-2B/
最后我们的模型在:/root/InternLM/convert_model/,文件格式:
.
|-- added_tokens.json
|-- config.json
|-- configuration_intern_vit.py
|-- configuration_internlm2.py
|-- configuration_internvl_chat.py
|-- conversation.py
|-- generation_config.json
|-- model.safetensors
|-- modeling_intern_vit.py
|-- modeling_internlm2.py
|-- modeling_internvl_chat.py
|-- special_tokens_map.json
|-- tokenization_internlm2.py
|-- tokenizer.model
`-- tokenizer_config.json
4.5、微调后效果对比
我们把下面的代码替换进test_lmdeploy.py中,然后跑一下效果。
from lmdeploy import pipeline
from lmdeploy.vl import load_image
pipe = pipeline('/root/InternLM/InternVL2-2B')
image = load_image('/root/InternLM/007aPnLRgy1hb39z0im50j30ci0el0wm.jpg')
response = pipe(('请你根据这张图片,讲一个脑洞大开的梗', image))
print(response.text)
data:image/s3,"s3://crabby-images/3f828/3f82893a2d1051f97f2735f3727a12e0c2577250" alt=""