微调大模型可以像这样轻松...
https://github.com/user-attachments/assets/e6ce34b0-52d5-4f3e-a830-592106c4c272
选择你的打开方式:
- 入门教程 :https://zhuanlan.zhihu.com/p/695287607
- 框架文档 :https://llamafactory.readthedocs.io/zh-cn/latest/
- Colab :https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing
- 本地机器 :请见如何使用
- PAI-DSW :Llama3 案例 | Qwen2-VL 案例 | DeepSeek-R1-Distill 案例
- Amazon SageMaker :博客
!NOTE\] 除上述链接以外的其他网站均为未经许可的第三方网站,请小心甄别。
目录
项目特色
- 多种模型:LLaMA、LLaVA、Mistral、Mixtral-MoE、Qwen、Qwen2-VL、DeepSeek、Yi、Gemma、ChatGLM、Phi 等等。
- 集成方法:(增量)预训练、(多模态)指令监督微调、奖励模型训练、PPO 训练、DPO 训练、KTO 训练、ORPO 训练等等。
- 多种精度:16 比特全参数微调、冻结微调、LoRA 微调和基于 AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ 的 2/3/4/5/6/8 比特 QLoRA 微调。
- 先进算法 :GaLore、BAdam、APOLLO、Adam-mini、DoRA、LongLoRA、LLaMA Pro、Mixture-of-Depths、LoRA+、LoftQ 和 PiSSA。
- 实用技巧 :FlashAttention-2、Unsloth、Liger Kernel、RoPE scaling、NEFTune 和 rsLoRA。
- 广泛任务:多轮对话、工具调用、图像理解、视觉定位、视频识别和语音理解等等。
- 实验监控:LlamaBoard、TensorBoard、Wandb、MLflow、SwanLab 等等。
- 极速推理:基于 vLLM 的 OpenAI 风格 API、浏览器界面和命令行接口。
最新模型的 Day-N 微调适配
适配时间 | 模型名称 |
---|---|
Day 0 | Qwen2.5 / Qwen2-VL / QwQ / QvQ / InternLM3 / MiniCPM-o-2.6 |
Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 |
性能指标
与 ChatGLM 官方的 P-Tuning 微调相比,LLaMA Factory 的 LoRA 微调提供了 3.7 倍的加速比,同时在广告文案生成任务上取得了更高的 Rouge 分数。结合 4 比特量化技术,LLaMA Factory 的 QLoRA 微调进一步降低了 GPU 显存消耗。
更新日志
25/02/24\] 我们宣布开源 **[EasyR1](https://gitee.com/link?target=https%3A%2F%2Fgithub.com%2Fhiyouga%2FEasyR1 "EasyR1")**,一个高效可扩展的多模态强化学习框架,支持 GRPO 训练。
\[25/02/11\] 我们支持了在导出模型时保存 **[Ollama](https://gitee.com/link?target=https%3A%2F%2Fgithub.com%2Follama%2Follama "Ollama")** 配置文件。详细用法请参照 [examples](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/examples/README_zh.md "examples")。
\[25/02/05\] 我们支持了在语音理解任务上微调 **[Qwen2-Audio](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/Qwen/Qwen2-Audio-7B-Instruct "Qwen2-Audio")** 和 **[MiniCPM-o-2.6](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fopenbmb%2FMiniCPM-o-2_6 "MiniCPM-o-2.6")** 模型。
\[25/01/31\] 我们支持了 **[DeepSeek-R1](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fdeepseek-ai%2FDeepSeek-R1 "DeepSeek-R1")** 和 **[Qwen2.5-VL](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FQwen%2FQwen2.5-VL-7B-Instruct "Qwen2.5-VL")** 模型的微调。
### 模型
| 模型名 | 参数量 | Template |
|------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------|---------------------|
| [Baichuan 2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fbaichuan-inc "Baichuan 2") | 7B/13B | baichuan2 |
| [BLOOM/BLOOMZ](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fbigscience "BLOOM/BLOOMZ") | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [ChatGLM3](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FTHUDM "ChatGLM3") | 6B | chatglm3 |
| [Command R](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FCohereForAI "Command R") | 35B/104B | cohere |
| [DeepSeek (Code/MoE)](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fdeepseek-ai "DeepSeek (Code/MoE)") | 7B/16B/67B/236B | deepseek |
| [DeepSeek 2.5/3](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fdeepseek-ai "DeepSeek 2.5/3") | 236B/671B | deepseek3 |
| [DeepSeek R1 (Distill)](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fdeepseek-ai "DeepSeek R1 (Distill)") | 1.5B/7B/8B/14B/32B/70B/671B | deepseek3 |
| [Falcon](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Ftiiuae "Falcon") | 7B/11B/40B/180B | falcon |
| [Gemma/Gemma 2/CodeGemma](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fgoogle "Gemma/Gemma 2/CodeGemma") | 2B/7B/9B/27B | gemma |
| [GLM-4](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FTHUDM "GLM-4") | 9B | glm4 |
| [GPT-2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fopenai-community "GPT-2") | 0.1B/0.4B/0.8B/1.5B | - |
| [Granite 3.0-3.1](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fibm-granite "Granite 3.0-3.1") | 1B/2B/3B/8B | granite3 |
| [Index](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FIndexTeam "Index") | 1.9B | index |
| [InternLM 2-3](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Finternlm "InternLM 2-3") | 7B/8B/20B | intern2 |
| [Llama](https://gitee.com/link?target=https%3A%2F%2Fgithub.com%2Ffacebookresearch%2Fllama "Llama") | 7B/13B/33B/65B | - |
| [Llama 2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmeta-llama "Llama 2") | 7B/13B/70B | llama2 |
| [Llama 3-3.3](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmeta-llama "Llama 3-3.3") | 1B/3B/8B/70B | llama3 |
| [Llama 3.2 Vision](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmeta-llama "Llama 3.2 Vision") | 11B/90B | mllama |
| [LLaVA-1.5](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fllava-hf "LLaVA-1.5") | 7B/13B | llava |
| [LLaVA-NeXT](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fllava-hf "LLaVA-NeXT") | 7B/8B/13B/34B/72B/110B | llava_next |
| [LLaVA-NeXT-Video](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fllava-hf "LLaVA-NeXT-Video") | 7B/34B | llava_next_video |
| [MiniCPM](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fopenbmb "MiniCPM") | 1B/2B/4B | cpm/cpm3 |
| [MiniCPM-o-2.6/MiniCPM-V-2.6](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fopenbmb "MiniCPM-o-2.6/MiniCPM-V-2.6") | 8B | minicpm_o/minicpm_v |
| [Ministral/Mistral-Nemo](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmistralai "Ministral/Mistral-Nemo") | 8B/12B | ministral |
| [Mistral/Mixtral](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmistralai "Mistral/Mixtral") | 7B/8x7B/8x22B | mistral |
| [Mistral Small](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmistralai "Mistral Small") | 24B | mistral_small |
| [OLMo](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fallenai "OLMo") | 1B/7B | - |
| [PaliGemma/PaliGemma2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fgoogle "PaliGemma/PaliGemma2") | 3B/10B/28B | paligemma |
| [Phi-1.5/Phi-2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmicrosoft "Phi-1.5/Phi-2") | 1.3B/2.7B | - |
| [Phi-3/Phi-3.5](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmicrosoft "Phi-3/Phi-3.5") | 4B/14B | phi |
| [Phi-3-small](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmicrosoft "Phi-3-small") | 7B | phi_small |
| [Phi-4](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmicrosoft "Phi-4") | 14B | phi4 |
| [Pixtral](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fmistralai "Pixtral") | 12B | pixtral |
| [Qwen/QwQ (1-2.5) (Code/Math/MoE)](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FQwen "Qwen/QwQ (1-2.5) (Code/Math/MoE)") | 0.5B/1.5B/3B/7B/14B/32B/72B/110B | qwen |
| [Qwen2-Audio](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FQwen "Qwen2-Audio") | 7B | qwen2_audio |
| [Qwen2-VL/Qwen2.5-VL/QVQ](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FQwen "Qwen2-VL/Qwen2.5-VL/QVQ") | 2B/3B/7B/72B | qwen2_vl |
| [Skywork o1](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FSkywork "Skywork o1") | 8B | skywork_o1 |
| [StarCoder 2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fbigcode "StarCoder 2") | 3B/7B/15B | - |
| [TeleChat2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FTele-AI "TeleChat2") | 3B/7B/35B/115B | telechat2 |
| [XVERSE](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2Fxverse "XVERSE") | 7B/13B/65B | xverse |
| [Yi/Yi-1.5 (Code)](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2F01-ai "Yi/Yi-1.5 (Code)") | 1.5B/6B/9B/34B | yi |
| [Yi-VL](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2F01-ai "Yi-VL") | 6B/34B | yi_vl |
| [Yuan 2](https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FIEITYuan "Yuan 2") | 2B/51B/102B | yuan |
> \[!NOTE\] 对于所有"基座"(Base)模型,`template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但"对话"(Instruct/Chat)模型请务必使用**对应的模板**。
>
> 请务必在训练和推理时采用**完全一致**的模板。
项目所支持模型的完整列表请参阅 [constants.py](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/src/llamafactory/extras/constants.py "constants.py")。
您也可以在 [template.py](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/src/llamafactory/data/template.py "template.py") 中添加自己的对话模板。
### 训练方法
| 方法 | 全参数训练 | 部分参数训练 | LoRA | QLoRA |
|----------|-------|--------|------|-------|
| 预训练 | | | | |
| 指令监督微调 | | | | |
| 奖励模型训练 | | | | |
| PPO 训练 | | | | |
| DPO 训练 | | | | |
| KTO 训练 | | | | |
| ORPO 训练 | | | | |
| SimPO 训练 | | | | |
> \[!TIP\] 有关 PPO 的实现细节,请参考[此博客](https://gitee.com/link?target=https%3A%2F%2Fnewfacade.github.io%2Fnotes-on-reinforcement-learning%2F17-ppo-trl.html "此博客")。
### 数据集
部分数据集的使用需要确认,我们推荐使用下述命令登录您的 Hugging Face 账户。
pip install --upgrade huggingface_hub
huggingface-cli login
### 软硬件依赖
| 必需项 | 至少 | 推荐 |
|--------------|--------|--------|
| python | 3.9 | 3.10 |
| torch | 1.13.1 | 2.4.0 |
| transformers | 4.41.2 | 4.49.0 |
| datasets | 2.16.0 | 3.2.0 |
| accelerate | 0.34.0 | 1.2.1 |
| peft | 0.11.1 | 0.12.0 |
| trl | 0.8.6 | 0.9.6 |
| 可选项 | 至少 | 推荐 |
|--------------|--------|--------|
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.16.2 |
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.3 | 0.7.2 |
| flash-attn | 2.3.0 | 2.7.2 |
#### 硬件依赖
\* *估算值*
| 方法 | 精度 | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
|--------------------------|----|-------|-------|-------|--------|--------|-------|--------|
| Full | 32 | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
| LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
### 如何使用
#### 安装 LLaMA Factory
> \[!IMPORTANT\] 此步骤为必需。
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
可选的额外依赖项:torch、torch-npu、metrics、deepspeed、liger-kernel、bitsandbytes、hqq、eetq、gptq、awq、aqlm、vllm、galore、apollo、badam、adam-mini、qwen、minicpm_v、modelscope、openmind、swanlab、quality
> \[!TIP\] 遇到包冲突时,可使用 `pip install --no-deps -e .` 解决。
#### 数据准备
关于数据集文件的格式,请参考 [data/README_zh.md](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/data/README_zh.md "data/README_zh.md") 的内容。你可以使用 HuggingFace / ModelScope / Modelers 上的数据集或加载本地数据集。
> \[!NOTE\] 使用自定义数据集时,请更新 `data/dataset_info.json` 文件。
#### 快速开始
下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA **微调** 、**推理** 和**合并**。
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
高级用法请参考 [examples/README_zh.md](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/examples/README_zh.md "examples/README_zh.md")(包括多 GPU 微调)。
> \[!TIP\] 使用 `llamafactory-cli help` 显示帮助信息。
#### LLaMA Board 可视化微调(由 [Gradio](https://gitee.com/link?target=https%3A%2F%2Fgithub.com%2Fgradio-app%2Fgradio "Gradio") 驱动)
llamafactory-cli webui
#### 构建 Docker
CUDA 用户:
cd docker/docker-cuda/
docker compose up -d
docker compose exec llamafactory bash
昇腾 NPU 用户:
cd docker/docker-npu/
docker compose up -d
docker compose exec llamafactory bash
AMD ROCm 用户:
cd docker/docker-rocm/
docker compose up -d
docker compose exec llamafactory bash
不使用 Docker Compose 构建
#### 利用 vLLM 部署 OpenAI API
API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
> \[!TIP\] API 文档请查阅[这里](https://gitee.com/link?target=https%3A%2F%2Fplatform.openai.com%2Fdocs%2Fapi-reference%2Fchat%2Fcreate "这里")。
>
> 示例:[图像理解](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/scripts/api_example/test_image.py "图像理解") \| [工具调用](https://gitee.com/morningwindsir/LLaMA-Factory/blob/main/scripts/api_example/test_toolcall.py "工具调用")
#### 从魔搭社区下载
如果您在 Hugging Face 模型和数据集的下载中遇到了问题,可以通过下述方法使用魔搭社区。
export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
将 `model_name_or_path` 设置为模型 ID 来加载对应的模型。在[魔搭社区](https://gitee.com/link?target=https%3A%2F%2Fmodelscope.cn%2Fmodels "魔搭社区")查看所有可用的模型,例如 `LLM-Research/Meta-Llama-3-8B-Instruct`。
#### 从魔乐社区下载
您也可以通过下述方法,使用魔乐社区下载数据集和模型。
export USE_OPENMIND_HUB=1 # Windows 使用 `set USE_OPENMIND_HUB=1`
将 `model_name_or_path` 设置为模型 ID 来加载对应的模型。在[魔乐社区](https://gitee.com/link?target=https%3A%2F%2Fmodelers.cn%2Fmodels "魔乐社区")查看所有可用的模型,例如 `TeleAI/TeleChat-7B-pt`。
#### 使用 W\&B 面板
若要使用 [Weights \& Biases](https://gitee.com/link?target=https%3A%2F%2Fwandb.ai "Weights & Biases") 记录实验数据,请在 yaml 文件中添加下面的参数。
report_to: wandb
run_name: test_run # 可选
在启动训练任务时,将 `WANDB_API_KEY` 设置为[密钥](https://gitee.com/link?target=https%3A%2F%2Fwandb.ai%2Fauthorize "密钥")来登录 W\&B 账户。
#### 使用 SwanLab 面板
若要使用 [SwanLab](https://gitee.com/link?target=https%3A%2F%2Fgithub.com%2FSwanHubX%2FSwanLab "SwanLab") 记录实验数据,请在 yaml 文件中添加下面的参数。
use_swanlab: true
swanlab_run_name: test_run # 可选
在启动训练任务时,登录SwanLab账户有以下三种方式:
方式一:在 yaml 文件中添加 `swanlab_api_key=