大模型PEFT(二) 之大模型LoRA指令微调实践

环境搭建

bash 复制代码

 git clone -b v0.6.1 --depth=1 https://github.com/hiyouga/LLaMA-Factory.git
 cd LLaMA-Factory
 conda create -n py310 python=3.10 
 source activate py310
 pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple --ignore-installed

疑问

bash 复制代码

!git lfs install
!git clone https://huggingface.co/Qwen/Qwen1.5-0.5B

出现如下输出，貌似并没有被安装

bash 复制代码

(py310) root@intern-studio-40072860:~/LLaMA-Factory# !git lfs install
git lfs install lfs install
Updated Git hooks.
Git LFS initialized.
(py310) root@intern-studio-40072860:~/LLaMA-Factory# !git clone https://huggingface.co/Qwen/Qwen1.5-0.5B
git lfs install lfs install clone https://huggingface.co/Qwen/Qwen1.5-0.5B
Updated Git hooks.

直接从huggingface安装

bash 复制代码

(py310) root@intern-studio-40072860:~/LLaMA-Factory# git clone https://huggingface.co/Qwen/Qwen1.5-0.5B
Cloning into 'Qwen1.5-0.5B'...
fatal: unable to access 'https://huggingface.co/Qwen/Qwen1.5-0.5B/': Received HTTP code 503 from proxy after CONNECT

从命令行(下载成功）

https://hf-mirror.com/

bash 复制代码

huggingface-cli download --resume-download Qwen/Qwen1.5-0.5B --local-dir Qwen/Qwen1.5-0.5B

推理

微调前（没有checkpoint，先进行微调）

bash 复制代码

CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \--model_name_or_path path_to_llama_model \--adapter_name_or_path path_to_checkpoint \--template default \--finetuning_type lora

大模型指令监督微调

bash 复制代码

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --do_train \
    --template default \
    --model_name_or_path ./Qwen/Qwen1.5-0.5B \
    --dataset alpaca_data_zh_demo \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --output_dir ./path_to_pt_checkpoint \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 5e-5 \
    --num_train_epochs 3.0 \
    --plot_loss \
    --fp16

运行截图

bash 复制代码

(py310) root@intern-studio-40072860:~/LLaMA-Factory# CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
>     --stage sft \
>     --do_train \
>     --template default \
>     --model_name_or_path ./Qwen/Qwen1.5-0.5B \
>     --dataset alpaca_data_zh_demo \
>     --finetuning_type lora \
>     --lora_target q_proj,v_proj \
>     --output_dir ./path_to_pt_checkpoint \
>     --overwrite_cache \
>     --per_device_train_batch_size 4 \
>     --gradient_accumulation_steps 4 \
>     --lr_scheduler_type cosine \
>     --logging_steps 10 \
>     --save_steps 1000 \
>     --learning_rate 5e-5 \
>     --num_train_epochs 3.0 \
>     --plot_loss \
>     --fp16
06/09/2024 17:59:15 - INFO - llmtuner.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.float16
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,215 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,215 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,216 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,216 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,216 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 17:59:15,216 >> loading file tokenizer.json
[WARNING|logging.py:314] 2024-06-09 17:59:15,516 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/09/2024 17:59:15 - INFO - llmtuner.data.loader - Loading dataset alpaca_data_zh_demo.json...
06/09/2024 17:59:15 - WARNING - llmtuner.data.utils - Checksum failed: missing SHA-1 hash value in dataset_info.json.
Generating train split: 1 examples [00:00,  2.85 examples/s]
Converting format of dataset: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 17.95 examples/s]
Running tokenizer on dataset: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.98 examples/s]
input_ids:
[33975, 25, 49434, 239, 79478, 100007, 18493, 101254, 102438, 101940, 103135, 94432, 71703, 25, 220, 16, 13, 85658, 113886, 104919, 3837, 29524, 113886, 101724, 100969, 102125, 64355, 33108, 52510, 102676, 1773, 715, 17, 13, 85658, 52510, 73296, 57191, 52510, 101508, 104412, 101064, 110628, 3837, 77557, 99634, 102565, 33108, 99634, 100969, 1773, 715, 18, 13, 73562, 103935, 15946, 100627, 113886, 100708, 1773, 715, 19, 13, 6567, 96, 222, 32876, 112044, 33108, 112892, 105743, 117624, 99559, 90395, 100667, 104749, 104017, 1773, 715, 20, 13, 6567, 112, 245, 103339, 20450, 107606, 3837, 37029, 99285, 104242, 101724, 100969, 64355, 105455, 103135, 1773, 715, 21, 13, 80090, 114, 42067, 110375, 3837, 100751, 99354, 99434, 105994, 65676, 112147, 100466, 1773, 715, 22, 13, 19468, 115, 100446, 57191, 101432, 44934, 13343, 29256, 100373, 52510, 102676, 1773, 715, 23, 13, 65727, 237, 82647, 118158, 114826, 101975, 1773, 715, 24, 13, 58230, 121, 87267, 111438, 105444, 37029, 100815, 52510, 9909, 101919, 113642, 5373, 113051, 52510, 102776, 33108, 101724, 100969, 9370, 52510, 74276, 715, 16, 15, 13, 26853, 103, 103946, 100727, 101991, 100964, 99634, 102565, 32648, 33108, 113642, 1773, 151643]
inputs:
Human: 我们如何在日常生活中减少用水？
Assistant: 1. 使用节水装置，如节水淋浴喷头和水龙头。 
2. 使用水箱或水桶收集家庭废水，例如洗碗和洗浴。 
3. 在社区中提高节水意识。 
4. 检查水管和灌溉系统的漏水情况，并及时修复它们。 
5. 洗澡时间缩短，使用低流量淋浴头节约用水。 
6. 收集雨水，用于园艺或其他非饮用目的。 
7. 刷牙或擦手时关掉水龙头。 
8. 减少浇水草坪的时间。 
9. 尽可能多地重复使用灰水（来自洗衣机、浴室水槽和淋浴的水）。 
10. 只购买能源效率高的洗碗机和洗衣机。<|endoftext|>
label_ids:
[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 16, 13, 85658, 113886, 104919, 3837, 29524, 113886, 101724, 100969, 102125, 64355, 33108, 52510, 102676, 1773, 715, 17, 13, 85658, 52510, 73296, 57191, 52510, 101508, 104412, 101064, 110628, 3837, 77557, 99634, 102565, 33108, 99634, 100969, 1773, 715, 18, 13, 73562, 103935, 15946, 100627, 113886, 100708, 1773, 715, 19, 13, 6567, 96, 222, 32876, 112044, 33108, 112892, 105743, 117624, 99559, 90395, 100667, 104749, 104017, 1773, 715, 20, 13, 6567, 112, 245, 103339, 20450, 107606, 3837, 37029, 99285, 104242, 101724, 100969, 64355, 105455, 103135, 1773, 715, 21, 13, 80090, 114, 42067, 110375, 3837, 100751, 99354, 99434, 105994, 65676, 112147, 100466, 1773, 715, 22, 13, 19468, 115, 100446, 57191, 101432, 44934, 13343, 29256, 100373, 52510, 102676, 1773, 715, 23, 13, 65727, 237, 82647, 118158, 114826, 101975, 1773, 715, 24, 13, 58230, 121, 87267, 111438, 105444, 37029, 100815, 52510, 9909, 101919, 113642, 5373, 113051, 52510, 102776, 33108, 101724, 100969, 9370, 52510, 74276, 715, 16, 15, 13, 26853, 103, 103946, 100727, 101991, 100964, 99634, 102565, 32648, 33108, 113642, 1773, 151643]
labels:
1. 使用节水装置，如节水淋浴喷头和水龙头。 
2. 使用水箱或水桶收集家庭废水，例如洗碗和洗浴。 
3. 在社区中提高节水意识。 
4. 检查水管和灌溉系统的漏水情况，并及时修复它们。 
5. 洗澡时间缩短，使用低流量淋浴头节约用水。 
6. 收集雨水，用于园艺或其他非饮用目的。 
7. 刷牙或擦手时关掉水龙头。 
8. 减少浇水草坪的时间。 
9. 尽可能多地重复使用灰水（来自洗衣机、浴室水槽和淋浴的水）。 
10. 只购买能源效率高的洗碗机和洗衣机。<|endoftext|>
[INFO|configuration_utils.py:727] 2024-06-09 17:59:24,557 >> loading configuration file ./Qwen/Qwen1.5-0.5B/config.json
[INFO|configuration_utils.py:792] 2024-06-09 17:59:24,566 >> Model config Qwen2Config {
  "_name_or_path": "./Qwen/Qwen1.5-0.5B",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151643,
  "hidden_act": "silu",
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 2816,
  "max_position_embeddings": 32768,
  "max_window_layers": 21,
  "model_type": "qwen2",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "num_key_value_heads": 16,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": true,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.37.2",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
}

[INFO|modeling_utils.py:3473] 2024-06-09 17:59:25,443 >> loading weights file ./Qwen/Qwen1.5-0.5B/model.safetensors
[INFO|modeling_utils.py:1426] 2024-06-09 17:59:27,858 >> Instantiating Qwen2ForCausalLM model under default dtype torch.float16.
[INFO|configuration_utils.py:826] 2024-06-09 17:59:27,860 >> Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151643
}

[INFO|modeling_utils.py:4350] 2024-06-09 18:00:13,863 >> All model checkpoint weights were used when initializing Qwen2ForCausalLM.

[INFO|modeling_utils.py:4358] 2024-06-09 18:00:13,863 >> All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at ./Qwen/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[INFO|configuration_utils.py:779] 2024-06-09 18:00:13,891 >> loading configuration file ./Qwen/Qwen1.5-0.5B/generation_config.json
[INFO|configuration_utils.py:826] 2024-06-09 18:00:13,891 >> Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151643,
  "max_new_tokens": 2048
}

06/09/2024 18:00:13 - INFO - llmtuner.model.patcher - Gradient checkpointing enabled.
06/09/2024 18:00:13 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
06/09/2024 18:00:14 - INFO - llmtuner.model.loader - trainable params: 786432 || all params: 464774144 || trainable%: 0.1692
/root/.conda/envs/py310/lib/python3.10/site-packages/accelerate/accelerator.py:444: FutureWarning: Passing the following arguments to `Accelerator` is deprecated and will be removed in version 1.0 of Accelerate: dict_keys(['dispatch_batches', 'split_batches']). Please pass an `accelerate.DataLoaderConfiguration` instead: 
dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False)
  warnings.warn(
[INFO|trainer.py:571] 2024-06-09 18:00:14,322 >> Using auto half precision backend
[INFO|trainer.py:1721] 2024-06-09 18:00:14,484 >> ***** Running training *****
[INFO|trainer.py:1722] 2024-06-09 18:00:14,484 >>   Num examples = 1
[INFO|trainer.py:1723] 2024-06-09 18:00:14,484 >>   Num Epochs = 3
[INFO|trainer.py:1724] 2024-06-09 18:00:14,484 >>   Instantaneous batch size per device = 4
[INFO|trainer.py:1727] 2024-06-09 18:00:14,484 >>   Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:1728] 2024-06-09 18:00:14,484 >>   Gradient Accumulation steps = 4
[INFO|trainer.py:1729] 2024-06-09 18:00:14,484 >>   Total optimization steps = 3
[INFO|trainer.py:1730] 2024-06-09 18:00:14,485 >>   Number of trainable parameters = 786,432
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.21s/it][INFO|trainer.py:1962] 2024-06-09 18:00:19,402 >> 

Training completed. Do not forget to share your model on huggingface.co/models =)


{'train_runtime': 4.917, 'train_samples_per_second': 0.61, 'train_steps_per_second': 0.61, 'train_loss': 0.4967418909072876, 'epoch': 3.0}                         
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.64s/it]
[INFO|trainer.py:2936] 2024-06-09 18:00:19,412 >> Saving model checkpoint to ./path_to_pt_checkpoint
/root/.conda/envs/py310/lib/python3.10/site-packages/peft/utils/save_and_load.py:195: UserWarning: Could not find a config file in ./Qwen/Qwen1.5-0.5B - will assume that the vocabulary was not modified.
  warnings.warn(
[INFO|tokenization_utils_base.py:2433] 2024-06-09 18:00:19,567 >> tokenizer config file saved in ./path_to_pt_checkpoint/tokenizer_config.json
[INFO|tokenization_utils_base.py:2442] 2024-06-09 18:00:19,573 >> Special tokens file saved in ./path_to_pt_checkpoint/special_tokens_map.json
[INFO|tokenization_utils_base.py:2493] 2024-06-09 18:00:19,576 >> added tokens file saved in ./path_to_pt_checkpoint/added_tokens.json
***** train metrics *****
  epoch                    =        3.0
  train_loss               =     0.4967
  train_runtime            = 0:00:04.91
  train_samples_per_second =       0.61
  train_steps_per_second   =       0.61
06/09/2024 18:00:19 - WARNING - llmtuner.extras.ploting - No metric loss to plot.
06/09/2024 18:00:19 - WARNING - llmtuner.extras.ploting - No metric eval_loss to plot.
[INFO|modelcard.py:452] 2024-06-09 18:00:19,934 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}

推理

bash 复制代码

CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \--model_name_or_path ./Qwen/Qwen1.5-0.5B \--adapter_name_or_path /root/LLaMA-Factory/path_to_pt_checkpoint \--template default \--finetuning_type lora

bash 复制代码

(py310) root@intern-studio-40072860:~/LLaMA-Factory# CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \--model_name_or_path ./Qwen/Qwen1.5-0.5B \--adapter_name_or_path /root/LLaMA-Factory/path_to_pt_checkpoint \--template default \--finetuning_type lora
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2025] 2024-06-09 18:11:41,109 >> loading file tokenizer.json
[WARNING|logging.py:314] 2024-06-09 18:11:41,436 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:727] 2024-06-09 18:11:41,437 >> loading configuration file ./Qwen/Qwen1.5-0.5B/config.json
[INFO|configuration_utils.py:792] 2024-06-09 18:11:41,441 >> Model config Qwen2Config {
  "_name_or_path": "./Qwen/Qwen1.5-0.5B",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151643,
  "hidden_act": "silu",
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 2816,
  "max_position_embeddings": 32768,
  "max_window_layers": 21,
  "model_type": "qwen2",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "num_key_value_heads": 16,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": true,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.37.2",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
}

06/09/2024 18:11:41 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3473] 2024-06-09 18:11:41,681 >> loading weights file ./Qwen/Qwen1.5-0.5B/model.safetensors
[INFO|modeling_utils.py:1426] 2024-06-09 18:11:41,693 >> Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:826] 2024-06-09 18:11:41,695 >> Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151643
}

[INFO|modeling_utils.py:4350] 2024-06-09 18:11:55,341 >> All model checkpoint weights were used when initializing Qwen2ForCausalLM.

[INFO|modeling_utils.py:4358] 2024-06-09 18:11:55,341 >> All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at ./Qwen/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
[INFO|configuration_utils.py:779] 2024-06-09 18:11:55,345 >> loading configuration file ./Qwen/Qwen1.5-0.5B/generation_config.json
[INFO|configuration_utils.py:826] 2024-06-09 18:11:55,345 >> Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151643,
  "max_new_tokens": 2048
}

06/09/2024 18:11:55 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
06/09/2024 18:11:55 - INFO - llmtuner.model.adapter - Merged 1 adapter(s).
06/09/2024 18:11:55 - INFO - llmtuner.model.adapter - Loaded adapter(s): /root/LLaMA-Factory/path_to_pt_checkpoint
06/09/2024 18:11:55 - INFO - llmtuner.model.loader - all params: 463987712
Welcome to the CLI application, use `clear` to remove the history, use `exit` to exit the application.

User: 我们如何在日常生活中减少用水？
Assistant: 为了减少用水，我们可以从以下几个方面入手：
1. 减少用水量：我们可以减少洗澡和淋浴的时间，使用节水龙头和淋浴头，关闭水龙头和淋浴头，避免洗完澡后忘记关水龙头，尽可能地使用淋浴喷头。
2. 淋浴时避免浪费水：淋浴时不要让水直接流出，应该将水缓慢地倒入盆中，以避免水流直接滴到地面，同时避免浪费水。
3. 安装节水设备：安装节水器、节水龙头、淋浴头等节水设备可以有效减少用水量。
4. 节约用水：在日常生活中，我们可以选择在不需要使用水时关闭水龙头，将水龙头换成节水型的，这样可以有效地节约用水。
5. 集中用水：将水放在一个地方，集中使用，避免浪费，同时也可以节约用水。
6. 优化用水习惯：养成良好的用水习惯，比如洗手时不要忘记关水龙头，洗完澡后及时关闭水龙头，可以有效减少用水量。
总之，减少用水需要我们从生活中的每个细节做起，从节约用水开始，从小事做起，才能更好地保护水资源，为我们的地球做出贡献。

合并 LoRA 权重并导出模型

复制代码

CUDA_VISIBLE_DEVICES=0 python src/export_model.py \--model_name_or_path ./Qwen/Qwen1.5-0.5B\--adapter_name_or_path /root/LLaMA-Factory/path_to_pt_checkpoint  \--template default \--finetuning_type lora \--export_dir path_to_export \--export_size 2 \--export_legacy_format False

大模型PEFT(二) 之 大模型LoRA指令微调实践

环境搭建

疑问

直接从huggingface安装

从命令行(下载成功）

推理

微调前（没有checkpoint，先进行微调）

大模型指令监督微调

运行截图

推理

合并 LoRA 权重并导出模型

大模型PEFT(二) 之大模型LoRA指令微调实践