LLaMA-Factory推理实践

运行成功的记录

平台:带有GPU的服务器

运行的命令

bash 复制代码
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory/
conda create -n py310 python=3.10
conda activate py310

由于服务器不能直接从huggingface上下载Qwen1.5-0.5B,但本地可以,所以是直接上传的方式

然后执行如下命令,则执行成功

复制代码
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli chat --model_name_or_path ./Qwen1.5-0.5B --template "qwen"
// 这个--template是怎么选择呢,/Users/wangfeng/code/LLaMA-Factory/src/llamafactory/data/template.py,在这个当中有进行规定

以下的记录整个思考过程

参考资料

教程:https://articles.zsxq.com/id_zdtwnsam9vbw.html

v0.6.1 版本:https://github.com/hiyouga/LLaMA-Factory/blob/v0.6.1/README_zh.md

在Mac上的情况

history 20

bash 复制代码
  672  conda create -n py310 python=3.10
  673  conda activate py310
  674  pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple --ignore-installed
  675  ls
  676  git lfs install
  677  history -10
  678  brew install git-lfs
  679  git lfs install
  680  git clone git@hf.co:Qwen/Qwen1.5-0.5B
  (py310) (myenv) ➜  LLaMA-Factory git:(main) git clone https://huggingface.co/Qwen/Qwen1.5-0.5B
Cloning into 'Qwen1.5-0.5B'...
remote: Enumerating objects: 76, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 76 (delta 2), reused 0 (delta 0), pack-reused 67 (from 1)
Unpacking objects: 100% (76/76), 3.62 MiB | 542.00 KiB/s, done.
Downloading model.safetensors (1.2 GB)
Error downloading object: model.safetensors (a88bcf4): Smudge error: Error downloading model.safetensors (a88bcf41b3fa9a20031b6b598abc11f694e35e0b5684d6e14dbe7e894ebbb080): batch response: Post "https://huggingface.co/Qwen/Qwen1.5-0.5B.git/info/lfs/objects/batch": dial tcp: lookup huggingface.co: no such host

Errors logged to '/Users/wangfeng/code/LLaMA-Factory/Qwen1.5-0.5B/.git/lfs/logs/20240601T165753.939959.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: model.safetensors: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
  681  git clone https://huggingface.co/Qwen/Qwen1.5-0.5B
  682* CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \\n    --model_name_or_path path_to_llama_model \\n    --adapter_name_or_path path_to_checkpoint \\n    --template default \\n    --finetuning_type lora
  // 这个是v0.6.1的命令,但直接git的时候是最新版本的,所以这里失败
  683  git clone https://huggingface.co/Qwen/Qwen1.5-0.5B
  684* pwd
  685* CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
  // llama3没有权限进行访问
  686* conda env list
  687* pip install -e .[torch,metrics]
  688* ls
  689* pip install -e '.[torch,metrics]'
  690* CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
  691* llamafactory-cli help
  692* llamafactory-cli chat -h
  693  ls -al Qwen1.5-0.5B
  694  llamafactory-cli chat --model_name_or_path ./Qwen1.5-0.5B --template default 

在本地的mac上运行llamafactory-cli chat --model_name_or_path ./Qwen1.5-0.5B --template default,出现如下错误:说明其不能在苹果的芯片上进行推理

bash 复制代码
Traceback (most recent call last):
  File "/opt/miniconda3/envs/py310/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/miniconda3/envs/py310/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/miniconda3/envs/py310/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/generation/utils.py", line 1591, in generate
    model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
  File "/opt/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/generation/utils.py", line 468, in _prepare_attention_mask_for_generation
    raise ValueError(
ValueError: Can't infer missing attention mask on `mps` device. Please provide an `attention_mask` or use a different device.
相关推荐
Jina AI2 天前
让 llama.cpp 支持多模态向量模型
llama
wyw00002 天前
大模型微调之LLaMA-Factory实战
llama
2202_756749692 天前
LLM大模型-大模型微调(常见微调方法、LoRA原理与实战、LLaMA-Factory工具部署与训练、模型量化QLoRA)
人工智能·深度学习·llama
JoannaJuanCV2 天前
大模型训练框架:LLaMA-Factory框架
llama·大模型训练·llama factory
骑士9991115 天前
llama_factory 安装以及大模型微调
llama
周小码6 天前
llama-stack实战:Python构建Llama应用的可组合开发框架(8k星)
开发语言·python·llama
blackoon888 天前
DeepSeek R1大模型微调实战-llama-factory的模型下载与训练
llama
johnny2338 天前
大模型微调理论、实战:LLaMA-Factory、Unsloth
llama
闲看云起8 天前
从 GPT 到 LLaMA:解密 LLM 的核心架构——Decoder-Only 模型
gpt·架构·llama
小草cys10 天前
在树莓派集群上部署 Distributed Llama (Qwen 3 14B) 详细指南
python·llama·树莓派·qwen