下载Huggingface模型
安装包
pip install huggingface_hub -i https://pypi.tuna.tsinghua.edu.cn/simple
下载
from huggingface_hub import snapshot_download
sql_lora_path = snapshot_download(repo_id="Djs07/qwen2.5-1.5b-lora")
会放在~/.cache/huggingface/hub/ 目录下

启动服务
先把lora模型拷贝到当前目录再执行
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --enable-lora --lora-modules Qwen-Lora=models--Djs07--qwen2.5-1.5b-lora/snap
shots/8d7d20b1cbb95e7de29abe404e900c106fa8c8cb/
测试
模型改为上面设置的名字
curl http://172.17.0.3:10000/v1/completions -H "Content-Type: application/json" -d '{
"model": "Qwen-Lora",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": 0
}'