- 环境准备
python
conda create -n my_vllm python==3.9.19 pip
conda activate my_vllm
pip install modelscope
pip install vllm
- 模型下载
python
# 模型下载
# modelscope默认安装路径:/root/.cache/modelscope/hub/qwen/Qwen2.5-72B-Instruct-GPTQ-Int8
from modelscope import snapshot_download
model_dir = snapshot_download('qwen/Qwen2.5-72B-Instruct-GPTQ-Int8', local_dir='/home/models/qwen/Qwen2.5-72B-Instruct-GPTQ-Int8')
参考文档:
- 直接服务器vllm启动测试
python
vllm serve /home/models/qwen/Qwen2.5-72B-Instruct-GPTQ-Int8 --tensor-parallel-size 2 --max-model-len 256
参考文档:
https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html