模型下载与使用

汀江游非侠2026-04-06 15:39

根据个人电脑配置及使用场景，我选择模型为Qwen-4B-Chat-Q4_K_M

模型需要从Huggingface模型库下载，需要使用平台工具来下载

注：使用wget无法下载

安装工具

pip install -U huggingface_hub

网络问题，需要使用镜像

export HF_ENDPOINT="https://hf-mirror.com"

原始模型下载

huggingface-cli download Qwen/Qwen1.5-4B-Chat --local-dir ./models/Qwen1.5-4B-Chat

原始模型需要进行量化转换

转换需要安装依赖，进入llama.cpp-b8642目录，执行如下命令进行安装

pip install -r requirements.txt

注：安装依赖要求Python 3.10以上版本，因此笔者并没有成功转换，使用第二在方法：

直接下载现成的 GGUF 模型

huggingface-cli download itlwas/Qwen1.5-4B-Chat-Q4_K_M-GGUF qwen1.5-4b-chat-q4_k_m.gguf --local-dir ./ --local-dir-use-symlinks False

./build/bin/llama-server -m models/qwen-4b-chat.Q4_K_M.gguf -c 4096 -ngl 35 --host 0.0.0.0 --port 8080

有如下界面