1. 安装llama-cpp-python报错
python
$ CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python -i https://pypi.tuna.tsinghua.edu.cn/simple
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting llama-cpp-python
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/de/6d/4a20e676bdf7d9d3523be3a081bf327af958f9bdfe2a564f5cf485faeaec/llama_cpp_python-0.3.9.tar.gz (67.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.9/67.9 MB 5.3 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /home/wuwenliang/anaconda3/envs/llmtuner/lib/python3.10/site-packages (from llama-cpp-python) (4.13.2)
Requirement already satisfied: numpy>=1.20.0 in /home/wuwenliang/anaconda3/envs/llmtuner/lib/python3.10/site-packages (from llama-cpp-python) (1.26.4)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl (45 kB)
Requirement already satisfied: jinja2>=2.11.3 in /home/wuwenliang/anaconda3/envs/llmtuner/lib/python3.10/site-packages (from llama-cpp-python) (3.1.6)
Requirement already satisfied: MarkupSafe>=2.0 in /home/wuwenliang/anaconda3/envs/llmtuner/lib/python3.10/site-packages (from jinja2>=2.11.3->llama-cpp-python) (3.0.2)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [29 lines of output]
*** scikit-build-core 0.11.5 using CMake 3.22.1 (wheel)
*** Configuring CMake...
loading initial cache file /tmp/tmp01d6kko6/build/CMakeInit.txt
-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1")
CMake Error at vendor/llama.cpp/CMakeLists.txt:108 (message):
LLAMA_CUBLAS is deprecated and will be removed in the future.
Use GGML_CUDA instead
Call Stack (most recent call first):
vendor/llama.cpp/CMakeLists.txt:113 (llama_option_depr)
-- Configuring incomplete, errors occurred!
See also "/tmp/tmp01d6kko6/build/CMakeFiles/CMakeOutput.log".
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)
安装报错,分析如下:
这个错误是因为 LLAMA_CUBLAS 选项已经被弃用,建议使用 GGML_CUDA 替代。你需要修改安装命令中的 CMake 参数。
完整安装命令(包含更多可能的依赖):
python
CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc" pip install llam
安装仍旧失败:
python
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
*** CMake build failed
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /llmtuner/bin/python /llmtuner/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmp2wtlptdn
cwd: /tmp/pip-install-jm38bxck/llama-cpp-python_69bdce92ceff4d4db8aec6e07d4f05e3
Building wheel for llama-cpp-python (pyproject.toml) ... error
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)
2. 采用源码安装
python
git clone --recursive https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python
CMAKE_ARGS="-DGGML_CUDA=on" pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple
仍旧失败
python
$ CMAKE_ARGS="-DGGML_CUDA=on" pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /home/software/llama-cpp-python
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /llmtuner/lib/python3.10/site-packages (from llama_cpp_python==0.3.9) (4.13.2)
Requirement already satisfied: numpy>=1.20.0 in /llmtuner/lib/python3.10/site-packages (from llama_cpp_python==0.3.9) (1.26.4)
Collecting diskcache>=5.6.1 (from llama_cpp_python==0.3.9)
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl (45 kB)
Requirement already satisfied: jinja2>=2.11.3 in /llmtuner/lib/python3.10/site-packages (from llama_cpp_python==0.3.9) (3.1.6)
Requirement already satisfied: MarkupSafe>=2.0 in /llmtuner/lib/python3.10/site-packages (from jinja2>=2.11.3->llama_cpp_python==0.3.9) (3.0.2)
Building wheels for collected packages: llama_cpp_python
Building wheel for llama_cpp_python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama_cpp_python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [64 lines of output]
*** scikit-build-core 0.11.5 using CMake 3.22.1 (wheel)
*** Configuring CMake...
loading initial cache file /tmp/tmpwhq61gdi/build/CMakeInit.txt
-- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.34.1")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.4.131")
-- CUDA Toolkit found
-- Using CUDA architectures: 50-virtual;61-virtual;70-virtual;75-virtual;80-virtual;86-real;89-real
CMake Error at /usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:726 (message):
Compiling the CUDA compiler identification source file
"CMakeCUDACompilerId.cu" failed.
Compiler: CMAKE_CUDA_COMPILER-NOTFOUND
Build flags:
Id flags: -v
The output was:
No such file or directory
Call Stack (most recent call first):
/usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
/usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:48 (__determine_compiler_id_test)
/usr/share/cmake-3.22/Modules/CMakeDetermineCUDACompiler.cmake:298 (CMAKE_DETERMINE_COMPILER_ID)
vendor/llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:43 (enable_language)
-- Configuring incomplete, errors occurred!
See also "/tmp/tmpwhq61gdi/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/tmpwhq61gdi/build/CMakeFiles/CMakeError.log".
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama_cpp_python
Failed to build llama_cpp_python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama_cpp_python)
3. 确保 CUDA 环境正确配置
python
运行以下命令检查 CUDA 是否安装正确:
nvcc --version
如果报错
nvcc: command not found
,说明 CUDA 环境未正确配置。需要:
安装 CUDA Toolkit(推荐 CUDA 12.x 或 11.x)
确保 nvcc 在 PATH 中:export PATH=/usr/local/cuda/bin:$PATH
python
$ nvcc -version
bash: nvcc: command not found
python
1. 确认是否安装了 CUDA Toolkit
运行以下命令检查 CUDA 是否安装:
ls /usr/local/cuda
如果目录不存在,说明 CUDA Toolkit 未安装,需要先安装它。
python
$ ls /usr/local/cuda
bin compute-sanitizer DOCS EULA.txt extras gds gds-12.4 include lib64 libnvvp nsight-compute-2024.1.1 nsightee_plugins nsight-systems-2023.4.4 nvml nvvm README share src targets tools version.json
目录结构来看,CUDA Toolkit 已经安装(版本可能是 12.4),但 nvcc 仍然不可用,可能是因为: nvcc 路径未添加到 PATH CUDA 环境变量未正确配置 安装过程中部分组件缺失
解决方案 1. 手动添加 nvcc 到 PATH
export PATH=/usr/local/cuda/bin:$PATH
然后验证: nvcc --version
python
将以下内容添加到
~/.bashrc
或
~/.zshrc
:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
然后生效:
source ~/.bashrc
python
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0
4. 重新命令行安装 llama-cpp-python
确保 nvcc 可用后,
运行:
pythonCMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc" pip install llama-cpp-python
仍然报错
python
FAILED: vendor/llama.cpp/tools/mtmd/llama-llava-clip-quantize-cli
: && /usr/bin/g++ -pthread -B /home/anaconda3/envs/llmtuner/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/tools/mtmd/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/llama-llava-clip-quantize-cli.dir/clip-quantize-cli.cpp.o -o vendor/llama.cpp/tools/mtmd/llama-llava-clip-quantize-cli -Wl,-rpath,/tmp/tmptjbwss02/build/bin: vendor/llama.cpp/common/libcommon.a bin/libllama.so bin/libggml.so bin/libggml-cpu.so bin/libggml-cuda.so bin/libggml-base.so && :
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libgomp.so.1, needed by bin/libggml-cpu.so, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libcuda.so.1, needed by bin/libggml-cuda.so, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemCreate'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_barrier@GOMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemAddressReserve'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemUnmap'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_parallel@GOMP_4.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemSetAccess'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuDeviceGet'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_thread_num@OMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemAddressFree'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuGetErrorString'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_single_start@GOMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuDeviceGetAttribute'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemMap'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemRelease'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_num_threads@OMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemGetAllocationGranularity'
collect2: error: ld returned 1 exit status
[165/165] : && /usr/bin/g++ -pthread -B /home/anaconda3/envs/llmtuner/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/tools/mtmd/CMakeFiles/mtmd.dir/mtmd.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/mtmd.dir/clip.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/llama-mtmd-cli.dir/mtmd-cli.cpp.o -o vendor/llama.cpp/tools/mtmd/llama-mtmd-cli -Wl,-rpath,/tmp/tmptjbwss02/build/bin: vendor/llama.cpp/common/libcommon.a bin/libllama.so bin/libggml.so bin/libggml-cpu.so bin/libggml-cuda.so bin/libggml-base.so && :
FAILED: vendor/llama.cpp/tools/mtmd/llama-mtmd-cli
: && /usr/bin/g++ -pthread -B /home/anaconda3/envs/llmtuner/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/tools/mtmd/CMakeFiles/mtmd.dir/mtmd.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/mtmd.dir/clip.cpp.o vendor/llama.cpp/tools/mtmd/CMakeFiles/llama-mtmd-cli.dir/mtmd-cli.cpp.o -o vendor/llama.cpp/tools/mtmd/llama-mtmd-cli -Wl,-rpath,/tmp/tmptjbwss02/build/bin: vendor/llama.cpp/common/libcommon.a bin/libllama.so bin/libggml.so bin/libggml-cpu.so bin/libggml-cuda.so bin/libggml-base.so && :
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libgomp.so.1, needed by bin/libggml-cpu.so, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libcuda.so.1, needed by bin/libggml-cuda.so, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemCreate'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_barrier@GOMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemAddressReserve'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemUnmap'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_parallel@GOMP_4.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemSetAccess'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuDeviceGet'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_thread_num@OMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemAddressFree'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuGetErrorString'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_single_start@GOMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuDeviceGetAttribute'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemMap'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemRelease'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_num_threads@OMP_1.0'
/home/anaconda3/envs/llmtuner/compiler_compat/ld: bin/libggml-cuda.so: undefined reference to `cuMemGetAllocationGranularity'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
*** CMake build failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)
从错误日志来看,主要问题是链接阶段找不到关键的 CUDA 和 OpenMP 动态库( libcuda.so.1 、 libgomp.so.1 等),同时存在符号未定义的问题(如 cuMemCreate )。以下是完整的解决方案:
5. 修复cuda so库
python
1. 修复系统库路径(关键步骤)
添加 CUDA 和系统库路径到环境变量
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
永久生效:
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
验证库是否存在
ls /usr/lib/x86_64-linux-gnu/libgomp.so.1 # OpenMP 库
ls /usr/local/cuda/lib64/libcudart.so.12 # CUDA Runtime
ls /usr/lib/x86_64-linux-gnu/libcuda.so.1 # NVIDIA 驱动库
python
$ echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH' >> ~/.bashrc
$ source ~/.bashrc
$ ls /usr/lib/x86_64-linux-gnu/libgomp.so.1
/usr/lib/x86_64-linux-gnu/libgomp.so.1
$ ls /usr/local/cuda/lib64/libcudart.so.12
/usr/local/cuda/lib64/libcudart.so.12
$ ls /usr/lib/x86_64-linux-gnu/libcuda.so.1
/usr/lib/x86_64-linux-gnu/libcuda.so.1
6. 重新安装并指定完整 CMake 参数
python
CMAKE_ARGS="-DGGML_CUDA=ON \
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
-DCMAKE_CUDA_ARCHITECTURES=80 \
-DLLAMA_CUDA_FORCE_DMMV=ON" \
pip install llama-cpp-python \
--force-reinstall \
--no-cache-dir \
--verbose \
-i https://pypi.tuna.tsinghua.edu.cn/simple
python
最终可以看到:
Successfully installed MarkupSafe-3.0.2 diskcache-5.6.3 jinja2-3.1.6 llama-cpp-python-0.3.9 numpy-2.2.6 typing-extensions-4.14.0