注:文中使用的环境为Ubuntu 22.04 LTS + CUDA 12.4.1
1、创建conda环境
shell
conda create -n xinference python==3.11 -y
conda activate xinference
2、安装Xinference
shell
pip install "xinference[all]"
3、解决报错
Pytorch报错
根据Pytorch安装指引进行安装,重新执行命令即可。注意,一定要在Pytorch网站上按照你的系统和CUDA版本等选择。使用pip即可。例:
shell
pip3 install torch torchvision torchaudio
pip install "xinference[all]"
llama-cpp-python报错
当报错ERROR: Failed building wheel for llama-cpp-python
时,需要手动修补环境并重新安装llama-cpp-python
。这个问题比较复杂。按步骤操作即可修复:
-
修复构建工具链
shellsudo add-apt-repository ppa:ubuntu-toolchain-r/test sudo apt update sudo apt install gcc-11 g++-11 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 60 --slave /usr/bin/g++ g++ /usr/bin/g++-11 pip install --upgrade pip pip install --upgrade setuptools wheel sudo apt-get install build-essential sudo apt-get install libgomp1
-
检查并更新环境变量
shellexport PATH=/usr/local/cuda-12.4/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:/usr/local/cuda-12.4/extras/CUPTI/lib64:/usr/local/cuda-12.4/targets/x86_64-linux/lib:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
-
重新安装
shell# CPU推理 pip install llama-cpp-python --verbose # N卡推理 CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python==0.2.57 --no-cache-dir --verbose pip install "xinference[all]"