说来话长,我想配一个一劳永逸的环境,方便以后复用。省的以后反复查教程重新装了
1. 安装miniconda+py3.10
sh
cd /root
wget -q https://repo.anaconda.com/miniconda/Miniconda3-py310_24.4.0-0-Linux-x86_64.sh
bash ./Miniconda3-py310_24.4.0-0-Linux-x86_64.sh -b -f -p /root/miniconda3
rm -f ./Miniconda3-py310_24.4.0-0-Linux-x86_64.sh
echo "PATH=/root/miniconda3/bin:/usr/local/bin:$PATH" >> /etc/profile
echo "source /etc/profile" >> /root/.bashrc
# 初始化miniconda
conda init
2. 本机安装cudatoolkit 12.1
这块内容来自:https://docs.infini-ai.com/posts/install-cuda-on-devmachine.html
首先更新系统包列表
sh
sudo apt update
以 Runfile 的方式安装系统级 CUDA 12.1.1
sh
# 下载 CUDA Toolkit 安装包
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
# 安装 CUDA Toolkit
sudo sh cuda_12.1.1_530.30.02_linux.run
稍等片刻,会提示接受 EULA 协议。输入 accept 接受协议。
sh
┌──────────────────────────────────────────────────────────────────────────────┐
│ End User License Agreement │
│ -------------------------- │
│ │
│ NVIDIA Software License Agreement and CUDA Supplement to │
│ Software License Agreement. Last updated: October 8, 2021 │
│ │
│ The CUDA Toolkit End User License Agreement applies to the │
│ NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA │
│ Display Driver, NVIDIA Nsight tools (Visual Studio Edition), │
│ and the associated documentation on CUDA APIs, programming │
│ model and development tools. If you do not agree with the │
│ terms and conditions of the license agreement, then do not │
│ download or use the software. │
│ │
│ Last updated: October 8, 2021. │
│ │
│ │
│ Preface │
│ ------- │
│ │
│──────────────────────────────────────────────────────────────────────────────│
│ Do you accept the above EULA? (accept/decline/quit): │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
输入accept
,同意协议后,按照提示进行安装,选择自定义安装,只选择 CUDA Toolkit 和相关库。
sh
┌──────────────────────────────────────────────────────────────────────────────┐
│ CUDA Installer │
│ - [ ] Driver │
│ [ ] 520.61.05 │
│ + [X] CUDA Toolkit 12.1 │
│ [X] CUDA Demo Suite 12.1 │
│ [X] CUDA Documentation 12.1 │
│ - [ ] Kernel Objects │
│ [ ] nvidia-fs │
│ Options │
│ Install │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└──────────────────────────────────────────────────────────────────────────────┘
安装完成后,输出如下:
sh
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-12.1/
Please make sure that
- PATH includes /usr/local/cuda-12.1/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-12.1/lib64, or, add /usr/local/cuda-12.1/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.1/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 520.00 is required for CUDA 12.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run --silent --driver
Logfile is /var/log/cuda-installer.log
接下来配置 CUDA 环境变量
设置 PATH
、LD_LIBRARY_PATH
和 CUDA_HOME
(通用路径和 CUDA 12.1 特定路径):
sh
echo 'export PATH=/usr/local/cuda/bin:/usr/local/cuda-12.1/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc
将 CUDA 库路径添加到 /etc/ld.so.conf
:
sh
echo '/usr/local/cuda/lib64' | sudo tee -a /etc/ld.so.conf
echo '/usr/local/cuda-12.1/lib64' | sudo tee -a /etc/ld.so.conf
运行 ldconfig
:
sh
sudo ldconfig
应用更改到当前会话:
sh
source ~/.bashrc
验证路径设置:
sh
echo $PATH | grep -E "cuda|cuda-12.1"
echo $LD_LIBRARY_PATH | grep -E "cuda|cuda-12.1"
echo $CUDA_HOME
ldconfig -p | grep "libcudart"
验证nvcc可用否:
sh
nvcc --version
输出:
sh
(base) root@autodl-container-39eb4a843f-12e69afc:~/autodl-tmp/code# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
(base) root@autodl-container-39eb4a843f-12e69afc:~/autodl-tmp/code#
3. 安装系统级cudnn 8.9.2.26
我们这里选择tarball方式安装,首先wget下载cudnn安装包
sh
wget -O cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
随后解压并安装
sh
# 解压安装包
tar -xvf cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
cd cudnn-linux-x86_64-8.9.2.26_cuda12-archive
# 将解压后的文件复制到cuda安装路径
sudo cp -r include/* /usr/local/cuda/include
sudo cp -r lib/* /usr/local/cuda/lib64
# 也可以将其复制到 /usr/include/ 和 /usr/lib/x86_64-linux-gnu/
# 修改文件权限
sudo chmod a+r /usr/local/cuda/include/cudnn*.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
# 添加环境变量
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# 刷新ldconfig
sudo ldconfig
其他安装方式,例如包管理器安装方式可参考:https://docs.nvidia.com/deeplearning/cudnn/latest/installation/linux.html#tarball-installation
4. 安装pytorch2.3.0+cu121 torchvision0.18.0等
使用pip命令安装pytorch和torchvision,注意,该方法安装的pytorch会自带cudatoolkit12.1和cudnn8.9.0,但仅限于pytorch自己使用,与系统中之前本机安装的系统级,所有环境都能使用的cudatoolkit和cudnn不是同一套,参考:
https://docs.infini-ai.com/posts/where-is-cuda.html
首先在base环境上新建一个环境,命名为pytorch
sh
conda create -n pytorch
conda activate pytorch
然后在pytorch环境中安装torch2.3.0,安装命令如下:
sh
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
验证安装是否成功,新建并执行脚本:python test_torch.py,如下内容:
py
import torch
print('PyTorch version: ' + str(torch.__version__))
print('CUDA available: ' + str(torch.cuda.is_available()))
print('cuDNN version: ' + str(torch.backends.cudnn.version()))
a = torch.tensor([0., 0.], dtype=torch.float32, device='cuda')
print('Tensor a =', a)
b = torch.randn(2, device='cuda')
print('Tensor b =', b)
c = a + b
print('Tensor c =', c)
import torchvision
print(torchvision.__version__)
输出如下:
sh
PyTorch version: 2.3.0+cu121
CUDA available: True
cuDNN version: 8902
Tensor a = tensor([0., 0.], device='cuda:0')
Tensor b = tensor([-0.4807, -0.8651], device='cuda:0')
Tensor c = tensor([-0.4807, -0.8651], device='cuda:0')
0.18.0+cu121
5. 安装onnxruntime-gpu
首先pypi上官方的onnxruntime-gpu安装包只有1.19.0以上版本才支持cuda12.x,但1.19.0以上版本又不支持cudnn8.x,所以需要自己编译安装支持cudnn8.x且能支持cuda12.x的onnxruntime-gpu,不过已经有人编译过了,可以直接拿来用。
sh
# 安装支持cuda12+cudnn8的onnxruntime-gpu1.18.0:
pip install -U onnxruntime-gpu==1.18.0 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
验证安装是否成功,新建并执行脚本:python test_onnxruntime.py,如下内容:
py
import cv2
import numpy as np
import onnxruntime as ort
import torch
use_gpu = True
# 检查可用的执行提供者
available_providers = ort.get_available_providers()
print("Available providers:", available_providers)
# 设置执行提供者
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if use_gpu and 'CUDAExecutionProvider' in available_providers else ['CPUExecutionProvider']
print("Using providers:", providers)
# 加载模型
session = ort.InferenceSession('./yolov8s.onnx', providers=providers)
# 获取输入和输出的名称
input_name = session.get_inputs()[0].name
output_names = [output.name for output in session.get_outputs()]
print(f"input_name: {input_name}")
print(f"output_names: {output_names}")
providers = [("CUDAExecutionProvider", {"device_id": torch.cuda.current_device(),
"user_compute_stream": str(torch.cuda.current_stream().cuda_stream)})]
sess_options = ort.SessionOptions()
sess = ort.InferenceSession("./yolov8s.onnx", sess_options=sess_options, providers=providers)
输出如下:
sh
Available providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
Using providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
2024-12-15 22:59:47.961114142 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.2/m.0/cv2/ReduceMean'
2024-12-15 22:59:47.962762629 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.4/m.0/cv2/ReduceMean'
2024-12-15 22:59:47.963180939 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.6/m.0/cv2/ReduceMean'
2024-12-15 22:59:47.966209107 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.2/m.0/cv2/ReduceMean'
2024-12-15 22:59:47.966248419 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.4/m.0/cv2/ReduceMean'
2024-12-15 22:59:47.966477201 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.6/m.0/cv2/ReduceMean'
input_name: images
output_names: ['output0']
2024-12-15 22:59:48.085628154 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.2/m.0/cv2/ReduceMean'
2024-12-15 22:59:48.085909892 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.4/m.0/cv2/ReduceMean'
2024-12-15 22:59:48.086174739 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.6/m.0/cv2/ReduceMean'
2024-12-15 22:59:48.091461531 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.2/m.0/cv2/ReduceMean'
2024-12-15 22:59:48.091661505 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.4/m.0/cv2/ReduceMean'
2024-12-15 22:59:48.091925856 [W:onnxruntime:, constant_folding.cc:269 ApplyImpl] Could not find a CPU kernel and hence can't constant fold ReduceMean node '/model.6/m.0/cv2/ReduceMean'
6. 安装TensorRT8.6.1
下载对应cuda版本的TensorRT版本和Liunx_x86_64版本的【Tar】文件
官网的tensorrt9.*版本的安装包不知咋的全都没了,我这里下载了支持cuda12.1的tensorrt8.6.1.8
下载TensorRT 8.6 GA for Linux x86_64 and CUDA 12.0 and 12.1 TAR Package
然后上传到容器中
参考官方指导进行安装:https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar
sh
# 解压安装包
tar -xvf TensorRT-8.6.1.8.Linux.x86_64-gnu.cuda-12.1.cudnn8.9.tar.gz
# 将 TensorRT lib 目录的绝对路径添加到环境变量中 LD_LIBRARY_PATH :
export LD_LIBRARY_PATH=/path/to/TensorRT-8.6.1.6/lib:$LD_LIBRARY_PATH
# 安装 Python TensorRT wheel 文件,3.x换成你的python版本
cd TensorRT-8.6.1.8/python
python3 -m pip install tensorrt-*-cp310-none-linux_x86_64.whl
# (可选)安装 TensorRT 精益和调度运行时轮 文件:
python3 -m pip install tensorrt_lean-*-cp310-none-linux_x86_64.whl
python3 -m pip install tensorrt_dispatch-*-cp310-none-linux_x86_64.whl
验证安装是否成功,新建并执行脚本:python test_tensorrt.py,如下内容:
py
import onnx
from onnx import shape_inference
import tensorrt as trt
import os
print(f"TensorRT version: {trt.__version__}")
# 加载ONNX模型
onnx_model_path = "./yolov8s.onnx"
onnx_model = onnx.load(onnx_model_path)
# 进行形状推断
inferred_model = shape_inference.infer_shapes(onnx_model)
onnx.save(inferred_model, "inferred_" + os.path.basename(onnx_model_path))
# 创建TensorRT builder和网络定义
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
# 解析ONNX模型
parser = trt.OnnxParser(network, TRT_LOGGER)
with open("inferred_" + os.path.basename(onnx_model_path), 'rb') as model:
if not parser.parse(model.read()):
print('Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print(parser.get_error(error))
exit()
# 构建TensorRT引擎
config = builder.create_builder_config()
config.max_workspace_size = 1 << 30 # 1GB
serialized_engine = builder.build_serialized_network(network, config)
# 保存TensorRT引擎文件
engine_file_path = "yolov8s.trt"
with open(engine_file_path, "wb") as f:
f.write(serialized_engine)
# 加载TensorRT引擎并进行推理(省略具体实现)
如果能正常运行,将onnx模型转为tensorrt模型,不报错,说明安装成功。
7. 编译安装带有cuda支持的opencv4.9.0
首先安装编译工具,也是opencv正常运行的前提。
sh
sudo apt install cmake
sudo apt install python3-numpy
sudo apt install libavcodec-dev libavformat-dev libswscale-dev
sudo apt install libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev
sudo apt install libgtk-3-dev
sudo apt install libpng-dev libjpeg-dev libopenexr-dev libtiff-dev libwebp-dev
然后下载4.9.0版本的opencv源码包
sh
sudo apt install git
cd ~/Downloads
git clone --branch=4.9.0 --single-branch https://github.com/opencv/opencv.git
git clone --branch=4.9.0 --single-branch https://github.com/opencv/opencv_contrib.git
编译安装到当前conda环境中:
sh
cd opencv
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_CUDA=ON -D WITH_CUDNN=ON -D WITH_CUBLAS=ON -D WITH_TBB=ON -D OPENCV_DNN_CUDA=ON -D OPENCV_ENABLE_NONFREE=ON -D CUDA_ARCH_BIN=8.9 -D OPENCV_EXTRA_MODULES_PATH=$HOME/Downloads/opencv_contrib/modules -D BUILD_EXAMPLES=OFF -D HAVE_opencv_python3=ON -D ENABLE_FAST_MATH=1 -D cuda_toolkit_root_dir=/usr/local/cuda-12.1 -D CUDNN_INCLUDE_DIR=/usr/include/ -D CUDNN_LIBRARY=/usr/lib/x86_64-linux-gnu/libcudnn.so.8 -D PYTHON_DEFAULT_EXECUTABLE=$(python3 -c "import sys; print(sys.executable)") -D PYTHON3_EXECUTABLE=$(python3 -c "import sys; print(sys.executable)") -D PYTHON3_NUMPY_INCLUDE_DIRS=$(python3 -c "import numpy; print (numpy.get_include())") -D PYTHON3_PACKAGES_PATH=$(python3 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") ..
注意,请将CUDA_ARCH_BIN
参数的值改为你的cuda算力,例如:CUDA_ARCH_BIN=8.9
,请将CUDNN_INCLUDE_DIR
和CUDNN_LIBRARY
参数的值改为你的cudnn安装路径,例如:CUDNN_INCLUDE_DIR=/usr/include/
,CUDNN_LIBRARY=/usr/lib/x86_64-linux-gnu/libcudnn.so.8
,请将OPENCV_EXTRA_MODULES_PATH
参数的值改为你的opencv_contrib源码包路径,例如:OPENCV_EXTRA_MODULES_PATH=$HOME/Downloads/opencv_contrib/modules
,最后执行:
sh
make -j$(nproc)
等待编译完成,编译完成后,执行:
sh
sudo make install
sudo ldconfig
验证安装,新建并执行脚本:python test_cv.py,如下内容:
py
import cv2
print(cv2.__version__)
print(cv2.cuda.getCudaEnabledDeviceCount())
print(cv2.getBuildInformation())
如果没有报错,则说明安装成功。
注意,由于我pytorch环境中使用的也是base环境中的python3.10.14,所以pytorch环境中可直接使用该cuda版opencv4.9.0,这个支持cuda的opencv4.9.0针对python的包位于/usr/local/lib/python3.10/dist-packages/cv2,如更换其他版本,请自行创建链接到自己版本python的库中:
sudo ln -s /usr/local/lib/python3.10/dist-packages/cv2 /root/miniconda3/lib/python3.xx/site-packages/cv2
安装完成后,之前apt安装的包也不要卸载,否则opencv会导入报错,无法运行。