周六日在家捣腾了一下,把过程记录下来。
前置条件
- Visual Studio C++ 生成工具
- 和本机显卡适配的CUDA
- 与CUDA匹配的cuDNN
- Python 3
- NumPy
- OpenCV源代码以及对应版本的OpenCV-contrib模块源码
- CMake
Visual Studio
下载Visual Studio(我本机的是VS2022),通过Visual Studio Installer安装程序,安装C++工具集(或C++工作负荷),详细安装过程可参考这里。
CUDA和cuDNN
下载安装最新版的CUDA Toolkit,注意与本地GPU兼容,或者检查本地路径,看是否已经安装CUDA工具包。以我本机为例,CUDA12.5安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5。同上,登录NVIDIA账号下载cuDNN,并将cuDNN文件中的内容复制到CUDA Toolkit所在目录(如C:\Program Files\NVIDIA\CUDNN\vX.X)的bin、include和lib/x64等文件夹下,我本机的cuDNN是9.2.1版。
Python、NumPy及pip
安装Python3.x版本,由于需要使用numpy矩阵替代cv:Mat,,故还需安装numpy,保证已经安装好numpy(pip install numpy)并确保包括opencv-python和opencv-contrib-python等opencv包卸载干净。
pip uninstall opencv-python
pip uninstall opencv-contrib-python
删除cv2目录------YOUR_PYTHON_PATH/Lib/site-packages/cv2
OpenCV
从github仓库下载,或克隆仓库到本地,内容包括OpenCV及版本匹配的opencv-contrib。
CMake配置
给opencv和opencv-contrib创建build目录,然后配置cmake。Cmake配置可参考官网链接:OpenCV: OpenCV configuration options reference。
这是一个漫长的过程,中途需要下载3rdparty文件夹里引用的第三方内容,个别库还可能出错,需要手工下载。
本例我们把Python也选上:
bash
General configuration for OpenCV 4.10.0 =====================================
Version control: unknown
Platform:
Timestamp: 2024-07-20T06:31:04Z
Host: Windows 10.0.22631 AMD64
CMake: 3.29.0
CMake generator: Visual Studio 17 2022
CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/MSBuild/Current/Bin/amd64/MSBuild.exe
MSVC: 1940
Configuration: Debug Release
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (18 files): + SSSE3 SSE4_1
SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (9 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (38 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe (ver 19.40.33812.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /O2 /Ob2 /DNDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:x64 /INCREMENTAL:NO
Linker flags (Debug): /machine:x64 /debug /INCREMENTAL
ccache: NO
Precompiled headers: YES
Extra dependencies:
3rdparty dependencies:
OpenCV modules:
To be built: calib3d core dnn features2d flann gapi highgui imgcodecs imgproc java ml objdetect photo python3 stitching ts video videoio
Disabled: world
Disabled by dependency: -
Unavailable: python2
Applications: tests perf_tests apps
Documentation: NO
Non-free algorithms: NO
Windows RT support: NO
GUI: WIN32UI
Win32 UI: YES
VTK support: NO
Media I/O:
ZLib: build (ver 1.3.1)
JPEG: build-libjpeg-turbo (ver 3.0.3-70)
SIMD Support Request: YES
SIMD Support: NO
WEBP: build (ver encoder: 0x020f)
PNG: build (ver 1.6.43)
SIMD Support Request: YES
SIMD Support: YES (Intel SSE)
TIFF: build (ver 42 - 4.6.0)
JPEG 2000: build (ver 2.5.0)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
Video I/O:
DC1394: NO
FFMPEG: YES (prebuilt binaries)
avcodec: YES (58.134.100)
avformat: YES (58.76.100)
avutil: YES (56.70.100)
swscale: YES (5.9.100)
avresample: YES (4.0.0)
GStreamer: NO
DirectShow: YES
Media Foundation: YES
DXVA: YES
Parallel framework: Concurrency
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2021.11.0 [2021.11.0]
at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2021.11.0)
at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/iw
Lapack: NO
Eigen: NO
Custom HAL: NO
Protobuf: build (3.19.1)
Flatbuffers: builtin/3rdparty (23.5.9)
OpenCL: YES (NVD3D11)
Include path: D:/Data/source/collection/OpenCV/4100/opencv-4.10.0/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Python 3:
Interpreter: D:/miniconda3/python.exe (ver 3.11.7)
Libraries: D:/miniconda3/libs/python311.lib (ver 3.11.7)
Limited API: NO
numpy: D:/miniconda3/Lib/site-packages/numpy/core/include (ver 1.26.1)
install path: D:/miniconda3/Lib/site-packages/cv2/python-3.11
Python (for build): D:/miniconda3/python.exe
Java:
ant: NO
Java: YES (ver 1.8.0.371)
JNI: D:/Program Files/jdk-1.8/include D:/Program Files/jdk-1.8/include/win32 D:/Program Files/jdk-1.8/include
Java wrappers: YES (JAVA)
Java tests: NO
Install to: D:/Data/source/collection/OpenCV/4100/build/install
-----------------------------------------------------------------
Configuring done (136.2s)
配置后好再修改以下两个参数,其中CUDA_ARCH_BIN找到CUDA Toolkit后,目前的版本会自动选上。
之前还生成了VTK,故加上了VTK路径(这是VTK的cmake生成路径):
点击"生成"按钮,生成VS工程。
从图可见,Java和Python的绑定工程都有了。
Visual Studio生成
执行ALL_BUILD生成工程,执行INSTALL进行安装。
生成和安装是一个漫长的等待......
注:要直接安装到Python环境中,需要用管理员身份打开VS,然后生成INSTALL项目。
安装效果及应用
cpp
#pragma comment(lib, "opencv_core4100.lib")
#include <iostream>
#include <opencv2/core/cuda.hpp>
int main()
{
int deviceCount = cv::cuda::getCudaEnabledDeviceCount();
std::cout << "CUDA Device Number: " << deviceCount << std::endl;
cv::cuda::printCudaDeviceInfo(0);
}
bash
Python 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> print(cv2.__version__)
4.10.0
>>> print(cv2.cuda.getCudaEnabledDeviceCount())
1
>>> cv2.cuda.printCudaDeviceInfo(0)
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***
Device count: 1
Device 0: "NVIDIA GeForce RTX 3070 Ti Laptop GPU"
CUDA Driver Version / Runtime Version 12.50 / 12.50
CUDA Capability Major/Minor version number: 8.6
Total amount of global memory: 8192 MBytes (8589410304 bytes)
GPU Clock Speed: 1.41 GHz
Max Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072,65536), 3D=(16384,16384,16384)
Max Layered Texture Size (dim) x layers 1D=(32768) x 2048, 2D=(32768,32768) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:
Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.50, CUDA Runtime Version = 12.50, NumDevs = 1
遇到的几个问题
- Visual Studio 已安装的Python版本影响
之前安装VS2022时装了Python开发负荷(Python3.9),导致cmake的时候绑死了该环境,且指向conda里的Python环境,其Libraries还是指向3.9。卸载VS中的Python可以解决。
- 缺Nvidia Video Codec SDK导致的警告
CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:26 (message):
cudacodec::VideoReader requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVID=OFF
CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:30 (message):
cudacodec::VideoWriter requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVENC=OFF
下载nvidia Video Codec SDK,并把lib和头文件(interface目录)分别复制到cuda toolkit的lib/x64和include目录,问题解决。
- CUDA版本问题导致的错误CMake Error at cmake/OpenCVDetectCUDAUtils.cmake :297 (list) list GET given empty list
因我的Visual Studio是17.10.4,在CUDA12.2上构建,则会出现这个问题,因为根据官方文档,CUDA Toolkit 12.2 只支持到17.0的Visual Studio,如下图:
CUDA Installation Guide Microsoft Windows (nvidia.com)
更换为CUDA 12.5,可以解决这个问题:
- Python使用CV2时dll缺失错误
ImportError: DLL load failed while importing cv2: 找不到指定的模块。
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\miniconda3\Lib\site-packages\cv2\init.py", line 181, in <module>
bootstrap()
File "D:\miniconda3\Lib\site-packages\cv2\init.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\Lib\importlib\init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
提示dll缺失,使用ProcessMonitor,添加python.exe过滤器,重现错误,追出出错原因:
发现原来是自己编译VTK带来的锅,自己搞的VTK,含着泪也要把它搞定,So,加到cv2的config.py中,但导致别的错误(都怪自己,把VTK的debug版和release版放一起了),单独抽取当中的release版,加入到环境变量或cv2的config.py,或者直接拷贝到site-packages->cv2->python-3.11目录。搞定,问题解决。
参考资料
Quick and Easy OpenCV Python Installation with Cuda GPU in Under 10 Minutes (youtube.com)
Unable to enable Cudacodec VideoReader · Issue #11220 · opencv/opencv · GitHub
OpenCV: OpenCV configuration options reference