- 下载匹配cuda的kaldi镜像
Ubuntu 20.04 including Python 3.8
NVIDIA CUDA 11.6.0
cuBLAS 11.8.1.74
NVIDIA cuDNN 8.3.2.44
NVIDIA NCCL 2.11.4 (optimized for NVLink™)
rdma-core 36.0
NVIDIA HPC-X 2.10
OpenMPI 4.1.2rc4+
OpenUCX 1.12.0
GDRCopy 2.3
Nsight Systems 2021.5.2.53
TensorRT 8.2.2
SHARP 2.5
DALI 1.9
- 下载命令:docker pull nvcr.io/nvidia/kaldi:22.01-py3
找包的过程,可以参考之前docker的那篇文章。
docker run --gpus '"device=all"' -itd -v /home/work/wang:/home/work/wang
-v /opt/wfs1/aivoice:/opt/wfs1/aivoice
--net host
--name wyr_tf_cuda11.6
--shm-size=8g
nvcr.io/nvidia/kaldi:22.01-py3 bash
-
配置pip 和 conda
vim ~/.pip/pip.conf
添加如下内容
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com
- 配置conda镜像
vim ~/.condarc
channels:
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
- 安装tensorflow-gpu==1.14.0
第一次尝试:
pip install tensorflow-gpu==1.14.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
找不到版本。原因是自己的python是3.8。
tensorflow1.14需要python3.7版本,而python3.8版本对应的是tensorflow2版本。
于是首先创建python3.7环境。
conda create -n audio python=3.7
conda activate audio
第二次尝试:
pip install tensorflow-gpu==1.14.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
安装成功。但是import出错。
错误1:
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
- Downgrade the protobuf package to 3.20.x or lower.
- Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
解决方法:
pip install protobuf==3.19.0
错误2:
/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
解决方法:
pip install numpy==1.16.4
- 其它包的安装
中间运行项目的时候,发现少一些包,比如resampy,pandas,使用pip单独安装会安装最新版本,然后卸载numpy1.16.4,安装更新版本的,这样会导致tensorflow又会报错,所以需要找到合适的resampy和pandas版本。从网上没找到说明,就手动一直实验,不好弄。后来发现可以用下面的方法解决:
pip install numpy==1.16.4 resampy numba scipy pandas h5py
这样写一块就能限制resampy、numba、scipy的版本,让他们自动兼容