tensorflow环境安装配置

下载匹配cuda的kaldi镜像

Ubuntu 20.04 including Python 3.8

NVIDIA CUDA 11.6.0

cuBLAS 11.8.1.74

NVIDIA cuDNN 8.3.2.44

NVIDIA NCCL 2.11.4 (optimized for NVLink™)

rdma-core 36.0

NVIDIA HPC-X 2.10

OpenMPI 4.1.2rc4+

OpenUCX 1.12.0

GDRCopy 2.3

Nsight Systems 2021.5.2.53

TensorRT 8.2.2

SHARP 2.5

DALI 1.9

下载命令：docker pull nvcr.io/nvidia/kaldi:22.01-py3
找包的过程，可以参考之前docker的那篇文章。

docker run --gpus '"device=all"' -itd -v /home/work/wang:/home/work/wang

-v /opt/wfs1/aivoice:/opt/wfs1/aivoice

--net host

--name wyr_tf_cuda11.6

--shm-size=8g

nvcr.io/nvidia/kaldi:22.01-py3 bash

配置pip 和 conda

vim ~/.pip/pip.conf

添加如下内容

global

index-url = https://pypi.tuna.tsinghua.edu.cn/simple

install

trusted-host=mirrors.aliyun.com

配置conda镜像

vim ~/.condarc

复制代码

channels:
  - defaults
show_channel_urls: true
default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud

安装tensorflow-gpu==1.14.0

第一次尝试：

pip install tensorflow-gpu==1.14.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

找不到版本。原因是自己的python是3.8。

tensorflow1.14需要python3.7版本，而python3.8版本对应的是tensorflow2版本。

于是首先创建python3.7环境。

conda create -n audio python=3.7

conda activate audio

第二次尝试：

pip install tensorflow-gpu==1.14.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

安装成功。但是import出错。

错误1:

TypeError: Descriptors cannot not be created directly.

If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.

If you cannot immediately regenerate your protos, some other possible workarounds are:

Downgrade the protobuf package to 3.20.x or lower.
Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

解决方法：

pip install protobuf==3.19.0

错误2:

/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

_np_qint8 = np.dtype([("qint8", np.int8, 1)])

/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

_np_quint8 = np.dtype([("quint8", np.uint8, 1)])

/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

_np_qint16 = np.dtype([("qint16", np.int16, 1)])

/home/work/wangyaru05/anaconda3/envs/audio/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

_np_quint16 = np.dtype([("quint16", np.uint16, 1)])

解决方法：

pip install numpy==1.16.4

其它包的安装
中间运行项目的时候，发现少一些包，比如resampy，pandas，使用pip单独安装会安装最新版本，然后卸载numpy1.16.4，安装更新版本的，这样会导致tensorflow又会报错，所以需要找到合适的resampy和pandas版本。从网上没找到说明，就手动一直实验，不好弄。后来发现可以用下面的方法解决：

pip install numpy==1.16.4 resampy numba scipy pandas h5py

这样写一块就能限制resampy、numba、scipy的版本，让他们自动兼容