paddlespeech on centos7

概述

paddlespeech是百度飞桨平台的开源工具包,主要用于语音和音频的分析处理,其中包含多个可选模型,提供语音识别、语音合成、说话人验证、关键词识别、音频分类和语音翻译等功能。

paddlespeech整体是比较简单易用的,但是安装部署依然有很多坑,本文为探坑而写。

环境

centos 7.9

gcc 版本 7.3.0 (GCC)

OpenSSL 1.1.1

Python 3.10.3

pip 23.2.1

numpy==1.23

paddlepaddle 2.4.2 2.5.1 为什么会有两个版本,后面会讲到。

paddlespeech 1.4.1

安装步骤

安装步骤如下。

gcc 7.3.0

openssl 1.1.1

python 3.10.3

paddlepaddle 2.5.1

paddlespeech 1.4.1

基础依赖库

首先升级centos7到最新版本,安装一些基础依赖库

yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gcc make libffi-devel wget

gcc

centos7环境,默认的gcc版本是4.8.5,而paddlepaddle要求libstdc++.so.6支持GLIBCXX_3.4.20版本,需要首先安装gcc的高版本。

gcc 7.3.0版本安装,网上文章很多,略过。

openssl

python3.10要求openssl版本1.1.1。

wget http://www.openssl.org/source/openssl-1.1.1.tar.gz

tar -zxvf openssl-1.1.1.tar.gz

cd openssl-1.1.1

./config --prefix=/usr/local/openssl shared zlib

make

make install

mv /usr/bin/openssl /usr/bin/openssl.bak

ln -s /usr/local/openssl/bin/openssl /usr/bin/openssl

ln -s /usr/local/openssl/lib/libssl.so.1.1 /usr/lib64/libssl.so.1.1

ln -s /usr/local/openssl/lib/libcrypto.so.1.1 /usr/lib64/libcrypto.so.1.1

openssl version

python

python选择3.10版本。

wget https://www.python.org/ftp/python/3.10.3/Python-3.10.3.tgz

tar -zxvf Python-3.10.3.tgz

cd Python-3.10.3

./configure -C --with-openssl=/usr/local/openssl --with-openssl-rpath=auto --prefix=/usr/local/python3

make -j 8

make altinstall

ln -s /usr/local/python3/bin/python3.10 /usr/bin/python3

ln -s /usr/local/python3/bin/pip3.10 /usr/bin/pip3

python3 -V

Python 3.10.3

更换pip源

mkdir -p ~/.pip

touch ~/.pip/pip.conf

vi ~/.pip/pip.conf

global

index-url=https://mirrors.aliyun.com/pypi/simple/

install

trusted-host=mirrors.aliyun.com

ssl_verify: false

paddlepaddle

paddlepaddle需要2个版本,分别配合ASR和TTS使用。

pip3 install paddlepaddle==2.4.2 -i https://mirror.baidu.com/pypi/simple

pip3 install paddlepaddle==2.5.1 -i https://mirror.baidu.com/pypi/simple

paddlespeech

安装paddlespeech,语音相关的功能包括语音识别,语音合成,声音分类,声纹识别,标点恢复,语音翻译。

pip3 install pytest-runner -i https://mirror.baidu.com/pypi/simple

pip3 install paddlespeech -i https://mirror.baidu.com/pypi/simple

Successfully installed Babel-2.12.1 Flask-2.3.3 Flask-Babel-3.1.0 Jinja2-3.1.2 MarkupSafe-2.1.3 ToJyutping-0.2.3 Werkzeug-2.3.7 aiohttp-3.8.5 aiosignal-1.3.1 annotated-types-0.5.0 async-timeout-4.0.3 attrs-23.1.0 audioread-2.1.9 bce-python-sdk-0.8.90 blinker-1.6.2 bottleneck-1.3.7 braceexpand-0.1.7 cffi-1.15.1 charset-normalizer-3.2.0 click-8.1.7 colorama-0.4.6 coloredlogs-15.0.1 colorlog-6.7.0 contourpy-1.1.0 cycler-0.11.0 cython-3.0.0 datasets-2.14.4 dill-0.3.4 distance-0.1.3 editdistance-0.6.2 einops-0.6.1 fastapi-0.101.1 filelock-3.12.2 flatbuffers-23.5.26 fonttools-4.42.1 frozenlist-1.4.0 fsspec-2023.6.0 ftfy-6.1.1 future-0.18.3 g2p-en-2.1.0 g2pM-0.1.2.5h5py-3.9.0 huggingface-hub-0.16.4 humanfriendly-10.0 hyperpyyaml-1.2.1 inflect-7.0.0 itsdangerous-2.1.2 jieba-0.42.1 joblib-1.3.2 jsonlines-3.1.0 kaldiio-2.18.0 kiwisolver-1.4.5 librosa-0.8.1 llvmlite-0.40.1 loguru-0.7.0 lxml-4.9.3 markdown-it-py-3.0.0 matplotlib-3.7.2 mdurl-0.1.1 mock-5.1.0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.12.2 nara-wpe-0.0.9 nltk-3.8.1 numba-0.57.1 numpy-1.24.4 onnx-1.14.0 onnxruntime-1.15.1 opencc-1.1.6 opencc-python-reimplemented-0.1.7 paddle2onnx-1.0.9 paddleaudio-1.1.0 paddlefsl-1.1.0 paddlenlp-2.6.0 paddleslim-2.4.1 paddlespeech-1.4.1 paddlespeech-feat-0.1.0 pandas-2.0.3 parameterized-0.9.0 pathos-0.2.8 pattern-singleton-1.2.0 platformdirs-3.10.0 pooch-1.7.0 portalocker-2.7.0 pox-0.3.3 ppdiffusers-0.16.3 ppft-1.7.6.7 praatio-5.1.1 prettytable-3.8.0 protobuf-3.20.2 psutil-5.9.5 pyarrow-13.0.0 pybind11-2.11.1 pycparser-2.21 pycryptodome-3.18.0 pydantic-2.3.0 pydantic-core-2.6.3 pygments-2.16.1 pygtrie-2.5.0 pyparsing-3.0.9 pypinyin-0.44.0 pypinyin-dict-0.6.0 pytz-2023.3 pyworld-0.3.4 pyzmq-25.1.1 rarfile-4.0 regex-2023.8.8 requests-2.31.0 resampy-0.4.2 rich-13.5.2 ruamel.yaml-0.17.28 ruamel.yaml.clib-0.2.7 sacrebleu-2.3.1 safetensors-0.3.3 scikit-learn-1.3.0 scipy-1.11.2 sentencepiece-0.1.99 seqeval-1.2.2 soundfile-0.12.1 starlette-0.27.0 swig-4.1.1 sympy-1.12 tabulate-0.9.0 textgrid-1.5 threadpoolctl-3.2.0 timer-0.2.2 tqdm-4.66.1 typeguard-2.13.3 typer-0.9.0 tzdata-2023.3 uvicorn-0.23.2 visualdl-2.5.3 wcwidth-0.2.6 webrtcvad-2.0.10 websockets-11.0.3 xxhash-3.3.0 yacs-0.1.8 yarl-1.9.2 zhon-2.0.2

numpy版本需要选择1.23,否则报错。

pip3 install numpy==1.23 -i https://mirror.baidu.com/pypi/simple

测试

ASR功能依赖 paddlepaddle==2.4.2

$ paddlespeech asr --lang zh --input zh.wav

/usr/local/python3/lib/python3.10/site-packages/librosa/core/constantq.py:1059: DeprecationWarning: np.complex is a deprecated alias for the builtin complex. To silence this warning, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

dtype=np.complex,

/usr/local/python3/lib/python3.10/site-packages/paddle/fluid/dygraph/math_op_patch.py:275: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.int64, but right dtype is paddle.bool, the right dtype will convert to paddle.int64

warnings.warn(

我认为跑步最重要的就是给我带来了身体健康

TTS功能依赖 paddlepaddle==2.5.1

$ paddlespeech tts --input "苏大今天没有穿内裤,呵呵呵!" --output output3.wav

nltk_data\] Error loading averaged_perceptron_tagger: \ \[nltk_data\] Error loading cmudict: \ I0825 11:09:40.279230 2563 eager_method.cc:140\] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()\[0\]' . In order to avoid this problem, 0D Tensor will be changedto 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()\[0\]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()\[0\]' will raise error in release 2.6. I0825 11:09:40.279644 2563 eager_method.cc:140\] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()\[0\]' . In order to avoid this problem, 0D Tensor will be changedto 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()\[0\]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()\[0\]' will raise error in release 2.6. /usr/local/python3/lib/python3.10/site-packages/paddle/nn/layer/layers.py:1897: UserWarning: Skip loading for encoder.embed.1.alpha. encoder.embed.1.alpha receives a shape \[1\], but the expected shape is \[\]. warnings.warn(f"Skip loading for {key}. " + str(err)) /usr/local/python3/lib/python3.10/site-packages/paddle/nn/layer/layers.py:1897: UserWarning: Skip loading for decoder.embed.0.alpha. decoder.embed.0.alpha receives a shape \[1\], but the expected shape is \[\]. warnings.warn(f"Skip loading for {key}. " + str(err)) /home/adminx/test/output3.wav cls功能依赖 paddlepaddle==2.4.2/2.5.1 $ paddlespeech cls --input zh.wav 100%\|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 357907/357907 \[00:32\<00:00, 11173.08it/s

Speech 0.9034528136253357

vector功能依赖 paddlepaddle==2.4.2/2.5.1

$ paddlespeech vector --task spk --input zh.wav

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 259820/259820 [00:20<00:00, 12381.19it/s]

-0.19083646 9.474294 -14.12228 -2.0916362 0.04848658 4.92957 1.4780139 0.3733759 10.69586 3.2697136 -4.4820027 -0.6617906 -9.170393 -11.156884 -1.2358196 -3.3581464 -8.040278 -8.109016 5.271239 9.093345 4.080139 9.174555 -2.4747503 4.5701075 -6.1615624 -4.750184 2.4837155 15.827937 5.474065 3.2251058 0.10092238 11.682478 -0.47919133 3.572539 1.4974319 4.199508 9.543804 -6.7265534 7.489065 -4.7066617 0.9260804 2.6370869 -14.898721 3.6780186 -7.6924915 -1.9698792 9.436737 12.2048645 3.485145 2.6493874 -4.10985 8.051481 2.8838215 -6.756511 -1.7955961 5.8305116 -8.327385 -7.664741 12.04934 -6.977676 1.4514436 6.774237 -4.78431 10.4314 7.897736 -7.368048 -6.3448873 -11.598493 10.807491 -5.1794314 -2.6945627 10.874314 -7.6098304 11.810847 5.270554 5.2236743 2.3782775 3.3985224 -0.6136011 -6.0067887 -7.7289877 3.2568665 3.5521574 1.5729685 -8.427421 -5.4197965 -5.7204127 9.6017685 -7.11521 -10.819559 -2.041802 -8.249927 -2.3642402 5.248027 7.642632 3.8729753 -1.0397645 -2.15431 -2.7227147 5.8286257 -2.7757604 6.2585583 -0.24755064 -19.751856 3.3896728 -2.195075 -11.729757 5.008801 -3.3035963 3.6805942 0.22119749 -8.734743 -12.249261 -6.785996 -11.262364 3.8227513 4.4570937 0.43271756 -5.979373 -0.43533772 -12.417465 -7.380396 6.762073 -0.09675703 6.758829 0.47246385 -5.556693 1.654608 -5.651553 8.078561 3.1227856 17.694748 -0.91461915 -9.803121 2.3637018 -4.606942 0.2602589 5.6254964 -9.485892 -3.5908723 -6.751416 2.7892575 4.8451343 -8.851273 0.9642851 7.9920044 -0.09444531 1.8815458 -6.555375 -2.6035395 2.8816917 -5.3074865 8.416342 7.1294055 -2.4942544 9.977794 -3.4511476 7.2009816 -0.18145518 -0.28605637 10.311885 -6.427509 -4.791568 -0.1989103 -12.877758 -4.532637 -0.08484638 -10.895372 2.0810192 -5.8358116 14.491089 -2.793815 -2.0666945 -7.370983 8.564973 18.26662 6.8758926 9.029721 -11.058079 1.0859501 -4.4105487 2.5650666 0.92991847 10.917894 13.856809 2.334257 8.546575 11.740078 -5.884227 0.5982095 10.536286 2.504756

text使用 paddlepaddle==2.5.1

$ paddlespeech text --task punc --input 今天的天气真不错啊你下午有空吗我想约你一起去吃饭

nltk_data\] Error loading averaged_perceptron_tagger: \ \[nltk_data\] Error loading cmudict: \ 100%\|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 427595/427595 \[00:19\<00:00, 21717.40it/s

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 89.5k/89.5k [00:00<00:00, 366kB/s]

今天的天气真不错啊!你下午有空吗?我想约你一起去吃饭。

st功能依赖全失败,未测试成功。

总结

paddlespeech提供了语音相关的基础能力,降低了音频模型的使用门槛。

在paddlespeech的基础上,完善的产品还需要很多的包装工作和业务开发。

空空如常

求真得真

相关推荐
Red丶哞10 小时前
内网自建Postfix使用Python发送邮件
开发语言·python
rebekk10 小时前
pytorch custom op的简单介绍
人工智能·pytorch·python
chushiyunen11 小时前
uv使用笔记(python包的管理工具)
笔记·python·uv
曲幽11 小时前
FastAPI状态共享秘籍:别再让中间件、依赖和路由“各自为政”了!
python·fastapi·web·request·state·depends·middleware
风清扬【coder】11 小时前
Anaconda 被误删后抢救手册:数据恢复 + 环境重建应急流程
python·数据恢复·anaconda·环境重建
2401_8845632411 小时前
进阶技巧与底层原理
jvm·数据库·python
2401_8732046511 小时前
使用Pandas进行数据分析:从数据清洗到可视化
jvm·数据库·python
l1t11 小时前
DeepSeek 辅助编写python程序求解欧拉计划932题:2025数
开发语言·python·欧拉计划
七夜zippoe11 小时前
WebAssembly与Python:在浏览器中运行Python
开发语言·python·wasm·webassembly·pyscript
m0_6625779711 小时前
自动化与脚本
jvm·数据库·python