FunASR离线文件转写服务开发指南-debian-10.13

FunASR离线文件转写服务开发指南-debian-10.13

服务器环境

debian10.13 64位

第一步 配置静态网卡

复制代码
auto eth0
iface eth0 inet static
address 192.168.1.100
netmask 255.255.255.0
gateway 192.168.1.1
dns-nameservers 8.8.8.8 8.8.4.4

/etc/init.d/networking restart

第二步 配置国内源 及更新软件包

bash 复制代码
deb http://mirrors.ustc.edu.cn/debian/ bullseye main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye main contrib non-free
deb http://mirrors.ustc.edu.cn/debian/ bullseye-updates main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye-updates main contrib non-free
deb http://mirrors.ustc.edu.cn/debian/ bullseye-backports main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye-backports main contrib non-free

apt update

apt upgrade

第三步 查看python环境 以便做本机测试

bash 复制代码
python3 --version
# Python 3.9.2  可以满足测试 无需上级

pip3 --versin
# -bash: pip3:未找到命令

# 安装pip3
apt install python3-pip -y

pip3 --version
# pip 20.3.4 from /usr/lib/python3/dist-packages/pip (python 3.9) 正常

# 安装python虚拟环境模块
apt install python3-venv

# 修改pip的源
mkdir ~/.pip
echo "[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple" > ~/.pip/pip.conf

第四步 安装docker

bash 复制代码
apt install apt-transport-https ca-certificates curl gnupg lsb-release wget

curl -fsSL https://download.docker.com/linux/debian/gpg |  gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

echo  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] http://download.docker.com/linux/debian   $(lsb_release -cs) stable" |  tee /etc/apt/sources.list.d/docker.list > /dev/null

apt update

apt install docker-ce docker-ce-cli containerd.io

docker --version
# Docker version 27.3.1, build ce12230 表示成功

第五步 拉去FunASR镜像

bash 复制代码
docker pull  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6

mkdir -p /var/local/funasr-runtime-resources/models

docker run -p 10095:10095 -it --privileged=true -v /var/local/funasr-runtime-resources/models/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6

cd FunASR/runtime


nohup bash run_server.sh   --download-model-dir /workspace/models --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst  --itn-dir thuduj12/fst_itn_zh  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

# 如果您想关闭ssl,增加参数:--certfile 0
# 如果您想使用SenseVoiceSmall模型、时间戳、nn热词模型进行部署,请设置--model-dir为对应模型:
#   iic/SenseVoiceSmall-onnx
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx(时间戳)
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx(nn热词)
# 如果您想在服务端加载热词,请在宿主机文件./funasr-runtime-resources/models/hotwords.txt配置热词(docker映射地址为/workspace/models/hotwords.txt):
#   每行一个热词,格式(热词 权重):阿里巴巴 20(注:热词理论上无限制,但为了兼顾性能和效果,建议热词长度不超过10,个数不超过1k,权重1~100)
# SenseVoiceSmall-onnx识别结果中"<|zh|><|NEUTRAL|><|Speech|> "分别为对应的语种、情感、事件信息


#部署8k的模型,请使用如下命令启动服务:

cd FunASR/runtime

nohup bash run_server.sh --download-model-dir /workspace/models  --vad-dir damo/speech_fsmn_vad_zh-cn-8k-common-onnx  --model-dir damo/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1-onnx --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst-token8358 --itn-dir thuduj12/fst_itn_zh --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

第六步 测试

本机测试

bash 复制代码
# 环境
# python 3.9.X  pip 20及以上

cd /opt

wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz

tar xvfz funasr_samples.tar.gz

cd /opt/samples/python

pip3 install websockets

# 第一次测试

python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"  
-audio_in "../audio/asr_example.wav" --output_dir "./results"

# Namespace(host='127.0.0.1', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='../audio/asr_example.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
# connect to wss://127.0.0.1:10095
#pid0_0: demo: 欢迎大家来体验达摩院推出的语音识别模型。 timestamp: [[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]
#Exception: sent 1000 (OK); then received 1000 (OK)
#end

# 第二次测试

 python3 funasr_wss_client.py --host "192.168.1.181" --port 10095 --mode offline --audio_in "../audio/asr_example.wav" --output_dir "./results"
 
 
# Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='../audio/asr_example.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
# connect to wss://192.168.1.181:10095
# pid0_0: demo: 欢迎大家来体验达摩院推出的语音识别模型。 timestamp: [[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]
# Exception: sent 1000 (OK); then received 1000 (OK)
# end

同局域网测试

  • python环境

    bash 复制代码
    python3 funasr_wss_client.py --host "192.168.1.181" --port 10095 --mode offline --audio_in "./001.wav" --output_dir "./results"
    
    #  --audio_in "./001.wav"  更改为本机音频路径
    
    # Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='./001.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
    
    # Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='./001.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
    # connect to wss://192.168.1.181:10095
    # pid0_0: demo: 咱们是微信支付的,不是银行这边的。 timestamp: [[90,210],[210,290],[290,410],[410,550],[550,690],[690,850],[850,1030],[1030,1310],[1310,1430],[1430,1570],[1570,1670],[1670,1850],[1850,1950],[1950,2130],[2130,2305]]
    # Exception: sent 1000 (OK); then received 1000 (OK)
    # end
  • html测试

    打开下载的测试包,打开html/static/index.html

结束

相关推荐
GodGump1 小时前
dbgpt7.0 docker部署
运维·docker·容器
Wnq100725 小时前
智能巡检机器人在化工企业的应用研究
运维·计算机视觉·机器人·智能硬件·deepseek
tf的测试笔记7 小时前
测试团队UI自动化实施方案
运维·自动化
TDD_06288 小时前
【运维】Centos硬盘满导致开机时处于加载状态无法开机解决办法
linux·运维·经验分享·centos
头孢头孢8 小时前
k8s常用总结
运维·后端·k8s
遇码8 小时前
单机快速部署开源、免费的分布式任务调度系统——DolphinScheduler
大数据·运维·分布式·开源·定时任务·dolphin·scheduler
爱编程的王小美8 小时前
Docker基础详解
运维·docker·容器
学习至死qaq9 小时前
windows字体在linux访问异常
linux·运维·服务器
IEVEl9 小时前
Centos7 安装 TDengine
运维·centos·时序数据库·tdengine
在野靡生.10 小时前
Ansible(4)—— Playbook
linux·运维·ansible