FunASR离线文件转写服务开发指南-debian-10.13

FunASR离线文件转写服务开发指南-debian-10.13

服务器环境

debian10.13 64位

第一步 配置静态网卡

复制代码
auto eth0
iface eth0 inet static
address 192.168.1.100
netmask 255.255.255.0
gateway 192.168.1.1
dns-nameservers 8.8.8.8 8.8.4.4

/etc/init.d/networking restart

第二步 配置国内源 及更新软件包

bash 复制代码
deb http://mirrors.ustc.edu.cn/debian/ bullseye main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye main contrib non-free
deb http://mirrors.ustc.edu.cn/debian/ bullseye-updates main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye-updates main contrib non-free
deb http://mirrors.ustc.edu.cn/debian/ bullseye-backports main contrib non-free
deb-src http://mirrors.ustc.edu.cn/debian/ bullseye-backports main contrib non-free

apt update

apt upgrade

第三步 查看python环境 以便做本机测试

bash 复制代码
python3 --version
# Python 3.9.2  可以满足测试 无需上级

pip3 --versin
# -bash: pip3:未找到命令

# 安装pip3
apt install python3-pip -y

pip3 --version
# pip 20.3.4 from /usr/lib/python3/dist-packages/pip (python 3.9) 正常

# 安装python虚拟环境模块
apt install python3-venv

# 修改pip的源
mkdir ~/.pip
echo "[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple" > ~/.pip/pip.conf

第四步 安装docker

bash 复制代码
apt install apt-transport-https ca-certificates curl gnupg lsb-release wget

curl -fsSL https://download.docker.com/linux/debian/gpg |  gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

echo  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] http://download.docker.com/linux/debian   $(lsb_release -cs) stable" |  tee /etc/apt/sources.list.d/docker.list > /dev/null

apt update

apt install docker-ce docker-ce-cli containerd.io

docker --version
# Docker version 27.3.1, build ce12230 表示成功

第五步 拉去FunASR镜像

bash 复制代码
docker pull  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6

mkdir -p /var/local/funasr-runtime-resources/models

docker run -p 10095:10095 -it --privileged=true -v /var/local/funasr-runtime-resources/models/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6

cd FunASR/runtime


nohup bash run_server.sh   --download-model-dir /workspace/models --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst  --itn-dir thuduj12/fst_itn_zh  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

# 如果您想关闭ssl,增加参数:--certfile 0
# 如果您想使用SenseVoiceSmall模型、时间戳、nn热词模型进行部署,请设置--model-dir为对应模型:
#   iic/SenseVoiceSmall-onnx
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx(时间戳)
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx(nn热词)
# 如果您想在服务端加载热词,请在宿主机文件./funasr-runtime-resources/models/hotwords.txt配置热词(docker映射地址为/workspace/models/hotwords.txt):
#   每行一个热词,格式(热词 权重):阿里巴巴 20(注:热词理论上无限制,但为了兼顾性能和效果,建议热词长度不超过10,个数不超过1k,权重1~100)
# SenseVoiceSmall-onnx识别结果中"<|zh|><|NEUTRAL|><|Speech|> "分别为对应的语种、情感、事件信息


#部署8k的模型,请使用如下命令启动服务:

cd FunASR/runtime

nohup bash run_server.sh --download-model-dir /workspace/models  --vad-dir damo/speech_fsmn_vad_zh-cn-8k-common-onnx  --model-dir damo/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1-onnx --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst-token8358 --itn-dir thuduj12/fst_itn_zh --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

第六步 测试

本机测试

bash 复制代码
# 环境
# python 3.9.X  pip 20及以上

cd /opt

wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz

tar xvfz funasr_samples.tar.gz

cd /opt/samples/python

pip3 install websockets

# 第一次测试

python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"  
-audio_in "../audio/asr_example.wav" --output_dir "./results"

# Namespace(host='127.0.0.1', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='../audio/asr_example.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
# connect to wss://127.0.0.1:10095
#pid0_0: demo: 欢迎大家来体验达摩院推出的语音识别模型。 timestamp: [[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]
#Exception: sent 1000 (OK); then received 1000 (OK)
#end

# 第二次测试

 python3 funasr_wss_client.py --host "192.168.1.181" --port 10095 --mode offline --audio_in "../audio/asr_example.wav" --output_dir "./results"
 
 
# Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='../audio/asr_example.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
# connect to wss://192.168.1.181:10095
# pid0_0: demo: 欢迎大家来体验达摩院推出的语音识别模型。 timestamp: [[880,1120],[1120,1380],[1380,1540],[1540,1780],[1780,2020],[2020,2180],[2180,2480],[2480,2600],[2600,2780],[2780,3040],[3040,3240],[3240,3480],[3480,3699],[3699,3900],[3900,4180],[4180,4420],[4420,4620],[4620,4780],[4780,5195]]
# Exception: sent 1000 (OK); then received 1000 (OK)
# end

同局域网测试

  • python环境

    bash 复制代码
    python3 funasr_wss_client.py --host "192.168.1.181" --port 10095 --mode offline --audio_in "./001.wav" --output_dir "./results"
    
    #  --audio_in "./001.wav"  更改为本机音频路径
    
    # Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='./001.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
    
    # Namespace(host='192.168.1.181', port=10095, chunk_size=[5, 10, 5], chunk_interval=10, hotword='', audio_in='./001.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir='./results', ssl=1, use_itn=1, mode='offline')
    # connect to wss://192.168.1.181:10095
    # pid0_0: demo: 咱们是微信支付的,不是银行这边的。 timestamp: [[90,210],[210,290],[290,410],[410,550],[550,690],[690,850],[850,1030],[1030,1310],[1310,1430],[1430,1570],[1570,1670],[1670,1850],[1850,1950],[1950,2130],[2130,2305]]
    # Exception: sent 1000 (OK); then received 1000 (OK)
    # end
  • html测试

    打开下载的测试包,打开html/static/index.html

结束

相关推荐
乘云数字DATABUFF4 天前
5分钟部署开源APM Databuff:OpenTelemetry全链路追踪入门实战
运维·后端
荣--6 天前
一键部署不是为了省时间 —— 它是把"买来的 PaaS"变成"自己的平台"的拐点
运维·zabbix·工程化·一键部署·平台化·边界设计
江华森6 天前
动手实战学 Docker — 从零到集群编排完全指南
运维
Avan_菜菜6 天前
FRP 内网穿透完整实战:从 HTTP 映射到 HTTPS 自签代理
运维·nginx·https
SelectDB7 天前
Litefuse 开源并推出单进程轻量模式,25 秒就能跑起来的 Agent 可观测与评估平台
运维·后端·自动化运维
XIAOHEZIcode9 天前
Linux系统鼠标偏移常见原因以及修复方案
linux·运维·游戏
用户0328472220709 天前
如何搭建本地yum源(上)
运维
大树8812 天前
金刚石散热越强,管路越先见顶
大数据·运维·服务器·人工智能·ai
摇滚侠12 天前
Linux CentOS7 rpm 安装 MySQL 5.7
linux·运维·mysql
霸道流氓气质12 天前
领域驱动设计(DDD)在 Spring Boot 微服务中的实践指南
运维·spring boot·微服务