超简单Translation翻译模型部署

Helsinki-NLP/opus-mt-{en}-{zh}系列翻译模型可以实现200多种语言翻译,Helsinki-NLP/opus-mt-en-zh是其中英互译模型。由于项目需要,在本地进行搭建,并记录下搭建过程,方便后人。

1. 基本硬件环境

  • CPU:N年前的 Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz, 32G内存
  • GPU:N年前的 NVIDIA GeForce GTX 1080 Ti,11G显存

2. 基本软件环境

  • 操作系统:Ubuntu20.04 LTS,是为了跟老旧的硬件相匹配,专门降级到20.04的,更高版本存在各种软件兼容性问题,等有钱了全部换新!!!
  • CUDA:cuda_12.0.0_525.60.13_linux.run,虽然能支持到12.2甚至12.4,保险起见还是选择了12.0
  • Cudnn:libcudnn8_8.8.0.121-1+cuda12.0_amd64.deb,对应CUDA版本
  • NCCL:libnccl2_2.19.3-1+cuda12.0_amd64.deb对应CUDA版本,多显卡需要
  • miniconda:Miniconda3-py312_24.9.2-0-Linux-x86_64.sh

3. 克隆fishspeech代码并安装本地依赖包

复制代码
git clone https://gitclone.com/github.com/fishaudio/fish-speech.git

sudo apt-get install ffmpeg libsm6 libxext6 portaudio19-dev -y

4. 创建虚拟环境

复制代码
conda create -n huggingface python==3.10 -y
conda activate huggingface

5. conda安装基础包

复制代码
conda install -c pytorch -c nvidia -c conda-forge pytorch torchvision pytorch-cuda=11.8

6. 安装huggingface组件,transformers包

复制代码
pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install -U huggingface_hub -i https://pypi.tuna.tsinghua.edu.cn/simple

设置环境变量,用于加速
HF_ENDPOINT=https://hf-mirror.com

7. 以python脚本方式运行

python 复制代码
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh")

def translate(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True)
    translated = model.generate(**inputs)
    return [tokenizer.decode(t, skip_special_tokens=True) for t in translated]

print(tokenizer.supported_language_codes)
text = ">>cmn_Hans<< Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results."
translated_text = translate(text)
print(translated_text)

首次运行会报错,因为缺少两个依赖包,安装即可

python 复制代码
pip install sentencepiece sacremoses -i https://pypi.tuna.tsinghua.edu.cn/simple

8. 以FastAPI方式运行

python 复制代码
# 安装fastapi ubicorn组件
pip install fastapi uvicorn -i https://pypi.tuna.tsinghua.edu.cn/simple

服务脚本如下:

python 复制代码
# Load model directly
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

app = FastAPI()

tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh")

def translate(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True)
    translated = model.generate(**inputs)
    return [tokenizer.decode(t, skip_special_tokens=True) for t in translated]

# print(tokenizer.supported_language_codes)
# text = ">>cmn_Hans<< Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results."
# translated_text = translate(text)
# print(translated_text)

class TextRequest(BaseModel):
    text: str
 
@app.post("/predict")
async def predict(request: TextRequest):
    # 预处理和预测
    translated_text = translate(request.text)
    
    # 返回结果
    return {
        "text": request.text,
        "predictions": translated_text
    }

运行服务

python 复制代码
uvicorn fastapi_app:app --host 0.0.0.0 --port 8000
相关推荐
zhangfeng11331 分钟前
大语言模型llm 量化模型 跑在 边缘设备小显存显卡 GGUF GGML PyTorch (.pth, .bin, SafeTensors)
人工智能·pytorch·深度学习·语言模型
kebijuelun5 分钟前
Towards Automated Kernel Generation in the Era of LLMs:LLM 时代的自动化 Kernel 生成全景图
人工智能·gpt·深度学习·语言模型
汉克老师11 分钟前
小学生0基础学大语言模型应用(第 19 课《字符串提示词训练(Prompt Thinking)》)
人工智能·深度学习·机器学习·语言模型·prompt·提示词
狮子座明仔21 分钟前
AgentScope 深度解读:多智能体开发框架的工程化实践
人工智能·深度学习·语言模型·自然语言处理
庵中十三居士32 分钟前
智谱清言智能体的设定(系统提示词)
语言模型
源代码杀手1 小时前
大型语言模型的主体推理(一项综述):2026 最新!Agentic Reasoning 终极指南——最全 LLM 智能体推理论文合集 + 核心架构解析
人工智能·语言模型·自然语言处理
hjs_deeplearning2 小时前
文献阅读篇#16:自动驾驶中的视觉语言模型:综述与展望
人工智能·语言模型·自动驾驶
破烂pan15 小时前
大语言模型核心评测基准详解:从认知到实践
语言模型·模型评测
司沐_Simuoss17 小时前
Text to SQL系统的千层套路~
数据库·人工智能·sql·语言模型·系统架构
阿杰学AI17 小时前
AI核心知识80——大语言模型之Slow Thinking和Deep Reasoning(简洁且通俗易懂版)
人工智能·ai·语言模型·自然语言处理·aigc·慢思考·深度推理