文章目录
- 一、准备镜像
- 二、准备paddleocr相关安装包
- 三、手动启动方式
-
- 1.提前准备
-
- [1.1 启动容器](#1.1 启动容器)
- [1.2 安装python依赖](#1.2 安装python依赖)
- [1.3 安装web服务器依赖](#1.3 安装web服务器依赖)
- 2.paddlex-单并发服务
-
- [2.1 安装paddlex服务](#2.1 安装paddlex服务)
- [2.2 启动paddlex服务](#2.2 启动paddlex服务)
- 3.Gunicorn+uvicorn+fastapi(多并发服务)
-
- [3.1 安装gunicorn](#3.1 安装gunicorn)
- [3.1 代码](#3.1 代码)
- 四、docker-compose启动方式
-
- 1.start.sh
- 2.docker-compose.yaml
- 3.服务端+客户端(无接口权限)
-
- [3.1 服务端](#3.1 服务端)
- [3.2 客户端](#3.2 客户端)
- 3.服务端+客户端(接口有权限)
-
- 3.1.docker-compose.yaml
- [3.1 服务端](#3.1 服务端)
- [3.2 客户端](#3.2 客户端)
一、准备镜像
- 资料参考:点击跳转
shell
# 1.下载docker镜像
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
# 2.导出镜像
docker save ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu | gzip > paddle_npu_cann800_ubuntu20_910b_aarch64_gcc84.tar.gz
二、准备paddleocr相关安装包
注意:需要在aarch64的环境下载、并且是py3.10环境,需要和镜像内python版本保持一致。
- 资料参考:
- 资料1:点击跳转
- 下载python包
shell
pip download -d ./packs paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/
pip download -d ./packs paddle-custom-npu -i https://www.paddlepaddle.org.cn/packages/nightly/npu/
pip download -d ./packs "paddleocr[all]" -i https://mirrors.aliyun.com/pypi/simple/
# 注意,下面的包也需要下载,不下载的话,识别模型会报错
pip download -d ./packs numpy==1.26.4 -i https://mirrors.aliyun.com/pypi/simple/
pip download -d ./packs opencv-python==3.4.18.65 -i https://mirrors.aliyun.com/pypi/simple/
三、手动启动方式
1.提前准备
1.1 启动容器
- 参考教程:点击跳转
shell
# 加载镜像
docker load -i paddle_npu_cann800_ubuntu20_910b_aarch64_gcc84.tar.gz
# 启动服务
docker run -it --name paddle-npu-dev -v $(pwd):/work \
--privileged --network=host --shm-size=128G -w=/work \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/dcmi:/usr/local/dcmi \
-e ASCEND_RT_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"\
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann80RC2-ubuntu20-aarch64-gcc84-py39 /bin/bash
1.2 安装python依赖
- 这些包是第一章下载的包,需要进入paddle-npu-dev容器进行安装
shell
pip install paddlepaddle==3.3.0 --no-index --find-links=./packs/packages
pip install paddle-custom-npu==3.3.0 --no-index --find-links=./packs/packages
pip install "paddleocr[all]" --no-index --find-links=./packs/packages
pip install packages/numpy=1.26.4 --no-index --find-links=./packs/packages
pip install packages/opencv-python==3.4.18.65 --no-index --find-links=./packs/packages
1.3 安装web服务器依赖
这个对于paddlex 或者 gunicorn都会用到
- 获取paddlex安装服务的时候,需要哪些依赖包。需要自行下载后,在镜像里面安装
- 获取paddlex服务的python依赖包清单
python
from paddlex.utils.deps import get_serving_dep_specs
open("paddlex_requirements.txt", "w").write("\n".join(get_serving_dep_specs()))
- 下载包
shell
# 下载python包
pip download -d ./packs -r paddlex_requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
# 打包
tar -czvf paddlex_packs.tar.gz .
- 如果使用gunicorn,那么需要下载gunicorn依赖
2.paddlex-单并发服务
该方式只支持并发为1,处理完一张图片,才处理后面的图片
- 参考教程:点击跳转
2.1 安装paddlex服务
- 安装paddlex服务
shell
paddlex --install serving
2.2 启动paddlex服务
- 启动命令:
shell
paddlex --serve --pipeline OCR --device npu:0 --port 8004
- 请求格式:
shell
url: http://127.0.0.1:8004/ocr
method:POST
body(json):
{"file": "http://127.0.0.1:12345/test.png"}
3.Gunicorn+uvicorn+fastapi(多并发服务)
3.1 安装gunicorn
3.1 代码
- 需要自行下载gunicorn包,进行安装
python
import base64
import binascii
import os
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline
def prune_result(result: dict) -> dict:
KEYS_TO_REMOVE = ["input_path", "page_index"]
def _process_obj(obj):
if isinstance(obj, dict):
return {
k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
}
elif isinstance(obj, list):
return [_process_obj(item) for item in obj]
else:
return obj
return _process_obj(result)
# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
os.mkdir(OCR_TMP)
def get_random_file_path(filename):
filename = uuid.uuid4().hex + filename[filename.rfind("."):]
return os.path.join(OCR_TMP, filename)
def init_ocr_pipeline():
"""初始化 OCR 管道(确保只执行一次)"""
global ocr_pipeline
if ocr_pipeline is None:
try:
ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
print(f"OCR 初始化成功(进程ID:{os.getpid()})")
except Exception as e:
print(f"OCR 初始化失败: {str(e)}")
raise RuntimeError(f"OCR 初始化失败: {str(e)}")
@asynccontextmanager
async def lifespan(app: FastAPI):
# 在应用启动时加载模型
init_ocr_pipeline()
yield
# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)
# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None
# 定义请求体模型
class OCRRequest(BaseModel):
image_base64: str
image_name: str
# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: OCRRequest):
file_path = None # 定义临时文件路径,用于finally中清理
try:
# 处理 Base64 字符串(该base64,不带前缀)
base64_str = request.image_base64
file_path = get_random_file_path(request.image_name)
if "," in base64_str:
base64_str = base64_str.split(",")[-1]
# 解码 Base64 为二进制数据
image_data = base64.b64decode(base64_str)
# 3. 创建临时文件(PNG格式),写入二进制数据
with open(file_path, "wb") as f:
f.write(image_data)
result = ocr_pipeline.predict(
file_path
)
# 4.处理ocr结果
ocr_results: List[Dict[str, Any]] = []
for i, item in enumerate(result):
pruned_res = prune_result(item.json["res"])
ocr_results.append(
{
"prunedResult": pruned_res,
}
)
# 返回识别结果
return {
"code": 200,
"message": "识别成功",
"data": ocr_results
}
except binascii.Error:
raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
except Exception as e:
raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
finally:
if os.path.exists(file_path):
os.remove(file_path)
- 启动命令
shell
gunicorn paddle_ocr_server:app -w 8 -b 0.0.0.0:8004 -k uvicorn.workers.UvicornWorker
四、docker-compose启动方式
1.start.sh
sh
#!/bin/bash
set -e
echo "Installing offline packages from /work/packages..."
pip install paddlepaddle==3.3.0 --no-index --find-links=/work/packages
pip install paddle-custom-npu==3.3.0 --no-index --find-links=/work/packages
pip install "paddleocr[all]" --no-index --find-links=/work/packages
pip install -r /work/paddlex_requirements.txt --no-index --find-links=/work/packs
pip install packages/numpy-1.26.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
pip install packages/opencv_python-3.4.18.65-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
pip install gunicorn --no-index --find-links=/work/packages
# 解决安装完包,找不到.so报错
source /root/.bashrc
gunicorn paddle_ocr_server:app -w 8 -b 0.0.0.0:8004 -k uvicorn.workers.UvicornWorker
2.docker-compose.yaml
yaml
services:
fxyj_paddle:
image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
container_name: fxyj_paddle_910b
privileged: true
network_mode: host
shm_size: 128G
working_dir: /work
volumes:
- .:/work
- /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro
- /usr/local/dcmi:/usr/local/dcmi:ro
environment:
- ASCEND_RT_VISIBLE_DEVICES=5
- FLAGS_npu_jit_compile=0
- PADDLE_PDX_CACHE_HOME=/work/models
entrypoint: ["/work/start.sh"]
command: ["/bin/bash"]
stdin_open: true
tty: true
3.服务端+客户端(无接口权限)
3.1 服务端
python
import base64
import binascii
import os
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline
class OCRUtils:
@staticmethod
def prune_result(result: dict) -> dict:
KEYS_TO_REMOVE = ["input_path", "page_index"]
def _process_obj(obj):
if isinstance(obj, dict):
return {
k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
}
elif isinstance(obj, list):
return [_process_obj(item) for item in obj]
else:
return obj
return _process_obj(result)
@staticmethod
def get_random_file_path(filename):
filename = uuid.uuid4().hex + filename[filename.rfind("."):]
return os.path.join(OCR_TMP, filename)
@staticmethod
def init_ocr_pipeline():
"""初始化 OCR 管道(确保只执行一次)"""
global ocr_pipeline
if ocr_pipeline is None:
try:
ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
print(f"OCR 初始化成功(进程ID:{os.getpid()})")
except Exception as e:
print(f"OCR 初始化失败: {str(e)}")
raise RuntimeError(f"OCR 初始化失败: {str(e)}")
# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
os.mkdir(OCR_TMP)
@asynccontextmanager
async def lifespan(app: FastAPI):
# 在应用启动时加载模型
OCRUtils.init_ocr_pipeline()
yield
# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)
# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None
# 定义请求体模型
class OCRRequest(BaseModel):
image_base64: str
image_name: str
# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: OCRRequest):
file_path = None # 定义临时文件路径,用于finally中清理
try:
# 处理 Base64 字符串(该base64,不带前缀)
base64_str = request.image_base64
file_path = OCRUtils.get_random_file_path(request.image_name)
if "," in base64_str:
base64_str = base64_str.split(",")[-1]
# 解码 Base64 为二进制数据
image_data = base64.b64decode(base64_str)
# 3. 创建临时文件(PNG格式),写入二进制数据
with open(file_path, "wb") as f:
f.write(image_data)
result = ocr_pipeline.predict(
file_path
)
# 4.处理ocr结果
ocr_results: List[Dict[str, Any]] = []
for i, item in enumerate(result):
pruned_res = OCRUtils.prune_result(item.json["res"])
ocr_results.append(
{
"prunedResult": pruned_res,
}
)
# 返回识别结果
return {
"code": 200,
"message": "识别成功",
"data": ocr_results
}
except binascii.Error:
raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
except Exception as e:
raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
finally:
if os.path.exists(file_path):
os.remove(file_path)
3.2 客户端
python
import base64
import os.path
import requests
API_URL = "http://127.0.0.1:8004/ocr/recognize"
def ocr_reconize(img_path, retry_req_num=3, timeout=300):
with open(file_path, "rb") as file:
file_bytes = file.read()
file_data = base64.b64encode(file_bytes).decode()
payloard = {
"image_base64": file_data,
"image_name": os.path.basename(img_path)
}
for try_req_n in range(retry_req_num):
try:
response = requests.post(API_URL, json=payloard,
timeout=timeout)
except requests.exceptions.RequestException as e:
print(f"第{try_req_n + 1}次请求失败,错误信息:{e}")
continue
if response.status_code != 200:
print(f"第{try_req_n + 1}次请求失败,状态码:{response.status_code}")
continue
# 拼接ocr识别结果
ocr_data = response.json().get("data")
if not ocr_data:
return ""
return "".join(ocr_data[0]["prunedResult"]["rec_texts"])
return ""
if __name__ == '__main__':
file_path = "./123.png"
print(ocr_reconize(file_path))
3.服务端+客户端(接口有权限)
3.1.docker-compose.yaml
- 添加SECRET_KEY配置
yaml
services:
fxyj_paddle:
image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84
container_name: fxyj_paddle_910b
privileged: true
network_mode: host
shm_size: 128G
working_dir: /work
volumes:
- .:/work
- /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro
- /usr/local/dcmi:/usr/local/dcmi:ro
environment:
- ASCEND_RT_VISIBLE_DEVICES=5
- FLAGS_npu_jit_compile=0
- PADDLE_PDX_CACHE_HOME=/work/models
- SECRET_KEY=your_secret_key
entrypoint: ["/work/start.sh"]
command: ["/bin/bash"]
stdin_open: true
tty: true
3.1 服务端
python
import base64
import binascii
import hashlib
import json
import os
import time
import uuid
from contextlib import asynccontextmanager
from typing import Dict, Any, List
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from paddlex import create_pipeline
class OCRUtils:
@staticmethod
def prune_result(result: dict) -> dict:
KEYS_TO_REMOVE = ["input_path", "page_index"]
def _process_obj(obj):
if isinstance(obj, dict):
return {
k: _process_obj(v) for k, v in obj.items() if k not in KEYS_TO_REMOVE
}
elif isinstance(obj, list):
return [_process_obj(item) for item in obj]
else:
return obj
return _process_obj(result)
@staticmethod
def init_ocr_pipeline():
"""初始化 OCR 管道(确保只执行一次)"""
global ocr_pipeline
if ocr_pipeline is None:
try:
ocr_pipeline = create_pipeline(pipeline="OCR", device="npu")
print(f"OCR 初始化成功(进程ID:{os.getpid()})")
except Exception as e:
print(f"OCR 初始化失败: {str(e)}")
raise RuntimeError(f"OCR 初始化失败: {str(e)}")
@staticmethod
def get_random_file_path(filename):
filename = uuid.uuid4().hex + filename[filename.rfind("."):]
return os.path.join(OCR_TMP, filename)
@staticmethod
def verify_ocr_sign(signed_request: dict, secret_key: str, timeout: int = 300) -> bool:
"""
验证OCR请求的签名合法性
Args:
signed_request: 带签名的请求数据 {"timestamp":..., "ocr_data":..., "sign":...}
secret_key: 签名密钥(字符串)
timeout: 签名有效期(秒),默认5分钟
Returns:
验签结果:True(合法)/False(非法)
"""
try:
# 1. 提取必要字段
timestamp = signed_request["timestamp"]
ocr_data = signed_request["ocr_data"]
received_sign = signed_request["sign"]
# 2. 校验时间戳是否过期(防止重放攻击)
current_timestamp = int(time.time() * 1000)
if abs(current_timestamp - timestamp) > timeout * 1000:
print(f"签名过期:当前时间戳{current_timestamp},请求时间戳{timestamp}")
return False
# 3. 重新计算签名
ocr_data_str = json.dumps(ocr_data, ensure_ascii=False, sort_keys=True)
sign_str = f"{timestamp}{ocr_data_str}{secret_key}"
calculated_sign = hashlib.sha256(sign_str.encode('utf-8')).hexdigest()
# 4. 对比签名(使用恒等比较防止时序攻击)
if received_sign != calculated_sign:
print(f"签名不匹配:收到{received_sign},计算出{calculated_sign}")
return False
return True
except KeyError as e:
print(f"请求数据缺少必要字段:{e}")
return False
except Exception as e:
print(f"验签过程出错:{e}")
return False
# 创建图片临时文件夹
OCR_TMP = os.path.join(os.getcwd(), "ocr_tmp")
if not os.path.exists(OCR_TMP):
os.mkdir(OCR_TMP)
@asynccontextmanager
async def lifespan(app: FastAPI):
# 在应用启动时加载模型
OCRUtils.init_ocr_pipeline()
yield
# 初始化 FastAPI 应用
app = FastAPI(lifespan=lifespan)
# 全局变量,初始化后存放 OCR 管道
ocr_pipeline = None
# 验签的密钥
SECRET_KEY = os.environ.get('SECRET_KEY', '默认密钥')
print("密钥: ", SECRET_KEY)
# 1. 定义嵌套的请求体模型(适配新的请求结构)
class OCRData(BaseModel):
image_base64: str # 对应原image_base64,字段名按新要求调整
image_name: str # 对应原image_name,字段名按新要求调整
class SignedOCRRequest(BaseModel):
sign: str # 签名
timestamp: int # 时间戳(毫秒级)
ocr_data: OCRData # 嵌套的OCR数据
# 定义 OCR 识别接口
@app.post("/ocr/recognize", response_model=Dict[str, Any])
async def ocr_recognize(request: SignedOCRRequest):
file_path = None # 定义临时文件路径,用于finally中清理
# 将Pydantic模型转为字典,用于验签
request_dict = request.model_dump()
if not OCRUtils.verify_ocr_sign(request_dict, SECRET_KEY):
raise HTTPException(
status_code=401,
detail={"code": 401, "message": "签名无效或过期", "data": []}
)
try:
# 处理 Base64 字符串(该base64,不带前缀)
ocr_data = request.ocr_data
base64_str = ocr_data.image_base64
file_path = OCRUtils.get_random_file_path(ocr_data.image_name)
if "," in base64_str:
base64_str = base64_str.split(",")[-1]
# 解码 Base64 为二进制数据
image_data = base64.b64decode(base64_str)
# 3. 创建临时文件(PNG格式),写入二进制数据
with open(file_path, "wb") as f:
f.write(image_data)
result = ocr_pipeline.predict(
file_path
)
# 4.处理ocr结果
ocr_results: List[Dict[str, Any]] = []
for i, item in enumerate(result):
pruned_res = OCRUtils.prune_result(item.json["res"])
ocr_results.append(
{
"prunedResult": pruned_res,
}
)
# 返回识别结果
return {
"code": 200,
"message": "识别成功",
"data": ocr_results
}
except binascii.Error:
raise HTTPException(status_code=400, detail="无效的 Base64 编码字符串")
except Exception as e:
raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
finally:
if os.path.exists(file_path):
os.remove(file_path)
3.2 客户端
python
import base64
import hashlib
import json
import os.path
import time
import requests
API_URL = "http://127.0.0.1:8004/ocr/recognize"
SECRET_KEY = 'your_secret_key'
def generate_ocr_sign(ocr_request: dict, secret_key: str) -> dict:
"""
生成带签名的OCR请求数据
Args:
ocr_request: 原始OCR请求数据 {"img_base64": "xxx", "img_name": "xxx.png"}
secret_key: 签名密钥(字符串)
Returns:
带签名的请求字典 {"timestamp": 12345677, "ocr_data": {...}, "sign": "..."}
"""
# 1. 生成毫秒级时间戳
timestamp = int(time.time() * 1000)
# 2. 按固定顺序拼接待签名字符串(保证前后端拼接规则一致)
# 先将ocr_request转为JSON字符串
ocr_data_str = json.dumps(ocr_request, ensure_ascii=False, sort_keys=True)
# 拼接规则:timestamp + ocr_data_str + secret_key
sign_str = f"{timestamp}{ocr_data_str}{secret_key}"
# 3. SHA256签名
sign = hashlib.sha256(sign_str.encode('utf-8')).hexdigest()
# 4. 组装最终请求结构
result = {
"timestamp": timestamp,
"ocr_data": ocr_request,
"sign": sign
}
return result
def ocr_reconize(img_path, secret_key, retry_req_num=3, timeout=300):
with open(file_path, "rb") as file:
file_bytes = file.read()
file_data = base64.b64encode(file_bytes).decode()
payloard = {
"image_base64": file_data,
"image_name": os.path.basename(img_path)
}
sign_payload = generate_ocr_sign(ocr_request=payloard,
secret_key=secret_key)
print(sign_payload)
for try_req_n in range(retry_req_num):
try:
response = requests.post(API_URL, json=sign_payload,
timeout=timeout)
except requests.exceptions.RequestException as e:
print(f"第{try_req_n + 1}次请求失败,错误信息:{e}")
continue
if response.status_code != 200:
open("w.txt", "w").write(response.text)
print(f"第{try_req_n + 1}次请求失败,状态码:{response.status_code}")
continue
# 拼接ocr识别结果
ocr_data = response.json().get("data")
if not ocr_data:
return ""
return "".join(ocr_data[0]["prunedResult"]["rec_texts"])
return ""
if __name__ == '__main__':
file_path = "./123.png"
print(ocr_reconize(file_path, secret_key=SECRET_KEY))