【PaddleOCR】基于PaddleOCR V5 最新框架实现车牌识别

一、使用PaddleOCR V5 最新kuang jia实现车牌识别简介

1.流程

  1. 环境配置

    • 首先,需要安装必要的依赖库。这通常包括PaddlePaddle深度学习框架、PaddleOCR库以及OpenCV库等。可以通过pip命令进行安装,如:pip install paddlepaddle paddleocr opencv-python
  2. 数据处理

    • 处理数据为PaddleOCR需要的检测、识别数据集
  3. 模型训练

    • 使用PaddleOCR进行检测、识别模型训练
  4. 车牌检测

    • 使用PaddleOCR的检测模型对预处理后的图像进行车牌检测,即定位出车牌在图像中的位置。这一步骤通常返回车牌的边界框坐标。
  5. 车牌识别

    • 将检测到的车牌区域裁剪出来,并使用PaddleOCR的识别模型对该区域进行字符识别。这一步骤将车牌图像中的字符转换为文本信息。
  6. 结果输出

    • 将识别结果输出,可以包括车牌号码、识别置信度等信息。根据需要,可以将结果存储到数据库中或进行其他后续处理。

需要注意的是,车牌识别的效果可能受到多种因素的影响,如图像质量、车牌角度、光照条件等。因此,在实际应用中,可能需要根据具体情况对模型进行微调或优化,以提高识别准确率。

此外,PaddleOCR还提供了丰富的接口和参数设置,用户可以根据自己的需求进行调整和配置,以实现更好的车牌识别效果。

2.数据集

本次使用的数据集为CCPD2019车牌数据集

原数据集地址:github.com/detectRecog... 这里上传的数据集不包含没有车牌的车,已删除清晰度太低的图片(低于70),md5去重。 图片总数121k(原355k),其中"皖": 113k,"皖A": 105k。 文件名即图片标注,

该数据集在合肥市的停车场采集得来,采集时间早上7:30到晚上10:00。停车场采集人员手持Android POS机对停车场的车辆拍照并手工标注车牌位置。拍摄的车牌照片涉及多种复杂环境,包括模糊、倾斜、阴雨天、雪天等等。CCPD数据集一共包含将近30万张图片,每种图片大小720x1160x3。一共包含8项,具体如下:

CCPD数据集没有专门的标注文件,每张图像的文件名就是对应的数据标注(label) 例如:025-95_113-154&383_386&473-386&473_177&454_154&383_363&402-0_0_22_27_27_33_16-37-15.jpg 由分隔符'-'分为几个部分:

  • 025为区域
  • 95_113 对应两个角度, 水平95°, 竖直113°
  • 154&383_386&473对应边界框坐标:左上(154, 383), 右下(386, 473)
  • 386&473_177&454_154&383_363&402对应四个角点坐标
  • 0_0_22_27_27_33_16为车牌号码 映射关系如下: 第一个为省份0 对应省份字典皖, 后面的为字母和文字, 查看ads字典.如0为A, 22为Y.... 仅使用到数据集中正常车牌即ccpd_base的数据

二、PaddleOCR环境配置

1.下载PaddleOCR

  • 可以直接从github下载,只是速率较低
bash 复制代码
git clone https://github.com/paddlepaddle/PaddleOCR --depth=1

# 如果因为网络问题无法克隆成功,也可选择使用码云上的仓库:
git clone https://gitee.com/paddlepaddle/PaddleOCR --depth=1
  • 我已经从github下载好,并打包上传至aisutdio数据集并公开,可以直接拿来用。新上传的数据集下载方式略有不同,需要手动下载。
python 复制代码
import paddle

# 检查PaddlePaddle是否安装成功
try:
    paddle.utils.run_check()
    print("PaddlePaddle安装成功,且运行正常。")
except Exception as e:
    print(f"PaddlePaddle运行异常:{e}")

# 执行一个简单的张量操作来验证PaddlePaddle的功能
try:
    x = paddle.ones(shape=[2, 3], dtype='float32')
    y = paddle.ones(shape=[2, 3], dtype='float32')
    z = x + y
    print("张量操作成功,结果如下:")
    print(z.numpy())
except Exception as e:
    print(f"PaddlePaddle张量操作异常:{e}")

print(f"Paddle 版本{paddle.__version__}")
vbnet 复制代码
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)


Running verify PaddlePaddle program ... 


/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/pir/math_op_patch.py:219: UserWarning: Value do not have 'place' interface for pir graph mode, try not to use it. None will be returned.
  warnings.warn(
I0914 23:04:47.797250   110 pir_interpreter.cc:1524] New Executor is Running ...
W0914 23:04:47.798897   110 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6
I0914 23:04:47.800211   110 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...


PaddlePaddle works well on 1 GPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
PaddlePaddle安装成功,且运行正常。
张量操作成功,结果如下:
[[2. 2. 2.]
 [2. 2. 2.]]
Paddle 版本3.2.0
python 复制代码
!nvidia-smi
diff 复制代码
Sun Sep 14 23:05:03 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.148.08             Driver Version: 570.148.08     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A800-SXM4-80GB          On  |   00000000:58:00.0 Off |                    0 |
| N/A   27C    P0             63W /  400W |     499MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A             110      C   ...on35-paddle120-env/bin/python        490MiB |
+-----------------------------------------------------------------------------------------+
python 复制代码
!aistudio download --dataset javaroom/PaddleOCR-2025-9-13 --local_dir ./
ruby 复制代码
['download', '--dataset', 'javaroom/PaddleOCR-2025-9-13', '--local_dir', './']
Downloading Model from remote to directory: /home/aistudio
Got 3 files, start to download ...
Processing 3 items:   0%|                           | 0.00/3.00 [00:00<?, ?it/s]
Downloading [paddleocr.zip]:   0%|                   | 0.00/217M [00:00<?, ?B/s]

Downloading [dataset_infos.json]:   0%|               | 0.00/421 [00:00<?, ?B/s]

Downloading [README.md]: 0%| | 0.00/1.26k [00:00<?, ?B/s] Downloading [paddleocr.zip]: 0%| | 1.00M/217M [00:00<01:26, 2.62MB/s] Downloading [paddleocr.zip]: 12%|█▏ | 26.0M/217M [00:00<00:02, 70.1MB/s]

Downloading [README.md]: 100%|█████████████| 1.26k/1.26k [00:00<00:00, 3.44kB/s] Processing 3 items: 33%|██████▋ | 1.00/3.00 [00:00<00:01, 1.51it/s] Downloading [paddleocr.zip]: 24%|██▋ | 53.0M/217M [00:00<00:01, 122MB/s]

ini 复制代码
Downloading [dataset_infos.json]: 100%|██████████| 421/421 [00:00<00:00, 693B/s]

Downloading [paddleocr.zip]:  39%|████▎      | 85.0M/217M [00:00<00:00, 178MB/s]
Downloading [paddleocr.zip]:  52%|██████▏     | 113M/217M [00:00<00:00, 205MB/s]
Downloading [paddleocr.zip]:  63%|███████▌    | 137M/217M [00:01<00:00, 183MB/s]
Downloading [paddleocr.zip]:  73%|████████▋   | 158M/217M [00:01<00:00, 160MB/s]
Downloading [paddleocr.zip]:  81%|█████████▋  | 176M/217M [00:01<00:00, 154MB/s]
Downloading [paddleocr.zip]:  89%|██████████▋ | 193M/217M [00:01<00:00, 141MB/s]
Downloading [paddleocr.zip]: 100%|████████████| 217M/217M [00:01<00:00, 137MB/s]
Processing 3 items: 100%|████████████████████| 3.00/3.00 [00:02<00:00, 1.39it/s]
Download model 'javaroom/PaddleOCR-2025-9-13' successfully.

Successfully Downloaded from dataset javaroom/PaddleOCR-2025-9-13.
python 复制代码
!unzip paddleocr.zip 

2.安装PaddleOCR

  • 如果只希望使用 PaddleOCR 的推理功能,直接安装推理包;如果希望进行模型训练、导出等,则需要安装训练依赖。
  • 在同一环境中安装推理包和训练依赖是允许的,无需进行环境隔离。
python 复制代码
%cd ~/paddleocr

#  安装依赖
!python -m pip install -r requirements.txt
python 复制代码
# 只希望使用基础文字识别功能(返回文字位置坐标和文本内容)
# python -m pip install paddleocr
# 希望使用文档解析、文档理解、文档翻译、关键信息抽取等全部功能
!pip install "paddleocr[all]"

三、数据处理

  • 检测数据:转换成icdar格式
  • 识别数据:转换成PaddleOCR使用的格式

1.解压缩数据

数据集7GB左右,比较大,解压耗时较长

python 复制代码
# 解压数据集
%cd ~
!unzip -q data/data17968/CCPD2019.zip -d work/CCPD
arduino 复制代码
/home/aistudio

2.转换为检测数据

python 复制代码
# 转换检测数据
%cd ~
import os, cv2
words_list = [
    "A", "B", "C", "D", "E",
    "F", "G", "H", "J", "K", 
    "L", "M", "N", "P", "Q", 
    "R", "S", "T", "U", "V", 
    "W", "X", "Y", "Z", "0", 
    "1", "2", "3", "4", "5", 
    "6", "7", "8", "9" ]

con_list = [
    "皖", "沪", "津", "渝", "冀",
    "晋", "蒙", "辽", "吉", "黑",
    "苏", "浙", "京", "闽", "赣",
    "鲁", "豫", "鄂", "湘", "粤",
    "桂", "琼", "川", "贵", "云",
    "西", "陕", "甘", "青", "宁",
    "新"]

count = 0
data = open('work/data_det.txt', 'w', encoding='UTF-8')
for item in os.listdir('work/CCPD/ccpd_base'):
    path = 'work/CCPD/ccpd_base/'+item
    _, _, bbox, points, label, _, _ = item.split('-')
    points = points.split('_')
    points = [_.split('&') for _ in points]
    tmp = points[-2:]+points[:2]
    points = []
    for point in tmp:
        points.append([int(_) for _ in point])
    label = label.split('_')
    con = con_list[int(label[0])]
    words = [words_list[int(_)] for _ in label[1:]]
    label = con+''.join(words)
    line = path+'\t'+'[{"transcription": "%s", "points": %s}]' % (label, str(points))
    line = line[:]+'\n'
    data.write(line)

total = []
with open('work/data_det.txt', 'r', encoding='UTF-8') as f:
    for line in f:
        total.append(line)

with open('work/train_det.txt', 'w', encoding='UTF-8') as f:
    for line in total[:-500]:
        f.write(line)

with open('work/dev_det.txt', 'w', encoding='UTF-8') as f:
    for line in total[-500:]:
        f.write(line)
arduino 复制代码
/home/aistudio
python 复制代码
%cd ~
!head -n3 work/data_det.txt
lua 复制代码
/home/aistudio
work/CCPD/ccpd_base/0270-2_6-289&288_529&382-529&382_294&371_289&288_524&299-0_0_7_32_29_30_13-129-131.jpg	[{"transcription": "皖AH856P", "points": [[289, 288], [524, 299], [529, 382], [294, 371]]}]
work/CCPD/ccpd_base/0158-0_2-249&463_466&524-466&524_252&524_249&463_463&463-0_0_21_27_29_26_15-127-81.jpg	[{"transcription": "皖AX352R", "points": [[249, 463], [463, 463], [466, 524], [252, 524]]}]
work/CCPD/ccpd_base/0220-8_6-103&474_297&569-295&569_103&542_105&474_297&501-0_0_14_10_24_32_30-127-104.jpg	[{"transcription": "皖AQL086", "points": [[105, 474], [297, 501], [295, 569], [103, 542]]}]
python 复制代码
from PIL import Image

# 打开一个图片文件
image_path = 'work/CCPD/ccpd_base/0270-2_6-289&288_529&382-529&382_294&371_289&288_524&299-0_0_7_32_29_30_13-129-131.jpg'  # 替换为你的图片文件路径
img = Image.open(image_path)

# 显示图片
img.show()

3. 转换为识别数据

python 复制代码
# 转换识别数据
%cd ~
import os, cv2
words_list = [
    "A", "B", "C", "D", "E",
    "F", "G", "H", "J", "K", 
    "L", "M", "N", "P", "Q", 
    "R", "S", "T", "U", "V", 
    "W", "X", "Y", "Z", "0", 
    "1", "2", "3", "4", "5", 
    "6", "7", "8", "9" ]

con_list = [
    "皖", "沪", "津", "渝", "冀",
    "晋", "蒙", "辽", "吉", "黑",
    "苏", "浙", "京", "闽", "赣",
    "鲁", "豫", "鄂", "湘", "粤",
    "桂", "琼", "川", "贵", "云",
    "西", "陕", "甘", "青", "宁",
    "新"]
if not os.path.exists('work/img'):
    os.mkdir('work/img')
count = 0
data = open('work/data_rec.txt', 'w', encoding='UTF-8')
for item in os.listdir('work/CCPD/ccpd_base'):
    path = 'work/CCPD/ccpd_base/'+item
    _, _, bbox, _, label, _, _ = item.split('-')
    bbox = bbox.split('_')
    x1, y1 = bbox[0].split('&')
    x2, y2 = bbox[1].split('&')
    label = label.split('_')
    con = con_list[int(label[0])]
    words = [words_list[int(_)] for _ in label[1:]]
    label = con+''.join(words)
    bbox = [int(_) for _ in [x1, y1, x2, y2]]
    img = cv2.imread(path)
    crop = img[bbox[1]:bbox[3], bbox[0]:bbox[2], :]
    cv2.imwrite('work/img/%06d.jpg' % count, crop)
    data.write('work/img/%06d.jpg\t%s\n' % (count, label))
    count += 1
data.close()

with open('work/word_dict.txt', 'w', encoding='UTF-8') as f:
    for line in words_list+con_list:
        f.write(line+'\n')

total = []
with open('work/data_rec.txt', 'r', encoding='UTF-8') as f:
    for line in f:
        total.append(line)

with open('work/train_rec.txt', 'w', encoding='UTF-8') as f:
    for line in total[:-500]:
        f.write(line)

with open('work/dev_rec.txt', 'w', encoding='UTF-8') as f:
    for line in total[-500:]:
        f.write(line)
arduino 复制代码
/home/aistudio
python 复制代码
%cd ~
!head -n3 work/data_rec.txt
bash 复制代码
/home/aistudio
work/img/000000.jpg	皖AH856P
work/img/000001.jpg	皖AX352R
work/img/000002.jpg	皖AQL086
python 复制代码
from PIL import Image

# 打开一个图片文件
image_path = 'work/img/000002.jpg'  # 替换为你的图片文件路径
img = Image.open(image_path)

# 显示图片
img.show()

四、模型训练

1.预训练模型下载

当然用最新的OCRv5_server模型了,分别下载检测、识别预训练模型

python 复制代码
%cd ~/paddleocr

# 下载 PP-OCRv5_server_det 预训练模型
!wget -P ./pretrain_models/ https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_det_pretrained.pdparams 

# 下载 PP-OCRv5_server_rec 预训练模型
!wget -P ./pretrain_models/  https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_rec_pretrained.pdparams 
bash 复制代码
/home/aistudio/paddleocr
--2025-09-14 22:33:00--  https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_det_pretrained.pdparams
Resolving paddle-model-ecology.bj.bcebos.com (paddle-model-ecology.bj.bcebos.com)... 100.67.184.196, 100.64.80.160, 100.64.80.202
Connecting to paddle-model-ecology.bj.bcebos.com (paddle-model-ecology.bj.bcebos.com)|100.67.184.196|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 105414496 (101M) [application/octet-stream]
Saving to: './pretrain_models/PP-OCRv5_server_det_pretrained.pdparams'

PP-OCRv5_server_det 100%[===================>] 100.53M   124MB/s    in 0.8s    

2025-09-14 22:33:01 (124 MB/s) - './pretrain_models/PP-OCRv5_server_det_pretrained.pdparams' saved [105414496/105414496]

--2025-09-14 22:33:01--  https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/PP-OCRv5_server_rec_pretrained.pdparams
Resolving paddle-model-ecology.bj.bcebos.com (paddle-model-ecology.bj.bcebos.com)... 100.64.80.160, 100.64.80.202, 100.67.184.196
Connecting to paddle-model-ecology.bj.bcebos.com (paddle-model-ecology.bj.bcebos.com)|100.64.80.160|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 214594738 (205M) [application/octet-stream]
Saving to: './pretrain_models/PP-OCRv5_server_rec_pretrained.pdparams'

PP-OCRv5_server_rec 100%[===================>] 204.65M   119MB/s    in 1.7s    

2025-09-14 22:33:03 (119 MB/s) - './pretrain_models/PP-OCRv5_server_rec_pretrained.pdparams' saved [214594738/214594738]
python 复制代码
!ls pretrain_models -la
diff 复制代码
total 312520
drwxr-xr-x 2 aistudio aistudio      4096 Sep 14 22:33 .
drwxrwxrwx 1 aistudio aistudio      4096 Sep 14 22:33 ..
-rw-r--r-- 1 aistudio aistudio 105414496 May 15 23:51 PP-OCRv5_server_det_pretrained.pdparams
-rw-r--r-- 1 aistudio aistudio 214594738 May 13 12:27 PP-OCRv5_server_rec_pretrained.pdparams

2.检测模型训练

检测训练文件/home/aistudio/PP-OCRv5_server_det.yml配置,主要完成以下工作:

  • 配置与训练模型
  • 配置train/eval数据集
yaml 复制代码
Global:
  model_name: PP-OCRv5_server_det # To use static model for inference.
  debug: false
  use_gpu: true
  epoch_num: &epoch_num 1
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/PP-OCRv5_server_det
  save_epoch_step: 10
  eval_batch_step:
  - 0
  - 1500
  cal_metric_during_train: false
  checkpoints:
  pretrained_model: ./pretrain_models/PP-OCRv5_server_det_pretrained.pdparams 
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true
python 复制代码
%cd ~/paddleocr
# GPU训练 支持单卡,多卡训练,通过CUDA_VISIBLE_DEVICES指定卡号
!python3 tools/train.py -c /home/aistudio/PP-OCRv5_server_det.yml

global_step: 1906, lr: 0.000497, loss: 0.290166, loss_shrink_maps: 0.127447, loss_threshold_maps: 0.110843, loss_binary_maps: 0.025570, loss_cbn: 0.025570, avg_reader_cost: 0.00100 s, avg_batch_cost: 1.27046 s, avg_samples: 26.9, ips: 21.17346 samples/s, eta: 0:00:00, max_mem_reserved: 76064 MB, max_mem_allocated: 62609 MB [2025/09/15 00:33:12] ppocr INFO: save model in ./output/PP-OCRv5_server_det/latest [2025/09/15 00:33:12] ppocr INFO: best metric, hmean: 0.9950248756218906, is_float16: False, precision: 0.9900990099009901, recall: 1.0, fps: 47.51993165964418, best_epoch: 1

注意遇到:

bash 复制代码
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough

直接执行

bash 复制代码
rm /dev/shm/*

预计跑一个epoch花费 1小时34分钟

bash 复制代码
[2025/09/14 23:17:21] ppocr INFO: epoch: [1/1], global_step: 10, lr: 0.000001, loss: 1.669999, loss_shrink_maps: 0.768423, loss_threshold_maps: 0.560487, loss_binary_maps: 0.153146, loss_cbn: 0.153146, avg_reader_cost: 0.49088 s, avg_batch_cost: 2.98450 s, avg_samples: 50.0, ips: 16.75323 samples/s, eta: 1:34:18, max_mem_reserved: 76064 MB, max_mem_allocated: 62609 MB



[2025/09/15 00:33:12] ppocr INFO: save model in ./output/PP-OCRv5_server_det/latest
[2025/09/15 00:33:12] ppocr INFO: best metric, hmean: 0.9950248756218906, is_float16: False, precision: 0.9900990099009901, recall: 1.0, fps: 47.51993165964418, best_epoch: 1

3.识别模型训练

检测训练文件/home/aistudio/PP-OCRv5_server_rec.yml配置,主要完成以下工作:

  • 配置与训练模型
  • 配置train/eval数据集
  • batch_size根据显存大小进行调整,尽可能把显存占满,可以不是2的n次方
yaml 复制代码
Global:
  model_name: PP-OCRv5_server_rec # To use static model for inference.
  debug: false
  use_gpu: true
  epoch_num: 5
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/PP-OCRv5_server_rec
  save_epoch_step: 1
  eval_batch_step: [0, 1000]
  cal_metric_during_train: true
  calc_epoch_interval: 1
  pretrained_model: ./pretrain_models/PP-OCRv5_server_rec_pretrained.pdparams 
  checkpoints:
  save_inference_dir:
  use_visualdl: false
  infer_img: doc/imgs_words/ch/word_1.jpg
  character_dict_path: /home/aistudio/work/word_dict.txt
  max_text_length: &max_text_length 25
  infer_mode: false
  use_space_char: true
  distributed: true
  save_res_path: ./output/rec/predicts_ppocrv5.txt
  d2s_train_image_shape: [3, 48, 320]
python 复制代码
%cd ~/paddleocr
# GPU训练 支持单卡,多卡训练,通过CUDA_VISIBLE_DEVICES指定卡号
!python3 tools/train.py -c /home/aistudio/PP-OCRv5_server_rec.yml
yaml 复制代码
[2025/09/15 01:31:41] ppocr INFO: epoch: [5/5], global_step: 820, lr: 0.000057, acc: 0.935625, norm_edit_dis: 0.989528, CTCLoss: 0.281386, NRTRLoss: 0.753598, loss: 1.033954, avg_reader_cost: 0.00861 s, avg_batch_cost: 1.48300 s, avg_samples: 679.9, ips: 458.46326 samples/s, eta: 0:00:07, max_mem_reserved: 68153 MB, max_mem_allocated: 62212 MB
[2025/09/15 01:31:49] ppocr INFO: epoch: [5/5], global_step: 825, lr: 0.000054, acc: 0.939962, norm_edit_dis: 0.989740, CTCLoss: 0.312621, NRTRLoss: 0.753528, loss: 1.065384, avg_reader_cost: 0.00385 s, avg_batch_cost: 0.72278 s, avg_samples: 293.3, ips: 405.79315 samples/s, eta: 0:00:00, max_mem_reserved: 68153 MB, max_mem_allocated: 62212 MB
[2025/09/15 01:31:50] ppocr INFO: save model in ./output/PP-OCRv5_server_rec/latest
[2025/09/15 01:31:50] ppocr INFO: save model in ./output/PP-OCRv5_server_rec/iter_epoch_5
[2025/09/15 01:31:50] ppocr INFO: best metric, acc: 0.9530721916836626, is_float16: False, norm_edit_dis: 0.9935427260596202, fps: 1411.8188540231401, best_epoch: 5
  • 初步预计5个epoch约21分钟即可完成训练
bash 复制代码
[2025/09/15 00:44:25] ppocr INFO: epoch: [1/5], global_step: 50, lr: 0.000120, acc: 0.000000, norm_edit_dis: 0.267684, CTCLoss: 38.783138, NRTRLoss: 0.983644, loss: 39.741526, avg_reader_cost: 0.01220 s, avg_batch_cost: 1.46025 s, avg_samples: 613.3, ips: 419.99745 samples/s, eta: 0:22:38, max_mem_reserved: 68153 MB, max_mem_allocated: 62212 MB
[2025/09/15 00:44:40] ppocr INFO: epoch: [1/5], global_step: 60, lr: 0.000150, acc: 0.000000, norm_edit_dis: 0.276049, CTCLoss: 30.285564, NRTRLoss: 0.856982, loss: 31.166797, avg_reader_cost: 0.01222 s, avg_batch_cost: 1.41677 s, avg_samples: 506.5, ips: 357.50347 samples/s, eta: 0:21:38, max_mem_reserved: 68153 MB, max_mem_allocated: 62212 MB
[2025/09/15 00:44:54] ppocr INFO: epoch: [1/5], global_step: 70, lr: 0.000180, acc: 0.000000, norm_edit_dis: 0.294244, CTCLoss: 26.253941, NRTRLoss: 0.824854, loss: 27.071569, avg_reader_cost: 0.01199 s, avg_batch_cost: 1.45010 s, avg_samples: 586.6, ips: 404.52491 samples/s, eta: 0:20:54, max_mem_reserved: 68153 MB, max_mem_allocated: 62212 MB

五、模型导出

1.导出检测模型

python 复制代码
%cd ~/paddleocr
# 导出检测模型
!python3 tools/export_model.py \
        -c /home/aistudio/PP-OCRv5_server_det.yml \
        -o Global.checkpoints=./output/PP-OCRv5_server_det/best_model/model \
        Global.save_inference_dir=./inference/det_db
yaml 复制代码
/home/aistudio/paddleocr
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
Skipping import of the encryption module.
W0915 01:34:12.333737 267065 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6
[2025/09/15 01:34:12] ppocr INFO: resume from ./output/PP-OCRv5_server_det/best_model/model
[2025/09/15 01:34:12] ppocr INFO: Export inference config file to ./inference/det_db/inference.yml
Skipping import of the encryption module
W0915 01:34:14.080535 267065 eager_utils.cc:3441] Paddle static graph(PIR) not support input out tensor for now!!!!!
[2025/09/15 01:34:15] ppocr INFO: inference model is saved to ./inference/det_db/inference

2.导出识别模型

python 复制代码
%cd ~/paddleocr
# 导出识别模型
!python3 tools/export_model.py \
        -c /home/aistudio/PP-OCRv5_server_rec.yml \
        -o Global.checkpoints=./output/PP-OCRv5_server_rec/best_model/model \
        Global.save_inference_dir=./inference/rec_rare
yaml 复制代码
/home/aistudio/paddleocr
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
Skipping import of the encryption module.
W0915 01:34:46.435449 268209 gpu_resources.cc:114] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 12.8, Runtime API Version: 12.6
[2025/09/15 01:34:46] ppocr INFO: resume from ./output/PP-OCRv5_server_rec/best_model/model
[2025/09/15 01:34:46] ppocr INFO: Export inference config file to ./inference/rec_rare/inference.yml
Skipping import of the encryption module
W0915 01:34:48.317775 268209 eager_utils.cc:3441] Paddle static graph(PIR) not support input out tensor for now!!!!!
[2025/09/15 01:34:49] ppocr INFO: inference model is saved to ./inference/rec_rare/inference

六、模型测试

python 复制代码
%cd ~/paddleocr
!python3 tools/infer/predict_system.py \
    --image_dir="/home/aistudio/test" \
    --det_model_dir="./inference/det_db" \
    --rec_model_dir="./inference/rec_rare" \
    --rec_image_shape="3, 32, 320" \
    --rec_algorithm="RARE" \
    --use_space_char False \
    --max_text_length 7 \
    --rec_char_dict_path="/home/aistudio/work/word_dict.txt" \
    --use_gpu False 
yaml 复制代码
/home/aistudio/paddleocr
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
[2025/09/15 01:40:22] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
[2025/09/15 01:40:22] ppocr DEBUG: dt_boxes num : 1, elapsed : 0.2661449909210205
[2025/09/15 01:40:22] ppocr DEBUG: rec_res num  : 1, elapsed : 0.06274533271789551
[2025/09/15 01:40:22] ppocr DEBUG: 0  Predict time of /home/aistudio/test/1.jpeg: 0.331s
[2025/09/15 01:40:22] ppocr DEBUG: 苏苏DSS0000, 0.995
[2025/09/15 01:40:22] ppocr DEBUG: The visualized image saved in ./inference_results/1.jpeg
[2025/09/15 01:40:22] ppocr INFO: The predict total time is 0.347625732421875
python 复制代码
from PIL import Image

# 打开一个图片文件
image_path = '/home/aistudio/test/1.jpeg'  # 替换为你的图片文件路径
img = Image.open(image_path)

# 显示图片
img.show()

七、总结

  • 文字识别轮次可以再高一点,目前5轮次准确率太低,误判太多,按照以前的情形大概72轮次即可
  • 检测模型训练最为耗时
相关推荐
float_六七2 小时前
Spring事务注解@Transactional核心机制详解
java·后端·spring
王维志2 小时前
LiteDB详解
数据库·后端·mongodb·sqlite·c#·json·database
半凡梦秋3 小时前
Springboot多线程操作事务
后端
几颗流星3 小时前
Rust 常用语法速记 - 错误处理
后端·rust
lypzcgf3 小时前
Coze源码分析-资源库-创建知识库-后端源码-应用/领域/数据访问
后端·go·coze·coze源码分析·智能体平台·ai应用平台·agent平台
LaoZhangAI3 小时前
Google Gemini AI图片编辑完全指南:50+中英对照提示词与批量处理教程(2025年9月)
前端·后端
小枫编程3 小时前
Spring Boot 调度任务在分布式环境下的坑:任务重复执行与一致性保证
spring boot·分布式·后端
用户11481867894844 小时前
从零搭建 Vue3 + Nest.js 实时通信项目:4 种方案(短轮询 / 长轮询 / SSE/WebSocket)
前端·后端
LaoZhangAI4 小时前
Google Gemini Nano与Banana AI完整部署指南:2025年轻量级AI解决方案
前端·后端