Linux 35.6 + JetPack v5.1.4之RTP实时视频Python框架
- [1. 源由](#1. 源由)
- [2. 思路](#2. 思路)
- [3. 方法论](#3. 方法论)
-
- [3.1 扩展思考 - 慎谋而后定](#3.1 扩展思考 - 慎谋而后定)
- [3.2 扩展思考 - 拒绝拖延或犹豫](#3.2 扩展思考 - 拒绝拖延或犹豫)
- [3.3 扩展思考 - 哲学思考](#3.3 扩展思考 - 哲学思考)
- [3.4 逻辑实操 - 方法论](#3.4 逻辑实操 - 方法论)
- [4 准备](#4 准备)
- [5. 分析](#5. 分析)
-
- [5.1 gst-launch-1.0](#5.1 gst-launch-1.0)
-
- [5.1.1 xvimagesink](#5.1.1 xvimagesink)
- [5.1.2 nv3dsink](#5.1.2 nv3dsink)
- [5.1.3 nv3dsink + sync=0](#5.1.3 nv3dsink + sync=0)
- [5.1.4 xvimagesink + sync=0](#5.1.4 xvimagesink + sync=0)
- [5.2 python framework](#5.2 python framework)
-
- [5.2.1 xvimagesink](#5.2.1 xvimagesink)
- [5.2.2 nv3dsink](#5.2.2 nv3dsink)
- [5.2.3 xvimagesink + sync=0](#5.2.3 xvimagesink + sync=0)
- [5.2.4 nv3dsink + sync=0](#5.2.4 nv3dsink + sync=0)
- [6. 总结 & 优化](#6. 总结 & 优化)
- [7. 补充 - RTP/RTSP推/拉流](#7. 补充 - RTP/RTSP推/拉流)
1. 源由
鉴于目前 DeepStream
的代码没有基于RTP实时视频流分析的 Python Demo 代码,但是是有RTSP的示例。
Application | Description |
---|---|
deepstream-test1 | 4-class object detection pipeline - now also demonstrates support for new nvstreammux |
deepstream-test2 | 4-class object detection, tracking, and attribute classification pipeline |
deepstream-test3 | Multi-stream pipeline performing 4-class object detection - now also supports Triton inference server, no-display mode, file-loop, and silent mode |
deepstream-test4 | msgbroker for sending analytics results to the cloud |
deepstream-imagedata-multistream | Multi-stream pipeline with access to image buffers |
deepstream-ssd-parser | SSD model inference via Triton server with output parsing in Python |
deepstream-test1-usbcam | deepstream-test1 pipeline with USB camera input |
deepstream-test1-rtsp-out | deepstream-test1 pipeline with RTSP output |
deepstream-opticalflow | Optical flow and visualization pipeline with flow vectors returned in NumPy array |
deepstream-segmentation | Segmentation and visualization pipeline with segmentation mask returned in NumPy array |
deepstream-nvdsanalytics | Multistream pipeline with analytics plugin |
runtime_source_add_delete | Add/delete source streams at runtime |
deepstream-imagedata-multistream-redaction | Multi-stream pipeline with face detection and redaction |
deepstream-rtsp-in-rtsp-out | Multi-stream pipeline with RTSP input/output - now takes new command line argument "--rtsp-ts" for configuring the RTSP source to attach the timestamp rather than the streammux |
deepstream-preprocess-test | Multi-stream pipeline using nvdspreprocess plugin with custom ROIs |
deepstream-demux-multi-in-multi-out | Multi-stream pipeline using nvstreamdemux plugin to generate separate buffer outputs |
deepstream-imagedata-multistream-cupy | Access imagedata buffer from GPU in a multistream source as CuPy array - x86 only |
deepstream-segmask | Access and interpret segmentation mask information from NvOSD_MaskParams |
deepstream-custom-binding-test | Demonstrate usage of NvDsUserMeta for attaching custom data structure - see also the Custom User Meta Guide |
2. 思路
为此,计划先写一个基于RTP视频流的框架,然后从RTSP示例中移植 DeepStream
代码。
通常需要思考以下基本问题:
- Why (为什么)
- When (何时)
- What (什么)
- Who (谁)
- Where (哪里)
- How (如何)
上述基本问题,拍脑袋一想:
- 从逻辑上应该是通的,因为RTSP其实就是RTP上的控制信息,理论上GST就是支持RTP;// Why (为什么)
- 实际上官方为什么这么基础的示例没有,目前论坛上也有人问,但是似乎也没有找到;// Where (哪里)
- 别人为什么没有做,大概是哪些人会要做这个;// What (什么) Who (谁)
- 是不是太简单了?不值得做?hello world也很简单,但是也有很多示例就是这种 // How (如何)
从底层逻辑思维去想这个问题,可能大家就不能想象一个问题:性能
注:当初并没有想到这个问题点,但是随着一步一步的深入分析和理解,感觉主要是由于RTP带来后续一系列的问题,尤其是性能方面。
3. 方法论
"谋而后动,不是不动"的俗语提醒我们:
- 做事要有计划,不要盲目行动。
- 计划的目的是为了更好地行动,而不是让行动停滞。
- 行动中要灵活调整,避免纸上谈兵。
这句话体现了一种高效的处事智慧:思考为行动铺路,行动为思考赋能,最终实现知行合一。
3.1 扩展思考 - 慎谋而后定
核心意思:行动前要进行充分的思考和计划,但"谋"并非拖延,而是为了行动提供更好的方向。
示例:
一个企业在推出新产品前,进行市场调研和风险分析,是"谋";调研完成后,果断投入资源推广,是"后动"。
3.2 扩展思考 - 拒绝拖延或犹豫
核心意思:谋划的目的是为了更好的行动,而不是找借口拖延行动。
示例:
在学习和工作中,设定目标后过于纠结细节而迟迟不开始,实际上是一种拖延心态。
3.3 扩展思考 - 哲学思考
核心意思:谋划是智慧的体现,行动是实践的体现,二者统一于"知行合一"。
真正的成功,不仅需要思想上的高度,还需要行动上的落实。"谋而后动"强调思考的重要性,但同时否定"以思代行"。
3.4 逻辑实操 - 方法论
逻辑思路很简单,会想会说的人很多,但是要能够真正落地,需要实操的方法论来保证。
评价工程师技术能力,并不是过去的经验、成就。因为能力不是过去式,而是潜在的将来式。
如何从方法论角度确保能力的体现,最重要的几个大步骤:
- 会想
- 会说
- 会写
- 会做
- 总结
- 优化 //这个就是继续回归到"会想",循环螺旋式上升高度
4 准备
注:关于环境安装,就不多说了,想了解的,请参阅:Linux 35.6 + JetPack v5.1.4@DeepStream安装
准备工作主要是测试数据+验证代码+测试方法+测试环境,没有这些东西,上来就喊是没有人理你的。因此,主要准备以下内容:
video-viewer
命令及参数gst-launch-1.0
命令及参数python
RTP 框架测试代码- 依据上述命令和测试代码进行测试,整理测试结果
然后,专业的事情找专业的人:
搜索找不到,不代表没有,看看能否通过技术支持找到一些有用信息。
如果没有,是否技术上存在难点?或者暂时无法搞清楚的一些背景原因。
测试下来效果并不理想,为什么?通过数据、示例代码,与技术人员互动,寻找解决方案,远比坐井观天来的实在。
后续集成或者
Porting
DeepStream代码。
环境配置如下:
Software part of jetson-stats 4.2.12 - (c) 2024, Raffaello Bonghi
Model: NVIDIA Orin Nano Developer Kit - Jetpack 5.1.4 [L4T 35.6.0]
NV Power Mode[0]: 15W
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:
- P-Number: p3767-0005
- Module: NVIDIA Jetson Orin Nano (Developer kit)
Platform:
- Distribution: Ubuntu 20.04 focal
- Release: 5.10.216-tegra
jtop:
- Version: 4.2.12
- Service: Active
Libraries:
- CUDA: 11.4.315
- cuDNN: 8.6.0.166
- TensorRT: 8.5.2.2
- VPI: 2.4.8
- OpenCV: 4.9.0 - with CUDA: YES
DeepStream C/C++ SDK version: 6.3
Python Environment:
Python 3.8.10
GStreamer: YES (1.16.3)
NVIDIA CUDA: YES (ver 11.4, CUFFT CUBLAS FAST_MATH)
OpenCV version: 4.9.0 CUDA True
YOLO version: 8.3.33
Torch version: 2.1.0a0+41361538.nv23.06
Torchvision version: 0.16.1+fdea156
DeepStream SDK version: 1.1.8
5. 分析
- 个人长期工作在0~1的过程,换句话说就是从不知道到知道的过程。
- 在实际工作中,一开始的思路会相对来说发散(但是基于假设场景的有限发散),这句话很拗口,仔细体会。
- 通常随着逐步深入和细化,问题将会呈现收敛。
- 随之形成局部或者部分结论,此时,
Root cause
逐步呈现。
鉴于video-viewer
工具在1080P@60Hz的RTP源上稳定获得60FPS的视频流,从逻辑角度以下两条线都应该呈现该性能。
5.1 gst-launch-1.0
关于这个gst-launch-1.0
命令和参数,网上搜,自己慢慢的积累,多了就知道大概了,然后要孜孜不倦的问问题,呵呵!
- xvimagesink 软的
- nv3dsink 有硬件加速
- sync=0 即时处理 // NVIDIA论坛上,专家反馈。不是"砖家",大家懂得!!!
5.1.1 xvimagesink
gst-launch-1.0 -v udpsrc port=5600 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! nvvidconv ! fpsdisplaysink text-overlay=0 video-sink=xvimagesink
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 9, dropped: 6, fps: 7.35, drop rate: 5.51
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 11, dropped: 10, fps: 3.86, drop rate: 7.73
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 19, dropped: 11, fps: 15.00, drop rate: 1.87
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 23, dropped: 15, fps: 7.61, drop rate: 7.61
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 30, dropped: 17, fps: 11.30, drop rate: 3.23
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 34, dropped: 20, fps: 6.83, drop rate: 5.13
5.1.2 nv3dsink
gst-launch-1.0 -v udpsrc port=5600 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! fpsdisplaysink text-overlay=0 video-sink=nv3dsink
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 13, dropped: 6, fps: 18.44, drop rate: 1.84
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 17, dropped: 9, fps: 7.88, drop rate: 5.91
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 25, dropped: 10, fps: 15.16, drop rate: 1.90
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 29, dropped: 14, fps: 7.83, drop rate: 7.83
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 35, dropped: 17, fps: 11.38, drop rate: 5.69
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 42, dropped: 20, fps: 12.73, drop rate: 5.45
5.1.3 nv3dsink + sync=0
gst-launch-1.0 -v udpsrc port=5600 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! fpsdisplaysink text-overlay=0 video-sink=nv3dsink sync=0
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 30, dropped: 0, current: 59.51, average: 59.51
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 61, dropped: 0, current: 60.00, average: 59.76
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 91, dropped: 0, current: 59.70, average: 59.74
5.1.4 xvimagesink + sync=0
gst-launch-1.0 -v udpsrc port=5600 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! nvvidconv ! fpsdisplaysink text-overlay=0 video-sink=xvimagesink sync=0
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 70, dropped: 0, current: 67.39, average: 69.26
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 100, dropped: 0, current: 59.61, average: 66.05
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0: last-message = rendered: 129, dropped: 0, current: 57.52, average: 63.92
5.2 python framework
Python其实是通过中间GST-Python --> PyBindings --> Data流程获取的数据,从pipline的角度,其实和gst-launch-1.0没什么差别。关于这个GST-Python和相关底层库和Python组件安装,详见:Linux 35.6 + JetPack v5.1.4@DeepStream安装
5.2.1 xvimagesink
性能: ~15FPS
5.2.2 nv3dsink
源代码:python nv3dsink
性能: ~20FPS
5.2.3 xvimagesink + sync=0
bash
daniel@daniel-nvidia:~/Work/jetson-fpv$ python3 ./utils/deepstream_xvimagesink.py 5600
Selected Pipeline: udpsrc port=5600 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! nvvidconv ! xvimagesink sync=0
Pipeline elements:
- capsfilter0
- xvimagesink0
- nvvconv0
- nvv4l2decoder0
- h265parse0
- rtph265depay0
- udpsrc0
Opening in BLOCKING MODE
Running...
NvMMLiteOpen : Block : BlockType = 279
NvMMLiteBlockCreate : Block : BlockType = 279
FPS: 36.96
FPS: 58.96
FPS: 59.95
FPS: 59.93
FPS: 59.93
FPS: 60.98
FPS: 58.94
FPS: 59.93
FPS: 60.94
^CExiting...
deepstream done!
5.2.4 nv3dsink + sync=0
bash
daniel@daniel-nvidia:~/Work/jetson-fpv$ python3 ./utils/deepstream.py 5600
Selected Pipeline: udpsrc port=5600 buffer-size=8388608 ! application/x-rtp,encoding-name=H265,payload=96 ! rtph265depay ! h265parse ! nvv4l2decoder ! nv3dsink name=sink sync=0
Opening in BLOCKING MODE
Running...
NvMMLiteOpen : Block : BlockType = 279
NvMMLiteBlockCreate : Block : BlockType = 279
FPS: 17.03
FPS: 59.95
FPS: 59.94
FPS: 59.93
FPS: 59.93
FPS: 59.96
FPS: 59.98
^CExiting...
deepstream done!
6. 总结 & 优化
通过一个简单的RTP实时视频Python框架遇到的问题,以及通过方法论来解决问题,大致的思路和步骤,做一个简单的整理。
其实工程问题远比科学问题简单,因为工程问题通常是有结论的。而科学问题,有时候会走到死胡同的。只要学会方法论,通过底层逻辑就能快速的解决工程技术问题。
优化的代码版本,稳定获取FPS 60Hz的数据,这个与我们在video-viewer
上看到的一致。
基于这个版本,我们去做DeepStream算法才是有意义的,否则源头上就FPS存在问题。
7. 补充 - RTP/RTSP推/拉流
因为,很多DEMO都是RTSP的,所以这里给下几个常用命令:
注:笔者有一个1080P@60FPS的设备可以稳定输出各种编码和分辨率以及帧率的视频;没有这种视频源的朋友,可能需要一些模拟源,就可能用到下面的命令。
-
RTSP推/拉流
gst-launch-1.0 -v videotestsrc ! x264enc ! rtph264pay ! udpsink host=<IP_ADDRESS> port=5000
gst-launch-1.0 -v rtspsrc location=rtsp://<RTSP_SERVER_IP>:<PORT>/stream ! decodebin ! autovideosink -
RTP推/拉流
gst-launch-1.0 -v videotestsrc ! x264enc ! rtph264pay ! udpsink host=<IP_ADDRESS> port=5000
gst-launch-1.0 -v udpsrc port=5000 ! application/x-rtp,media=video,encoding-name=H264,payload=96 ! rtph264depay ! decodebin ! autovideosink -
循环播放指定文件
在Jetson Orin板子上,可以用以下工具,非常简单的循环播放一个视频来做测试:
bash
$ video-viewer file:///$(pwd)/<your video file> rtp://127.0.0.1:5000 --input-loop=-1 --headless --ouput-codec=h265
bash
$ video-viewer rtp://@:5000