PyAV源码安装及基本使用

PyAV源码编译

PyAV是对 ffmpeg 库的 Python 绑定，本篇介绍PyAV的源码安装及基本使用。

文章目录

PyAV源码编译
PyAV的使用
- [nvidia 硬件编解码](#nvidia 硬件编解码)
- [Python C++ debug 方式](#Python C++ debug 方式)

编译流程

PyAV的源码地址为：https://github.com/PyAV-Org/PyAV 根据README进行安装即可。

source scripts/activate.sh
启动虚拟环境，需要安装python模块-virtualenv，使用的python版本是默认路劲下的python版本，/usr/bin/python 或者miniconda3/bin/python；在scripts/activate.sh中，通过变量PYAV_PYTHON可以指定虚拟环境所使用的python版本：

export PYAV_PYTHON=~/miniconda3/envs/py310/bin/python
source scripts/activate.sh
pip install --upgrade -r tests/requirements.txt
requirements.txt 里指定所需模块，但是没有指定版本，默认为最新版本。
./scripts/build-deps
这里会在目录PyAV/vendor下编译ffmpeg, 如何存在ffmpeg则跳过ffmpeg的编译，可以在这里替换自己的ffmpeg 。
make
pip install .

源码目录结构

PyAV/

│

├── av/ # PyAV 的核心模块，可以查看对应的Python接口是怎么调用到ffmpeg接口的

├── docs/ # 文档目录

├── src/ # Cython 代码，将代码翻译成 C 代码并编译成扩展模块，以提供 Python 与 C/C++ 之间的接口

├── examples/ # 示例代码目录

├── tests/ # 测试代码目录外部通过import test.xxx 进行使用

├── vendor #ffmpeg 安装路劲

├── venvs #虚拟环境安装路径

├── CHANGELOG.rst # 变更日志可以看到各个版本支持的ffmpeg 和 python版本

├── setup.cfg # setuptools 配置文件

└── setup.py # 安装脚本

编译中遇到的问题

直接编译最新版本应该问题不大，但是为了适配我的ffmpeg版本（ffmpeg-4.2），我对PyAV的版本进行了回退:

PyAV历史产出包：PyAV-tags

make时报log_callback的错误：

performance hint: av\logging.pyx:232:5: Exception check on 'log_callback' will always require the GIL to be acquired.

Possible solutions:

Declare the function as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.

Use an 'int' return type on the function to allow an error code to be returned.

Error compiling Cython file:

...

cdef const char *log_context_name(void *ptr) nogil:

cdef log_context obj = <log_context >ptr

return obj.name

cdef lib.AVClass log_class

log_class.item_name = log_context_name

通过查阅资料，大致的解决办法都是回退python的版本: Python 3.13.x -> python 3.12.x ，如下面这篇问题：
https://stackoverflow.com/questions/77410272/problems-installing-python-av-in-windows-11

但是我在PyAV-8.1.0中回退版本无效，后来通过回退Cython的版本解决的，默认安装的Cython版本为3.x，这里我回退到0.29.24。

另外还有下面这个报错，我是通过回退python版本解决的：

fatal error: longintrepr.h: 没有那个文件或目录

总结起来就是版本需要对应上，这里尝试了两种版本可以编译通过：

Python	Cython	PyAV	ffmpeg
3.10	0.29.24	8.1.0	4.2
3.11	3.0.9	11.0	6.0

PyAV的使用

仓库里在目录example下提供了许多case，很值得参考；另外，关于video解码可以参考这篇：
https://blog.csdn.net/weixin_43360707/article/details/131654650

nvidia 硬件编解码

下面是使用nvidia的硬件解码器，解码mp4文件的case, 前提是ffmpeg需要编译出h264_cuvid和h264_nvenc编解码器。
nvidia 解码，将裸流解码为yuv文件

复制代码

import os
import subprocess

import av
import av.datasets


# We want an H.264 stream in the Annex B byte-stream format.
# We haven't exposed bitstream filters yet, so we're gonna use the `ffmpeg` CLI.
h264_path = 'night-sky.h264'

fh = open(h264_path, 'rb')

codec = av.CodecContext.create('h264_cuvid', 'r')
output_file = "output.yuv"
output = open(output_file, 'wb')
while True:

    chunk = fh.read(1 << 16)

    packets = codec.parse(chunk)
    print("Parsed {} packets from {} bytes:".format(len(packets), len(chunk)))

    for packet in packets:

        print('   ', packet)

        frames = codec.decode(packet)
        for frame in frames:
            print('       ', frame)
            # yuv_frame = frame.to_ndarray(format='yuv420p')
            # output.write(yuv_frame.tobytes())
            yuv_frame = frame.reformat(1920, 1080, 'yuv420p')
            output.write(yuv_frame.planes[0].to_bytes())
            output.write(yuv_frame.planes[1].to_bytes())
            output.write(yuv_frame.planes[2].to_bytes())

    # We wait until the end to bail so that the last empty `buf` flushes
    # the parser.
    if not chunk:
        break
output.close()

通过 watch -n 1 nvidia-smi 可以检查是否正真的调用到nvidia的硬件。
nvidia 编码，将yuv文件编码为码流文件

复制代码

import av
import numpy as np

input_file = "input.yuv"

output_raw_file = "stream.h264"
output_media_file = "output.mp4"

#org yuv resolutions
width = 1920
height = 1080

input_yuv = open(input_file, 'rb')
output = open(output_raw_file, 'wb')

output_container = av.open(output_media_file, 'w')
output_stream = output_container.add_stream('h264_nvenc')

#strat encode
frame_count = 0
while True:

    y_data = input_yuv.read(width * height)
    u_data = input_yuv.read(width * height // 4)
    v_data = input_yuv.read(width * height // 4)
    
    if not y_data or not u_data or not v_data:
        break
    #create AVframe for encoding
    yuv = np.concatenate([np.frombuffer(y_data, dtype=np.uint8),
                          np.frombuffer(u_data, dtype=np.uint8),
                          np.frombuffer(v_data, dtype=np.uint8)])
    yuv = yuv.reshape((height * 3 // 2, width))

    frame = av.VideoFrame.from_ndarray(yuv, format='yuv420p')
    #encode
    for packet in output_stream.encode(frame):
            # output raw stream
            output.write(packet.to_bytes())
            #or: mux to mp4 media-format
            # output_container.mux(packet)
            

    frame_count += 1

input_yuv.close()
output.close()
print(f"Encoded {frame_count} frames.")

nvidia 转码，将H264-mp4转为HEVC-mp4

复制代码

import av

input_file = 'H264_1080P.mp4'
output_file = 'HEVC_1080P.mp4'

input_container = av.open(input_file)

# video_stream = next(s for s in input_container.streams if s.type == 'video')
video_stream = input_container.streams.video[0]
output_container = av.open(output_file, 'w')

output_stream = output_container.add_stream('hevc_nvenc', rate=video_stream.rate) #enc

codec = av.CodecContext.create('h264_cuvid', 'r') #dec

#copy avcodec_ctx->extradata & avcodec_ctx->extradata_size
codec.extradata = video_stream.extradata

#output yuv if you want
output_file = "output.yuv"
output = open(output_file, 'wb')

# transcode
for packets in input_container.demux(video_stream):
    print(packets)

    frames = codec.decode(packets)

    for frame in frames:
        # mux to mp4
        for packet in output_stream.encode(frame):
            output_container.mux(packet)
        #
        yuv_frame = frame.to_ndarray(format='yuv420p')
        output.write(yuv_frame.tobytes())


input_container.close()
output_container.close()
output.close()

在我的测试中，需要在解码之前加上codec.extradata = video_stream.extradata这句语句，否则codec.decode(packet)返回的frame list为空，不清楚是不是算是一个bug。具体来说在ffmpeg的demux demo(doc/examples/demuxing_decoding.c)中，使用函数avcodec_parameters_to_context进行参数的拷贝，而PyAV可能缺少了这一步骤。

Python C++ debug 方式

由于PyAV对ffmpeg进行了python封装，直接使用gdb抓不到Python程序，使用pdb又进不了C函数；因此需要一种Python C++的debug方案。

在VScode中有Python C++ debugger 插件，里面介绍了用法，可以一试；还有其他的VScode调试方法如：
https://nadiah.org/2020/03/01/example-debug-mixed-python-c-in-visual-studio-code/

下面这篇文章提到了Python C++的debug 方式，指出："调试器的工作机制都是将执行代码中的代码替换成一个异常处理的代码，让程序的执行跳转到调试器中的处理流程去"，也就是直接用gdb去attach python的进程就能对C的代码流程进行调试：
编译、调试PyTorch源码1

实际使用中，我用pdb + gdb 的方法常常会卡死，可能是机器的问题

终端1：pdb + print(os.getpid()) + pdb.set_trace()

终端2：gdb -p < pid >

怀疑是pdb和gdb的冲突，所以后来我在python中进行延迟（time.sleep(10))，为了是在启动gdb之前获取当前Python进程的pid，然后再用gdb去attach pid，通过break打点，代码运行到断点出可以查看栈信息

欢迎评论、讨论以及指出错误， Thanks ~