音视频入门基础：H.264专题（6）——FFmpeg源码：从H.264码流中提取NALU Header、EBSP、RBSP和SODB

=================================================================

音视频入门基础：H.264专题系列文章：

音视频入门基础：H.264专题（1）------H.264官方文档下载

音视频入门基础：H.264专题（2）------使用FFmpeg命令生成H.264裸流文件

音视频入门基础：H.264专题（3）------EBSP, RBSP和SODB

音视频入门基础：H.264专题（4）------NALU Header：forbidden_zero_bit、nal_ref_idc、nal_unit_type简介

音视频入门基础：H.264专题（5）------FFmpeg源码中解析NALU Header的函数分析

=================================================================

一、引言

FFmpeg源码中通过ff_h2645_packet_split这个函数将一个个NALU的NALU Header、EBSP、RBSP和SODB从H.264/H.265码流中提取出来，本文以H.264为例对该函数进行讲解。

二、ff_h2645_packet_split函数的声明

ff_h2645_packet_split函数声明在FFmpeg源码（本文演示用的FFmpeg源码版本为5.0.3）的头文件libavcodec/h2645_parse.h中：

cpp 复制代码

/**
 * Split an input packet into NAL units.
 *
 * If data == raw_data holds true for a NAL unit of the returned pkt, then
 * said NAL unit does not contain any emulation_prevention_three_byte and
 * the data is contained in the input buffer pointed to by buf.
 * Otherwise, the unescaped data is part of the rbsp_buffer described by the
 * packet's H2645RBSP.
 *
 * If the packet's rbsp_buffer_ref is not NULL, the underlying AVBuffer must
 * own rbsp_buffer. If not and rbsp_buffer is not NULL, use_ref must be 0.
 * If use_ref is set, rbsp_buffer will be reference-counted and owned by
 * the underlying AVBuffer of rbsp_buffer_ref.
 */
int ff_h2645_packet_split(H2645Packet *pkt, const uint8_t *buf, int length,
                          void *logctx, int is_nalff, int nal_length_size,
                          enum AVCodecID codec_id, int small_padding, int use_ref);

该函数的作用是：将形参buf指向的H.264码流中的一个个NALU提取出来，解析NALU Header，分别将每个NALU的NALU Header中的属性，EBSP、RBSP和SODB存贮到形参pkt指向的内存中。

形参pkt：输出型参数。为H2645Packet *类型。

H2645Packet结构体声明在libavcodec/h2645_parse.h中：

cpp 复制代码

/* an input packet split into unescaped NAL units */
typedef struct H2645Packet {
    H2645NAL *nals;
    H2645RBSP rbsp;
    int nb_nals;
    int nals_allocated;
    unsigned nal_buffer_size;
} H2645Packet;

执行ff_h2645_packet_split函数后，指针pkt->nals会指向一个H2645NAL类型的数组。该数组的每个元素都会存放从H.264码流中提取出来的NALU信息。比如pkt->nals[0]存放从H.264码流中提取出来的第一个NALU的信息，pkt->nals[1]存放提取出来的第二个NALU的信息，以此类推。

H2645NAL结构体声明在libavcodec/h2645_parse.h：

cpp 复制代码

typedef struct H2645NAL {
    const uint8_t *data;
    int size;

    /**
     * Size, in bits, of just the data, excluding the stop bit and any trailing
     * padding. I.e. what HEVC calls SODB.
     */
    int size_bits;

    int raw_size;
    const uint8_t *raw_data;

    GetBitContext gb;

    /**
     * NAL unit type
     */
    int type;

    /**
     * H.264 only, nal_ref_idc
     */
    int ref_idc;

    /**
     * HEVC only, nuh_temporal_id_plus_1 - 1
     */
    int temporal_id;

    /*
     * HEVC only, identifier of layer to which nal unit belongs
     */
    int nuh_layer_id;

    int skipped_bytes;
    int skipped_bytes_pos_size;
    int *skipped_bytes_pos;
} H2645NAL;

我们记pkt->nals指向的数组的某个元素的下标为"subscript"（数组的下标都是从0开始，所以pkt->nals[subscript]表示它是第"subscript+1"个元素），则执行函数ff_h2645_packet_split后：

pkt->nals[subscript]->data变为：指向某个缓冲区的指针。该缓冲区存放从H.264码流中提取出来的第"subscript+1"个NALU的"NALU Header + RBSP"。

pkt->nals[subscript]->size变为：pkt->nals[subscript]->data指向的缓冲区的大小，单位为字节。

pkt->nals[subscript]->size_bits变为：该NALU "NALU Header + SODB的位数"，单位为bit（1个字节等于8位）。

pkt->nals[subscript]->raw_data变为：指向某个缓冲区的指针。该缓冲区存放提取出来的第"subscript+1"个NALU的"NALU Header + EBSP"。

pkt->nals[subscript]->raw_size变为：pkt->nals[subscript]->raw_data指向的缓冲区的大小，单位为字节。

pkt->nals[subscript]->type变为：该NALU"NALU Header中的nal_unit_type"。

pkt->nals[subscript]->ref_idc变为：该NALU"NALU Header中的nal_ref_idc"。

pkt->nals[subscript]->gb.buffer的值等于：pkt->nals[subscript]->data。

pkt->nals[subscript]->gb.buffer_end变为：指向该NALU的RBSP的最后一个字节。

pkt->nals[subscript]->gb.index变为：8。表示读取完了该NALU的第一个字节（NALU Header，8位）

pkt->nals[subscript]->gb.size_in_bit的值等于：pkt->nals[subscript]->size_bits。

pkt->nals[subscript]->gb.size_in_bits_plus8的值等于：pkt->nals[subscript]->gb.size_in_bit + 8。

pkt->nb_nals为：这段H.264码流中NALU的个数。

形参buf：输入型参数。指向缓冲区的指针，该缓冲区存放"包含startcode的H.264码流"。

形参length：输入型参数。形参buf指向的缓冲区的长度，单位为字节。

形参logctx：输入型参数。用来输出日志，可忽略。

形参is_nalff：输入型参数。值一般为0，可忽略。

形参nal_length_size：输入型参数。值一般为0，可忽略。

codec_id：输入型参数。解码器的id。对于H.264码流，其值就是"AV_CODEC_ID_H264"。

small_padding：输入型参数。值一般为0或1，可忽略。

use_ref：输入型参数。值一般为0，可忽略。

返回值：提取NALU Header、EBSP、RBSP和SODB成功返回0。返回非0值表示失败。

三、ff_h2645_packet_split函数的定义

ff_h2645_packet_split函数定义在libavcodec/h2645_parse.c中：

cpp 复制代码

int ff_h2645_packet_split(H2645Packet *pkt, const uint8_t *buf, int length,
                          void *logctx, int is_nalff, int nal_length_size,
                          enum AVCodecID codec_id, int small_padding, int use_ref)
{
    GetByteContext bc;
    int consumed, ret = 0;
    int next_avc = is_nalff ? 0 : length;
    int64_t padding = small_padding ? 0 : MAX_MBPAIR_SIZE;

    bytestream2_init(&bc, buf, length);
    alloc_rbsp_buffer(&pkt->rbsp, length + padding, use_ref);

    if (!pkt->rbsp.rbsp_buffer)
        return AVERROR(ENOMEM);

    pkt->rbsp.rbsp_buffer_size = 0;
    pkt->nb_nals = 0;
    while (bytestream2_get_bytes_left(&bc) >= 4) {
        H2645NAL *nal;
        int extract_length = 0;
        int skip_trailing_zeros = 1;

        if (bytestream2_tell(&bc) == next_avc) {
            int i = 0;
            extract_length = get_nalsize(nal_length_size,
                                         bc.buffer, bytestream2_get_bytes_left(&bc), &i, logctx);
            if (extract_length < 0)
                return extract_length;

            bytestream2_skip(&bc, nal_length_size);

            next_avc = bytestream2_tell(&bc) + extract_length;
        } else {
            int buf_index;

            if (bytestream2_tell(&bc) > next_avc)
                av_log(logctx, AV_LOG_WARNING, "Exceeded next NALFF position, re-syncing.\n");

            /* search start code */
            buf_index = find_next_start_code(bc.buffer, buf + next_avc);

            bytestream2_skip(&bc, buf_index);

            if (!bytestream2_get_bytes_left(&bc)) {
                if (pkt->nb_nals > 0) {
                    // No more start codes: we discarded some irrelevant
                    // bytes at the end of the packet.
                    return 0;
                } else {
                    av_log(logctx, AV_LOG_ERROR, "No start code is found.\n");
                    return AVERROR_INVALIDDATA;
                }
            }

            extract_length = FFMIN(bytestream2_get_bytes_left(&bc), next_avc - bytestream2_tell(&bc));

            if (bytestream2_tell(&bc) >= next_avc) {
                /* skip to the start of the next NAL */
                bytestream2_skip(&bc, next_avc - bytestream2_tell(&bc));
                continue;
            }
        }

        if (pkt->nals_allocated < pkt->nb_nals + 1) {
            int new_size = pkt->nals_allocated + 1;
            void *tmp;

            if (new_size >= INT_MAX / sizeof(*pkt->nals))
                return AVERROR(ENOMEM);

            tmp = av_fast_realloc(pkt->nals, &pkt->nal_buffer_size, new_size * sizeof(*pkt->nals));
            if (!tmp)
                return AVERROR(ENOMEM);

            pkt->nals = tmp;
            memset(pkt->nals + pkt->nals_allocated, 0, sizeof(*pkt->nals));

            nal = &pkt->nals[pkt->nb_nals];
            nal->skipped_bytes_pos_size = FFMIN(1024, extract_length/3+1); // initial buffer size
            nal->skipped_bytes_pos = av_malloc_array(nal->skipped_bytes_pos_size, sizeof(*nal->skipped_bytes_pos));
            if (!nal->skipped_bytes_pos)
                return AVERROR(ENOMEM);

            pkt->nals_allocated = new_size;
        }
        nal = &pkt->nals[pkt->nb_nals];

        consumed = ff_h2645_extract_rbsp(bc.buffer, extract_length, &pkt->rbsp, nal, small_padding);
        if (consumed < 0)
            return consumed;

        if (is_nalff && (extract_length != consumed) && extract_length)
            av_log(logctx, AV_LOG_DEBUG,
                   "NALFF: Consumed only %d bytes instead of %d\n",
                   consumed, extract_length);

        bytestream2_skip(&bc, consumed);

        /* see commit 3566042a0 */
        if (bytestream2_get_bytes_left(&bc) >= 4 &&
            bytestream2_peek_be32(&bc) == 0x000001E0)
            skip_trailing_zeros = 0;

        nal->size_bits = get_bit_length(nal, skip_trailing_zeros);

        if (nal->size <= 0 || nal->size_bits <= 0)
            continue;

        ret = init_get_bits(&nal->gb, nal->data, nal->size_bits);
        if (ret < 0)
            return ret;

        /* Reset type in case it contains a stale value from a previously parsed NAL */
        nal->type = 0;

        if (codec_id == AV_CODEC_ID_HEVC)
            ret = hevc_parse_nal_header(nal, logctx);
        else
            ret = h264_parse_nal_header(nal, logctx);
        if (ret < 0) {
            av_log(logctx, AV_LOG_WARNING, "Invalid NAL unit %d, skipping.\n",
                   nal->type);
            continue;
        }

        pkt->nb_nals++;
    }

    return 0;
}

四、ff_h2645_packet_split函数的内部实现原理

ff_h2645_packet_split函数中首先通过：

cpp 复制代码

bytestream2_init(&bc, buf, length);

初始化GetByteContext结构体变量bc，让bc.buffer指向"包含起始码的H.264码流"的开头（首地址）。（关于bytestream2_init函数和相关函数的用法可以参考：《FFmpeg字节操作相关的源码：GetByteContext结构体，bytestream2_init、bytestream2_get_bytes_left、bytestream2_tell函数分析》）

然后通过：

cpp 复制代码

while (bytestream2_get_bytes_left(&bc) >= 4){
//...
}

判断如果距离读取完H.264码流还剩超过4个字节，则执行大括号循环体中的内容

如果没读取完这段H.264码流，执行else{//...}里面的内容：

cpp 复制代码

if (bytestream2_tell(&bc) == next_avc) {
//...
}else{
//...
}

然后通过：

cpp 复制代码

/* search start code */
buf_index = find_next_start_code(bc.buffer, buf + next_avc);
bytestream2_skip(&bc, buf_index);

找到这段H.264码流中值为0x000001或0x00000001的起始码的位置，让bc.buffer指向"这段H.264码流去掉第一个起始码后的位置"。

如果此时已经到了这段H.264码流的末尾，并且这段H.264码流中存在其它起始码，返回0。如果到了这段H.264码流的末尾时也没发现它里面包含任何起始码，说明这段H.264码流是无效的，返回AVERROR_INVALIDDATA：

cpp 复制代码

if (!bytestream2_get_bytes_left(&bc)) {
    if (pkt->nb_nals > 0) {
    // No more start codes: we discarded some irrelevant
    // bytes at the end of the packet.
        return 0;
    } else {
        av_log(logctx, AV_LOG_ERROR, "No start code is found.\n");
        return AVERROR_INVALIDDATA;
    }
}

继续往下执行，通过：

cpp 复制代码

consumed = ff_h2645_extract_rbsp(bc.buffer, extract_length, &pkt->rbsp, nal, small_padding);

拿到这段H.264码流中的第一个NALU的"NALU Header + RBSP"和"NALU Header + EBSP"。关于ff_h2645_extract_rbsp函数可以参考《FFmpeg源码：ff_h2645_extract_rbsp函数分析》

通过：

cpp 复制代码

bytestream2_skip(&bc, consumed);

让bc.buffer指向下一个NALU的开始位置。

通过：

cpp 复制代码

nal->size_bits = get_bit_length(nal, skip_trailing_zeros);

拿到NALU Header + SODB的位数，单位为比特。关于get_bit_length可以参考《FFmpeg源码：get_bit_length函数分析》

通过：

cpp 复制代码

ret = h264_parse_nal_header(nal, logctx);

将NALU Header解析出来。关于h264_parse_nal_header函数的用法可以参考《音视频入门基础：H.264专题（5）------FFmpeg源码中解析NALU Header的函数分析》

该H.264码流中的NALU统计数量加1：

cpp 复制代码

pkt->nb_nals++;

然后继续通过while循环来读取下一个NALU，直到读取完该H.264码流为止：

cpp 复制代码

while (bytestream2_get_bytes_left(&bc) >= 4) {
//...
}