FFmpeg 基本数据结构 AVInputFormat 分析

1、AVInputFormat 定义

AVInputFormat 是 FFmpeg 公共 API 中用于描述和操作一个解复用器（Demuxer）的核心数据结构。你可以将它理解为一个解复用器的"说明书"或"接口契约"。

AVInputFormat是ffmpeg里面的解复用器对象，AVInputFormat类似COM接口的数据结果，表示输入文件容器格式。着重于功能函数，一种文件格式对应一个AVInputFormat结果，AVInputFormat 是输入格式（解复用器/Demuxer）的核心数据结构，负责识别和解析各种媒体容器格式（如 MP4、MKV、FLV 等）。在程序中运行多个实例。

AVInputFormat：公共API结构体。它是面向开发者（使用 FFmpeg 库的应用程序）和框架的接口。你在代码中看到和操作的都是它。
FFInputFormat：内部实现结构体。它在 FFmpeg 源代码内部使用，是每个具体解复用器的"私有数据"结构。它通常第一个成员就是 AVInputFormat pub。

cpp 复制代码

typedef struct AVInputFormat {
    const char *name; // 输入格式的名称
    const char *long_name; // 输入格式的详细名称
    int flags; // 输入格式的属性标志
    const char *extensions; // 输入格式的扩展名
    const struct AVCodecTag * const *codec_tag; // 指向AVCodecTag结构的指针，
                                                // 用于存储编解码器标签
    const AVClass *priv_class; // 私有数据的AVClass结构体
    const char *mime_type; // 输入格式的MIME类型
} AVInputFormat;

各个成员的含义如下：

name: 一个字符指针，用于表示AVInputFormat的名称。
long_name: 一个字符指针，用于表示AVInputFormat的详细名称。
flags: 一个整数，用于表示AVInputFormat的属性标志。
extensions: 一个字符指针，用于表示AVInputFormat的扩展名。
codec_tag: 一个指向AVCodecTag结构的指针，用于存储编解码器标签。
priv_class: 一个指向AVClass结构体的指针，用于表示AVInputFormat的私有数据类。
mime_type: 一个字符指针，用于表示AVInputFormat的MIME类型。

2、AVInputFormat 作用阶段

AVInputFormat 作用在解析音视频流文件或者网络中的时候，其作用是分离Audio 、Video、Subtitle等信息。

自动格式探测：当你使用 avformat_open_input() 但将第二个参数（AVInputFormat）设置为NULL时，FFmpeg会启动自动探测流程。它会读取文件开头的一部分数据（或来自网络的数据流），调用所有已注册格式的 read_probe 函数，与文件特征进行匹配，从而确定最合适的解封装器。
手动指定格式：如果你提前知道媒体文件的精确格式，可以直接通过 av_find_input_format() 函数获取对应的 AVInputFormat 结构体，然后在 avformat_open_input() 中传入。这样做可以跳过自动探测步骤，提升打开文件的效率。

cpp 复制代码

avformat_open_input()
 └──-> ff_probe_input_buffer()
      └──-> 调用所有 AVInputFormat 的 read_probe() -> 得分最高的匹配
 └──-> selected_iformat->read_header()
 └──-> 绑定 priv_data
 └──-> avformat_find_stream_info()
      └──-> 不断调用 selected_iformat->read_packet()
 └──-> 循环读取时调用 selected_iformat->read_packet()

3、FFInputFormat 输入格式解复用器

FFInputFormat 是输入格式（解复用器）的核心数据结构，负责管理媒体文件的格式解析和解封装过程。

其是 FFmpeg 内部用于描述和实现一个解复用器（Demuxer）的核心数据结构。你可以把它理解为一个解复用器的"类"或"蓝图"。

它的主要职责是：

识别输入媒体容器的格式：比如判断一个文件是 MP4、FLV、AVI 还是 MP3。
解析媒体容器的元信息：读取文件头等信息，获取媒体的总时长、创建时间、码率、音视频流数量等。
读取数据包（Packets）：从容器中分离出一个个的数据包（通常是压缩后的视频帧或音频帧），并将它们交给解码器。

简单来说，FFInputFormat 回答了以下问题：

"这个文件是什么格式？" （通过 read_probe 函数）
"这个文件里有多少路流？它们的编码格式、分辨率、采样率是什么？" （通过 read_header 函数）
"如何从这个文件中读取一帧帧的数据？" （通过 read_packet 函数）

下面是FFInputFormat 具体的定义：

cpp 复制代码

typedef struct FFInputFormat {
    AVInputFormat p; // 公共API可见部分
    enum AVCodecID raw_codec_id;
    int priv_data_size; // 私有数据结构大小
    int flags_internal;
    // 关键指针函数
    int (*read_probe)(const AVProbeData *);
    int (*read_header)(struct AVFormatContext *);
    int (*read_packet)(struct AVFormatContext *, AVPacket *pkt);
    int (*read_close)(struct AVFormatContext *);
    int (*read_seek)(struct AVFormatContext *,
                     int stream_index, int64_t timestamp, int flags);
    int64_t (*read_timestamp)(struct AVFormatContext *s, int stream_index,
                              int64_t *pos, int64_t pos_limit);
    int (*read_play)(struct AVFormatContext *);
    int (*read_pause)(struct AVFormatContext *);
    int (*read_seek2)(struct AVFormatContext *s, int stream_index, 
        int64_t min_ts, int64_t ts, int64_t max_ts, int flags);
    int (*get_device_list)(struct AVFormatContext *s, struct AVDeviceInfoList *device_list);
} FFInputFormat;

关键指针函数映射关系：

4、FFInputFormat 结构定义示例

cpp 复制代码

const FFInputFormat ff_rtp_demuxer = {
    .p.name         = "rtp",
    .p.long_name    = NULL_IF_CONFIG_SMALL("RTP input"),
    .p.flags        = AVFMT_NOFILE,
    .p.priv_class   = &rtp_demuxer_class,
    .priv_data_size = sizeof(RTSPState),
    .read_probe     = rtp_probe,
    .read_header    = rtp_read_header,
    .read_packet    = ff_rtsp_fetch_packet,
    .read_close     = sdp_read_close,
};

const FFInputFormat ff_mov_demuxer = {
    // 定义AVInputFormat
    .p = {
        .name           = "mov,mp4,m4a",
        .long_name      = "QuickTime / MOV",
        .flags          = AVFMT_NO_BYTE_SEEK,
        .extensions     = "mov,mp4,m4a",
        .priv_class     = &mov_class,
    },
    .priv_data_size    = sizeof(MOVContext), // 私有数据大小
    .read_header       = mov_read_header,    // 头解析函数
    .read_packet       = mov_read_packet,    // 包读取函数
    .read_close        = mov_read_close,     // 关闭函数
    .read_seek         = mov_read_seek,      // 跳转函数
    .check_bitstream   = mov_check_bitstream,// FFmpeg7新增
    .init              = mp4_init,           // 初始化函数
    .flags_internal    = FF_FMT_ALLOW_FLUSH, // 内部标志
};

//以Mp4为例：
typedef struct MOVContext {
    int64_t moov_offset;     // 'moov' box位置
    int time_scale;          // 时间基准
    AVStream **streams;      // 流指针数组
    // ... MP4特有字段
} MOVContext;
// 在read_header中初始化
static int mov_read_header(AVFormatContext *s) {
    MOVContext *mov = s->priv_data; // 访问私有数据
    mov->time_scale = 1000;         // 设置默认时间基准
    // ...
}

5、AVInputFormat 使用示例

cpp 复制代码

#include <libavformat/avformat.h>
int main() {
    AVFormatContext *fmt_ctx = NULL;
    const AVInputFormat *input_fmt = av_find_input_format("mp4");
    avformat_open_input(&fmt_ctx, "input.mp4", input_fmt, NULL);
  
    avformat_close_input(&fmt_ctx);
    return 0;
}