FFmpeg 开发实战全解析：从入门到精通（附完整代码示例）

作者： 码流怪侠 日期： 2026-05-01
标签： FFmpeg、音视频开发、H264、AAC、解码、编码、Filter
参考项目： ffmpeg-demo（基于 FFmpeg 4.1）

前言

FFmpeg 是当今最强大的开源音视频处理框架，广泛应用于流媒体、视频转码、直播推流等领域。本文通过 7 个完整实战 Demo，系统讲解 FFmpeg 开发中的核心流程：H.264 码流解析、视频解码、解复用、H.264 编码、视频 Filter 处理、AAC 码流解析和音频解码。

每个 Demo 均配有完整注释代码 和原理详解，适合有 C/C++ 基础、希望深入音视频开发的读者。

[FFmpeg 核心架构概览](#FFmpeg 核心架构概览 "#1-ffmpeg-%E6%A0%B8%E5%BF%83%E6%9E%B6%E6%9E%84%E6%A6%82%E8%A7%88")
环境搭建与编译
[Demo 1：H.264 码流解析（NALU 分析）](#Demo 1：H.264 码流解析（NALU 分析） "#3-demo-1h264-%E7%A0%81%E6%B5%81%E8%A7%A3%E6%9E%90nalu-%E5%88%86%E6%9E%90")
[Demo 2：视频解码（H.264/HEVC → YUV）](#Demo 2：视频解码（H.264/HEVC → YUV） "#4-demo-2%E8%A7%86%E9%A2%91%E8%A7%A3%E7%A0%81h264hevc--yuv")
[Demo 3：解复用（Demuxing）](#Demo 3：解复用（Demuxing） "#5-demo-3%E8%A7%A3%E5%A4%8D%E7%94%A8demuxing")
[Demo 4：H.264 视频编码（YUV → H.264）](#Demo 4：H.264 视频编码（YUV → H.264） "#6-demo-4h264-%E8%A7%86%E9%A2%91%E7%BC%96%E7%A0%81yuv--h264")
[Demo 5：视频 Filter（滤镜处理）](#Demo 5：视频 Filter（滤镜处理） "#7-demo-5%E8%A7%86%E9%A2%91-filter%E6%BB%A4%E9%95%9C%E5%A4%84%E7%90%86")
[Demo 6：AAC 码流解析（ADTS 帧分析）](#Demo 6：AAC 码流解析（ADTS 帧分析） "#8-demo-6aac-%E7%A0%81%E6%B5%81%E8%A7%A3%E6%9E%90adts-%E5%B8%A7%E5%88%86%E6%9E%90")
[Demo 7：音频解码（AAC → PCM）](#Demo 7：音频解码（AAC → PCM） "#9-demo-7%E9%9F%B3%E9%A2%91%E8%A7%A3%E7%A0%81aac--pcm")
[FFmpeg 关键数据结构汇总](#FFmpeg 关键数据结构汇总 "#10-ffmpeg-%E5%85%B3%E9%94%AE%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84%E6%B1%87%E6%80%BB")
常见问题与注意事项
总结

1. FFmpeg 核心架构概览

FFmpeg 由多个功能库组成，每个库负责不同的处理层次：

库名	说明	主要用途
`libavformat`	封装/解封装库	读写各种容器格式（MP4/FLV/MKV等）
`libavcodec`	编解码库	对音视频数据进行编码或解码
`libavfilter`	滤镜库	视频/音频效果处理（缩放/裁剪/混音等）
`libswscale`	像素格式转换库	YUV ↔ RGB 等色彩空间转换
`libswresample`	音频重采样库	音频采样率/格式/通道转换
`libavutil`	工具库	通用数据结构、数学函数等
`libavdevice`	设备库	摄像头/麦克风等硬件设备访问

音视频处理核心流程

csharp 复制代码

输入文件
   ↓
[avformat] 解封装 → AVPacket（压缩数据包）
   ↓
[avcodec] 解码器 → AVFrame（原始数据帧）
   ↓
[avfilter] 滤镜处理 → 经过处理的 AVFrame
   ↓
[swscale/swresample] 格式转换
   ↓
[avcodec] 编码器 → AVPacket（压缩数据包）
   ↓
[avformat] 封装 → 输出文件

2. 环境搭建与编译

2.1 Linux 下编译安装 FFmpeg（启用 H.264 编码）

bash 复制代码

# 先安装 x264 编码库
# 解压 FFmpeg 源码后执行：

./configure \
  --enable-libx264 \
  --enable-gpl \
  --enable-decoder=h264 \
  --enable-encoder=libx264 \
  --enable-shared \
  --disable-yasm \
  --enable-postproc \
  --prefix=/home/share/lib_so/ffmpeg4.1

make -j4
make install

# 配置动态库路径
echo "/home/share/lib_so/ffmpeg4.1/lib" >> /etc/ld.so.conf
ldconfig

# 配置环境变量（写入 /etc/profile）
export FFMPEG=/home/share/lib_so/ffmpeg4.1
export PATH=$PATH:$FFMPEG/bin
source /etc/profile

2.2 编译 Demo（Makefile 示例）

makefile 复制代码

CC = gcc
CFLAGS = -I/home/share/lib_so/ffmpeg4.1/include
LDFLAGS = -L/home/share/lib_so/ffmpeg4.1/lib \
          -lavformat -lavcodec -lavfilter \
          -lavutil -lswscale -lswresample -lm

demo: demo.c
	$(CC) $(CFLAGS) -o demo demo.c $(LDFLAGS)

3. Demo 1：H.264 码流解析（NALU 分析）

3.1 原理：什么是 NALU？

H.264 码流由一系列 NALU（Network Abstraction Layer Unit，网络抽象层单元） 组成。每个 NALU 以起始码（Start Code）开头：

3 字节起始码： 0x000001
4 字节起始码： 0x00000001（用于 SPS/PPS 和第一个 Slice）

NALU 的第一个字节包含重要信息：

arduino 复制代码

bit[7]   = forbidden_zero_bit（必须为 0）
bit[6:5] = nal_reference_idc（参考优先级）
bit[4:0] = nal_unit_type（NALU 类型）

3.2 NALU 类型枚举

c 复制代码

typedef enum {
    NALU_TYPE_SLICE    = 1,   // 非 IDR 图像的 Slice
    NALU_TYPE_DPA      = 2,   // 数据分区 A
    NALU_TYPE_DPB      = 3,   // 数据分区 B
    NALU_TYPE_DPC      = 4,   // 数据分区 C
    NALU_TYPE_IDR      = 5,   // IDR 图像（关键帧）
    NALU_TYPE_SEI      = 6,   // 补充增强信息
    NALU_TYPE_SPS      = 7,   // 序列参数集
    NALU_TYPE_PPS      = 8,   // 图像参数集
    NALU_TYPE_AUD      = 9,   // 访问单元分隔符
    NALU_TYPE_EOSEQ    = 10,  // 序列结束
    NALU_TYPE_EOSTREAM = 11,  // 码流结束
    NALU_TYPE_FILL     = 12,  // 填充数据
} NaluType;

各类型说明：

NALU 类型	值	说明
SPS	7	序列参数集，包含分辨率、帧率、编码等级等全局参数
PPS	8	图像参数集，包含熵编码模式、加权预测等参数
IDR	5	关键帧，不依赖其他帧即可独立解码
SLICE	1	P帧/B帧的 Slice 数据
SEI	6	用户自定义附加信息（如水印、CC字幕）
AUD	9	访问单元边界，用于分隔不同帧

3.3 NALU 数据结构定义

c 复制代码

typedef struct {
    int  startcodeprefix_len;  // 起始码长度（3 或 4）
    unsigned len;              // NALU 数据长度（不含起始码）
    unsigned max_size;         // 缓冲区最大容量
    int  forbidden_bit;        // 禁止位（必须为 0）
    int  nal_reference_idc;    // 参考优先级（0=可丢弃, 1=低, 2=高, 3=最高）
    int  nal_unit_type;        // NALU 类型（见上方枚举）
    char *buf;                 // NALU 数据缓冲区（含第一个字节后的 EBSP 数据）
} NALU_t;

3.4 核心函数：从 Annex-B 格式读取 NALU

c 复制代码

/**
 * 从 Annex-B 格式的 H.264 码流中读取一个完整 NALU
 * @param nalu  输出：NALU 结构体（需预先分配内存）
 * @return 读取的字节数（含起始码）；0 表示文件结束；-1 表示错误
 */
int GetAnnexbNALU(NALU_t *nalu) {
    int pos = 0;
    int StartCodeFound, rewind;
    unsigned char *Buf;

    // 分配临时缓冲区
    Buf = (unsigned char*)calloc(nalu->max_size, sizeof(char));

    // --- 读取并识别起始码 ---
    nalu->startcodeprefix_len = 3;
    fread(Buf, 1, 3, h264bitstream);

    if (!FindStartCode2(Buf)) {       // 不是 3 字节起始码
        fread(Buf + 3, 1, 1, h264bitstream);
        if (!FindStartCode3(Buf)) {   // 也不是 4 字节起始码
            free(Buf);
            return -1;
        }
        pos = 4;
        nalu->startcodeprefix_len = 4;
    } else {
        pos = 3;
    }

    // --- 找到下一个 NALU 的起始码，以确定当前 NALU 结束位置 ---
    StartCodeFound = 0;
    while (!StartCodeFound) {
        if (feof(h264bitstream)) {
            // 文件结束，处理最后一个 NALU
            nalu->len = (pos - 1) - nalu->startcodeprefix_len;
            break;
        }
        Buf[pos++] = fgetc(h264bitstream);
        // 检查是否遇到下一个起始码
        if (FindStartCode3(&Buf[pos - 4]) || FindStartCode2(&Buf[pos - 3]))
            StartCodeFound = 1;
    }

    // 回退到下一个 NALU 起始码之前
    rewind = FindStartCode3(&Buf[pos - 4]) ? -4 : -3;
    fseek(h264bitstream, rewind, SEEK_CUR);

    nalu->len = (pos + rewind) - nalu->startcodeprefix_len;
    memcpy(nalu->buf, &Buf[nalu->startcodeprefix_len], nalu->len);

    // 解析第一字节的 NALU 头部信息
    nalu->forbidden_bit    = nalu->buf[0] & 0x80;  // 最高 1 位
    nalu->nal_reference_idc = nalu->buf[0] & 0x60; // 第 6-7 位
    nalu->nal_unit_type    = nalu->buf[0] & 0x1f;  // 低 5 位

    free(Buf);
    return (pos + rewind);
}

3.5 输出示例

运行 H.264 解析程序后，输出类似：

markdown 复制代码

-----+-------- NALU Table ------+---------+
 NUM |    POS  |    IDC |  TYPE |   LEN   |
-----+---------+--------+-------+---------+
    0|        0|HIGHEST|    SPS|       26|
    1|       30| HIGHEST|    PPS|        4|
    2|       38| HIGHEST|    IDR|    22509|
    3|    22551|    HIGH|  SLICE|     5812|
    4|    28367|    HIGH|  SLICE|     2143|
    ...

关键点： SPS/PPS 优先级最高（HIGHEST），IDR 帧为 HIGHEST，普通 P/B 帧为 HIGH。

4. Demo 2：视频解码（H.264/HEVC → YUV）

4.1 解码流程

视频解码的标准流程分为两个阶段：

从码流文件读取数据，通过 AVParser 解析出完整的 AVPacket
将 AVPacket 送入解码器，输出 AVFrame（YUV 原始帧）

scss 复制代码

文件读取（fread）→ AVParser 解析 → AVPacket
                                     ↓
                              avcodec_send_packet()
                                     ↓
                             avcodec_receive_frame()
                                     ↓
                               AVFrame（YUV数据）
                                     ↓
                              写入 .yuv 文件

4.2 完整代码解析

（1）初始化解码器

c 复制代码

// 查找解码器（支持 H.264 / HEVC）
enum AVCodecID codec_id = AV_CODEC_ID_H264;  // 或 AV_CODEC_ID_HEVC
AVCodec *pCodec = avcodec_find_decoder(codec_id);
if (!pCodec) {
    printf("Codec not found\n");
    return -1;
}

// 分配并初始化 AVCodecContext
AVCodecContext *pCodecCtx = avcodec_alloc_context3(pCodec);

// 初始化码流解析器（AVParser 能从字节流中找出完整 Packet 边界）
AVCodecParserContext *pCodecParserCtx = av_parser_init(codec_id);

// 打开解码器
avcodec_open2(pCodecCtx, pCodec, NULL);

（2）AVParser：从原始字节流中提取 AVPacket

c 复制代码

// 每次从文件读取 4096 字节
const int in_buffer_size = 4096;
unsigned char in_buffer[40960] = {0};

while (1) {
    int cur_size = fread(in_buffer, 1, in_buffer_size, fp_in);
    if (cur_size == 0) break;

    unsigned char *cur_ptr = in_buffer;
    while (cur_size > 0) {
        // av_parser_parse2：从字节流中解析出一个完整的 AVPacket
        int len = av_parser_parse2(
            pCodecParserCtx, pCodecCtx,
            &packet.data, &packet.size,  // 输出：packet 数据和大小
            cur_ptr, cur_size,           // 输入：当前缓冲区
            AV_NOPTS_VALUE, AV_NOPTS_VALUE, AV_NOPTS_VALUE);

        cur_ptr  += len;
        cur_size -= len;

        if (packet.size == 0) continue;  // 还未解析出完整 packet

        // 打印帧信息
        printf("[Packet] Size:%6d  Type:", packet.size);
        switch (pCodecParserCtx->pict_type) {
            case AV_PICTURE_TYPE_I: printf("I"); break;
            case AV_PICTURE_TYPE_P: printf("P"); break;
            case AV_PICTURE_TYPE_B: printf("B"); break;
        }
        printf("  Number:%4d\n", pCodecParserCtx->output_picture_number);

        // 解码
        decode(pCodecCtx, pFrame, &packet, fp_out);
    }
}

// 冲刷解码器（处理缓存中的最后几帧）
decode(pCodecCtx, pFrame, NULL, fp_out);

（3）解码函数：avcodec_send_packet + avcodec_receive_frame

c 复制代码

/**
 * 向解码器发送 AVPacket，并接收解码后的 AVFrame
 * pkt 为 NULL 时表示冲刷（flush）解码器
 */
static void decode(AVCodecContext *dec_ctx, AVFrame *frame,
                   AVPacket *pkt, FILE *fp_out) {
    // 发送压缩数据包
    int ret = avcodec_send_packet(dec_ctx, pkt);
    if (ret < 0) {
        fprintf(stderr, "Error sending packet for decoding\n");
        exit(1);
    }

    // 循环接收解码帧（一个 packet 可能解出多个 frame）
    while (ret >= 0) {
        ret = avcodec_receive_frame(dec_ctx, frame);
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
            return;  // 需要更多数据，或已读完所有帧
        if (ret < 0) {
            fprintf(stderr, "Error during decoding\n");
            exit(1);
        }

        // 写入 YUV 数据（YUV420P 格式：Y/U/V 三平面分开存储）
        // Y 分量：width × height 字节
        for (int i = 0; i < frame->height; i++)
            fwrite(frame->data[0] + frame->linesize[0] * i,
                   1, frame->width, fp_out);
        // U 分量：(width/2) × (height/2) 字节
        for (int i = 0; i < frame->height / 2; i++)
            fwrite(frame->data[1] + frame->linesize[1] * i,
                   1, frame->width / 2, fp_out);
        // V 分量：(width/2) × (height/2) 字节
        for (int i = 0; i < frame->height / 2; i++)
            fwrite(frame->data[2] + frame->linesize[2] * i,
                   1, frame->width / 2, fp_out);
    }
}

重要提示： 写入 YUV 时，必须按 linesize（行跨度）而非 width 来计算偏移，否则在有对齐填充的情况下会产生绿色条纹或花屏。

（4）资源释放

c 复制代码

av_parser_close(pCodecParserCtx);
av_frame_free(&pFrame);
avcodec_close(pCodecCtx);
av_free(pCodecCtx);
fclose(fp_in);
fclose(fp_out);

5. Demo 3：解复用（Demuxing）

5.1 解复用 + 解码的完整流程

解复用（Demuxing）是指从封装格式（如 FLV、MP4）中分离出音视频流的过程，比直接解析裸码流更常见。

scss 复制代码

avformat_open_input()       // 打开文件
    ↓
avformat_find_stream_info() // 探测流信息
    ↓
avcodec_find_decoder()      // 查找解码器
    ↓
avcodec_alloc_context3()    // 创建解码上下文
    ↓
avcodec_parameters_to_context() // 从流参数初始化解码上下文
    ↓
avcodec_open2()             // 打开解码器
    ↓
sws_getContext()            // 初始化色彩空间转换器（YUV→RGB）
    ↓
av_read_frame() 循环         // 读取 AVPacket
    ↓
avcodec_send_packet()       // 送入解码器
    ↓
avcodec_receive_frame()     // 获取解码帧
    ↓
sws_scale()                 // 色彩空间转换
    ↓
saveFrame()                 // 保存为 PPM 图像

5.2 关键步骤代码

（1）打开文件并查找视频流

c 复制代码

AVFormatContext *pFormatCtx = NULL;

// 打开文件，自动识别格式（MP4/FLV/AVI等）
if (avformat_open_input(&pFormatCtx, filename, NULL, NULL) != 0) {
    fprintf(stderr, "无法打开文件\n");
    return;
}

// 探测码流信息（读取并解码少量数据来确定编解码参数）
if (avformat_find_stream_info(pFormatCtx, NULL) < 0) {
    fprintf(stderr, "无法获取流信息\n");
    return;
}

// 打印文件格式信息（调试用）
av_dump_format(pFormatCtx, 0, filename, 0);

// 遍历所有流，找到视频流索引
int videoStream = -1;
for (int i = 0; i < pFormatCtx->nb_streams; i++) {
    if (pFormatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
        videoStream = i;
        break;
    }
}

（2）创建解码器上下文（FFmpeg 4.x 新 API）

c 复制代码

// 查找解码器（从流参数中获取 codec_id）
AVCodec *pCodec = avcodec_find_decoder(
    pFormatCtx->streams[videoStream]->codecpar->codec_id);

// 分配解码上下文
AVCodecContext *pCodecCtx = avcodec_alloc_context3(pCodec);

// 将流的编解码参数复制到解码上下文（新 API，替代旧版的直接赋值）
avcodec_parameters_to_context(pCodecCtx,
    pFormatCtx->streams[videoStream]->codecpar);

// 打开解码器
avcodec_open2(pCodecCtx, pCodec, NULL);

注意： FFmpeg 4.x 废弃了 AVCodecContext 中的 codec_id/width/height 等字段的直接赋值方式，应使用 avcodec_parameters_to_context() 从 AVCodecParameters（存在 AVStream->codecpar）复制参数。

（3）色彩空间转换（YUV420P → RGB24）

c 复制代码

// 初始化 SwsContext：输入 YUV420P，输出 RGB24，双线性插值
struct SwsContext *sws_ctx = sws_getContext(
    pCodecCtx->width, pCodecCtx->height, pCodecCtx->pix_fmt,  // 输入
    pCodecCtx->width, pCodecCtx->height, AV_PIX_FMT_RGB24,    // 输出
    SWS_BILINEAR, NULL, NULL, NULL);

// 分配 RGB 帧的内存
int numBytes = av_image_get_buffer_size(AV_PIX_FMT_RGB24,
    pCodecCtx->width, pCodecCtx->height, 1);
uint8_t *buffer = (uint8_t*)av_malloc(numBytes);
av_image_fill_arrays(pFrameRGB->data, pFrameRGB->linesize,
    buffer, AV_PIX_FMT_RGB24, pCodecCtx->width, pCodecCtx->height, 1);

// 执行色彩空间转换
sws_scale(sws_ctx,
    (uint8_t const * const *)pFrame->data, pFrame->linesize,
    0, pFrame->height,          // srcSliceY=0, srcSliceH=height（整帧处理）
    pFrameRGB->data, pFrameRGB->linesize);

（4）保存为 PPM 图像格式

c 复制代码

void saveFrame(AVFrame *pFrame, int width, int height,
               int iFrame, const char *outname) {
    char szFilename[32];
    sprintf(szFilename, outname, iFrame);  // 如 "hello01.ppm"
    FILE *pFile = fopen(szFilename, "wb");

    // PPM 文件头：P6 格式（二进制 RGB）
    fprintf(pFile, "P6\n%d %d\n255\n", width, height);

    // 写入 RGB 数据（每行 width * 3 字节）
    for (int y = 0; y < height; y++)
        fwrite(pFrame->data[0] + y * pFrame->linesize[0],
               1, width * 3, pFile);

    fclose(pFile);
}

6. Demo 4：H.264 视频编码（YUV → H.264）

6.1 编码流程

scss 复制代码

YUV 原始数据文件（.yuv）
    ↓
av_frame_alloc() + av_image_alloc()  // 分配帧内存
    ↓
avcodec_find_encoder(AV_CODEC_ID_H264) // 查找编码器
    ↓
avcodec_alloc_context3() + 设置编码参数 // 配置编码器
    ↓
avcodec_open2()                        // 打开编码器
    ↓
循环读取 YUV 帧 → 设置 PTS → encode()  // 编码循环
    ↓
encode(NULL) 冲刷编码器                // 清空缓存帧
    ↓
写入 .h264 文件

6.2 编码参数配置

c 复制代码

// 查找 H.264 编码器
AVCodec *pCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
AVCodecContext *pCodecCtx = avcodec_alloc_context3(pCodec);

// ===== 核心编码参数配置 =====

// 码率（比特率）：400 kbps
pCodecCtx->bit_rate = 400000;

// 分辨率：480×272
pCodecCtx->width  = 480;
pCodecCtx->height = 272;

// 时间基：1/25（即 25fps）
pCodecCtx->time_base.num = 1;
pCodecCtx->time_base.den = 25;

// GOP 大小：每 10 帧一个关键帧
pCodecCtx->gop_size = 10;

// 最大 B 帧数量
pCodecCtx->max_b_frames = 1;

// 输入像素格式：YUV420P
pCodecCtx->pix_fmt = AV_PIX_FMT_YUV420P;

// H.264 专用参数：编码预设（slow 换取更高压缩率）
av_opt_set(pCodecCtx->priv_data, "preset", "slow", 0);

// 打开编码器
avcodec_open2(pCodecCtx, pCodec, NULL);

常用 x264 preset 说明：

preset	编码速度	压缩率	适用场景
`ultrafast`	最快	最低	低延迟直播
`superfast`	很快	低	实时编码
`fast`	较快	中等	普通录制
`medium`	默认	中等	通用
`slow`	慢	较高	离线转码
`veryslow`	很慢	高	存储/归档
`placebo`	极慢	最高	极致压缩

6.3 分配 AVFrame 内存

c 复制代码

AVFrame *pFrame = av_frame_alloc();
pFrame->format = pCodecCtx->pix_fmt;   // AV_PIX_FMT_YUV420P
pFrame->width  = pCodecCtx->width;
pFrame->height = pCodecCtx->height;

// 为 YUV 数据分配对齐内存（对齐值 16 可提升 SIMD 性能）
av_image_alloc(pFrame->data, pFrame->linesize,
    pCodecCtx->width, pCodecCtx->height,
    pCodecCtx->pix_fmt, 16);

6.4 编码主循环

c 复制代码

int y_size = pCodecCtx->width * pCodecCtx->height;

for (int i = 0; i < framenum; i++) {
    av_init_packet(&pkt);
    pkt.data = NULL;  // 让编码器自动分配 packet 内存
    pkt.size = 0;

    // 从文件读取 YUV420P 数据（Y/U/V 分量分别读取）
    fread(pFrame->data[0], 1, y_size,     fp_in);  // Y: 480×272 字节
    fread(pFrame->data[1], 1, y_size / 4, fp_in);  // U: 240×136 字节
    fread(pFrame->data[2], 1, y_size / 4, fp_in);  // V: 240×136 字节

    // 设置时间戳（PTS = 帧号，配合 time_base 换算为实际时间）
    pFrame->pts = i;

    encode(pCodecCtx, pFrame, &pkt, fp_out);
}

// 冲刷编码器（输出缓冲区内的 B 帧等延迟帧）
encode(pCodecCtx, NULL, &pkt, fp_out);

6.5 编码函数

c 复制代码

static void encode(AVCodecContext *enc_ctx, AVFrame *frame,
                   AVPacket *pkt, FILE *outfile) {
    if (frame)
        printf("Send frame %3"PRId64"\n", frame->pts);

    // 发送原始帧给编码器
    int ret = avcodec_send_frame(enc_ctx, frame);
    if (ret < 0) {
        fprintf(stderr, "Error sending frame for encoding\n");
        exit(1);
    }

    // 循环接收编码后的数据包
    while (ret >= 0) {
        ret = avcodec_receive_packet(enc_ctx, pkt);
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
            return;  // 需要更多帧，或编码完成
        if (ret < 0) {
            fprintf(stderr, "Error during encoding\n");
            exit(1);
        }

        printf("Write packet %3"PRId64" (size=%5d)\n", pkt->pts, pkt->size);
        fwrite(pkt->data, 1, pkt->size, outfile);
        av_packet_unref(pkt);  // 释放 packet 数据
    }
}

7. Demo 5：视频 Filter（滤镜处理）

7.1 FFmpeg Filter 架构

FFmpeg 的滤镜系统（libavfilter）基于**滤镜图（FilterGraph）**的概念，将多个滤镜按照数据流连接成图状结构：

ini 复制代码

解码帧 → [buffersrc（输入节点）]
              ↓
         [scale（缩放滤镜）]   ← filter_descr = "scale=iw*0.5:ih*0.5"
              ↓
         [buffersink（输出节点）]
              ↓
         处理后的帧

7.2 Filter 初始化过程

c 复制代码

// 滤镜描述字符串（将视频缩小为原来的 50%）
const char *filter_descr = "scale=iw*0.5:ih*0.5";

static int init_filters(const char *filters_descr) {
    char args[512];
    int ret = 0;

    // 获取 buffer 和 buffersink 滤镜（输入输出节点）
    const AVFilter *buffersrc  = avfilter_get_by_name("buffer");
    const AVFilter *buffersink = avfilter_get_by_name("buffersink");

    AVFilterInOut *outputs = avfilter_inout_alloc();  // 输出端点
    AVFilterInOut *inputs  = avfilter_inout_alloc();  // 输入端点

    AVRational time_base = fmt_ctx->streams[video_stream_index]->time_base;

    // 创建 FilterGraph（所有滤镜的容器）
    filter_graph = avfilter_graph_alloc();

    // ===== 配置 buffersrc（输入源）的参数 =====
    // 需要告知滤镜链输入视频的格式、尺寸、时间基等
    snprintf(args, sizeof(args),
        "video_size=%dx%d:pix_fmt=%d:time_base=%d/%d:pixel_aspect=%d/%d",
        dec_ctx->width, dec_ctx->height, dec_ctx->pix_fmt,
        time_base.num, time_base.den,
        dec_ctx->sample_aspect_ratio.num, dec_ctx->sample_aspect_ratio.den);

    // 创建输入 buffer 滤镜节点
    avfilter_graph_create_filter(&buffersrc_ctx, buffersrc, "in",
                                 args, NULL, filter_graph);

    // 创建输出 buffersink 滤镜节点
    avfilter_graph_create_filter(&buffersink_ctx, buffersink, "out",
                                 NULL, NULL, filter_graph);

    // 设置 buffersink 允许的输出像素格式
    enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_YUV420P, AV_PIX_FMT_NONE };
    av_opt_set_int_list(buffersink_ctx, "pix_fmts", pix_fmts,
                        AV_PIX_FMT_NONE, AV_OPT_SEARCH_CHILDREN);

    // ===== 连接输入/输出端点 =====
    outputs->name       = av_strdup("in");   // 与 buffersrc 的名称对应
    outputs->filter_ctx = buffersrc_ctx;
    outputs->pad_idx    = 0;
    outputs->next       = NULL;

    inputs->name        = av_strdup("out");  // 与 buffersink 的名称对应
    inputs->filter_ctx  = buffersink_ctx;
    inputs->pad_idx     = 0;
    inputs->next        = NULL;

    // ===== 解析滤镜字符串并插入到 FilterGraph =====
    avfilter_graph_parse_ptr(filter_graph, filters_descr,
                             &inputs, &outputs, NULL);

    // ===== 验证并配置 FilterGraph 中所有连接 =====
    avfilter_graph_config(filter_graph, NULL);

    avfilter_inout_free(&inputs);
    avfilter_inout_free(&outputs);
    return ret;
}

7.3 Filter 处理主循环

c 复制代码

while (1) {
    // 读取封装数据包
    if (av_read_frame(fmt_ctx, &packet) < 0) break;

    if (packet.stream_index == video_stream_index) {
        // 解码
        avcodec_send_packet(dec_ctx, &packet);
        while (avcodec_receive_frame(dec_ctx, frame) >= 0) {
            // 使用 best_effort_timestamp 作为帧的时间戳
            frame->pts = frame->best_effort_timestamp;

            // ===== 将解码帧送入 FilterGraph =====
            av_buffersrc_add_frame_flags(buffersrc_ctx, frame,
                                        AV_BUFFERSRC_FLAG_KEEP_REF);

            // ===== 从 FilterGraph 拉取处理后的帧 =====
            while (av_buffersink_get_frame(buffersink_ctx, filt_frame) >= 0) {
                write_frame(filt_frame);      // 保存处理后的帧
                av_frame_unref(filt_frame);   // 释放帧引用
            }
            av_frame_unref(frame);
        }
    }
    av_packet_unref(&packet);
}

7.4 更多滤镜示例

滤镜描述字符串	效果说明
`scale=1280:720`	缩放到 1280×720
`scale=iw0.5:ih0.5`	缩小为原来的 50%
`crop=640:480:0:0`	从左上角裁剪 640×480
`hflip`	水平翻转（镜像）
`vflip`	垂直翻转（上下颠倒）
`rotate=PI/4`	旋转 45 度
`drawtext=text='Hello':fontsize=24:x=10:y=10`	添加文字水印
`eq=brightness=0.1:contrast=1.2`	调整亮度/对比度
`overlay=10:10`	叠加另一路视频
`fps=30`	转换帧率为 30fps
`scale=iw0.5:ih0.5,hflip`	先缩小再镜像（链式滤镜）

7.5 资源释放

c 复制代码

avfilter_graph_free(&filter_graph);   // 释放整个 FilterGraph
avcodec_close(dec_ctx);
avformat_close_input(&fmt_ctx);
av_frame_free(&frame);
av_frame_free(&filt_frame);

8. Demo 6：AAC 码流解析（ADTS 帧分析）

8.1 AAC 编码格式简介

AAC（Advanced Audio Coding）是目前最常用的音频编码格式之一，其码流有两种封装格式：

格式	说明	特点
ADTS	Audio Data Transport Stream	每帧自包含完整头信息，支持直接播放/解析
ADIF	Audio Data Interchange Format	只有一个头信息，不支持随机访问

8.2 ADTS 帧结构

ADTS 帧由 7 字节固定头 （或 9 字节含 CRC）+ 音频数据 组成：

ini 复制代码

ADTS Header（7 bytes without CRC）：
+--------+--------+--------+--------+--------+--------+--------+
|  Sync  | Sync+  |Profile |Freq+Ch | FullLen|  Full  |  Num  |
| 0xFF   |0xF0+ID |/Freq Hi| Layout |  Hi    | Length  | Bufs  |
+--------+--------+--------+--------+--------+--------+--------+
Bits:
[0]     : 0xFF（同步字高8位）
[1][7:4]: 0xF（同步字低4位）
[1][3]  : ID（0=MPEG-4, 1=MPEG-2）
[1][2:1]: Layer（固定为 0b00）
[1][0]  : Protection Absent（1=无CRC）
[2][7:6]: Profile（0=Main, 1=LC, 2=SSR）
[2][5:2]: Sampling Frequency Index（见下表）
[2][1]  : Private Bit
[2][0]+[3][7:6]: Channel Configuration（声道数）
...
[3][1:0]+[4][7:0]+[5][7:5]: Frame Length（帧总长，含头部）

采样率索引对应表：

索引	采样率	索引	采样率
0	96000 Hz	6	24000 Hz
1	88200 Hz	7	22050 Hz
2	64000 Hz	8	16000 Hz
3	48000 Hz	9	12000 Hz
4	44100 Hz	10	11025 Hz
5	32000 Hz	11	8000 Hz

8.3 ADTS 帧解析核心代码

c 复制代码

/**
 * 从缓冲区中提取一个 ADTS 帧
 * @param buffer    输入缓冲区
 * @param buf_size  缓冲区大小
 * @param data      输出：ADTS 帧数据
 * @param data_size 输出：ADTS 帧大小
 * @return 0=成功, -1=数据不足, 1=帧不完整（需要更多数据）
 */
int getADTSframe(unsigned char *buffer, int buf_size,
                 unsigned char *data, int *data_size) {
    int size = 0;

    while (1) {
        if (buf_size < 7) return -1;  // 不足一个 ADTS 头

        // 检查同步字（0xFFF）
        if ((buffer[0] == 0xFF) && ((buffer[1] & 0xF0) == 0xF0)) {
            // 从 ADTS 头中提取帧总长度（13 bit）
            size  = (buffer[3] & 0x03) << 11;  // 高 2 bit
            size |= buffer[4] << 3;             // 中间 8 bit
            size |= (buffer[5] & 0xE0) >> 5;   // 低 3 bit
            break;
        }
        --buf_size;
        ++buffer;  // 同步字不匹配，向后移动一字节
    }

    if (buf_size < size) return 1;   // 数据不够一帧

    memcpy(data, buffer, size);
    *data_size = size;
    return 0;
}

8.4 AAC Profile 说明

c 复制代码

unsigned char profile = aacframe[2] & 0xC0;
profile >>= 6;
switch (profile) {
    case 0: printf("Main");    break;  // AAC Main（高复杂度，已较少使用）
    case 1: printf("LC");      break;  // AAC-LC（Low Complexity，最常用）
    case 2: printf("SSR");     break;  // Scalable Sample Rate
    default: printf("Unknown"); break;
}

输出示例：

markdown 复制代码

-----+- ADTS Frame Table -+------+
 NUM | Profile | Frequency| Size |
-----+---------+----------+------+
    0|       LC|  44100Hz|   368|
    1|       LC|  44100Hz|   373|
    2|       LC|  44100Hz|   365|
    ...

9. Demo 7：音频解码（AAC → PCM）

9.1 音频解码与重采样流程

scss 复制代码

AAC 文件
    ↓
avformat_open_input()   // 解封装
    ↓
查找 AVMEDIA_TYPE_AUDIO 流
    ↓
avcodec_find_decoder()  // 查找 AAC 解码器
    ↓
avcodec_open2()         // 打开解码器
    ↓
swr_alloc_set_opts()    // 配置音频重采样参数
    ↓
swr_init()              // 初始化重采样器
    ↓
循环 av_read_frame()
    ↓
avcodec_send_packet() + avcodec_receive_frame()  // 解码
    ↓
swr_convert()           // 格式/采样率/声道转换
    ↓
写入 PCM 文件

9.2 音频重采样参数配置

c 复制代码

// 目标输出参数
int out_nb_samples = 2048;                  // 每次重采样的样本数
enum AVSampleFormat sample_fmt = AV_SAMPLE_FMT_S16;  // 16位有符号整数
int out_sample_rate = 44100;               // 目标采样率 44100 Hz
uint64_t out_channel_layout = AV_CH_LAYOUT_MONO;     // 单声道

// 计算每次重采样的缓冲区大小
int out_channels = av_get_channel_layout_nb_channels(out_channel_layout);
int buffer_size = av_samples_get_buffer_size(
    NULL, out_channels, out_nb_samples, sample_fmt, 1);

// 获取输入音频的声道布局
int64_t in_channel_layout = av_get_default_channel_layout(dec_ctx->channels);

// 创建并配置重采样上下文（SwrContext）
struct SwrContext *convert_ctx = swr_alloc();
convert_ctx = swr_alloc_set_opts(
    convert_ctx,
    out_channel_layout, sample_fmt, out_sample_rate,   // 输出参数
    in_channel_layout, dec_ctx->sample_fmt, dec_ctx->sample_rate, // 输入参数
    0, NULL);
swr_init(convert_ctx);  // 初始化重采样器

常用音频格式说明：

AVSampleFormat	说明	字节数/样本
`AV_SAMPLE_FMT_U8`	8位无符号整数	1
`AV_SAMPLE_FMT_S16`	16位有符号整数（最常用）	2
`AV_SAMPLE_FMT_S32`	32位有符号整数	4
`AV_SAMPLE_FMT_FLT`	32位浮点数	4
`AV_SAMPLE_FMT_DBL`	64位双精度浮点	8
`AV_SAMPLE_FMT_S16P`	16位有符号整数（平面模式）	2
`AV_SAMPLE_FMT_FLTP`	32位浮点数（平面模式）	4

Packed vs Planar： 普通格式（如 S16）：多声道数据交叉存储（LRLRLR...）；平面格式（如 S16P）：每个声道独立存储（LLLL...RRRR...）。

9.3 解码与重采样核心循环

c 复制代码

while (av_read_frame(fmt_ctx, packet) >= 0) {
    if (packet->stream_index == stream_index) {
        // 发送压缩数据包
        ret = avcodec_send_packet(dec_ctx, packet);
        if (ret < 0) break;

        while (ret >= 0) {
            // 接收解码帧（AAC 解码后为 AV_SAMPLE_FMT_FLTP 格式）
            ret = avcodec_receive_frame(dec_ctx, frame);
            if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break;
            if (ret < 0) break;

            // 重采样：FLTP → S16（Mono, 44100Hz）
            swr_convert(
                convert_ctx,
                &buffer, MAX_AUDIO_FRAME_SIZE,        // 输出缓冲区
                (const uint8_t **)frame->data,         // 输入数据
                frame->nb_samples);                    // 输入样本数

            // 写入 PCM 文件（注意：写入 buffer_size 字节而非转换后实际字节）
            fwrite(buffer, 1, buffer_size, out_fb);
        }
    }
    av_packet_unref(packet);
}

// 资源释放
swr_free(&convert_ctx);
av_frame_free(&frame);
av_packet_free(&packet);

10. FFmpeg 关键数据结构汇总

10.1 核心数据结构关系图

objectivec 复制代码

AVFormatContext（封装层上下文）
  ├── AVStream[]（媒体流数组）
  │     └── AVCodecParameters（编解码参数）
  └── AVIOContext（I/O 上下文）

AVCodecContext（编解码器上下文）
  └── AVCodec（编解码器）

AVPacket（压缩数据包）
  ├── data（压缩数据）
  ├── size（数据大小）
  ├── pts（显示时间戳）
  ├── dts（解码时间戳）
  └── stream_index（所属流索引）

AVFrame（解压缩数据帧）
  ├── data[8]（数据平面指针数组）
  ├── linesize[8]（每行数据大小）
  ├── width/height（视频宽高）
  ├── nb_samples（音频样本数）
  ├── format（像素/采样格式）
  └── pts（显示时间戳）

10.2 重要数据结构详解

数据结构	所属库	作用
`AVFormatContext`	libavformat	封装层总上下文，保存文件格式、流信息等
`AVStream`	libavformat	单路媒体流信息（视频/音频/字幕）
`AVCodecParameters`	libavcodec	流的编解码参数（无状态，可跨线程安全）
`AVCodec`	libavcodec	编解码器描述（静态信息）
`AVCodecContext`	libavcodec	编解码器实例（有状态，含编解码过程数据）
`AVCodecParserContext`	libavcodec	码流解析器，从字节流中分割出 Packet 边界
`AVPacket`	libavcodec	压缩数据包（一帧或多帧压缩数据）
`AVFrame`	libavutil	解压缩数据帧（原始音视频数据）
`SwsContext`	libswscale	图像缩放/像素格式转换上下文
`SwrContext`	libswresample	音频重采样/格式/声道转换上下文
`AVFilterGraph`	libavfilter	滤镜图（管理所有滤镜节点）
`AVFilterContext`	libavfilter	单个滤镜节点实例
`AVFilterInOut`	libavfilter	滤镜图的输入/输出端点

10.3 时间戳与时间基

FFmpeg 使用**有理数时间基（time_base）**来表示时间，避免浮点误差：

时间基（time_base）： AVRational {num, den} 表示 num/den 秒
时间戳（PTS/DTS）： 整数值，实际时间 = 时间戳 × time_base

c 复制代码

// 时间单位转换（从流时间基转为微秒）
int64_t pts_us = av_rescale_q(frame->pts,
    fmt_ctx->streams[video_index]->time_base,
    AV_TIME_BASE_Q);  // AV_TIME_BASE_Q = {1, 1000000}（微秒基）

// 帧间延迟计算（用于渲染同步）
int64_t delay = av_rescale_q(frame->pts - last_pts,
    time_base, AV_TIME_BASE_Q);
if (delay > 0 && delay < 1000000)
    usleep(delay);  // 按照延迟时间睡眠，实现播放同步

11. 常见问题与注意事项

11.1 写 YUV 数据时的行跨度问题

❌ 错误写法：

c 复制代码

// 错误：width 可能小于 linesize（存在对齐填充）
fwrite(frame->data[0], 1, frame->width * frame->height, fp);

✅ 正确写法：

c 复制代码

// 正确：按行写入，用 linesize 计算每行偏移
for (int i = 0; i < frame->height; i++)
    fwrite(frame->data[0] + frame->linesize[0] * i, 1, frame->width, fp);

原因： FFmpeg 为了 SIMD 优化，会对图像数据进行内存对齐。linesize 可能大于 width，多余的字节是填充字节，写入时需要跳过。

11.2 FFmpeg 4.x 新旧 API 对比

功能	旧 API（已废弃）	新 API（推荐）
解码	`avcodec_decode_video2()`	`avcodec_send_packet()` + `avcodec_receive_frame()`
编码	`avcodec_encode_video2()`	`avcodec_send_frame()` + `avcodec_receive_packet()`
参数获取	直接使用 `AVCodecContext` 字段	`avcodec_parameters_to_context()`
Packet 初始化	`av_init_packet()`	`av_packet_alloc()`

11.3 内存管理规范

c 复制代码

// AVPacket：使用后必须释放数据（不释放结构体本身）
av_packet_unref(&packet);   // 或 av_packet_free(&pkt_ptr)

// AVFrame：使用后必须释放引用
av_frame_unref(frame);      // 释放引用计数（不释放结构体）
av_frame_free(&frame);      // 释放结构体

// AVFormatContext：
avformat_close_input(&fmt_ctx);  // 关闭并释放

// AVCodecContext：
avcodec_close(codec_ctx);        // 关闭编解码器
av_free(codec_ctx);              // 释放上下文
// 或（推荐新方式）：
avcodec_free_context(&codec_ctx);

11.4 错误处理最佳实践

c 复制代码

// 所有 FFmpeg 函数都返回负值表示错误
int ret = avformat_open_input(&fmt_ctx, filename, NULL, NULL);
if (ret < 0) {
    char errbuf[128];
    av_strerror(ret, errbuf, sizeof(errbuf));
    fprintf(stderr, "Error: %s\n", errbuf);
    return ret;
}

// 常见错误码
// AVERROR(EAGAIN)    ：需要更多输入数据
// AVERROR_EOF        ：已到达文件末尾
// AVERROR(ENOMEM)    ：内存分配失败
// AVERROR_INVALIDDATA：输入数据无效

12. 总结

本文通过 7 个完整的 FFmpeg Demo，系统展示了音视频开发的核心流程：

Demo	核心知识点	关键 API
H.264 码流解析	NALU 结构、起始码识别、SPS/PPS/IDR	手动解析，无需 FFmpeg API
视频解码	AVParser + 解码器两段式流程	`av_parser_parse2` + `avcodec_send/receive`
解复用+解码	封装格式处理、SwsContext 色彩转换	`avformat_open_input` + `sws_scale`
H.264 编码	编码参数配置、x264 preset	`avcodec_find_encoder` + `avcodec_send/receive`
视频 Filter	FilterGraph 构建与数据流	`avfilter_graph_*` + `av_buffersrc/sink`
AAC 码流解析	ADTS 帧结构、Profile/采样率解析	手动解析 ADTS 头
音频解码	SwrContext 音频重采样	`swr_alloc_set_opts` + `swr_convert`

学习建议

循序渐进： 先掌握解码流程（Demo 2/3），再学编码（Demo 4），最后深入 Filter（Demo 5）。
多调试： 使用 av_dump_format() 打印文件信息，用 av_log_set_level(AV_LOG_DEBUG) 开启详细日志。
关注版本： FFmpeg API 变化较大，本文基于 4.1，新版本（5.x/6.x）在 API 细节上有所差异，注意废弃警告。
实战工具： 安装 ffplay/ffprobe 配合开发，随时验证中间文件是否正确。

FFmpeg 开发实战全解析：从入门到精通（附完整代码示例）

前言

目录

1. FFmpeg 核心架构概览

音视频处理核心流程

2. 环境搭建与编译

2.1 Linux 下编译安装 FFmpeg（启用 H.264 编码）

2.2 编译 Demo（Makefile 示例）

3. Demo 1：H.264 码流解析（NALU 分析）

3.1 原理：什么是 NALU？

3.2 NALU 类型枚举

3.3 NALU 数据结构定义

3.4 核心函数：从 Annex-B 格式读取 NALU

3.5 输出示例

4. Demo 2：视频解码（H.264/HEVC → YUV）

4.1 解码流程

4.2 完整代码解析

（1）初始化解码器

（2）AVParser：从原始字节流中提取 AVPacket

（3）解码函数：avcodec_send_packet + avcodec_receive_frame

（4）资源释放

5. Demo 3：解复用（Demuxing）

5.1 解复用 + 解码的完整流程

5.2 关键步骤代码

（1）打开文件并查找视频流

（2）创建解码器上下文（FFmpeg 4.x 新 API）

（3）色彩空间转换（YUV420P → RGB24）

（4）保存为 PPM 图像格式

6. Demo 4：H.264 视频编码（YUV → H.264）

6.1 编码流程

6.2 编码参数配置

6.3 分配 AVFrame 内存

6.4 编码主循环

6.5 编码函数

7. Demo 5：视频 Filter（滤镜处理）

7.1 FFmpeg Filter 架构

7.2 Filter 初始化过程

7.3 Filter 处理主循环

7.4 更多滤镜示例

7.5 资源释放

8. Demo 6：AAC 码流解析（ADTS 帧分析）

8.1 AAC 编码格式简介

8.2 ADTS 帧结构

8.3 ADTS 帧解析核心代码

8.4 AAC Profile 说明

9. Demo 7：音频解码（AAC → PCM）

9.1 音频解码与重采样流程

9.2 音频重采样参数配置

9.3 解码与重采样核心循环

10. FFmpeg 关键数据结构汇总

10.1 核心数据结构关系图

10.2 重要数据结构详解

10.3 时间戳与时间基

11. 常见问题与注意事项

11.1 写 YUV 数据时的行跨度问题

11.2 FFmpeg 4.x 新旧 API 对比

11.3 内存管理规范

11.4 错误处理最佳实践

12. 总结

学习建议

推荐学习资源