FFmpeg 的常用API

附录：FFmpeg库介绍

库	介绍
libavcodec	音视频编解码核心库编码 (`avcodec_send_frame`, `avcodec_receive_packet`)。解码 (`avcodec_send_packet`, `avcodec_receive_frame`)。
libavformat	提供了音视频流的解析和封装功能，多种多媒体封装格式（如 MP4、MKV、FLV、TS、AVI 等）。分配和初始化上下文 (`avformat_alloc_context`, `avformat_alloc_output_context2`)。解析媒体流 (`avformat_open_input`)。写入媒体流 (`avformat_write_header`, `av_write_frame`, `av_write_trailer`)。
libavutil	提供多种辅助工具。
libswscale	处理图像的缩放和色彩格式转换。像素格式转换(从 RGB 转换为 YUV420)。图像的尺寸缩放（如调整视频分辨率）。转换像素格式 (`sws_scale`)。分配和初始化上下文 (`sws_getContext`)。
libswresample	处理音频的重采样和格式转换初始化重采样上下文 (`swr_alloc_set_opts`, `swr_init`)。音频格式转换 (`swr_convert`)。
libavdevice	处理设备输入输出。提供多媒体输入设备的支持（如摄像头、麦克风）。
libpostproc	提供视频后处理功能。主要用于视频质量增强（如去块效应、降噪处理）。配合视频解码器使用，改善解码后的视频质量。

附录1：参考文献

ffmpeg视频编解码流程：https://www.cnblogs.com/fxw1/p/17229792.html

常用API：https://www.cnblogs.com/linuxAndMcu/p/12041359.html

FFmpeg各版本区别：https://juejin.cn/post/7261245655128424509

附录2：编解码流程图

新版本ffmpeg4.0：

老版本ffmpeg3.0：

一、通用API

1.1 `av_register_all()`

初始化 libavformat 和注册所有的复用器muxer、解复用器demuxer和协议。(ffmpeg4.0已正式废弃)

c++ 复制代码

void av_register_all(void);

1.2 `avcodec_find_encoder`、`avcodec_find_decoder`

查找具有匹配编解码器ID的已注册编/解码器，位于 libavcodec\avcodec.h

c++ 复制代码

// 函数的参数是一个编码器的ID，返回查找到的编码器（没有找到就返回NULL）。
AVCodec *avcodec_find_encoder(enum AVCodecID id);

// 函数的参数是一个解码器的ID，返回查找到的解码器（没有找到就返回NULL）。
AVCodec *avcodec_find_decoder(enum AVCodecID id);

1.3 `avcodec_open2()`

**初始化一个视音频编解码器的 AVCodecContext以使用给定的AVCodec。**声明位于 libavcodec\utils.c

c++ 复制代码

int avcodec_open2(AVCodecContext *avctx, 
                  const AVCodec *codec, 
                  AVDictionary **options);

avctx：需要初始化的 AVCodecContext。
codec：输入的AVCodec。
options：一些选项。例如使用libx264编码的时候，"preset"，"tune"等都可以通过该参数设置。

1.4 `avcodec_close()`

关闭给定的avcodeContext并释放与之关联的所有数据，声明位于 libavcodec\utils.c

c++ 复制代码

int avcodec_close(AVCodecContext *avctx);

二、解码相关API

2.1 `avformat_open_input()`

打开输入流和读取头信息，流必须使用avformat_close_input()关闭

c++ 复制代码

int avformat_open_input(AVFormatContext **ps,
                        const char *url,
                        AVInputFormat *fmt, 
                        AVDictionary **options);

ps：用户提供的AVFormatContext（由avformat_alloc_context分配）的指针。
url：打开的视音频流的 URL。
fmt：如果!=NULL，则此参数强制使用特定的输入格式。否则将自动检测格式。
options：包含AVFormatContext和demuxer私有选项的字典；一般情况下可以设置为 NULL。

2.2 avformat_find_stream_info()

**读取检查媒体文件的数据包以获取具体的流信息，**如媒体存入的编码格式。

c++ 复制代码

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options)

ic：媒体文件的上下文
options：字典，配置选项

2.3 av_read_frame

读取码流中的音频若干帧或者视频一帧

例如，解码视频的时候，每解码一个视频帧，需要先调用 av_read_frame() 获得一帧视频的压缩数据，然后才能对该数据进行解码。

c++ 复制代码

int av_read_frame(AVFormatContext *s, AVPacket *pkt);

2.4 avcodec_send_packet()

新版FFMPEG4.0引入：主要用于将编码或解码的数据包（Packet）送入编解码器的输入队列

c++ 复制代码

int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);

avctx ：指向 AVCodecContext 结构的指针，包含与编码器或解码器相关的配置信息 。（如avcodec_open2初始化的编解码器）
avpkt ：指向 AVPacket 结构的指针，表示要送入编解码器的输入数据包 。（如av_read_frame的数据包）

2.5 avcodec_receive_frame()

新版FFMPEG4.0引入：用于从解码器获取解码后帧

c++ 复制代码

int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame);

avctx ：指向 AVCodecContext 结构的指针，它包含与解码器相关的上下文和配置信息。你必须在解码器初始化后（通过 avcodec_open2）提供此参数。
frame ：指向 AVFrame 结构的指针，接收解码后的帧数据。AVFrame 是一个结构体，表示解码后的视频或音频数据。解码后的数据将存储在这个结构中。

为什么要使用 avcodec_send_packet 和 avcodec_receive_frame

分离输入和输出 ：使用这两个函数可以将输入和输出解耦，给解码器提供更大的灵活性。例如，在多线程环境中，你可以在一个线程中调用 avcodec_send_packet 发送数据包，而在另一个线程中调用 avcodec_receive_frame 获取解码结果。
线程安全：新版的 API 提供了线程安全的机制，尤其适用于异步解码或编码任务。
增强性能和灵活性：通过逐步处理数据，避免了直接处理整个解码过程所带来的性能瓶颈。

2.6 avformat_close_input()

对应2.1；关闭打开的流。并释放AVFormatContext的所有内容并将*s设置为空

c++ 复制代码

void avformat_close_input(AVFormatContext **s)

复制代码

## 三、编码相关API

### 3.1 `avformat_alloc_output_context2`

> 用于**分配并初始化**一个输出媒体格式的上下文 (`AVFormatContext`)  （通常是第一个调用的函数）

```c++
int avformat_alloc_output_context2(AVFormatContext **ctx, 
                                   AVOutputFormat *oformat, 
                                   const char *format_name, 
                                   const char *filename);

ctx ：指向输出上下文指针的指针，用于存储分配的 AVFormatContext。
oformat ：指定输出格式（AVOutputFormat），可以为 NULL。如果为 NULL，则根据 format_name 或 filename 自动推断格式。
format_name ：指定输出格式的名称（如 "mp4"、"mkv" 等），用于明确输出文件的封装格式。可以为 NULL。
filename ：输出文件的名称。此参数会用于推断格式（如果 oformat 和 format_name 都为 NULL）。

3.2 `avformat_write_header()`

为输出文件写入文件头，准备文件封装格式所需的元数据。

c++ 复制代码

int avformat_write_header(AVFormatContext *s, AVDictionary **options);

s (AVFormatContext) ：指向输出上下文 (AVFormatContext) 的指针，必须是用 avformat_alloc_output_context2 创建的，并且已经设置好音视频流 (AVStream)。
options (AVDictionary**) ：用于传递格式化选项的字典指针，可以为 NULL。
- 设置编码参数（如比特率 bit_rate）。
- 设置容器格式选项（如 movflags）。
- 需要在调用完成后手动释放（通过 av_dict_free）。

3.3 av_write_frame()

用于将单个媒体包（AVPacket）写入输出文件。它是音视频数据封装的重要步骤，直接处理编解码后的数据帧。

c++ 复制代码

int av_write_frame(AVFormatContext *s, AVPacket *pkt)

s (AVFormatContext*)

指向输出上下文的指针，通常由 avformat_alloc_output_context2 创建并初始化。

pkt (AVPacket*)

包含需要写入的媒体数据的包（AVPacket）。它应该包含目标流的索引 (stream_index)、解码后的时间戳（PTS/DTS）、以及数据缓冲区。

3.4 av_write_trailer()

用于输出文件尾

c++ 复制代码

int av_write_trailer(AVFormatContext *s)

四、图像处理API

4.1 sws_getContext()

用于初始化一个缩放上下文 (SwsContext)，以便进行视频像素格式的转换或尺寸缩放。

c++ 复制代码

struct SwsContext *sws_getContext(
    int srcW, int srcH, enum AVPixelFormat srcFormat,
    int dstW, int dstH, enum AVPixelFormat dstFormat,
    int flags, SwsFilter *srcFilter, SwsFilter *dstFilter, const double *param
);

srcW 和 srcH ：输入图像的宽度和高度。srcFormat ：输入图像的像素格式（AVPixelFormat 枚举值，例如 AV_PIX_FMT_YUV420P）。
dstW 和 dstH ：输出图像的宽度和高度。dstFormat ：输出图像的像素格式（例如 AV_PIX_FMT_RGB24）。
flags：用于控制缩放的算法。可以是以下值之一或它们的组合：
- SWS_FAST_BILINEAR：快速双线性缩放。
- SWS_BILINEAR：双线性缩放。
- SWS_BICUBIC：双三次插值缩放（质量高）。
- SWS_LANCZOS：Lanczos重采样（质量最高）。
srcFilter 和 dstFilter ：分别为输入和输出图像使用的滤波器。通常为 NULL。
param ：滤波器相关参数，通常为 NULL。

4.2 sws_scale()

libswscale 库中的关键函数，用于在图像转换和缩放过程中执行实际的像素格式转换和尺寸调整操作 。它在 sws_getContext 初始化的上下文中完成图像数据处理

c++ 复制代码

int sws_scale(struct SwsContext *c,
                                  const uint8_t * const srcSlice[],
                                  const int srcStride[], int srcSliceY,
                                  int srcSliceH, uint8_t *const dst[],
                                  const int dstStride[]) )

c (struct SwsContext) ：指向由 sws_getContext 返回的上下文结构体，定义了转换和缩放的参数。

srcSlice (const uint8_t *const[]) ：输入图像的每个平面的指针数组（通常是 AVFrame->data）。

srcStride (const int[]) ：输入图像每行的字节数数组，对应每个数据平面（通常是 AVFrame->linesize）。

srcSliceY (int)：输入图像处理的起始行号，通常为 0。

srcSliceH (int) ：输入图像处理的行数（高度），例如 AVFrame->height。

dst (uint8_t *const[])：输出图像的每个平面的指针数组，存储转换后的数据。

dstStride (const int[])：输出图像每行的字节数数组，对应每个数据平面。

4.3 sws_freeContext()

释放一个 SwsContext

c++ 复制代码

void sws_freeContext(struct SwsContext *swsContext)

FFmpeg 的常用API