SVT-AV1源码分析-函数svt_aom_motion_estimation_kernel

一 svt_aom_motion_estimation_kernel函数作用

这段代码是EBSDK 中的一个运动估计内核函数，用于处理视频编码中的运动估计任务。运动估计任务。运动估计是视频编码中的一个关键步骤，目的是在时间域上找到当前块在参考帧中的最佳匹配块，从而减少视频数据的冗余。

函数主要功能

函数svt_aom_motion_estimation_kernel 无限循环，从输入队列中获取任务并且处理，然后将结果输出到结果队列。主要处理以下几种任务类型。

1 PAME 逐块运动估计，用于P或B帧的运动估计。

2 TFME 时间滤波运动估计，用于实践滤波器中的运动估计

3 DG_DETECTOR_HME 动态GOP检测器的HME部分。

代码结构解析

1 获取输入对象

EB_GET_FULL_OBJECT(me_context_ptr->picture_decision_results_input_fifo_ptr, &in_results_wrapper_ptr)

从输入队列picture_decision_results_input_fifo_ptr中获取一个完整的对象，这个任务包含了当前需要处理的运动估计相关信息。

2 获取任务类型

if (in_results_ptr->task_type == TASK_TFME)

m_context_ptr->me_ctx->me_type = ME_MCTF;

else if (in_results_ptr->task_type == TASK_PAME || in_results_ptr->task_type == TASK_SUPPERRES_RE_ME)

me_context_ptr->me_ctx->me_type = ME_OPEN_LOOP;

else if (in_results_ptr->task_type == TASK_DG_DETECTOR_HME)

me_context_ptr->me_ctx->me_type = ME_DG_DETECTOR;

根据任务类型设置运动估计的类型me_type

3 ME核心信号推导

if (in_results_ptr->task_type == TASK_PAME) ||

(in_results_ptr->task_type == TASK_SUPPERRES_RE_ME)

svt_aom_sig_deriv_me(scs, pcs, me_context_ptr->me_ctx);

else if (in_result_ctx->task_type == TASK_TFME)

svt_aom_sig_deriv_me_tf(pcs, me_context_ptr->me_ctx);

4 处理PAME或者SUPERRES_RE_ME任务

if (in_results_ptr->task_type == TASK_PAME)

if (in_results_ptr->task_type == TASK_SUPPERRES_RE_ME)

这部分代码主要处理PAME和SUPPERRES_RE_ME任务，包括以下步骤

1 获取图片缓冲区指针

EbPictureBufferDesc *sixteenth_picture_ptr;

EbPictureBufferDesc *quarter_picture_ptr;

EbPictureBufferDesc *input_padded_pic;

EbPictureBufferDesc *input_pic;

这些指针分别指向1/16下采样，1/4采样，输入填充图片和原始输入图片的缓冲区

2 获取段索引和图片尺寸

uint32_t segment_index = in_results_ptr->segment_index;

uint32_t pic_width_in_b64 = (pcs->aligned_width + scs->b64_size - 1) / scs->b64_size;

uint32_t picture_height_in_b64 = (pcs->aligned_height + scs->b64_size - 1) / scs->b64_size;

计算图片在64x64块尺寸下的宽度和高度，并获取当前段索引。

3 转换段索引为XY坐标

SEGMENT_CONVERT_IDX_TO_XY(segment_index, x_segment_index, y_segment_index, pcs->me_segments_column_count);

将一维段索引转换为二维XY坐标

4 计算64x64块的起始和结束索引

uint32_t x_b64_start_index = SEGMENT_START_IDX(x_segment_index, pic_width_in_b64, pcs->me_segments_column_count);

uint32_t x_b64_end_index = SEGMENT_END_IDX(x_segment_index, pic_width_in_b64, pcs->me_segments_column_count);

uint32_t y_b64_start_index = SEGMENT_START_IDX(y_segment_index, picture_height_in_b64, pcs->me_segments_row_count);

uint32_t y_b64_end_index = SEGMENT_END_IDX(y_segment_index, picture_height_in_b64, pcs->me_segments_row_count);

计算当前段中64x64块的起始和结束索引。

5 运动估计处理

if (!skip_me) {

if (pcs->slice_type != I_SLICE) {

// 使用缩放源参考帧（如果需要）

svt_aom_use_scaled_source_refs_if_needed(pcs, input_pic, pa_ref_obj_, &input_padded_pic, &quarter_picture_ptr, &sixteenth_picture_ptr);

// 64x64 块循环

for (uint32_t y_b64_index = y_b64_start_index; y_b64_index < y_b64_end_index; ++y_b64_index) {

for (uint32_t x_b64_index = x_b64_start_index; x_b64_index < x_b64_end_index; ++x_b64_index) {

// 获取 64x64 块的索引和起始位置

uint32_t b64_index = (uint16_t)(x_b64_index + y_b64_index * pic_width_in_b64);

uint32_t b64_origin_x = x_b64_index * scs->b64_size;

uint32_t b64_origin_y = y_b64_index * scs->b64_size;

// 加载 64x64 块数据到中间缓冲区

uint32_t buffer_index = (input_pic->org_y + b64_origin_y) * input_pic->stride_y +

input_pic->org_x + b64_origin_x;

me_context_ptr->me_ctx->b64_src_ptr = &input_padded_pic->buffer_y[buffer_index];

me_context_ptr->me_ctx->b64_src_stride = input_padded_pic->stride_y;

// 加载 1/4 和 1/16 下采样块数据

if (me_context_ptr->me_ctx->enable_hme_level1_flag) {

buffer_index = (quarter_picture_ptr->org_y + (b64_origin_y >> 1)) * quarter_picture_ptr->stride_y +

quarter_picture_ptr->org_x + (b64_origin_x >> 1);

me_context_ptr->me_ctx->quarter_b64_buffer = &quarter_picture_ptr->buffer_y[buffer_index];

me_context_ptr->me_ctx->quarter_b64_buffer_stride = quarter_picture_ptr->stride_y;

}

if (me_context_ptr->me_ctx->enable_hme_level0_flag) {

buffer_index = (sixteenth_picture_ptr->org_y + (b64_origin_y >> 2)) * sixteenth_picture_ptr->stride_y +

sixteenth_picture_ptr->org_x + (b64_origin_x >> 2);

me_context_ptr->me_ctx->sixteenth_b64_buffer = &sixteenth_picture_ptr->buffer_y[buffer_index];

me_context_ptr->me_ctx->sixteenth_b64_buffer_stride = sixteenth_picture_ptr->stride_y;

}

// 设置运动估计参数

me_context_ptr->me_ctx->me_type = ME_OPEN_LOOP;

// 设置参考帧信息

if ((in_results_ptr->task_type == TASK_PAME) || (in_results_ptr->task_type == TASK_SUPERRES_RE_ME)) {

me_context_ptr->me_ctx->num_of_list_to_search = (pcs->slice_type == P_SLICE) ? 1 : 2;

me_context_ptr->me_ctx->num_of_ref_pic_to_search[0] = pcs->ref_list0_count_try;

if (pcs->slice_type == B_SLICE)

me_context_ptr->me_ctx->num_of_ref_pic_to_search[1] = pcs->ref_list1_count_try;

me_context_ptr->me_ctx->temporal_layer_index = pcs->temporal_layer_index;

me_context_ptr->me_ctx->is_ref = pcs->is_ref;

// 处理超分辨率或帧调整

if (pcs->frame_superres_enabled || pcs->frame_resize_enabled) {

for (int i = 0; i < me_context_ptr->me_ctx->num_of_list_to_search; i++) {

for (int j = 0; j < me_context_ptr->me_ctx->num_of_ref_pic_to_search[i]; j++) {

uint8_t sr_denom_idx = svt_aom_get_denom_idx(pcs->superres_denom);

uint8_t resize_denom_idx = svt_aom_get_denom_idx(pcs->resize_denom);

EbPaReferenceObject *ref_object = (EbPaReferenceObject *)pcs->ref_pa_pic_ptr_array[i][j]->object_ptr;

me_context_ptr->me_ctx->me_ds_ref_array[i][j].picture_ptr = ref_object->downscaled_input_padded_picture_ptr[sr_denom_idx][resize_denom_idx];

me_context_ptr->me_ctx->me_ds_ref_array[i][j].quarter_picture_ptr = ref_object->downscaled_quarter_downsampled_picture_ptr[sr_denom_idx][resize_denom_idx];

me_context_ptr->me_ctx->me_ds_ref_array[i][j].sixteenth_picture_ptr = ref_object->downscaled_sixteenth_downsampled_picture_ptr[sr_denom_idx][resize_denom_idx];

me_context_ptr->me_ctx->me_ds_ref_array[i][j].picture_number = ref_object->picture_number;

}

} else {

for (int i = 0; i < me_context_ptr->me_ctx->num_of_list_to_search; i++) {

for (int j = 0; j < me_context_ptr->me_ctx->num_of_ref_pic_to_search[i]; j++) {

EbPaReferenceObject *ref_object = (EbPaReferenceObject *)pcs->ref_pa_pic_ptr_array[i][j]->object_ptr;

me_context_ptr->me_ctx->me_ds_ref_array[i][j].picture_ptr = ref_object->input_padded_pic;

me_context_ptr->me_ctx->me_ds_ref_array[i][j].quarter_picture_ptr = ref_object->quarter_downsampled_picture_ptr;

me_context_ptr->me_ctx->me_ds_ref_array[i][j].sixteenth_picture_ptr = ref_object->sixteenth_downsampled_picture_ptr;

me_context_ptr->me_ctx->me_ds_ref_array[i][j].picture_number = ref_object->picture_number;

}

// 执行运动估计

svt_aom_motion_estimation_b64(pcs, b64_index, b64_origin_x, b64_origin_y, me_context_ptr->me_ctx, input_pic);

// 处理全局运动估计

if ((in_results_ptr->task_type == TASK_PAME) || (in_results_ptr->task_type == TASK_SUPERRES_RE_ME)) {

svt_block_on_mutex(pcs->me_processed_b64_mutex);

pcs->me_processed_b64_count++;

if (pcs->me_processed_b64_count == pcs->b64_total_count) {

if (pcs->gm_ctrls.enabled && (!pcs->gm_ctrls.pp_enabled || pcs->gm_pp_detected)) {

svt_aom_global_motion_estimation(pcs, input_pic);

} else {

memset(pcs->is_global_motion, FALSE, MAX_NUM_OF_REF_PIC_LIST * REF_LIST_MAX_DEPTH);

}

svt_release_mutex(pcs->me_processed_b64_mutex);

}

// 处理开放环路内块估计

if (scs->in_loop_ois == 0 && pcs->tpl_ctrls.enable)

for (uint32_t y_b64_index = y_b64_start_index; y_b64_index < y_b64_end_index; ++y_b64_index)

for (uint32_t x_b64_index = x_b64_start_index; x_b64_index < x_b64_end_index; ++x_b64_index) {

uint32_t b64_index = (uint16_t)(x_b64_index + y_b64_index * pic_width_in_b64);

svt_aom_open_loop_intra_search_mb(pcs, b64_index, input_pic);

}

这部分代码主要处理 PAME 和 SUPERRES_RE_ME 任务的运动估计，包括：

加载 64x64 块数据 ：从输入图片中加载当前 64x64 块的数据到中间缓冲区。

加载下采样块数据 ：如果启用了 HME（Hierarchical Motion Estimation），加载 1/4 和 1/16 下采样块数据。

设置参考帧信息 ：根据切片类型（P 或 B）设置参考帧列表和数量。

执行运动估计 ：调用 `svt_aom_motion_estimation_b64` 函数进行运动估计。

处理全局运动估计 ：在所有 64x64 块处理完成后，执行全局运动估计。

开放环路内块估计 ：如果启用了开放环路内块估计，处理每个 64x64 块的内块估计。

5. 处理 TFME 任务

复制代码

`
`

else if (in_results

复制代码

`
`

>task_type == TASK_TFME) {

// gm pre-processing for only base B

if (pcs->gm_ctrls.pp_enabled && pcs->gm_pp_enabled && in_results_ptr->segment_index == 0)

svt_aom_gm_pre_processor(pcs, pcs->temp_filt_pcs_list);

// temporal filtering start

me_context_ptr->me_ctx->me_type = ME_MCTF;

svt_av1_init_temporal_filtering(pcs->temp_filt_pcs_list, pcs, me_context_ptr, in_results_ptr->segment_index);

// Release the Input Results

svt_release_object(in_results_wrapper_ptr);

}

这部分代码处理TAME任务，主要是时间滤波器的运动估计，包括

全局运动处理，如果启用了全局运动预处理且当前段索引为0，则调用svt_aom_gm_pre_processor 进行与处理。

初始化时间滤波，调用svt_av1_init_temporal_filtering 初始化时与滤波

6 处理DG_DETECTOR_HME任务

else if (in_results_ptr->task_type == TASK_DG_DETECTOR_HME) {

// dynamic gop detector

dg_detector_hme_level0(pcs, in_results_ptr->segment_index);

// Release the Input Results

svt_release_object(in_results_wrapper_ptr);

}

这部分代码处理DG_DETECTOR_HME任务，主要是动态GOP检测器的HME部分，调用dg_detector_hme_level0 函数进行处理

7 发布结果

// Get Empty Results Object

svt_get_empty_object(me_context_ptr->motion_estimation_results_output_fifo_ptr,

&out_results_wrapper);

MotionEstimationResults out_results = (MotionEstimationResults )out_results_wrapper->object_ptr;

out_results->pcs_wrapper = in_results_ptr->pcs_wrapper;

out_results->segment_index = segment_index;

out_results->task_type = in_results_ptr->task_type;

// Release the Input Results

svt_release_object(in_results_wrapper_ptr);

// Post the Full Results Object

svt_post_full_object(out_results_wrapper);

获取一个空的结果对象，填充结果数据（如 pcs_wrapper, segment_index 和task_type）然后发布结果到输出队列

总结

这个函数是EBSDK汇总运动估计模块的核心部分，负责处理不同类型的运动估计任务，通过从输入队列获取任务，根据任务类型进行相应的处理，然后将结果发布到输出队列

主要功能

1 任务类型处理，根据不同任务类型PAME,TFME,DG_DETECTOR_HME执行相应的运动估计逻辑

2 运动估计内核对64x64块进行运动估计，包括加载数据，设置参考帧信息和执行运动估计算法

3 全局运动估计在所有块处理完后，执行全局运动估计

4 结果发布，将处理结果发布到输出队列，供后续模块使用。