Android 16 显示系统 | 从View 到屏幕系列 - 10 | SurfaceFling 合成 (三)

ompositionEngine::present 作为 SurfaceFlinger 合成最核心的方法

scss 复制代码
void CompositionEngine::present(CompositionRefreshArgs& args) {
    // 遍历每个 LayerFE 的 onPreComposition,记录合成开始时间
    preComposition(args);

    {
        LayerFESet latchedLayers;
        // 前面提到:每一个 Output 代表一个显示设备
        for (const auto& output : args.outputs) {
            // 重点1:Output prepare
            output->prepare(args, latchedLayers);
        }
    }

    offloadOutputs(args.outputs);
    // ftl::Future<T> 是一个异步任务
    ui::DisplayVector<ftl::Future<std::monostate>> presentFutures;
    for (const auto& output : args.outputs) {
        // 遍历每一个显示设备,把 output->present 添加到异步队列容器
        presentFutures.push_back(output->present(args));
    }

    {
        for (auto& future : presentFutures) {
            // 执行 output->present
            future.get();
        }
    }
    postComposition(args);
}

juejin.cn/post/753567... 中,已经分析了 prepare 的实现,prepare 完成了所有合成前的准备工作,这一小节基于此继续分析剩下的 presentpostComposition

Output::present

scss 复制代码
ftl::Future<std::monostate> Output::present(
        const compositionengine::CompositionRefreshArgs& refreshArgs) {
    ...
    // 1. 更新每个 OutputLayer 所需要使用的合成方式 
    updateCompositionState(refreshArgs);
    planComposition();
    // 2. 合成状态传递给 HWComposer
    writeCompositionState(refreshArgs);
    beginFrame();
    ...
    GpuCompositionResult result;
    const bool predictCompositionStrategy = canPredictCompositionStrategy(refreshArgs);
    // prepareFrameAsync 和 prepareFrame 基本一致,只不过是同步和异步的区别
    // 3. prepareFrame,硬件合成
    if (predictCompositionStrategy) {
        result = prepareFrameAsync();
    } else {
        prepareFrame();
    }
    ...
    // 4. GPU 合成
    finishFrame(std::move(result));
    ftl::Future<std::monostate> future;
    const bool flushEvenWhenDisabled = !refreshArgs.bufferIdsToUncache.empty();
    if (mOffloadPresent) {
        ...
    } else {
        // 5. 提交 GPU 合成的结果到 HWC
        presentFrameAndReleaseLayers(flushEvenWhenDisabled);
        future = ftl::yield<std::monostate>({});
    }
    renderCachedSets(refreshArgs);
    return future;
}

1. updateCompositionState

ini 复制代码
void Output::updateCompositionState(const compositionengine::CompositionRefreshArgs& refreshArgs) {

    // 遍历所有的 OutputLayer
    // 返回带有 "模糊背景" 的 Layer
    mLayerRequestingBackgroundBlur = findLayerRequestingBackgroundComposition();
    // 如果存在 "模糊背景" 的 Layer,那么 forceClientComposition 就为 ture,代表这块 Layer 需要使用 GPU 合成
    bool forceClientComposition = mLayerRequestingBackgroundBlur != nullptr;

    auto* properties = getOverlaySupport();
    // 遍历所有的 OutputLayer
    for (auto* layer : getOutputLayersOrderedByZ()) {
        layer->updateCompositionState(refreshArgs.updatingGeometryThisFrame,
                                      refreshArgs.devOptForceClientComposition ||
                                              forceClientComposition,
                                      refreshArgs.internalDisplayRotationFlags,
                                      properties ? properties->lutProperties : std::nullopt);

        if (mLayerRequestingBackgroundBlur == layer) {
            forceClientComposition = false;
        }
    }
    commitPictureProfilesToCompositionState();
}

遍历所有的 OutputLayer 中带有 "模糊背景"(一种视觉效果) 的 Layer,如果存在,那么 Layer 以及它下面的 Layer (Z-Order 排序) 都需要使用 GPU 合成,这里就确定了每个 Layer 所有需要的合成方式

2. 合成状态传递给 HWComposer

arduino 复制代码
void Output::writeCompositionState(const compositionengine::CompositionRefreshArgs& refreshArgs) {
    ...
    // 遍历所有的 OutputLayer
    for (auto* layer : getOutputLayersOrderedByZ()) {
        ...
        layer->writeStateToHWC(includeGeometry, skipLayer, z++, overrideZ, isPeekingThrough,
                               hasLutsProperties);
        ...
    }
}

layer->writeStateToHWC 的实现如下

scss 复制代码
void OutputLayer::writeStateToHWC(bool includeGeometry, bool skipLayer, uint32_t z,
                                  bool zIsOverridden, bool isPeekingThrough,
                                  bool hasLutsProperties) {
    const auto& state = getState();

    // 获取 HWC::Layer
    auto& hwcLayer = (*state.hwc).hwcLayer;
    
    // 这里的 outputIndependentState 来自于 LayerFE
    const auto* outputIndependentState = getLayerFE().getCompositionState();
    
    // 获取 Layer 所需的合成方式
    auto requestedCompositionType = outputIndependentState->compositionType;

    if (requestedCompositionType == Composition::SOLID_COLOR && state.overrideInfo.buffer) {
        requestedCompositionType = Composition::DEVICE;
    }

    const bool isOverridden =
            state.overrideInfo.buffer != nullptr || isPeekingThrough || zIsOverridden;
    const bool prevOverridden = state.hwc->stateOverridden;
    if (isOverridden || prevOverridden || skipLayer || includeGeometry) {
        // 写入几何状态
        writeOutputDependentGeometryStateToHWC(hwcLayer.get(), requestedCompositionType, z);
        writeOutputIndependentGeometryStateToHWC(hwcLayer.get(), *outputIndependentState,
                                                 skipLayer);
    }

    // 写入每帧的状态
    writeOutputDependentPerFrameStateToHWC(hwcLayer.get());
    writeOutputIndependentPerFrameStateToHWC(hwcLayer.get(), *outputIndependentState,
                                             requestedCompositionType, skipLayer);
    // 写入合成的类型
    writeCompositionTypeToHWC(hwcLayer.get(), requestedCompositionType, isPeekingThrough,
                              skipLayer);
    if (hasLutsProperties) {
        writeLutToHWC(hwcLayer.get(), *outputIndependentState);
    }

    if (requestedCompositionType == Composition::SOLID_COLOR) {
        writeSolidColorStateToHWC(hwcLayer.get(), *outputIndependentState);
    }
    ...
   
}
  1. outputIndependentState 的数据来自于 LayerFE,前面已经多次提到 LayerFE 来自于 Layer, 而 Layer 的数据来自于 App, 这个 outputIndependentState 在后面的分析会用到
  2. 这里调用了很多以 "write" 开头的函数,将各种 state 写入 HWComposer,每个函数内部实现的套路都是一样的,这里以 writeOutputIndependentPerFrameStateToHWC 为例,分析 PerFrameState 是如何写入 HWComposer
arduino 复制代码
void OutputLayer::writeOutputIndependentPerFrameStateToHWC(
        HWC2::Layer* hwcLayer, const LayerFECompositionState& outputIndependentState,
        Composition compositionType, bool skipLayer) {
        // 设置颜色变换
    switch (auto error = hwcLayer->setColorTransform(outputIndependentState.colorTransform)) {
        case hal::Error::NONE:
            break;
        // 1.如果不支持,则标记这个 Layer 需要使用 GPU 合成
        case hal::Error::UNSUPPORTED:
            editState().forceClientComposition = true;
            break;
        default:
    }
   
    const Region& surfaceDamage = getState().overrideInfo.buffer
            ? getState().overrideInfo.damageRegion
            : (getState().hwc->stateOverridden ? Region::INVALID_REGION
                                               : outputIndependentState.surfaceDamage);
    
    switch (compositionType) {
        case Composition::SOLID_COLOR:
            break;
        case Composition::SIDEBAND:
            writeSidebandStateToHWC(hwcLayer, outputIndependentState);
            break;
        case Composition::CURSOR:
        // 如果 Layer 不需要使用 GPU 合成,默认就是走硬件合成,走这一条分支
        case Composition::DEVICE:
        case Composition::DISPLAY_DECORATION:
        case Composition::REFRESH_RATE_INDICATOR:
            // 2. 写入 Buffer 相关的数据
            writeBufferStateToHWC(hwcLayer, outputIndependentState, skipLayer);
            break;
        case Composition::INVALID:
        // 如果当前 Layer 需要使用 GPU 合成,则跳过
        case Composition::CLIENT:
            // Ignored
            break;
    }
}
  1. 如果 Layer 不需要使用 GPU 合成,默认就是走硬件合成的逻辑
  2. 如果 Layer 是使用 GPU 合成,那么这里就可以直接跳过了

这里重点分析 writeBufferStateToHWC 的实现

1. 写入 Buffer 数据到 HWComposer

writeBufferStateToHWC 的实现如下

ini 复制代码
void OutputLayer::writeBufferStateToHWC(HWC2::Layer* hwcLayer,
                                        const LayerFECompositionState& outputIndependentState,
                                        bool skipLayer) {
    ...
    // 1. 获取 buffer
    HwcSlotAndBuffer hwcSlotAndBuffer;
    sp<Fence> hwcFence;
    {

        auto& state = editState();

        if (state.overrideInfo.buffer != nullptr && !skipLayer) {
            hwcSlotAndBuffer = state.hwc->hwcBufferCache.getOverrideHwcSlotAndBuffer(
                    state.overrideInfo.buffer->getBuffer());
            hwcFence = state.overrideInfo.acquireFence;
            state.hwc->activeBufferId = state.overrideInfo.buffer->getBuffer()->getId();
        } else {
            // buffer 的数据来自于 LayerFE 的 outputIndependentState
            hwcSlotAndBuffer =state.hwc->hwcBufferCache.getHwcSlotAndBuffer(outputIndependentState.buffer);
            hwcFence = outputIndependentState.acquireFence;
            state.hwc->activeBufferId = outputIndependentState.buffer->getId();
        }
        state.hwc->activeBufferSlot = hwcSlotAndBuffer.slot;
    }
    // 2. hwcLayer 设置 buffer
    if (auto error = hwcLayer->setBuffer(hwcSlotAndBuffer.slot, hwcSlotAndBuffer.buffer, hwcFence);
        error != hal::Error::NONE) {
        x
    }
}
  1. 默认情况下 hwcLayer.state.overrideInfo.buffer 为 null,所以 buffer 来自于 LayerFE 的 outputIndependentState, 而 outputIndependentState 来自于 Layer 的 Snapshot
  2. 调用 hwcLayer->setBuffer 把 buffer 交给 HWC2::Layer
arduino 复制代码
// frameworks/native/services/surfaceflinger/DisplayHardware/HWC2.cpp
Error Layer::setBuffer(uint32_t slot, const sp<GraphicBuffer>& buffer,
        const sp<Fence>& acquireFence)
{

    mBufferSlot = slot;
    int32_t fenceFd = acquireFence->dup();
    auto intError = mComposer.setLayerBuffer(mDisplay->getId(), mId, slot, buffer, fenceFd);
}

最终还是调用 HWComposersetLayerBuffer

arduino 复制代码
// frameworks/native/services/surfaceflinger/DisplayHardware/AidlComposerHal.cpp
Error AidlComposer::setLayerBuffer(Display display, Layer layer, uint32_t slot,
                                   const sp<GraphicBuffer>& buffer, int acquireFence) {
    const native_handle_t* handle = nullptr;
    if (buffer.get()) {
        // 拿到这个 GraphicBuffer 的句柄
        handle = buffer->getNativeBuffer()->handle;
    }

    if (auto writer = getWriter(display)) {
        // 把数据暂存在 Writer 中
        writer->get().setLayerBuffer(translate<int64_t>(display), translate<int64_t>(layer), slot,
                                     handle, acquireFence);
    } else {
        error = Error::BAD_DISPLAY;
    }
    return error;
}

这里把数据暂存到 Writer 中, 此时还没有发给 HWC。

这里只是分析了其中一个 "write" 开头的函数,其他的 "write" 开头的函数的实现流程都是一样,都是通过 hwcLayer 将各种 state 写入 HWComposerWriter 暂存起来。从这里可以看出, HWC2::Layer hwcLayer 本质上就是 OutputLayer 和 HWComposer 的桥梁

我们接下来继续分析 present 的流程: prepareFrame()

3. Prepare Frame -- 硬件合成

ini 复制代码
void Output::prepareFrame() {

    std::optional<android::HWComposer::DeviceRequestedChanges> changes;
    // 1. 根据每个 Layer 是否需要使用 GPU 合成,决定最终的合成方式
    bool success = chooseCompositionStrategy(&changes);
    resetCompositionStrategy();
    // 这里可以关注一下,strategyPrediction 后面 GPU 合成会用到
    outputState.strategyPrediction = CompositionStrategyPredictionState::DISABLED;
    outputState.previousDeviceRequestedChanges = changes;
    outputState.previousDeviceRequestedSuccess = success;
    if (success) {
        // 2. 将合成方式设置设置给 OutputLayer
        applyCompositionStrategy(changes);
    }
    finishPrepareFrame();
}

1. chooseCompositionStrategy

scss 复制代码
bool Display::chooseCompositionStrategy(
        std::optional<android::HWComposer::DeviceRequestedChanges>* outChanges) {


    // Get any composition changes requested by the HWC device, and apply them.
    auto& hwc = getCompositionEngine().getHwComposer();
    // 1. 判断是有有 Layer 需要使用 GPU 合成
    const bool requiresClientComposition = anyLayersRequireClientComposition();

    // 2. getDeviceCompositionChanges
    if (status_t result = hwc.getDeviceCompositionChanges(*halDisplayId, requiresClientComposition,
                                                          getState().earliestPresentTime,
                                                          getState().expectedPresentTime,
                                                          getState().frameInterval, outChanges);
        result != NO_ERROR) {
        ...
    }
    ...
}
1. 判断是否存在需要 GPU 合成的 Layer
c 复制代码
bool Output::anyLayersRequireClientComposition() const {
    // 遍历所有的 OutputLayer
    const auto layers = getOutputLayersOrderedByZ();
    // any_of 检查容器中是否至少存在一个元素,使得给定的条件为 true
    // 意思就是:如果 Layer 中至少有一个 Layer 需要使用 GPU 合成,那么就返回 ture
    return std::any_of(layers.begin(), layers.end(),
                       // requiresClientComposition 在下面
                       [](const auto& layer) { return layer->requiresClientComposition(); });
}

// frameworks/native/services/surfaceflinger/CompositionEngine/src/OutputLayer.cpp
bool OutputLayer::requiresClientComposition() const {
    const auto& state = getState();
    // hwcCompositionType 的默认值为 Composition::InVALID
    return !state.hwc || state.hwc->hwcCompositionType == Composition::CLIENT;
}

所以结论是:

  1. 默认情况下,全部走 Device 合成,不需要使用 GPU 参与合成
  2. 只有存在 Layer 中必须使用 GPU 合成的情况下,GPU 才会参与合成

接下来分析第2点。

2. getDeviceCompositionChanges
ini 复制代码
status_t HWComposer::getDeviceCompositionChanges(
        HalDisplayId displayId, bool frameUsesClientComposition,
        std::optional<std::chrono::steady_clock::time_point> earliestPresentTime,
        nsecs_t expectedPresentTime, Fps frameInterval,
        std::optional<android::HWComposer::DeviceRequestedChanges>* outChanges) {

    ...
    // 判断是否可以跳过 Validate
    const bool canSkipValidate = [&] {
         ...
    }();

    displayData.validateWasSkipped = false;

    if (canSkipValidate) {
        // 执行 presentOrValidate
        error = hwcDisplay->presentOrValidate(expectedPresentTime, frameInterval.getPeriodNsecs(),
                                              &numTypes, &numRequests, &outPresentFence, &state);
        if (!hasChangesError(error)) {
            RETURN_IF_HWC_ERROR_FOR("presentOrValidate", error, displayId, UNKNOWN_ERROR);
        }
        if (state == 1) { // Present Succeeded.
            std::unordered_map<HWC2::Layer*, sp<Fence>> releaseFences;
            error = hwcDisplay->getReleaseFences(&releaseFences);
            displayData.releaseFences = std::move(releaseFences);
            displayData.lastPresentFence = outPresentFence;
            displayData.validateWasSkipped = true;
            displayData.presentError = error;
            return NO_ERROR;
        }
        // Present failed but Validate ran.
    } else {
        // 执行 validate
        error = hwcDisplay->validate(expectedPresentTime, frameInterval.getPeriodNsecs(), &numTypes,
                                     &numRequests);
    }
 
    android::HWComposer::DeviceRequestedChanges::ChangedTypes changedTypes;
    changedTypes.reserve(numTypes);
    error = hwcDisplay->getChangedCompositionTypes(&changedTypes);

    auto displayRequests = static_cast<hal::DisplayRequest>(0);
    android::HWComposer::DeviceRequestedChanges::LayerRequests layerRequests;
    layerRequests.reserve(numRequests);
    error = hwcDisplay->getRequests(&displayRequests, &layerRequests);

    DeviceRequestedChanges::ClientTargetProperty clientTargetProperty;
    error = hwcDisplay->getClientTargetProperty(&clientTargetProperty);

    DeviceRequestedChanges::LayerLuts layerLuts;
    error = hwcDisplay->getRequestedLuts(&layerLuts, mLutFileDescriptorMapper);
    2. 保存 presentOrValidate/validate 的结果到 DeviceRequestedChanges
    outChanges->emplace(DeviceRequestedChanges{std::move(changedTypes), std::move(displayRequests),
                                               std::move(layerRequests),
                                               std::move(clientTargetProperty),
                                               std::move(layerLuts)});
    error = hwcDisplay->acceptChanges();
    ...
}
  1. 根据是否能跳过 validate 的条件,进入不同的分支,presentOrValidate 或者 validate
  2. 保存 presentOrValidate/validate 的结果到 DeviceRequestedChanges

先看 1:

presentOrValidatevalidate 的逻辑大致相同,这里分析 presentOrValidate

arduino 复制代码
Error Display::presentOrValidate(nsecs_t expectedPresentTime, int32_t frameIntervalNs,
                                 uint32_t* outNumTypes, uint32_t* outNumRequests,
                                 sp<android::Fence>* outPresentFence, uint32_t* state) {
    auto intError =
            mComposer.presentOrValidateDisplay(mId, expectedPresentTime, frameIntervalNs, &numTypes,
                                               &numRequests, &presentFenceFd, state);
}

hwcDisplay->presentOrValidate 会调用 mComposerpresentOrValidateDisplay

scss 复制代码
Error AidlComposer::presentOrValidateDisplay(Display display, nsecs_t expectedPresentTime,
                                             int32_t frameIntervalNs, uint32_t* outNumTypes,
                                             uint32_t* outNumRequests, int* outPresentFence,
                                             uint32_t* state) {
    const auto displayId = translate<int64_t>(display);
    1. 拿到 Writer
    auto writer = getWriter(display);
    auto reader = getReader(display);
    if (writer && reader) {
        2. 开始发起 Binder call,把数据发送给 HWComposer 的 Binder Server
        writer->get().presentOrvalidateDisplay(displayId,
                                               ClockMonotonicTimestamp{expectedPresentTime},
                                               frameIntervalNs);
        error = execute(display);
    } 

    // 3. 解析 Binder Call 返回的数据并保存
    const auto result = reader->get().takePresentOrValidateStage(displayId);
    
    // 返回 PresentOrValidate::Result::Presented 表示:已经合成完成
    // 对于所以的 Layer 都是 Device 的情况下,合成已经结束了!!!
    if (*result == PresentOrValidate::Result::Presented) {
        auto fence = reader->get().takePresentFence(displayId);
        // take ownership
        *outPresentFence = fence.get();
        *fence.getR() = -1;
    }
    // 剩下的 Layer 需要 GPU 进行合成
    if (*result == PresentOrValidate::Result::Validated) {
        reader->get().hasChanges(displayId, outNumTypes, outNumRequests);
    }
}
  1. 还记得上面提到的 Writer 吗?hwcLayer 的各种 set 函数(如 setBuffer) 都会把数据暂存到和这个 Writer
  2. execute 会真正的执行 Binder call,把数据发送给 HWComposer Server
scss 复制代码
Error AidlComposer::execute(Display display) {
    ...
    auto status = mAidlComposerClient->executeCommands(commands, &results);
    ...    
}

这个 Binder Call 执行完成之后,HWComposer Server 会执行合成动作

  1. Reader 会解析 Binder Server 返回的数据,
  • 如果返回 PresentOrValidate::Result::Presented, 代表合成已经结束
  • 如果返回 PresentOrValidate::Result::Validated, 则表示会等待 GPU 合成之后提交到 HWComposer 再执行合成操作

所以,如果所有的 Layer 没有需要 GPU 合成的情况下,此时合成已经结束了。

到这里为止 prepareFrame 这个函数的核心流程就分析完了。接下来继续分析 present 的最后一个方法 finishFrame

4. finishFrame -- GPU 合成

rust 复制代码
ftl::Future<std::monostate> Output::present(
        const compositionengine::CompositionRefreshArgs& refreshArgs) {
    ...

    // 4. 执行 GPU 合成
    finishFrame(std::move(result));
    ...
    return future;
}

finishFrame 的实现如下:

arduino 复制代码
void Output::finishFrame(GpuCompositionResult&& result) {


    std::optional<base::unique_fd> optReadyFence;
    std::shared_ptr<renderengine::ExternalTexture> buffer;
    base::unique_fd bufferFence;
    // 前面已经把 strategyPrediction 置为 CompositionStrategyPredictionState::DISABLED
    // 所以会走 else 分支
    if (outputState.strategyPrediction == CompositionStrategyPredictionState::SUCCESS) {
        optReadyFence = std::move(result.fence);
    } else {
        if (result.bufferAvailable()) {
            buffer = std::move(result.buffer);
            bufferFence = std::move(result.fence);
        } else {
            updateProtectedContentState();
            // 1. GPU 合成之前申请一块 Buffer
            if (!dequeueRenderBuffer(&bufferFence, &buffer)) {
                return;
            }
        }
        //2. GPU 合成
        optReadyFence = composeSurfaces(Region::INVALID_REGION, buffer, bufferFence);
    }
    ...
    
    //3. 提交合成之后的 Buffer 到 BufferQueue
    mRenderSurface->queueBuffer(std::move(*optReadyFence), getHdrSdrRatio(buffer));
}

GPU 合成和前面提到的 App 渲染非常相似,也是通过生产者 从 BufferQueue 获取一块 Buffer,然后进行渲染,最后 Buffer 插入 BufferQueue,通知消费者消费。

1. dequeueBuffer

rust 复制代码
bool Output::dequeueRenderBuffer(base::unique_fd* bufferFence,
                                 std::shared_ptr<renderengine::ExternalTexture>* tex) {
    const auto& outputState = getState();

    if (outputState.usesClientComposition || outputState.flipClientTarget) {
         // mRenderSurface 作为消费者,从 BufferQueue 中申请一块 GraphicBuffer
        *tex = mRenderSurface->dequeueBuffer(bufferFence);
        if (*tex == nullptr) {
            return false;
        }
    }
    return true;
}

BufferQueue 在 合成初始阶段的时候创建(参考: juejin.cn/post/753501...%EF%BC%8C "https://juejin.cn/post/7535015508149731338#heading-6)%EF%BC%8C") 目的就是此刻。在 BufferQueue 创建的时候,也分别创建了对应的生产者:RenderSurface消费者: frameBufferSurface

mRenderSurface->dequeueBuffer的实现如下

ini 复制代码
std::shared_ptr<renderengine::ExternalTexture> RenderSurface::dequeueBuffer(
        base::unique_fd* bufferFence) {
 
    ANativeWindowBuffer* buffer = nullptr;
    1. 通过 mNativeWindow dequeueBuffer
    status_t result = mNativeWindow->dequeueBuffer(mNativeWindow.get(), &buffer, &fd);


    sp<GraphicBuffer> newBuffer = GraphicBuffer::from(buffer);

    std::shared_ptr<renderengine::ExternalTexture> texture;

    for (auto it = mTextureCache.begin(); it != mTextureCache.end(); it++) {
        const auto& cachedTexture = *it;
        if (cachedTexture->getBuffer()->getId() == newBuffer->getId()) {
            texture = cachedTexture;
            mTextureCache.erase(it);
            break;
        }
    }

    if (texture) {
        mTexture = texture;
    } else {
        // 2. 基于 GraphicBuffer 构建一块 ExternalTexture
        mTexture = std::make_shared<
                renderengine::impl::ExternalTexture>(GraphicBuffer::from(buffer),
                                                     mCompositionEngine.getRenderEngine(),
                                                     renderengine::impl::ExternalTexture::Usage::
                                                             WRITEABLE);
    }
    mTextureCache.push_back(mTexture);
    if (mTextureCache.size() > mMaxTextureCacheSize) {
        mTextureCache.erase(mTextureCache.begin());
    }

    *bufferFence = base::unique_fd(fd);
    // 返回 ExternalTexture
    return mTexture;
}
  1. juejin.cn/post/753501... 的分析中,我们知道 mNativeWindow 本质上就是 Surface 对象,所以 mNativeWindow->dequeueBuffer 的实现在 Surface 中
arduino 复制代码
int Surface::dequeueBuffer(android_native_buffer_t** buffer, int* fenceFd) {
    ...
    status_t result = mGraphicBufferProducer->dequeueBuffer(&buf, &fence, dqInput.width,
                                                            dqInput.height, dqInput.format,
                                                            dqInput.usage, &mBufferAge,
                                                            dqInput.getTimestamps ?
                                                                    &frameTimestamps : nullptr);
    ...
}

在前面的文章中 juejin.cn/post/752699... 提到过 mGraphicBufferProducerBufferQueue 提供给 Surface 的生产者接口,所以通过 mGraphicBufferProducer->dequeueBuffer 可以从 BufferQueue 中获取一块 GraphicBuffermGraphicBufferProducer->dequeueBuffer 的实现在之前的文章已经分析过,可以参考juejin.cn/post/752982...

  1. 申请到 GraphicBuffer 之后, 封装成 ExternalTexture, 并返回。

2. GPU Composition

c 复制代码
std::optional<base::unique_fd> Output::composeSurfaces(
        const Region& debugRegion, std::shared_ptr<renderengine::ExternalTexture> tex,
        base::unique_fd& fd) {

    renderengine::DisplaySettings clientCompositionDisplay =
            generateClientCompositionDisplaySettings(tex);
    // 获取 GPU RenderEngine
    auto& renderEngine = getCompositionEngine().getRenderEngine();
    const bool supportsProtectedContent = renderEngine.supportsProtectedContent();
    std::vector<LayerFE*> clientCompositionLayersFE;
    // 1. 获取需要使用 GPU 合成的 layer,并封装成 LayerFE::LayerSettings
    std::vector<LayerFE::LayerSettings> clientCompositionLayers =
            generateClientCompositionRequests(supportsProtectedContent,
                                              clientCompositionDisplay.outputDataspace,
                                              clientCompositionLayersFE);
    appendRegionFlashRequests(debugRegion, clientCompositionLayers);

    OutputCompositionState& outputCompositionState = editState();
    ...

    std::vector<renderengine::LayerSettings> clientRenderEngineLayers;
    clientRenderEngineLayers.reserve(clientCompositionLayers.size());
    // 2. 把需要 GPU 合成的 Layer 的 LayerFE::LayerSettings 转换为 renderengine::LayerSettings 数据结构,最后插入 clientRenderEngineLayers
    std::transform(clientCompositionLayers.begin(), clientCompositionLayers.end(),
                   std::back_inserter(clientRenderEngineLayers),
                   [](LayerFE::LayerSettings& settings) -> renderengine::LayerSettings {
                       return settings;
                   });

    const nsecs_t renderEngineStart = systemTime();
    // 3. 调用 renderEngine 控制 GPU 进行合成,合成结果在 tex
    // tex 是一块 ExternalTexture,是对 GraphicBuffer 的封装
    auto fenceResult = renderEngine
                               .drawLayers(clientCompositionDisplay, clientRenderEngineLayers, tex,
                                           std::move(fd))
                               .get();

    ...
    for (auto* clientComposedLayer : clientCompositionLayersFE) {
        clientComposedLayer->setWasClientComposed(fence);
    }

    return base::unique_fd(fence->dup());
}
  1. 把需要使用 GPU 合成的 Layer 的信息封装到 LayerFE::LayerSettings
  2. 然后把 LayerFE::LayerSettings 转换成 RenderEngine 所需要的对象 renderengine::LayerSettings
  3. 最后把 renderengine::LayerSettings 提交给 renderEngine, renderEngine 在合成初始化阶段已经创建,参考 juejin.cn/post/753501... 我们以 RenderEngineThreaded 为例,
arduino 复制代码
ftl::Future<FenceResult> RenderEngineThreaded::drawLayers(
        const DisplaySettings& display, const std::vector<LayerSettings>& layers,
        const std::shared_ptr<ExternalTexture>& buffer, base::unique_fd&& bufferFence) {
 
    const auto resultPromise = std::make_shared<std::promise<FenceResult>>();
    std::future<FenceResult> resultFuture = resultPromise->get_future();
    int fd = bufferFence.release();
    {
        std::lock_guard lock(mThreadMutex);
        mNeedsPostRenderCleanup = true;
        mFunctionCalls.push(
                [resultPromise, display, layers, buffer, fd](renderengine::RenderEngine& instance) {         
                    // 这里的 instance 指的是创建 RenderEngineThreaded 所传入的真正的 RenderEngine
                    instance.updateProtectedContext(layers, {buffer.get()});
                    instance.drawLayersInternal(std::move(resultPromise), display, layers, buffer,
                                                base::unique_fd(fd));
                });
    }
    mCondition.notify_one();
    return resultFuture;
}

这里的 instance 默认情况下是 SkiaRenderEngine, SkiaRenderEngine.drawLayersInternal 的实现细节在 frameworks/native/libs/renderengine/skia/SkiaRenderEngine.cpp, 实现细节这里暂时不讨论,最终会把合成的结果保存到第一步所申请到的 ExternalTexture 中。

GPU 合成完成之后, 把合成的结果 (ExternalTexture) 通过 mRenderSurface->queueBuffer 插入到 BufferQueue

3. 提交合成之后的 Buffer 到 BufferQueue

mRenderSurface->queueBuffer 的实现如下

scss 复制代码
void RenderSurface::queueBuffer(base::unique_fd readyFence, float hdrSdrRatio) {
    auto& state = mDisplay.getState();

    if (state.usesClientComposition || state.flipClientTarget) {
       
        if (mTexture == nullptr) {
            ALOGE("No buffer is ready for display [%s]", mDisplay.getName().c_str());
        } else {
            // 1. 把 Buffer 插入 BufferQueue
            status_t result = mNativeWindow->queueBuffer(mNativeWindow.get(),
                                                         mTexture->getBuffer()->getNativeBuffer(),
                                                         dup(readyFence));
           ...

            mTexture = nullptr;
        }
    }
    // 2. 消费者消费 Buffer
    status_t result = mDisplaySurface->advanceFrame(hdrSdrRatio);
    }
}
  1. 前面已经提到 mNativeWindow 实际类型是 Surface,而 Surface 持有 BufferQueue 的生产者接口 mGraphicBufferProducer,所以可以通过 mGraphicBufferProducer 把 Buffer 插入 BufferQueue 中,前面已经分析过 mGraphicBufferProducer->dequeuebuffer 的源码juejin.cn/post/752982... 至于 mGraphicBufferProducer->queuebuffer 的源码这里就不分析了,代码在 frameworks/native/libs/gui/BufferQueueProducer.cpp 大家有兴趣可以去了解一下, 对于整个合成流程来说,这个是次要矛盾。

  2. 通知消费者消费 Buffer: mDisplaySurface->advanceFrame

ini 复制代码
status_t FramebufferSurface::advanceFrame(float hdrSdrRatio) {
    Mutex::Autolock lock(mMutex);

    BufferItem item;
    // 1. FramebufferSurface 作为 BufferQueue 的消费者,
    // 可以通过 BufferQueue 提供的接口 acquireBufferLocked 从 BufferQueue 中获取合成好的 Buffer
    status_t err = acquireBufferLocked(&item, 0);
    if (mCurrentBufferSlot != BufferQueue::INVALID_BUFFER_SLOT &&
        item.mSlot != mCurrentBufferSlot) {
        mHasPendingRelease = true;
        mPreviousBufferSlot = mCurrentBufferSlot;
        mPreviousBuffer = mCurrentBuffer;
    }
    mCurrentBufferSlot = item.mSlot;
    mCurrentBuffer = mSlots[mCurrentBufferSlot].mGraphicBuffer;
    mCurrentFence = item.mFence;
    mDataspace = static_cast<Dataspace>(item.mDataSpace);

    // assume HWC has previously seen the buffer in this slot
    sp<GraphicBuffer> hwcBuffer = sp<GraphicBuffer>(nullptr);
    if (mCurrentBuffer->getId() != mHwcBufferIds[mCurrentBufferSlot]) {
        mHwcBufferIds[mCurrentBufferSlot] = mCurrentBuffer->getId();
        hwcBuffer = mCurrentBuffer; // HWC hasn't previously seen this buffer in this slot
    }
    // 2. 把通过 GPU 合成好的 Buffer 提交给 HWComposer 的 Waiter 暂存
    status_t result = mHwc.setClientTarget(mDisplayId, mCurrentBufferSlot, mCurrentFence, hwcBuffer,
                                           mDataspace, hdrSdrRatio);
     ...
}
  1. juejin.cn/post/753501... FramebufferSurfaceBufferQueue 的消费者(继承于 ComsumerBase),所以可以通过 acquireBufferLockedBufferQueue 中获取 GraphicBuffer,也就是 GPU 刚刚合成之后的结果:

acquireBufferLocked 的实现如下:

ini 复制代码
status_t ConsumerBase::acquireBufferLocked(BufferItem *item,
        nsecs_t presentWhen, uint64_t maxFrameNumber) {
    ...
    // 这里的 mConsumer 是 IGraphicBufferConsumer 的实例对象 
    status_t err = mConsumer->acquireBuffer(item, presentWhen, maxFrameNumber);
    if (item->mGraphicBuffer != nullptr) {
        if (mSlots[item->mSlot].mGraphicBuffer != nullptr) {
            freeBufferLocked(item->mSlot);
        }
        mSlots[item->mSlot].mGraphicBuffer = item->mGraphicBuffer;
    }

    mSlots[item->mSlot].mFrameNumber = item->mFrameNumber;
    mSlots[item->mSlot].mFence = item->mFence;

    return OK;
}

mConsumer->acquireBuffer 的实现在 frameworks/native/libs/gui/BufferQueueConsumer.cpp,具体的实现在 juejin.cn/post/753467... 已经分析过了

  1. mHwc.setClientTarget 的调用链如下
rust 复制代码
    HWComposer::setClientTarget
        --> Display::setClientTarget
            --> AidlComposer::setClientTarget

通过 AidlComposer::setClientTarget 把 buffer 暂存到 Writer

arduino 复制代码
Error AidlComposer::setClientTarget(Display display, uint32_t slot, const sp<GraphicBuffer>& target,
                                    int acquireFence, Dataspace dataspace,
                                    const std::vector<IComposerClient::Rect>& damage,
                                    float hdrSdrRatio) {
    const native_handle_t* handle = nullptr;
    if (target.get()) {
        handle = target->getNativeBuffer()->handle;
    }
    mMutex.lock_shared();
    if (auto writer = getWriter(display)) {
        writer->get()
                .setClientTarget(translate<int64_t>(display), slot, handle, acquireFence,
                                 translate<aidl::android::hardware::graphics::common::Dataspace>(
                                         dataspace),
                                 translate<AidlRect>(damage), hdrSdrRatio);
    } 
    return error;
}

setClientTarget 的实现如下

arduino 复制代码
    //hardware/interfaces/graphics/composer/aidl/include/android/hardware/graphics/composer3/ComposerClientWriter.h
    void setClientTarget(int64_t display, uint32_t slot, const native_handle_t* target,
                         int acquireFence, Dataspace dataspace, const std::vector<Rect>& damage,
                         float hdrSdrRatio) {
        ClientTarget clientTargetCommand;
        // GPU 合成之后的 Buffer
        clientTargetCommand.buffer = getBufferCommand(slot, target, acquireFence);
        clientTargetCommand.dataspace = dataspace;
        clientTargetCommand.damage.assign(damage.begin(), damage.end());
        clientTargetCommand.hdrSdrRatio = hdrSdrRatio;
        getDisplayCommand(display).clientTarget.emplace(std::move(clientTargetCommand));
    }

到这里为止 Output::presentfinishFrame 就分析完了,我们看 Output::present 最后一个关键方法 presentFrameAndReleaseLayers

5. presentFrameAndReleaseLayers

scss 复制代码
void Output::presentFrameAndReleaseLayers(bool flushEvenWhenDisabled) {
    ...
    // 1. 提交 GPU 合成结果到 HWC
    auto frame = presentFrame();

    mRenderSurface->onPresentDisplayCompleted();
    // 2. 遍历所有的 OutputLayer,执行 release
    for (auto* layer : getOutputLayersOrderedByZ()) {
        // The layer buffer from the previous frame (if any) is released
        // by HWC only when the release fence from this frame (if any) is
        // signaled.  Always get the release fence from HWC first.
        sp<Fence> releaseFence = Fence::NO_FENCE;

        if (auto hwcLayer = layer->getHwcLayer()) {
            if (auto f = frame.layerFences.find(hwcLayer); f != frame.layerFences.end()) {
                releaseFence = f->second;
            }
        }

        if (outputState.usesClientComposition) {
            releaseFence =
                    Fence::merge("LayerRelease", releaseFence, frame.clientTargetAcquireFence);
        }
        layer->getLayerFE().setReleaseFence(releaseFence);
        layer->getLayerFE().setReleasedBuffer(layer->getLayerFE().getCompositionState()->buffer);
    }
    for (auto& weakLayer : mReleasedLayers) {
        if (const auto layer = weakLayer.promote()) {
            layer->setReleaseFence(frame.presentFence);
        }
    }

    // Clear out the released layers now that we're done with them.
    mReleasedLayers.clear();
}

1. 提交 GPU 合成结果到 HWC

presentFrame 的实现如下

rust 复制代码
    Display::presentFrame
        --> HWComposer::presentAndGetReleaseFences
          --> Display::present
            --> AidlComposer::presentDisplay     

AidlComposer::presentDisplay 的实现如下

ini 复制代码
Error AidlComposer::presentDisplay(Display display, int* outPresentFence) {
    const auto displayId = translate<int64_t>(display);
    // 拿到对应的 Writer
    auto writer = getWriter(display);
    auto reader = getReader(display);
    if (writer && reader) {
       
        writer->get().presentDisplay(displayId);
        // Binder Call, 把暂存在 Writer 的 Buffer(GPU 合成结果) 发送给 HWComposer Server
        error = execute(display);
    } else {
        error = Error::BAD_DISPLAY;
    }
    auto fence = reader->get().takePresentFence(displayId);
    mMutex.unlock_shared();
    *outPresentFence = fence.get();
    *fence.getR() = -1;
    return Error::NONE;
}

这里可以重点关注 execute

scss 复制代码
Error AidlComposer::execute(Display display) {
    auto writer = getWriter(display);
    auto reader = getReader(display);

    // 获取暂存在 Writer 的数据
    auto commands = writer->get().takePendingCommands();

    { // scope for results
        std::vector<CommandResultPayload> results;
        // 通过 Binder 发送给 HWComposer Server,执行 present
        auto status = mAidlComposerClient->executeCommands(commands, &results);
        if (!status.isOk()) {
            ALOGE("executeCommands failed %s", status.getDescription().c_str());
            return static_cast<Error>(status.getServiceSpecificError());
        }

        reader->get().parse(std::move(results));
    }
    return error;
}
  1. Writer 中获取暂存在其中的数据: GPU 合成之后的 Buffer
  2. 通过 Binder Call,把数据发给 HWComposer Server 进行 present,HWComposer Server 端的实现,不同的手机厂商实现不一样。

提交完成之后,会执行资源的 release。到这里为止 presentFrameAndReleaseLayers 就结束了,整个 Output::present 的分析就结束了,到此为止,Layer 的合成就结束了,包括硬件合成 和 GPU 合成,总结一下:

  1. 所有参与合成的 Layer 主要有其中一个 Layer 需要使用 GPU 合成,那么 GPU 合成就是必须的
  2. 所有的 Layer 如果都可以使用硬件合成,那么会优先使用硬件合成,并且只需要执行一次合成
  3. 如果 GPU 也参与合成,那么硬件合成会执行两次合成

回到 CompositionEngine::present

scss 复制代码
void CompositionEngine::present(CompositionRefreshArgs& args) {
    // 遍历每个 LayerFE 的 onPreComposition,记录合成开始时间
    preComposition(args);

    {
        LayerFESet latchedLayers;
        // 前面提到:每一个 Output 代表一个显示设备
        for (const auto& output : args.outputs) {
            // 重点1:Output prepare
            output->prepare(args, latchedLayers);
        }
    }

    offloadOutputs(args.outputs);
    // ftl::Future<T> 是一个异步任务
    ui::DisplayVector<ftl::Future<std::monostate>> presentFutures;
    for (const auto& output : args.outputs) {
        // 遍历每一个显示设备,把 output->present 添加到异步队列容器
        presentFutures.push_back(output->present(args));
    }

    {
        for (auto& future : presentFutures) {
            // 执行 output->present
            future.get();
        }
    }
    postComposition(args);
}

接下来就只剩下 postComposition 最后一个方法了

CompositionEngine::postComposition

less 复制代码
// If a buffer is latched but the layer is not presented, such as when
// obscured by another layer, the previous buffer needs to be released. We find
// these buffers and fire a NO_FENCE to release it. This ensures that all
// promises for buffer releases are fulfilled at the end of composition.
void CompositionEngine::postComposition(CompositionRefreshArgs& args) {

    for (auto& layerFE : args.layers) {
        if (layerFE->getReleaseFencePromiseStatus() ==
            LayerFE::ReleaseFencePromiseStatus::INITIALIZED) {
            layerFE->setReleaseFence(Fence::NO_FENCE);
        }
    }

    // List of layersWithQueuedFrames does not necessarily overlap with
    // list of layers, so those layersWithQueuedFrames also need any
    // unfulfilled promises to be resolved for completeness.
    for (auto& layerFE : args.layersWithQueuedFrames) {
        if (layerFE->getReleaseFencePromiseStatus() ==
            LayerFE::ReleaseFencePromiseStatus::INITIALIZED) {
            layerFE->setReleaseFence(Fence::NO_FENCE);
        }
    }
}

postComposition 主要就是对 Fence 的释放。Fence 是一种同步机制,表示某个异步操作(GPU 渲染等) 是否完成,如果 Layer 的 buffer 的 Fence 没有被设置为 Fence::NO_FENCE,系统将认为该 Buffer 仍在使用中,不能释放。

相关推荐
参宿四南河三19 分钟前
Android Compose SideEffect(副作用)实例加倍详解
android·app
火柴就是我1 小时前
mmkv的 mmap 的理解
android
没有了遇见1 小时前
Android之直播宽高比和相机宽高比不支持后动态获取所支持的宽高比
android
shenshizhong2 小时前
揭开 kotlin 中协程的神秘面纱
android·kotlin
vivo高启强2 小时前
如何简单 hack agp 执行过程中的某个类
android
沐怡旸2 小时前
【底层机制】 Android ION内存分配器深度解析
android·面试
你听得到113 小时前
肝了半个月,我用 Flutter 写了个功能强大的图片编辑器,告别image_cropper
android·前端·flutter
KevinWang_3 小时前
Android 原生 app 和 WebView 如何交互?
android
用户69371750013843 小时前
Android Studio中Gradle、AGP、Java 版本关系:不再被构建折磨!
android·android studio
杨筱毅4 小时前
【底层机制】Android低内存管理机制深度解析
android·底层机制