安卓硬件加速
本文基于安卓11。
从 Android 3.0 (API 级别 11) 开始,Android 2D 渲染管道支持硬件加速,这意味着在 View 的画布上执行的所有绘图操作都使用 GPU。由于启用硬件加速所需的资源增加,你的应用程序将消耗更多内存。
软件绘制:
- Invalidate the hierarchy
- Draw the hierarchy
软件绘制在每次draw时都需要执行大量操作,比如一个Button位于另一个View上,当Button执行invalidate()
,系统也重新绘制View尽管它什么都没有改变。
和硬件加速绘制:
- Invalidate the hierarchy
- Record and update display lists
- Draw the display lists
和软件绘制不同,硬件绘制不是立即执行绘制操作,而是UI线程把繁杂的绘制操作记录保存在display list当中,renderThread执行其中的绘制命令,对比软件绘制,硬件绘制只需要记录和更新dirty的View,也就是执行了invalidate()
的View,其他的View可以重用display list中的记录。
其具体实现在hwui模块。
hwui UML:
1. RenderProxy
RenderProxy作为hwui提供给应用的功能接口,应用层通过ThreadedRenderer调用RenderProxy,RenderProxy内部持有RenderThread、CanvasContext、DrawFrameTask对象,CanvasContext拥有实际操作画面的能力,DrawFrameTask是对CanvasContext能力的封装。
ThreadedRenderer继承自HardwareRenderer,HardwareRenderer持有mNativeProxy
变量,作为native层hwlib模块RenderProxy的引用。
RenderProxy提供了setSurface(), syncAndDrawFrame(),
等API供应用使用。
2. RenderThread
java
//ThreadedRenderer.java
void draw(View view, AttachInfo attachInfo, DrawCallbacks callbacks) {
final Choreographer choreographer = attachInfo.mViewRootImpl.mChoreographer;
choreographer.mFrameInfo.markDrawStart();
updateRootDisplayList(view, callbacks);
// register animating rendernodes which started animating prior to renderer
// creation, which is typical for animators started prior to first draw
if (attachInfo.mPendingAnimatingRenderNodes != null) {
final int count = attachInfo.mPendingAnimatingRenderNodes.size();
for (int i = 0; i < count; i++) {
registerAnimatingRenderNode(
attachInfo.mPendingAnimatingRenderNodes.get(i));
}
attachInfo.mPendingAnimatingRenderNodes.clear();
// We don't need this anymore as subsequent calls to
// ViewRootImpl#attachRenderNodeAnimator will go directly to us.
attachInfo.mPendingAnimatingRenderNodes = null;
}
int syncResult = syncAndDrawFrame(choreographer.mFrameInfo);
if ((syncResult & SYNC_LOST_SURFACE_REWARD_IF_FOUND) != 0) {
Log.w("OpenGLRenderer", "Surface lost, forcing relayout");
// We lost our surface. For a relayout next frame which should give us a new
// surface from WindowManager, which hopefully will work.
attachInfo.mViewRootImpl.mForceNextWindowRelayout = true;
attachInfo.mViewRootImpl.requestLayout();
}
if ((syncResult & SYNC_REDRAW_REQUESTED) != 0) {
attachInfo.mViewRootImpl.invalidate();
}
}
对于硬件加速的设备,绘制时启动新线程RenderThread负责绘制工作,RenderThread继承Thread类,但不是指Java层的ThreadedRenderer类,而是native层hwui的RenderThread,可以理解为Java层的ThreadedRenderer作为RenderThread的一个接口。
ThreadedRenderer的draw方法主要有两个步骤。
- 更新DisplayList,
updateRootDisplayList
:
更新DisplayList,分为LAYER_TYPE_SOFTWARE、LAYER_TYPE_HARDWARE两种情况:
- LAYER_TYPE_SOFTWARE:drawBitmap,每个View缓存了Bitmap对象mDrawingCache。
- LAYER_TYPE_HARDWARE: 更新DisplayList。
- 同步并提交绘制请求,
syncAndDrawFrame
:Syncs the RenderNode tree to the render thread and requests a frame to be drawn.
syncAndDrawFrame
通过上述引用调用RenderProxy的syncAndDrawFrame
方法,RenderProxy在RenderThread添加一个新的任务,执行DrawFrameTask的run()
方法。
3. ReliableSurface
Surface初始化完成后,就可以把它传递给hwui模块的RenderProxy、CanvasContext、IRenderPipeline等对象使用。
java
//ViewRootImpl.java
private void performTraversals() {
bool surfaceCreated = !hadSurface && mSurface.isValid();
bool surfaceDestroyed = hadSurface && !mSurface.isValid();
bool surfaceReplaced = (surfaceGenerationId != mSurface.getGenerationId())
&& mSurface.isValid();
if (surfaceCreated) {
if (mAttachInfo.mThreadedRenderer != null) {
hwInitialized = mAttachInfo.mThreadedRenderer.initialize(mSurface);
if (hwInitialized && (host.mPrivateFlags
& View.PFLAG_REQUEST_TRANSPARENT_REGIONS) == 0) {
// Don't pre-allocate if transparent regions
// are requested as they may not be needed
mAttachInfo.mThreadedRenderer.allocateBuffers();
}
}
} else if (surfaceDestroyed) {
if (mAttachInfo.mThreadedRenderer != null &&
mAttachInfo.mThreadedRenderer.isEnabled()) {
mAttachInfo.mThreadedRenderer.destroy();
}
} else if ((surfaceReplaced
|| surfaceSizeChanged || windowRelayoutWasForced || colorModeChanged) {
mAttachInfo.mThreadedRenderer.updateSurface(mSurface);
}
}
ViewRootImpl判断surface状态是否是创建(surfaceCreated)、销毁(surfaceDestroyed)或者更新(surfaceReplaced|Changed),创建销毁和更新都是执行的同一个方法,销毁的时候setSurface(null)
,创建和更新setSurface(mSurface)。
mThreadedRenderer将mSurface通过RenderProxy传递给CanvasContext,更新其mNativeSurface变量std::unique_ptr<ReliableSurface> mNativeSurface;
。
ReliableSurface持有类变量ANativeWindow* mWindow;
,是ANativeWindow的装饰者模式,ANativeWindow提供了扩展接口,使ReliableSurface可以在不改变现有对象结构的情况下,动态地向Surface对象添加功能,在其init()
方法中通过添加拦截器,通过ANativeWindow扩展接口,将ReliableSurface的方法动态插入到Surface的接口中,通过拦截和管理ANativeWindow的操作,增强了对图形缓冲区的控制,从而提升系统的稳定性和渲染效果,例如检查缓冲区的状态是否合法、在操作失败时尝试恢复或提供警告、优化缓冲区的分配和释放逻辑等。
cpp
//ReliableSurface.cpp
void ReliableSurface::init() {
int result = ANativeWindow_setCancelBufferInterceptor(mWindow, hook_cancelBuffer, this);
LOG_ALWAYS_FATAL_IF(result != NO_ERROR, "Failed to set cancelBuffer interceptor: error = %d",
result);
result = ANativeWindow_setDequeueBufferInterceptor(mWindow, hook_dequeueBuffer, this);
LOG_ALWAYS_FATAL_IF(result != NO_ERROR, "Failed to set dequeueBuffer interceptor: error = %d",
result);
result = ANativeWindow_setQueueBufferInterceptor(mWindow, hook_queueBuffer, this);
LOG_ALWAYS_FATAL_IF(result != NO_ERROR, "Failed to set queueBuffer interceptor: error = %d",
result);
result = ANativeWindow_setPerformInterceptor(mWindow, hook_perform, this);
LOG_ALWAYS_FATAL_IF(result != NO_ERROR, "Failed to set perform interceptor: error = %d",
result);
result = ANativeWindow_setQueryInterceptor(mWindow, hook_query, this);
LOG_ALWAYS_FATAL_IF(result != NO_ERROR, "Failed to set query interceptor: error = %d",
result);
}
ANativeWindow提供了ANativeWindow_setCancelBufferInterceptor、ANativeWindow_setDequeueBufferInterceptor、ANativeWindow_setQueueBufferInterceptor等扩展接口,ReliableSurface分别用自己的hook_cancelBuffer、hook_dequeueBuffer、hook_queueBuffer等方法替代native层Surface的实现。
cpp
//ANativeWindow.cpp
int ANativeWindow_setDequeueBufferInterceptor(ANativeWindow* window,
ANativeWindow_dequeueBufferInterceptor interceptor,
void* data) {
return window->perform(window, NATIVE_WINDOW_SET_DEQUEUE_INTERCEPTOR, interceptor, data);
}
ANativeWindow提供的扩展接口。
cpp
//window.h
int (*perform)(struct ANativeWindow* window,
int operation, ... );
Surface作为ANativeWindow的接口实现,实现了perform方法。
cpp
//Surface.cpp
int Surface::perform(int operation, va_list args)
{
int res = NO_ERROR;
switch (operation) {
case NATIVE_WINDOW_SET_DEQUEUE_INTERCEPTOR:
res = dispatchAddDequeueInterceptor(args);
break;
}
return res;
}
int Surface::dispatchAddDequeueInterceptor(va_list args) {
ANativeWindow_dequeueBufferInterceptor interceptor =
va_arg(args, ANativeWindow_dequeueBufferInterceptor);
void* data = va_arg(args, void*);
std::lock_guard<std::shared_mutex> lock(mInterceptorMutex);
mDequeueInterceptor = interceptor;
mDequeueInterceptorData = data;
return NO_ERROR;
}
将ReliableSurface的hook_dequeueBuffer
实现赋值给了Surface的mDequeueInterceptor
变量,Surface在hook_dequeueBuffer时检查拦截器是否为空,如果不为空的话调用拦截器的操作。
cpp
//Surface.cpp
int Surface::hook_dequeueBuffer(ANativeWindow* window,
ANativeWindowBuffer** buffer, int* fenceFd) {
Surface* c = getSelf(window);
{
std::shared_lock<std::shared_mutex> lock(c->mInterceptorMutex);
if (c->mDequeueInterceptor != nullptr) {
auto interceptor = c->mDequeueInterceptor;
auto data = c->mDequeueInterceptorData;
return interceptor(window, Surface::dequeueBufferInternal, data, buffer, fenceFd);
}
}
return c->dequeueBuffer(buffer, fenceFd);
}
Surface的hook_dequeueBuffer在其构造函数中被绑定到ANativeWindow的dequeueBuffer函数指针上,从此dequeueBuffer都会调用ReliableSurface动态插入的hook_dequeueBuffer方法。
4. IRenderPipeline
前面说到应用层ViewRootImple实例化Surface对象通过RenderProxy接口传递给hwui模块,CanvasContext、IRenderPipeline对象需要Surface对象开始图形绘制工作,安卓支持两种渲染管线,OpenGL和Vulkan,这里是OpenGL的实现SkiaOpenGLPipeline,SkiaOpenGLPipeline通过使用跨平台的接口EGL管理OpenGL ES的上下文,可以看作是OpenGL ES提供给应用的接口。
setSurface(mSurface)
最终SkiaOpenGLPipeline通过EglManager调用eglCreateWindowSurface
,将窗口对象mSurface作为参数,EGL 创建一个新的 EGLSurface 对象,并将其连接到窗口对象的 BufferQueue 的生产方接口,此后,渲染到该 EGLSurface 会导致一个缓冲区离开队列、进行渲染,然后排队等待消费方使用。
setSurface(null)
在!mSurface.isValid()
时调用,判断当前是否需要保留或者丢弃buffer,最终通过eglSurfaceAttrib
改变EGL的buffer行为。
eglCreateWindowSurface
只是创建了一个EGLSurface,还需要等到应用请求提交当前帧eglSwapBuffersWithDamageKHR
发出绘制命令才能看到绘制的画面。
4.1 EGLSurface
关注一下EGLSurface是怎么创建的,它和Surface的关系是什么。
cpp
//SkiaOpenGLPipeline.cpp
bool SkiaOpenGLPipeline::setSurface(ANativeWindow* surface, SwapBehavior swapBehavior) {
if (surface) {
mRenderThread.requireGlContext();
auto newSurface = mEglManager.createSurface(surface, mColorMode, mSurfaceColorSpace);
if (!newSurface) {
return false;
}
mEglSurface = newSurface.unwrap();
}
}
传递ANativeWindow* surface
给EglManager。
cpp
Result<EGLSurface, EGLint> EglManager::createSurface(EGLNativeWindowType window,
ColorMode colorMode,
sk_sp<SkColorSpace> colorSpace) {
EGLSurface surface = eglCreateWindowSurface(
mEglDisplay, wideColorGamut ? mEglConfigWideGamut : mEglConfig, window, attribs);
return surface;
}
注意看这里surface对象被从ANativeWindow类型转换成了EGLNativeWindowType类型,EGLNativeWindowType被定义在EGL模块。
cpp
//EGL/eglplatform.h
#elif defined(__ANDROID__) || defined(ANDROID)
struct ANativeWindow;
struct egl_native_pixmap_t;
typedef void* EGLNativeDisplayType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef struct ANativeWindow* EGLNativeWindowType;
#elif defined(USE_OZONE)
EGL的eglplatform.h头文件定义了在Android平台,EGLNativeWindowType就是ANativeWindow*
类型,安卓native层的Surface对象作为ANativeWindow的实现,被作为参数传递给eglCreateWindowSurface
方法创建了EGLSurface对象,后续eglSwapBuffersWithDamageKHR
交换缓冲区也是这个对象。
5. DrawFrameTask
cpp
//DrawFrameTask.cpp
void DrawFrameTask::run() {
ATRACE_NAME("DrawFrame");
bool canUnblockUiThread;
bool canDrawThisFrame;
{
TreeInfo info(TreeInfo::MODE_FULL, *mContext);
canUnblockUiThread = syncFrameState(info);
canDrawThisFrame = info.out.canDrawThisFrame;
if (mFrameCompleteCallback) {
mContext->addFrameCompleteListener(std::move(mFrameCompleteCallback));
mFrameCompleteCallback = nullptr;
}
}
// Grab a copy of everything we need
CanvasContext* context = mContext;
std::function<void(int64_t)> callback = std::move(mFrameCallback);
mFrameCallback = nullptr;
// From this point on anything in "this" is *UNSAFE TO ACCESS*
if (canUnblockUiThread) {
unblockUiThread();
}
// Even if we aren't drawing this vsync pulse the next frame number will still be accurate
if (CC_UNLIKELY(callback)) {
context->enqueueFrameWork(
[callback, frameNr = context->getFrameNumber()]() { callback(frameNr); });
}
if (CC_LIKELY(canDrawThisFrame)) {
context->draw();
} else {
// wait on fences so tasks don't overlap next frame
context->waitOnFences();
}
if (!canUnblockUiThread) {
unblockUiThread();
}
}
UI线程(主线程)在RenderThread添加一个新的任务,执行DrawFrameTask的run()
方法,UI线程阻塞等待RenderThread从UI线程同步完绘制所需要的信息之后,包括各个RenderNode的DisplayList、RenderProperties等属性,同步完判读是否能unblockUiThread
发出信号,UI线程才能退出继续执行其他任务,重点关注context->draw();
方法。
cpp
void CanvasContext::draw() {
Frame frame = mRenderPipeline->getFrame(); // dequeueBuffer
setPresentTime();
SkRect windowDirty = computeDirtyRect(frame, &dirty);
bool drew = mRenderPipeline->draw(frame, windowDirty, dirty, mLightGeometry, &mLayerUpdateQueue,
mContentDrawBounds, mOpaque, mLightInfo, mRenderNodes,
&(profiler()));
int64_t frameCompleteNr = getFrameNumber();
waitOnFences();
bool requireSwap = false;
int error = OK;
// queueBuffer
bool didSwap =
mRenderPipeline->swapBuffers(frame, drew, windowDirty, mCurrentFrameInfo, &requireSwap);
}
CanvasContext::draw执行一系列渲染操作,将绘制结果呈现到显示设备上。
-
获取帧。
mRenderPipeline->getFrame()
,作为图形队列中的生产者,getFrame通过gui模块的Surface对象dequeueBuffer申请GraphicBuffer,Surface对象由上文的setSurface
方法传递过来。 -
计算脏区域(需要更新的区域)。
computeDirtyRect(frame, &dirty)
-
绘制当前帧。
mRenderPipeline->draw
,向申请的GraphicBuffer中填充数据。 -
等待所有任务完成。
waitOnFences
-
交换缓冲区并提交渲染结果。
mRenderPipeline->swapBuffers
,填充完成后通过gui模块的Surface对象queueBuffer将GraphicBuffer加入队列中。
5.1 draw
mRenderPipeline->draw
cpp
void SkiaPipeline::renderFrame(const LayerUpdateQueue& layers, const SkRect& clip,
const std::vector<sp<RenderNode>>& nodes, bool opaque,
const Rect& contentDrawBounds, sk_sp<SkSurface> surface,
const SkMatrix& preTransform) {
// Initialize the canvas for the current frame, that might be a recording canvas if SKP
// capture is enabled.
SkCanvas* canvas = tryCapture(surface.get(), nodes[0].get(), layers);
// draw all layers up front
renderLayersImpl(layers, opaque);
renderFrameImpl(clip, nodes, opaque, contentDrawBounds, canvas, preTransform);
endCapture(surface.get());
if (CC_UNLIKELY(Properties::debugOverdraw)) {
renderOverdraw(clip, nodes, contentDrawBounds, surface, preTransform);
}
ATRACE_NAME("flush commands");
surface->getCanvas()->flush();
}
- tryCapture:Returns the canvas that records the drawing commands.
- renderFrameImpl:执行绘制命令。
- endCapture:Signal that the caller is done recording.
- surface->getCanvas()->flush();刷新fBytes缓存。
renderFrameImpl执行DisplayList记录的绘制操作,实际调用SkCanvas的绘制命令,例如canvas->drawRect(bounds, layerPaint)
,RecordingCanvas继承自SkCanvas,调用其onDrawRect
方法:
cpp
void RecordingCanvas::onDrawRect(const SkRect& rect, const SkPaint& paint) {
fDL->drawRect(rect, paint);
}
fDL是DisplayListData* fDL;
对象
cpp
void DisplayListData::drawRect(const SkRect& rect, const SkPaint& paint) {
this->push<DrawRect>(0, rect, paint);
}
cpp
template <typename T, typename... Args>
void* DisplayListData::push(size_t pod, Args&&... args) {
size_t skip = SkAlignPtr(sizeof(T) + pod);
SkASSERT(skip < (1 << 24));
if (fUsed + skip > fReserved) {
static_assert(SkIsPow2(SKLITEDL_PAGE), "This math needs updating for non-pow2.");
// Next greater multiple of SKLITEDL_PAGE.
fReserved = (fUsed + skip + SKLITEDL_PAGE) & ~(SKLITEDL_PAGE - 1);
fBytes.realloc(fReserved);
}
SkASSERT(fUsed + skip <= fReserved);
auto op = (T*)(fBytes.get() + fUsed);
fUsed += skip;
new (op) T{std::forward<Args>(args)...};
op->type = (uint32_t)T::kType;
op->skip = skip;
return op + 1;
}
fBytes是SkAutoTMalloc<uint8_t> fBytes;
,保存了所有绘制操作的内存空间,DisplayListData::push向其添加绘制操作,然后调用displayList->draw(canvas)
读取保存的数据开始真正的绘制操作:
cpp
void DisplayListData::draw(SkCanvas* canvas) const {
SkAutoCanvasRestore acr(canvas, false);
this->map(draw_fns, canvas, canvas->getTotalMatrix());
}
draw_fn定义在"DisplayListOps.in"。
cpp
#define X(T) \
[](const void* op, SkCanvas* c, const SkMatrix& original) { \
((const T*)op)->draw(c, original); \
},
static const draw_fn draw_fns[] = {
#include "DisplayListOps.in"
};
#undef X
DisplayListOps.in定义了所有的绘制方法,X(T)宏生成一个 lambda 表达式,将 const void*
类型的对象转换为 T
类型,并调用该类型的 draw
方法来执行绘制操作。
cpp
X(Flush)
X(Save)
X(Restore)
...
X(Scale)
X(Translate)
X(ClipPath)
X(ClipRect)
X(ClipRRect)
...
X(DrawPaint)
X(DrawBehind)
X(DrawPath)
X(DrawRect)
...
例如DrawRect:
cpp
struct Op {
uint32_t type : 8;
uint32_t skip : 24;
};
struct DrawRect final : Op {
static const auto kType = Type::DrawRect;
DrawRect(const SkRect& rect, const SkPaint& paint) : rect(rect), paint(paint) {}
SkRect rect;
SkPaint paint;
void draw(SkCanvas* c, const SkMatrix&) const { c->drawRect(rect, paint); }
};
DisplayListData::map是一个模板方法,遍历查找fBytes中是否存在Type::DrawRect
,如果存在调用drawRect(rect, paint)
。
cpp
template <typename Fn, typename... Args>
inline void DisplayListData::map(const Fn fns[], Args... args) const {
auto end = fBytes.get() + fUsed;
for (const uint8_t* ptr = fBytes.get(); ptr < end;) {
auto op = (const Op*)ptr;
auto type = op->type;
auto skip = op->skip;
if (auto fn = fns[type]) { // We replace no-op functions with nullptrs
fn(op, args...); // to avoid the overhead of a pointless call.
}
ptr += skip;
}
}
5.2 swapBuffers
最终SkiaOpenGLPipeline通过EglManager调用eglSwapBuffersWithDamageKHR
交换指定的脏区域的缓冲区内容提交当前帧,EGL 的工作机制是双缓冲模式,一个 Back Frame Buffer 和一个 Front Frame Buffer,正常绘制操作的目标都是 Back Frame Buffer,渲染完毕之后,调用eglSwapBuffersWithDamageKHR
这个 API,会将绘制完毕的 Back Frame Buffer 与当前的 Front Frame Buffer 进行交换,buffer被EGL渲染完成。