Android 13 - Media框架（11）- MediaCodec（一）

MediaCodec 是 Android 平台上音视频编解码的标准接口，无论是使用软解还是硬解都要通过调用 MediaCodec来完成，是学习 Android 音视频不可跳过的重要部分。MediaCodec 部分的代码有几千行，光是头文件就有几百行，对于我这样的新手来说，简直就劝退了，又或者是硬着头皮往下看，一行一行阅读，看到里面的各种状态各种变量，很容易就晕了。我们这一篇笔记旨在从设计思路的角度了解 MediaCodec，不仅仅是粘贴代码流程，力求更好地帮助理解 MediaCodec 的原理。

ps：由于本人水平有限，MediaCodec 中的部分内容也没有理解，如果有错误还恳请指正。

1、准备工作

MediaCodec 中有一些类我们暂时可以不用看，例如 ResourceManagerServiceProxy，大概是用于资源管理的，我们碰到可以跳过；还有一个类 Histogram，应该是做 decoder 性能统计用的，阅读过程中碰到同样跳过。

目前我只阅读了 ACodec，所以文中设计 CCodec 的部分暂时跳过，另外关于 ACodec 我们只要了解有什么接口就行，不必深入了解它的内部实现。

MediaCodec 使用异步消息处理机制（AMessage/ALooper/AHandler），不是很了解的同学可以阅读之前的文章 Android 13 - Media框架 - 异步消息机制。

观察 MediaCodec 对外开放的接口可以发现，很多接口使用的是消息的同步处理方法，例如 configure、setCallback、queueInputBuffer、dequeueInputBuffer 等等都是调用的 AMessage 的 postAndAwaitResponse 方法，为什么这里不直接在函数体中实现功能，非要使用消息同步消息机制处理呢？原因我们在之前的笔记中已经提到过了，异步消息处理机制可以帮我们实现线程同步，避免异常状态的出现。ACodec 会频繁向 MediaCodec 上抛消息，如果接口调用过程中有一些事件发送，很容易就出现异常了，使用异步消息机制将上层命令和底层回调放到同一个线程中处理，就会井井有条了。

可能有人还会问，为什么MediaCodec 部分对外接口不用异步处理呢？有些接口不太适宜异步调用，比如上面提到的queueInputBuffer 等，设计为异步的会增加外层的设计难度；其他的我也不是很了解，但是我觉得 MediaCodec 的外层 NuPlayerDecoder 已经使用了异步处理，如果 MediaCodec 也使用异步，异步套异步的情况下需要考虑的状态反而会变的更多，可能会得不偿失。

接下来我们就正式去看 MediaCodec 的实现。

2、MediaCodec的创建

cpp 复制代码

    static sp<MediaCodec> CreateByType(
            const sp<ALooper> &looper, const AString &mime, bool encoder, status_t *err = NULL,
            pid_t pid = kNoPid, uid_t uid = kNoUid);

    static sp<MediaCodec> CreateByType(
            const sp<ALooper> &looper, const AString &mime, bool encoder, status_t *err,
            pid_t pid, uid_t uid, sp<AMessage> format);

    static sp<MediaCodec> CreateByComponentName(
            const sp<ALooper> &looper, const AString &name, status_t *err = NULL,
            pid_t pid = kNoPid, uid_t uid = kNoUid);

MediaCodec 隐藏了自己的构造函数，对外提供了三个静态函数用于创建自身实例，ps：这算不算建造者模式？。

CreateByType 会根据媒体类型（mime type）来选择合适的编解码组件（component），使用时需要指定创建解码器（decoder）还是编码器（encoder）；要注意的是第二个CreateByType会传递 format 参数，这里传递的参数并不会用于 configure，仅仅用于编解码组件的选择；
CreateByComponentName 是由使用者指定要创建的编解码组件；

这三个方法最终会调用 MediaCodec 构造函数，并且调用 init 方法将 component name 传递给 MediaCodec。

MediaCodec 构造函数主要用于初始化成员变量，需要重点关注的是两个成员 mGetCodecBase 和 mGetCodecInfo，这是两个函数指针：

mGetCodecBase：根据 component name 创建对应的 CodecBase 对象（ACodec/CCodec）；
mGetCodecInfo：获取 component 所支援的媒体类型信息（MediaCodecInfo）；

如果我们没有指定这两个函数，MediaCodec 为我们提供了默认实现，我们这里暂时先不了解。init 方法主要完成了 MediaCodec 的初始化工作，内容如下：

cpp 复制代码

status_t MediaCodec::init(const AString &name) {
    // save init parameters for reset
    mInitName = name;
    mCodecInfo.clear();

    bool secureCodec = false;
    const char *owner = "";
    if (!name.startsWith("android.filter.")) {
    	// 1、根据 component 获取 codecInfo
        err = mGetCodecInfo(name, &mCodecInfo);
        // 2、这里会做 double check
        if (err != OK) {
            mCodec = NULL;  // remove the codec.
            return err;
        }
        if (mCodecInfo == nullptr) {
            ALOGE("Getting codec info with name '%s' failed", name.c_str());
            return NAME_NOT_FOUND;
        }
        // 3、判断选择的是否是 secure component
        secureCodec = name.endsWith(".secure");
        Vector<AString> mediaTypes;
        // 4、获取当前的 codecInfo 中的 mediatype，判断创建的是什么组件
        mCodecInfo->getSupportedMediaTypes(&mediaTypes);
        for (size_t i = 0; i < mediaTypes.size(); ++i) {
            if (mediaTypes[i].startsWith("video/")) {
                mDomain = DOMAIN_VIDEO;
                break;
            } else if (mediaTypes[i].startsWith("audio/")) {
                mDomain = DOMAIN_AUDIO;
                break;
            } else if (mediaTypes[i].startsWith("image/")) {
                mDomain = DOMAIN_IMAGE;
                break;
            }
        }
        // 5、获取 component 隶属的架构
        owner = mCodecInfo->getOwnerName();
    }
	// 6、根据架构名和组件名创建CodecBase
    mCodec = mGetCodecBase(name, owner);
	// 7、如果是 video，则需要使用单独的 Looper
    if (mDomain == DOMAIN_VIDEO) {
        // video codec needs dedicated looper
        if (mCodecLooper == NULL) {
            status_t err = OK;
            mCodecLooper = new ALooper;
            mCodecLooper->setName("CodecLooper");
            err = mCodecLooper->start(false, false, ANDROID_PRIORITY_AUDIO);
        }

        mCodecLooper->registerHandler(mCodec);
    } else {
        mLooper->registerHandler(mCodec);
    }

    mLooper->registerHandler(this);
	// 8、创建 CodecCallback 和 BufferCallback
    mCodec->setCallback(
            std::unique_ptr<CodecBase::CodecCallback>(
                    new CodecCallback(new AMessage(kWhatCodecNotify, this))));
    mBufferChannel = mCodec->getBufferChannel();
    mBufferChannel->setCallback(
            std::unique_ptr<CodecBase::BufferCallback>(
                    new BufferCallback(new AMessage(kWhatCodecNotify, this))));
	// 9、涉及到 CodecBase 状态的调用，用消息处理
    sp<AMessage> msg = new AMessage(kWhatInit, this);
    if (mCodecInfo) {
        msg->setObject("codecInfo", mCodecInfo);
    }
    msg->setString("name", name);
    sp<AMessage> response;
    err = PostAndAwaitResponse(msg, &response);
    return err;
}

根据传入参数 component name 获取对应的 codec Info，再反向获取 component 所支持的 mime type；可能会有人问，我上面明明已经传入 mime type了，为什么这里还要费力再反向获取呢？我觉得这是为了 CreateByComponentName 服务的，有的时候我们指定了组件名称，但是 MediaCodec 并不知道它是 audio 组件还是 video 组件，也不知道该组件系统是否支持，所以这里做了一次判断，这个步骤对于 CreateByType 是多余的，因为调用 init 之前已经通过 MediaCodecList 获取了支持当前 mime 的组件明；
获取 component 隶属的架构，MediaCodec 当前串接有两个编解码框架，一个是老的 ACodec/OMX 架构，另一个是新的 CCodec 架构，owner name 就是用来区分 component 是哪个架构下的，之后会根据这个 name 创建对应的 CCodec；
调用 mGetCodecBase 函数，利用 owner name 和 component name 创建 CodecBase 对象，我们暂时只看 ACodec；
将 AHandler 注册到 ALooper 中，注意的是 Video Codec 将独享一个 looper，而 MediaCodec、Audio Codec 会共享上层传递的 looper；
给 ACodec 和 BufferChannel 注册回调消息；
发送消息，到 looper 线程中处理 kWhatInit 方法，调用 ACodec 的 initiateAllocateComponent 方法，这里我们要注意的是，发送消息调用的方法是 PostAndAwaitResponse，它内部封装的是 AMessage.postAndAwaitResponse，是阻塞处理消息。

接下来我们一起来看 kWhatInit 是如何处理的：

cpp 复制代码

        case kWhatInit:
        {
        	// 1、检查状态
            if (mState != UNINITIALIZED) {
                PostReplyWithError(msg, INVALID_OPERATION);
                break;
            }
			// 2、判断是否正在等待某个方法返回
            if (mReplyID) {
            	// 如果是就将消息先加入到容器中等待处理
                mDeferredMessages.push_back(msg);
                break;
            }
            sp<AReplyToken> replyID;
            CHECK(msg->senderAwaitsResponse(&replyID));

            mReplyID = replyID;
            // 3、设置新状态
            setState(INITIALIZING);

            sp<RefBase> codecInfo;
            (void)msg->findObject("codecInfo", &codecInfo);
            AString name;
            CHECK(msg->findString("name", &name));

            sp<AMessage> format = new AMessage;
            if (codecInfo) {
                format->setObject("codecInfo", codecInfo);
            }
            format->setString("componentName", name);
			// 4、调用 CodecBase 方法
            mCodec->initiateAllocateComponent(format);
            break;
        }

检查当前 MediaCodec 状态；我觉得 MediaCodec 的状态有两种作用，一个是检查当前的函数是否在合法状态下调用，另一个是当有事件需要处理时，根据当前的状态做出不同的反应，例如 callback 到达时根据当前状态做不同处理。
MediaCodec 会有一些中间状态，例如这里的 INITIALIZING，表示正在处理 init 的过程中。这里有个DeferredMessages用于存储即将延时处理的消息，这个情况什么时候会出现？比如说我们当前正在处理上层调用的 flush 方法，调用 ACodec 的异步方法后会等待消息返回，这期间收到了 BufferChannel 发过来的消息，消息会进入到 looper 线程处理，但是我们要先等待 ACodec flush 处理完成，再去处理 BufferChannel 的消息，因为 flush 之后所有的 buffer 将被刷新。我觉得 DeferredMessages 是用来处理消息优先级的，如果当前有上层函数调用（命令），将会优先等待这些消息处理完成。
调用 setState 设置当前状态，我们要了解的是出了设置状态外，里面还会重置 flag 等内容，flag 具体有什么作用我们后面再看；
调用 CodecBase 的 initiateAllocateComponent 方法，从这里我们大致可以了解到，创建 Component 只需要传递 component name 一个参数即可。

从这里我们可知道，调用 initiateAllocateComponent 传入的 format 参数中必须要有两个内容：

codecInfo：组件所对应的 codecInfo，如果为 NULL，说明在 MediaCodecList 中没有找到组件对应的信息，这个组件是系统所不支援的；
componentName：组件名称，ACodec 会根据该组件创建 omx 组件。

关于 secure component 和普通 component 的一些内容：我们平时创建的组件以普通组件为主，如果想要创建 secure 组件，一般是调用 CreateByComponentName 直接指定组件名，使用 CreateByType 似乎并不能指定创建 secure 还是 non-secure 组件。

ACodec 创建组件之后会调用 Callback 结束 MediaCodec 阻塞状态，具体有那些callback 参考 CodecBase::CodecCallback，这里调用的是 onComponentAllocated。

在看 onMessageReceived 方法时，我们要知道 AMessage.what 用于区分是什么模块发来的消息，AMessage 中的字段 what 用于区分是模块发来的具体消息内容。

cpp 复制代码

                case kWhatComponentAllocated:
                {
                    if (mState == RELEASING || mState == UNINITIALIZED) {
                        // In case a kWhatError or kWhatRelease message came in and replied,
                        // we log a warning and ignore.
                        ALOGW("allocate interrupted by error or release, current state %d/%s",
                              mState, stateString(mState).c_str());
                        break;
                    }
                    // 检查当前状态并且设置新的状态，设置flag
                    CHECK_EQ(mState, INITIALIZING);
                    setState(INITIALIZED);
                    mFlags |= kFlagIsComponentAllocated;
					// 检查创建的组件名称
                    CHECK(msg->findString("componentName", &mComponentName));

                    const char *owner = mCodecInfo ? mCodecInfo->getOwnerName() : "";
                    if (mComponentName.startsWith("OMX.google.")
                            && strncmp(owner, "default", 8) == 0) {
                        mFlags |= kFlagUsesSoftwareRenderer;
                    } else {
                        mFlags &= ~kFlagUsesSoftwareRenderer;
                    }
                    mOwnerName = owner;

                    if (mComponentName.endsWith(".secure")) {
                        mFlags |= kFlagIsSecure;
                        mediametrics_setInt32(mMetricsHandle, kCodecSecure, 1);
                    } else {
                        mFlags &= ~kFlagIsSecure;
                        mediametrics_setInt32(mMetricsHandle, kCodecSecure, 0);
                    }
					// 结束阻塞并且处理被延迟的消息
                    postPendingRepliesAndDeferredMessages("kWhatComponentAllocated");
                    break;
                }

kWhatComponentAllocated 的处理比较简单，主要是更新状态，更新 MediaCodec flags，结束阻塞调用并处理被推迟的消息。

cpp 复制代码

void MediaCodec::postPendingRepliesAndDeferredMessages(
        std::string origin, status_t err /* = OK */) {
    sp<AMessage> response{new AMessage};
    if (err != OK) {
        response->setInt32("err", err);
    }
    postPendingRepliesAndDeferredMessages(origin, response);
}

void MediaCodec::postPendingRepliesAndDeferredMessages(
        std::string origin, const sp<AMessage> &response) {
    LOG_ALWAYS_FATAL_IF(
            !mReplyID,
            "postPendingRepliesAndDeferredMessages: mReplyID == null, from %s following %s",
            origin.c_str(),
            mLastReplyOrigin.c_str());
    mLastReplyOrigin = origin;
    // 返回阻塞调用结果
    response->postReply(mReplyID);
    mReplyID.clear();
    ALOGV_IF(!mDeferredMessages.empty(),
            "posting %zu deferred messages", mDeferredMessages.size());
    // 将所有推迟的消息重新 post 出去处理
    for (sp<AMessage> msg : mDeferredMessages) {
        msg->post();
    }
    mDeferredMessages.clear();
}

3、configure

MediaCodec 提供了2个版本的 configure 方法：

cpp 复制代码

    status_t configure(
            const sp<AMessage> &format,
            const sp<Surface> &nativeWindow,
            const sp<ICrypto> &crypto,
            uint32_t flags);

    status_t configure(
            const sp<AMessage> &format,
            const sp<Surface> &nativeWindow,
            const sp<ICrypto> &crypto,
            const sp<IDescrambler> &descrambler,
            uint32_t flags);

format：需要编码/解码数据的格式信息；
nativeWindow：用于显示的窗口（surface），如果在configure时没有设定，也可以调用setSurface方法来设定；如果起播时没有设置窗口，后续也可以再调用 setSurface 方法来设置显示窗口；
crypto、descrambler：用于处理加密视频和解扰播放的参数；
flags：目前有两个flag，CONFIGURE_FLAG_ENCODE（用于标记当前codec用于编码还是解码）和 CONFIGURE_FLAG_USE_BLOCK_MODEL（目前还不太了解有什么用）。

先抛出一个问题，传递给 configure 方法的参数中需要包含什么信息呢？什么是必要的，什么是可选的？接下来我们就一起来看 configure 实现，我们将会暂时忽略 mediametrics 相关的内容。

cpp 复制代码

status_t MediaCodec::configure(
        const sp<AMessage> &format,
        const sp<Surface> &surface,
        const sp<ICrypto> &crypto,
        const sp<IDescrambler> &descrambler,
        uint32_t flags) {
    sp<AMessage> msg = new AMessage(kWhatConfigure, this);
    ...
    if (mIsVideo) {
    	format->findString("log-session-id", &mLogSessionId);
	    format->findInt32("width", &mVideoWidth);
	    format->findInt32("height", &mVideoHeight);
	    
        if (mVideoWidth < 0 || mVideoHeight < 0 ||
               (uint64_t)mVideoWidth * mVideoHeight > (uint64_t)INT32_MAX / 4) {
            ALOGE("Invalid size(s), width=%d, height=%d", mVideoWidth, mVideoHeight);
            return BAD_VALUE;
        }
    } else {
        if (nextMetricsHandle != 0) {
            int32_t channelCount;
            if (format->findInt32(KEY_CHANNEL_COUNT, &channelCount)) {
            }
            int32_t sampleRate;
            if (format->findInt32(KEY_SAMPLE_RATE, &sampleRate)) {
            }
        }
    }
    ......
    msg->setMessage("format", format);
    msg->setInt32("flags", flags);
    msg->setObject("surface", surface);

    if (crypto != NULL || descrambler != NULL) {
        if (crypto != NULL) {
            msg->setPointer("crypto", crypto.get());
        } else {
            msg->setPointer("descrambler", descrambler.get());
        }
    } 
    ......
}

可以看到，如果是用于解码video，那么必须要设置画面的宽高信息（可以先随意设置，但至少要有），如果没有那么 MediaCodec 会直接报错；audio可能需要channel count 和 sample rate信息，但是没有这里也不会报错。

cpp 复制代码

        case kWhatConfigure:
        {
            if (mState != INITIALIZED) {
                PostReplyWithError(msg, INVALID_OPERATION);
                break;
            }

            if (mReplyID) {
                mDeferredMessages.push_back(msg);
                break;
            }
            sp<AReplyToken> replyID;
            CHECK(msg->senderAwaitsResponse(&replyID));
			// 查找 msg 中的 surface
            sp<RefBase> obj;
            CHECK(msg->findObject("surface", &obj));
			// 查找 msg 中的 format
            sp<AMessage> format;
            CHECK(msg->findMessage("format", &format));
            // 结束播放时不保留最后一帧
            int32_t push;
            if (msg->findInt32("push-blank-buffers-on-shutdown", &push) && push != 0) {
                mFlags |= kFlagPushBlankBuffersOnShutdown;
            }
			// 配置 surface
            if (obj != NULL) {
                if (!format->findInt32(KEY_ALLOW_FRAME_DROP, &mAllowFrameDroppingBySurface)) {
                    // allow frame dropping by surface by default
                    mAllowFrameDroppingBySurface = true;
                }

                format->setObject("native-window", obj);
                status_t err = handleSetSurface(static_cast<Surface *>(obj.get()));
                if (err != OK) {
                    PostReplyWithError(replyID, err);
                    break;
                }
            } else {
                // we are not using surface so this variable is not used, but initialize sensibly anyway
                mAllowFrameDroppingBySurface = false;

                handleSetSurface(NULL);
            }

            uint32_t flags;
            CHECK(msg->findInt32("flags", (int32_t *)&flags));
            // 根据 flag 判断是否要使用 block mode，block mode似乎只能在异步模式下使用，看起来是 C2 使用的
            if (flags & CONFIGURE_FLAG_USE_BLOCK_MODEL) {
                if (!(mFlags & kFlagIsAsync)) {
                    PostReplyWithError(replyID, INVALID_OPERATION);
                    break;
                }
                mFlags |= kFlagUseBlockModel;
            }
            mReplyID = replyID;
            setState(CONFIGURING);

            void *crypto;
            if (!msg->findPointer("crypto", &crypto)) {
                crypto = NULL;
            }
			// 解密/解扰工作由 bufferchannel 完成
            mCrypto = static_cast<ICrypto *>(crypto);
            mBufferChannel->setCrypto(mCrypto);
            void *descrambler;
            if (!msg->findPointer("descrambler", &descrambler)) {
                descrambler = NULL;
            }
            mDescrambler = static_cast<IDescrambler *>(descrambler);
            mBufferChannel->setDescrambler(mDescrambler);
			// 如果 flag 是 encoder，那么需要在 format 中做设定
            format->setInt32("flags", flags);
            if (flags & CONFIGURE_FLAG_ENCODE) {
                format->setInt32("encoder", true);
                mFlags |= kFlagIsEncoder;
            }
			// 提取 format 中的 csd 信息
            extractCSD(format);
			// 判断是否为 tunnel mode
            int32_t tunneled;
            if (format->findInt32("feature-tunneled-playback", &tunneled) && tunneled != 0) {
                ALOGI("Configuring TUNNELED video playback.");
                mTunneled = true;
            } else {
                mTunneled = false;
            }
			// 后台播放？
            int32_t background = 0;
            if (format->findInt32("android._background-mode", &background) && background) {
                androidSetThreadPriority(gettid(), ANDROID_PRIORITY_BACKGROUND);
            }

            mCodec->initiateConfigureComponent(format);
            break;
        }

这里来记录下部分 format 和 flag 的作用：

push-blank-buffers-on-shutdown：结束播放时不保留最后一帧；
CONFIGURE_FLAG_USE_BLOCK_MODEL：CCodec 使用；
CONFIGURE_FLAG_ENCODE：标记是 encoder 还是 decoder，如果没有默认是 decoder，在 format 中添加 encoder 信息；
feature-tunneled-playback：配置 tunnel mode；
android._background-mode：可能是用于后台播放；

除了以上提到的信息外，configure 还需要什么 format 信息呢？具体要看 CodecBase 中的 configure 是如何写的，以 ACodec 为例子，mime 也是 format 的必要信息，如果没有就会报错了，这部分我们后面看到 ACodec 时再来了解。

这里贴一个input format示例，format 由 Extractor parse出来，但不是所有信息都用于 configure：

CodecBase 处理完成之后会调用 callback 返回结果：

cpp 复制代码

void CodecCallback::onComponentConfigured(
        const sp<AMessage> &inputFormat, const sp<AMessage> &outputFormat) {
    sp<AMessage> notify(mNotify->dup());
    notify->setInt32("what", kWhatComponentConfigured);
    notify->setMessage("input-format", inputFormat);
    notify->setMessage("output-format", outputFormat);
    notify->post();
}

onComponentConfigured 有两个参数，inputFormat 是设置给 component 的格式（输入数据的信息）， outputFormat 是 component 回传的格式（输出数据的格式）。

cpp 复制代码

                case kWhatComponentConfigured:
                {
                    if (mState == RELEASING || mState == UNINITIALIZED || mState == INITIALIZED) {
                        // In case a kWhatError or kWhatRelease message came in and replied,
                        // we log a warning and ignore.
                        ALOGW("configure interrupted by error or release, current state %d/%s",
                              mState, stateString(mState).c_str());
                        break;
                    }
                    CHECK_EQ(mState, CONFIGURING);

                    // reset input surface flag
                    mHaveInputSurface = false;

                    CHECK(msg->findMessage("input-format", &mInputFormat));
                    CHECK(msg->findMessage("output-format", &mOutputFormat));

                    // limit to confirming the opt-in behavior to minimize any behavioral change
                    if (mSurface != nullptr && !mAllowFrameDroppingBySurface) {
                        // signal frame dropping mode in the input format as this may also be
                        // meaningful and confusing for an encoder in a transcoder scenario
                        mInputFormat->setInt32(KEY_ALLOW_FRAME_DROP, mAllowFrameDroppingBySurface);
                    }
                    sp<AMessage> interestingFormat =
                            (mFlags & kFlagIsEncoder) ? mOutputFormat : mInputFormat;
                    ALOGV("[%s] configured as input format: %s, output format: %s",
                            mComponentName.c_str(),
                            mInputFormat->debugString(4).c_str(),
                            mOutputFormat->debugString(4).c_str());
                    int32_t usingSwRenderer;
                    if (mOutputFormat->findInt32("using-sw-renderer", &usingSwRenderer)
                            && usingSwRenderer) {
                        mFlags |= kFlagUsesSoftwareRenderer;
                    }
                    setState(CONFIGURED);
                    postPendingRepliesAndDeferredMessages("kWhatComponentConfigured");
                    ......
                    break;
                }

对于 kWhatComponentConfigured 的处理这里就不做过多了解，这里只看下 using-sw-renderer，如果输出格式中带有这个信息，那么将会使用软件渲染 SoftwareRenderer，这里渲染的意思应该是对输出数据进行处理，例如剪裁、旋转、颜色空间转换 ColorConverter 等。