Android音频学习(十八)——混音流程

在上一节有提到,音频数据会通过共享内存传到播放线程播放,如果是混音,则音频数据传入到MixerThread进行处理。当调用output打开输出流后会根据output_flag创建对应的播放线程。

thread = new MixerThread(this, outputStream, *output, mSystemReady);

创建MixerThread对象时,会创建混音器AudioMixer,实现了混音的主要逻辑。

mAudioMixer = new AudioMixer(mNormalFrameCount, mSampleRate);

mNormalFrameCount:AudioMixer会根据传进来的mNormalFrameCount作为一次输送数据的长度,把源buffer的音频数据写入目的buffer

mSampleRate:AudioMixer会把传进来的mSampleRate作为音频数据输出的采样率

MixerThread继承于PlaybackThread,当PlaybackThread的核心方法threadLoop被唤醒时,就开始拿到音频数据进行混音处理。

混音关键的三个步骤有

(1)prepareTracks_l

将需要混音的Track赋值到AudioMixer中,并配置相关混音参数Paramter

(2)threadLoop_mix

开始混音

(3)threadLoop_write

将混音后的数据写入到HAL中去

1.混音准备

准备工作主要做了3件事情:

(1)查看音频数据是否充足

(2)计算音量

(3)将外部Track属性、参数等工作设置到AudioMixer内部的Track中

设置的内容包含format、sampleRate、音量volume、音频数据输入地址mainBuffer、重采样器等等,通过setParameter设置

prepareTracks函数就是混音之前做好准备工作,为AudioMixer混音前创建一一对应的内部Track,并为Track赋值音量、buffer、format、重采样管理器等参数,为混音前做好准备工作,具体看的不是很明白,以后再补充。

2.混音

在上一节提到,PlaybackThread的threadLoop中调用了threadLoop_mix(), 其中调用了AudioMixer->process,我们先来看下process具体实现:

AudioMixer继承于AudioMixerBase, process在AudioMixerBase.h中定义如下:

void process() {

preProcess();
(this->*mHook)();

postProcess();

}

重点就是执行mHook函数指针指向的函数,下面分析下具体指向了哪个函数。

在第一步时调用AudioMixer的setParameter()时会调用 invalidate()方法,该方法在AudioMixerBase.h中定义了hook的实现。

void invalidate() {

mHook = &AudioMixerBase::process__validate;

}

再具体看下process__validate,该方法主要做了两件事情,一是为当前Track指定hook函数,即对应的混音函数;二是为AudioMixer指定mHook函数指针成员,处理当前混音器下面所有Track逻辑。

cpp 复制代码
void AudioMixerBase::process__validate()
{
    ......
    for (const auto &pair : mTracks) {
        const int name = pair.first;
        const std::shared_ptr<TrackBase> &t = pair.second;
        if (!t->enabled) continue;

        mEnabled.emplace_back(name);  // we add to mEnabled in order of name.
        mGroups[t->mainBuffer].emplace_back(name); // mGroups also in order of name.

        uint32_t n = 0;
        // FIXME can overflow (mask is only 3 bits)
        n |= NEEDS_CHANNEL_1 + t->channelCount - 1;
        if (t->doesResample()) {
            n |= NEEDS_RESAMPLE;
        }
        if (t->auxLevel != 0 && t->auxBuffer != NULL) {
            n |= NEEDS_AUX;
        }

        if (t->volumeInc[0]|t->volumeInc[1]) {
            volumeRamp = true;
        } else if (!t->doesResample() && t->volumeRL == 0) {
            n |= NEEDS_MUTE;
        }
        t->needs = n;

        if (n & NEEDS_MUTE) {
            t->hook = &TrackBase::track__nop;
        } else {
            if (n & NEEDS_AUX) {
                all16BitsStereoNoResample = false;
            }
            if (n & NEEDS_RESAMPLE) {
                all16BitsStereoNoResample = false;
                resampling = true;
                if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1
                        && t->channelMask == AUDIO_CHANNEL_OUT_MONO // MONO_HACK
                        && isAudioChannelPositionMask(t->mMixerChannelMask)) {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLEMONO, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                } else if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2
                        && t->useStereoVolume()) {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLESTEREO, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                } else {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLE, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                }
                ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
                        "Track %d needs downmix + resample", name);
            } else {
                if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1){
                    t->hook = TrackBase::getTrackHook(
                            (isAudioChannelPositionMask(t->mMixerChannelMask)  // TODO: MONO_HACK
                                    && t->channelMask == AUDIO_CHANNEL_OUT_MONO)
                                ? TRACKTYPE_NORESAMPLEMONO : TRACKTYPE_NORESAMPLE,
                            t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                    all16BitsStereoNoResample = false;
                }
                if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2){
                    t->hook = TrackBase::getTrackHook(
                            t->useStereoVolume() ? TRACKTYPE_NORESAMPLESTEREO
                                    : TRACKTYPE_NORESAMPLE,
                            t->mMixerChannelCount, t->mMixerInFormat,
                            t->mMixerFormat);
                    ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
                            "Track %d needs downmix", name);
                }
            }
        }
    }
    // select the processing hooks
    mHook = &AudioMixerBase::process__nop;
    if (mEnabled.size() > 0) {
        if (resampling) {
            if (mOutputTemp.get() == nullptr) {
                mOutputTemp.reset(new int32_t[MAX_NUM_CHANNELS * mFrameCount]);
            }
            if (mResampleTemp.get() == nullptr) {
                mResampleTemp.reset(new int32_t[MAX_NUM_CHANNELS * mFrameCount]);
            }
            mHook = &AudioMixerBase::process__genericResampling;
        } else {
            // we keep temp arrays around.
            mHook = &AudioMixerBase::process__genericNoResampling;
            if (all16BitsStereoNoResample && !volumeRamp) {
                if (mEnabled.size() == 1) {
                    const std::shared_ptr<TrackBase> &t = mTracks[mEnabled[0]];
                    if ((t->needs & NEEDS_MUTE) == 0) {
                        // The check prevents a muted track from acquiring a process hook.
                        //
                        // This is dangerous if the track is MONO as that requires
                        // special case handling due to implicit channel duplication.
                        // Stereo or Multichannel should actually be fine here.
                        mHook = getProcessHook(PROCESSTYPE_NORESAMPLEONETRACK,
                                t->mMixerChannelCount, t->mMixerInFormat, t->mMixerFormat,
                                t->useStereoVolume());
                    }
                }
            }
        }
    }

    ALOGV("mixer configuration change: %zu "
        "all16BitsStereoNoResample=%d, resampling=%d, volumeRamp=%d",
        mEnabled.size(), all16BitsStereoNoResample, resampling, volumeRamp);

    process();

    // Now that the volume ramp has been done, set optimal state and
    // track hooks for subsequent mixer process
    if (mEnabled.size() > 0) {
        bool allMuted = true;

        for (const int name : mEnabled) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            if (!t->doesResample() && t->volumeRL == 0) {
                t->needs |= NEEDS_MUTE;
                t->hook = &TrackBase::track__nop;
            } else {
                allMuted = false;
            }
        }
        if (allMuted) {
            mHook = &AudioMixerBase::process__nop;
        } else if (all16BitsStereoNoResample) {
            if (mEnabled.size() == 1) {
                //const int i = 31 - __builtin_clz(enabledTracks);
                const std::shared_ptr<TrackBase> &t = mTracks[mEnabled[0]];
                // Muted single tracks handled by allMuted above.
                mHook = getProcessHook(PROCESSTYPE_NORESAMPLEONETRACK,
                        t->mMixerChannelCount, t->mMixerInFormat, t->mMixerFormat,
                        t->useStereoVolume());
            }
        }
    }
}

调用getTrackHook根据tracktype指定混音的hook函数:

cpp 复制代码
AudioMixerBase::hook_t AudioMixerBase::TrackBase::getTrackHook(int trackType, uint32_t channelCount,
        audio_format_t mixerInFormat, audio_format_t mixerOutFormat __unused)
{
    if (!kUseNewMixer && channelCount == FCC_2 && mixerInFormat == AUDIO_FORMAT_PCM_16_BIT) {
        switch (trackType) {
        case TRACKTYPE_NOP:
            return &TrackBase::track__nop;
        case TRACKTYPE_RESAMPLE:
            return &TrackBase::track__genericResample;
        case TRACKTYPE_NORESAMPLEMONO:
            return &TrackBase::track__16BitsMono;
        case TRACKTYPE_NORESAMPLE:
            return &TrackBase::track__16BitsStereo;
        default:
            LOG_ALWAYS_FATAL("bad trackType: %d", trackType);
            break;
        }
    }
    LOG_ALWAYS_FATAL_IF(channelCount > MAX_NUM_CHANNELS);
    switch (trackType) {
    case TRACKTYPE_NOP:
        return &TrackBase::track__nop;
    case TRACKTYPE_RESAMPLE:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_RESAMPLESTEREO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI_STEREOVOL, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI_STEREOVOL, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    // RESAMPLEMONO needs MIXTYPE_STEREOEXPAND since resampler will upmix mono
    // track to stereo track
    case TRACKTYPE_RESAMPLEMONO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_STEREOEXPAND, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_STEREOEXPAND, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLEMONO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                            MIXTYPE_MONOEXPAND, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                            MIXTYPE_MONOEXPAND, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLE:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLESTEREO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI_STEREOVOL, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI_STEREOVOL, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    default:
        LOG_ALWAYS_FATAL("bad trackType: %d", trackType);
        break;
    }
    return NULL;
}

调用getProcessHook,根据音频格式返回对应的process函数,例如AUDIO_FORMAT_PCM_16_BIT,返回process_noResampleOneTrack

cpp 复制代码
AudioMixerBase::process_hook_t AudioMixerBase::getProcessHook(
        int processType, uint32_t channelCount,
        audio_format_t mixerInFormat, audio_format_t mixerOutFormat,
        bool stereoVolume)
{
    if (processType != PROCESSTYPE_NORESAMPLEONETRACK) { // Only NORESAMPLEONETRACK
        LOG_ALWAYS_FATAL("bad processType: %d", processType);
        return NULL;
    }
    if (!kUseNewMixer && channelCount == FCC_2 && mixerInFormat == AUDIO_FORMAT_PCM_16_BIT) {
        return &AudioMixerBase::process__oneTrack16BitsStereoNoResampling;
    }
    LOG_ALWAYS_FATAL_IF(channelCount > MAX_NUM_CHANNELS);

    if (stereoVolume) { // templated arguments require explicit values.
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            switch (mixerOutFormat) {
            case AUDIO_FORMAT_PCM_FLOAT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, float /*TO*/,
                        float /*TI*/, TYPE_AUX>;
            case AUDIO_FORMAT_PCM_16_BIT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, int16_t /*TO*/,
                        float /*TI*/, TYPE_AUX>;
            default:
                LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                break;
            }
            break;
        case AUDIO_FORMAT_PCM_16_BIT:
            switch (mixerOutFormat) {
            case AUDIO_FORMAT_PCM_FLOAT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, float /*TO*/,
                        int16_t /*TI*/, TYPE_AUX>;
            case AUDIO_FORMAT_PCM_16_BIT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, int16_t /*TO*/,
                        int16_t /*TI*/, TYPE_AUX>;
            default:
                LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                break;
            }
            break;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
    } else {
          switch (mixerInFormat) {
          case AUDIO_FORMAT_PCM_FLOAT:
              switch (mixerOutFormat) {
              case AUDIO_FORMAT_PCM_FLOAT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, float /*TO*/,
                          float /*TI*/, TYPE_AUX>;
              case AUDIO_FORMAT_PCM_16_BIT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, int16_t /*TO*/,
                          float /*TI*/, TYPE_AUX>;
              default:
                  LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                  break;
              }
              break;
          case AUDIO_FORMAT_PCM_16_BIT:
              switch (mixerOutFormat) {
              case AUDIO_FORMAT_PCM_FLOAT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, float /*TO*/,
                          int16_t /*TI*/, TYPE_AUX>;
              case AUDIO_FORMAT_PCM_16_BIT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, int16_t /*TO*/,
                          int16_t /*TI*/, TYPE_AUX>;
              default:
                  LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                  break;
              }
              break;
          default:
              LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
              break;
          }
    }
    return NULL;
}

有两种hook函数,一种是需要重采样的,一种是不需要重采样:

需要重采样:process__genericResampling

cpp 复制代码
void AudioMixerBase::process__genericResampling()
{
    ALOGVV("process__genericResampling\n");
    int32_t * const outTemp = mOutputTemp.get(); // naked ptr
    size_t numFrames = mFrameCount;

    for (const auto &pair : mGroups) {
        const auto &group = pair.second;
        const std::shared_ptr<TrackBase> &t1 = mTracks[group[0]];

        // clear temp buffer
        memset(outTemp, 0, sizeof(*outTemp) * t1->mMixerChannelCount * mFrameCount);
        for (const int name : group) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            int32_t *aux = NULL;
            if (CC_UNLIKELY(t->needs & NEEDS_AUX)) {
                aux = t->auxBuffer;
            }

            // this is a little goofy, on the resampling case we don't
            // acquire/release the buffers because it's done by
            // the resampler.
            if (t->needs & NEEDS_RESAMPLE) {
                (t.get()->*t->hook)(outTemp, numFrames, mResampleTemp.get() /* naked ptr */, aux);
            } else {

                size_t outFrames = 0;

                while (outFrames < numFrames) {
                    t->buffer.frameCount = numFrames - outFrames;
                    t->bufferProvider->getNextBuffer(&t->buffer);
                    t->mIn = t->buffer.raw;
                    // t->mIn == nullptr can happen if the track was flushed just after having
                    // been enabled for mixing.
                    if (t->mIn == nullptr) break;

                    (t.get()->*t->hook)(
                            outTemp + outFrames * t->mMixerChannelCount, t->buffer.frameCount,
                            mResampleTemp.get() /* naked ptr */,
                            aux != nullptr ? aux + outFrames : nullptr);
                    outFrames += t->buffer.frameCount;

                    t->bufferProvider->releaseBuffer(&t->buffer);
                }
            }
        }
        convertMixerFormat(t1->mainBuffer, t1->mMixerFormat,
                outTemp, t1->mMixerInFormat, numFrames * t1->mMixerChannelCount);
    }
}

该函数中,会遍历所有的Track,如果Track需要重采样则不需要获取额外的buffer,调用track的重采样函数并进行混音。如果不需要重采样了则需要获取buffer,并进行混音后把数据拷贝到输出buffer中,

不需要混音的hook函数:process__genericNoResampling也是类似的,需要获取到buffer,混音后把数据拷贝到输出buffer中。

在设置参数或enable时会调用到invalidate,会重新指定mhook函数,从而刷新track,在调用process__validate会调用一次process()函数,后续都是在threadLoop_mix中调用。

混音以track为源,mainBuffer为目标,frameCount为一次混音长度。AudioMixer最多能维护32个track。track可以对应不同mainBuffer,尽管一般情况下他们的mainBuffer都是同一个。

相关推荐
MongoVIP1 小时前
音频类AI工具扩展
人工智能·音视频·ai工具使用
悠哉悠哉愿意4 小时前
【ROS2学习笔记】 TF 坐标系
笔记·学习·ros2
代码萌新知5 小时前
设计模式学习(五)装饰者模式、桥接模式、外观模式
java·学习·设计模式·桥接模式·装饰器模式·外观模式
驱动探索者7 小时前
find 命令使用介绍
java·linux·运维·服务器·前端·学习·microsoft
小雨凉如水8 小时前
k8s学习-pod的生命周期
java·学习·kubernetes
charlie1145141918 小时前
理解C++20的革命特性——协程支持2:编写简单的协程调度器
c++·学习·算法·设计模式·c++20·协程·调度器
文火冰糖的硅基工坊8 小时前
[人工智能-综述-21]:学习人工智能的路径
大数据·人工智能·学习·系统架构·制造
JJJJ_iii8 小时前
【深度学习01】快速上手 PyTorch:环境 + IDE+Dataset
pytorch·笔记·python·深度学习·学习·jupyter
好奇龙猫9 小时前
日语学习-日语知识点小记-进阶-JLPT-N1阶段应用练习(5):语法 +考え方18+2022年7月N1
学习
。TAT。9 小时前
Linux - 进程状态
linux·学习