WebRTC 接收端音频流畅低延迟播放：原理与源码对照（NetEQ / Opus）

说明：本文结合 WebRTC 源码根目录 下的 src 树与官方 NetEQ 说明文档，把接收端 在「流畅 + 可控延迟」上做的事，按理论 → 模块 → 代码锚点 串起来。
路径约定 ：文中 webrtc源码根目录 表示你本地的 WebRTC 仓库根目录（官方布局下该根目录内包含 src/、AUTHORS 等）。所有源码路径 统一写作 webrtc源码根目录/src/... ，请读者自行将前缀替换为实际 clone 路径。
注意：具体行为随分支/commit 可能微调，请以你检出的源码与 RFC 3550（RTP/RTCP）等规范为准。

接收端在解决什么问题
[常见疑问：是否只有 NetEQ 在处理音频](#常见疑问：是否只有 NetEQ 在处理音频)
[源码总览：从 RTP 到扬声器](#源码总览：从 RTP 到扬声器)
[NetEQ 在工程里的位置](#NetEQ 在工程里的位置)
[双入口 API：InsertPacket 与 GetAudio](#双入口 API：InsertPacket 与 GetAudio)
[GetAudio 内部的 Operation 与信号处理](#GetAudio 内部的 Operation 与信号处理)
[NetEQ 内部关键对象（与源码类名对应）](#NetEQ 内部关键对象（与源码类名对应）)
[丢包：PLC、NACK、RED / FEC](#丢包：PLC、NACK、RED / FEC)
[时序、RTCP 与音视频同步线索](#时序、RTCP 与音视频同步线索)
可配置项：抖动缓冲与加速
统计与观测
[Opus 与解码器侧 PLC](#Opus 与解码器侧 PLC)
参考与延伸阅读

接收端在解决什么问题

发送端做码率、FEC、DTX 等优化，接收端仍必须处理：

现象	接收端目标
抖动（jitter）	包到达间隔不稳定
乱序 / 迟到包	序列号与时间戳不一致
丢包	序列号空洞
时钟漂移	发送端与本地时钟不同步
缓冲过多或过少	延迟大或易欠载

NetEQ 的定位 （官方 g3doc）：自适应抖动缓冲 + 丢包隐藏；在低延迟 与低artifact、连续听感之间持续折中。

常见疑问：是否只有 NetEQ 在处理音频

不是。 WebRTC 里和「音频」相关的代码很多；本文侧重 NetEQ ，是因为在默认的 RTP 语音接收链路 里，抗抖动、排队、按节拍取数、丢包隐藏与时间拉伸 这一整块，几乎都集中在 NetEQ（经 AcmReceiver 封装） 里完成，所以文档读起来会像「只有 NetEQ」。

和「流畅 / 低延迟播放」相关的其他模块（分工不同）

模块 / 层次	和接收播放的关系（概括）
`ChannelReceive`	收 RTP、解析头、NACK 列表与 `ResendPackets` 、RTCP 与时钟估计、音量/增益、可选 AudioSink 、再调用 `acm_receiver_.InsertPacket` / `GetAudio`
`AcmReceiver`（ACM 接收侧）	持有 `NetEq` 实例；RED 等预处理；`GetAudio` 之后可按设备采样率做重采样
`AudioMixer` + `AudioDeviceModule`	多路混音、10 ms 级回调向声卡送 PCM；不负责 RTP 抖动缓冲
解码器（如 Opus）	实际把压缩帧变成 PCM；帧内 PLC 等可与 NetEQ 的 Expand / `DoCodecPlc` 协同
可选扩展	如 `FrameTransformer` （进 NetEQ 前改 payload）、自定义 `NetEqFactory`（替换默认 NetEQ 实现）

「还有别的办法吗？」

走官方音频接收管线时 ：抖动缓冲与播放调度默认就是 NetEQ ；要换行为通常是 调 NetEQ/ACM 参数 、或 注入自定义 NetEQ （见 webrtc源码根目录/src/api/neteq/custom_neteq_factory.h 等），而不是再平行做一套官方第二套 jitter buffer。
完全自建 ：若不用 ChannelReceive + ACM 这条路径，就需要自己实现「包队列 + 时钟 + 解码 + PLC」等，语义上仍是在替代 NetEQ 所承担的那类职责。

一句话 ：NetEQ 不是「WebRTC 里唯一的音频模块」，而是默认接收路径里专门负责抖动缓冲与播放决策的那一层 ；NACK、RTCP、混音、设备输出等在别的类里。

源码总览：从 RTP 到扬声器

调用链（简化）

音频播放拉取路径（如混音线程）
网络线程 / 收包路径
acm_receiver_.InsertPacket
Call::DeliverRtpPacket

(MediaType::AUDIO)
RtpStreamReceiverController

解复用到接收流
ChannelReceive::OnRtpPacket
ChannelReceive::ReceivePacket

→ OnReceivedPayloadData
AudioReceiveStreamImpl::

GetAudioFrameWithInfo
ChannelReceive::

GetAudioFrameWithInfo
AcmReceiver::GetAudio
NetEqImpl::GetAudio

→ GetAudioInternal
NetEq::InsertPacket
AudioFrame

固定 10ms 输出

推送侧 ：RTP 到达后最终进入 AcmReceiver::InsertPacket ，内部调用 NetEq::InsertPacket 。
拉取侧 ：设备/混音器周期性要数据 → ChannelReceive::GetAudioFrameWithInfo → AcmReceiver::GetAudio → NetEq::GetAudio ，每次取 10 ms PCM（见下文 kOutputSizeMs）。

代码锚点：收包入 NetEQ

Call 将 RTP 按媒体类型分发；音频走 audio_receiver_controller_：

1351:1391:webrtc源码根目录/src/call/call.cc 复制代码

void Call::DeliverRtpPacket(
    MediaType media_type,
    RtpPacketReceived packet,
    OnUndemuxablePacketHandler undemuxable_packet_handler) {
  RTC_DCHECK_RUN_ON(worker_thread_);
  RTC_DCHECK(packet.arrival_time().IsFinite());
  // ...
  RtpStreamReceiverController& receiver_controller =
      media_type == MediaType::AUDIO ? audio_receiver_controller_
                                     : video_receiver_controller_;
  if (!receiver_controller.OnRtpPacket(packet)) {
    // ...
  }
  // ...
}

ChannelReceive 在解析 RTP 头后把 payload 推入 ACM（内含 NetEQ） ；若开启 NACK，插入后会根据 RTT 取待重传序号并 ResendPackets：

320:359:webrtc源码根目录/src/audio/channel_receive.cc 复制代码

void ChannelReceive::OnReceivedPayloadData(
    rtc::ArrayView<const uint8_t> payload,
    const RTPHeader& rtpHeader) {
  if (!playing_) {
    // ...
    return;
  }

  // Push the incoming payload (parsed and ready for decoding) into the ACM
  if (acm_receiver_.InsertPacket(rtpHeader, payload) != 0) {
    RTC_DLOG(LS_ERROR) << "ChannelReceive::OnReceivedPayloadData() unable to "
                          "push data to the ACM";
    return;
  }

  TimeDelta round_trip_time = rtp_rtcp_->LastRtt().value_or(TimeDelta::Zero());

  std::vector<uint16_t> nack_list =
      acm_receiver_.GetNackList(round_trip_time.ms());
  if (!nack_list.empty()) {
    ResendPackets(&(nack_list[0]), static_cast<int>(nack_list.size()));
  }
}

AcmReceiver 负责 RED payload 类型识别（从冗余头取真实 PT），再转给 NetEQ：

106:148:webrtc源码根目录/src/modules/audio_coding/acm2/acm_receiver.cc 复制代码

int AcmReceiver::InsertPacket(const RTPHeader& rtp_header,
                              rtc::ArrayView<const uint8_t> incoming_payload) {
  if (incoming_payload.empty()) {
    neteq_->InsertEmptyPacket(rtp_header);
    return 0;
  }

  int payload_type = rtp_header.payloadType;
  auto format = neteq_->GetDecoderFormat(payload_type);
  if (format && absl::EqualsIgnoreCase(format->sdp_format.name, "red")) {
    // This is a RED packet. Get the format of the audio codec.
    payload_type = incoming_payload[0] & 0x7f;
    format = neteq_->GetDecoderFormat(payload_type);
  }
  // ...
  if (neteq_->InsertPacket(rtp_header, incoming_payload) < 0) {
    RTC_LOG(LS_ERROR) << "AcmReceiver::InsertPacket "
                      << static_cast<int>(rtp_header.payloadType)
                      << " Failed to insert packet";
    return -1;
  }
  return 0;
}

代码锚点：拉取 10 ms 播放数据

383:405:webrtc源码根目录/src/audio/channel_receive.cc 复制代码

AudioMixer::Source::AudioFrameInfo ChannelReceive::GetAudioFrameWithInfo(
    int sample_rate_hz,
    AudioFrame* audio_frame) {
  // ...
  // Get 10ms raw PCM data from the ACM (mixer limits output frequency)
  if (acm_receiver_.GetAudio(audio_frame->sample_rate_hz_, audio_frame) == -1) {
    RTC_DLOG(LS_ERROR)
        << "ChannelReceive::GetAudioFrame() PlayoutData10Ms() failed!";
    // ...
    return AudioMixer::Source::AudioFrameInfo::kError;
  }
  // ... gain、level、timestamp 等后处理 ...
}

151:159:webrtc源码根目录/src/modules/audio_coding/acm2/acm_receiver.cc 复制代码

int AcmReceiver::GetAudio(int desired_freq_hz,
                          AudioFrame* audio_frame,
                          bool* muted) {
  int current_sample_rate_hz = 0;
  if (neteq_->GetAudio(audio_frame, muted, &current_sample_rate_hz) !=
      NetEq::kOK) {
    RTC_LOG(LS_ERROR) << "AcmReceiver::GetAudio - NetEq Failed.";
    return -1;
  }

NetEQ 在工程里的位置

实现类 ：NetEqImpl（webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.{h,cc}）。
对外接口 ：webrtc源码根目录/src/api/neteq/neteq.h 中的抽象类 NetEq。
设计文档 （英文）：webrtc源码根目录/src/modules/audio_coding/neteq/g3doc/index.md ------ 与下文 InsertPacket / GetAudio 描述一致，适合对照阅读。

双入口 API：InsertPacket 与 GetAudio

接口层定义

GetAudio 的注释写明了契约：每次调用交付 10 ms 音频 ，并刷新 AudioFrame 的采样率、声道等：

197:215:webrtc源码根目录/src/api/neteq/neteq.h 复制代码

  // Instructs NetEq to deliver 10 ms of audio data. The data is written to
  // `audio_frame`. All data in `audio_frame` is wiped; `data_`, `speech_type_`,
  // `num_channels_`, `sample_rate_hz_` and `samples_per_channel_` are updated
  // upon success. If an error is returned, some fields may not have been
  // updated, or may contain inconsistent values. If muted state is enabled
  // (through Config::enable_muted_state), `muted` may be set to true after a
  // prolonged expand period. When this happens, the `data_` in `audio_frame`
  // is not written, but should be interpreted as being all zeros. For testing
  // purposes, an override can be supplied in the `action_override` argument,
  // which will cause NetEq to take this action next, instead of the action it
  // would normally choose. An optional output argument for fetching the current
  // sample rate can be provided, which will return the same value as
  // last_output_sample_rate_hz() but will avoid additional synchronization.
  // Returns kOK on success, or kFail in case of an error.
  virtual int GetAudio(
      AudioFrame* audio_frame,
      bool* muted = nullptr,
      int* current_sample_rate_hz = nullptr,
      absl::optional<Operation> action_override = absl::nullopt) = 0;

实现侧加锁并调用内部逻辑：

215:242:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.cc 复制代码

int NetEqImpl::GetAudio(AudioFrame* audio_frame,
                        bool* muted,
                        int* current_sample_rate_hz,
                        absl::optional<Operation> action_override) {
  TRACE_EVENT0("webrtc", "NetEqImpl::GetAudio");
  MutexLock lock(&mutex_);
  if (GetAudioInternal(audio_frame, action_override) != 0) {
    return kFail;
  }
  // ...
  if (current_sample_rate_hz) {
    *current_sample_rate_hz = last_output_sample_rate_hz_;
  }

  return kOK;
}

输出节拍 与 NetEqImpl::kOutputSizeMs 一致（10 ms）：

196:196:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.h 复制代码

  static const int kOutputSizeMs = 10;

InsertPacket（官方 g3doc 要点 + 源码）

根据 webrtc源码根目录/src/modules/audio_coding/neteq/g3doc/index.md，InsertPacket 路径大致包括：

过晚无法参与播放的包丢弃；否则进入 packet buffer ；若缓冲满则清空已有包（罕见）。
统计包到达间隔 （以 GetAudio 的 tick 计量），更新目标播放延迟 ，并兼顾收发时钟漂移。

对应入口：

195:203:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.cc 复制代码

int NetEqImpl::InsertPacket(const RTPHeader& rtp_header,
                            rtc::ArrayView<const uint8_t> payload) {
  rtc::MsanCheckInitialized(payload);
  TRACE_EVENT0("webrtc", "NetEqImpl::InsertPacket");
  MutexLock lock(&mutex_);
  if (InsertPacketInternal(rtp_header, payload) != 0) {
    return kFail;
  }
  return kOK;
}

GetAudio（官方 g3doc 简化逻辑）

官方文档给出的极度简化决策顺序可概括为：

步骤	行为
1	Sync buffer 里已有 ≥10 ms 则可直接参与后续决策
2	若下一包（按 RTP 时间戳）已在 packet buffer ，则解码并写入 sync buffer；根据滤波缓冲深度与目标延迟做 time stretch
3	DTX 场景生成 comfort noise
4	无可用包则 PLC ：拉伸 sync buffer 中已有波形或让解码器生成

实际代码通过 GetDecision → Decode → switch (operation) 实现，见下一节。

GetAudio 内部的 Operation 与信号处理

`NetEq::Operation` 枚举（对外可见）

145:157:webrtc源码根目录/src/api/neteq/neteq.h 复制代码

  enum class Operation {
    kNormal,
    kMerge,
    kExpand,
    kAccelerate,
    kFastAccelerate,
    kPreemptiveExpand,
    kRfc3389Cng,
    kRfc3389CngNoPacket,
    kCodecInternalCng,
    kDtmf,
    kUndefined,
  };

`GetAudioInternal` 中的分发（与听感直接相关）

每次 tick_timer_ 递增后，先 GetDecision，再 Decode，最后按 operation 调用 DoNormal / DoMerge / DoExpand / DoAccelerate / DoPreemptiveExpand 等：

791:878:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.cc 复制代码

  int return_value = GetDecision(&operation, &packet_list, &dtmf_event,
                                 &play_dtmf, action_override);
  if (return_value != 0) {
    last_mode_ = Mode::kError;
    return return_value;
  }

  AudioDecoder::SpeechType speech_type;
  int length = 0;
  const size_t start_num_packets = packet_list.size();
  int decode_return_value =
      Decode(&packet_list, &operation, &length, &speech_type);
  // ...
  algorithm_buffer_->Clear();
  switch (operation) {
    case Operation::kNormal: {
      DoNormal(decoded_buffer_.get(), length, speech_type, play_dtmf);
      // ...
      break;
    }
    case Operation::kMerge: {
      DoMerge(decoded_buffer_.get(), length, speech_type, play_dtmf);
      break;
    }
    case Operation::kExpand: {
      RTC_DCHECK_EQ(return_value, 0);
      if (!current_rtp_payload_type_ || !DoCodecPlc()) {
        return_value = DoExpand(play_dtmf);
      }
      // ...
      break;
    }
    case Operation::kAccelerate:
    case Operation::kFastAccelerate: {
      const bool fast_accelerate =
          enable_fast_accelerate_ && (operation == Operation::kFastAccelerate);
      return_value = DoAccelerate(decoded_buffer_.get(), length, speech_type,
                                  play_dtmf, fast_accelerate);
      break;
    }
    case Operation::kPreemptiveExpand: {
      return_value = DoPreemptiveExpand(decoded_buffer_.get(), length,
                                        speech_type, play_dtmf);
      break;
    }
    case Operation::kRfc3389Cng:
    case Operation::kRfc3389CngNoPacket: {
      return_value = DoRfc3389Cng(&packet_list, play_dtmf);
      break;
    }
    case Operation::kCodecInternalCng: {
      DoCodecInternalCng(decoded_buffer_.get(), length);
      break;
    }
    case Operation::kDtmf: {
      return_value = DoDtmf(dtmf_event, &play_dtmf);
      break;
    }
    case Operation::kUndefined: {
      RTC_LOG(LS_ERROR) << "Invalid operation kUndefined.";
      RTC_DCHECK_NOTREACHED();  // This should not happen.
      last_mode_ = Mode::kError;
      return kInvalidOperation;
    }
  }  // End of switch.

操作与听感（对照表）

`Operation`	典型含义（工程理解）	相关实现入口
kNormal	正常解码播放	`DoNormal`
kMerge	丢包隐藏段与真实解码数据拼接，减轻边界 artifact	`DoMerge`
kExpand	欠载/丢包：延展波形或走 Codec PLC （`DoCodecPlc`）	`DoExpand` / `DoCodecPlc`
kAccelerate / kFastAccelerate	缓冲偏高：加速消耗样本、控延迟	`DoAccelerate`
kPreemptiveExpand	缓冲偏低：预防性拉长（减速），降低即将欠载概率	`DoPreemptiveExpand`
kRfc3389Cng*	RFC3389 舒适噪声（有包/无包分支）	`DoRfc3389Cng`
kCodecInternalCng	解码器内部 CNG（无传输时由解码器产噪）	`DoCodecInternalCng`
kDtmf	电话音事件回放	`DoDtmf`

kNormal
kMerge
kExpand
kAccelerate
kPreemptiveExpand
CNG / DTMF
GetDecision
Operation?
DoNormal
DoMerge
DoCodecPlc 或 DoExpand
DoAccelerate
DoPreemptiveExpand
DoRfc3389Cng / DoDtmf ...
写入 algorithm_buffer

→ PushBack 到 sync_buffer

→ 取出 10ms 到 AudioFrame

NetEQ 内部关键对象（与源码类名对应）

NetEqImpl::Dependencies 集中构造了抖动缓冲、控制器、RED 拆分、解码器表等（节选）：

108:115:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.h 复制代码

    std::unique_ptr<DecoderDatabase> decoder_database;
    std::unique_ptr<DtmfBuffer> dtmf_buffer;
    std::unique_ptr<DtmfToneGenerator> dtmf_tone_generator;
    std::unique_ptr<PacketBuffer> packet_buffer;
    std::unique_ptr<NetEqController> neteq_controller;
    std::unique_ptr<RedPayloadSplitter> red_payload_splitter;
    std::unique_ptr<TimestampScaler> timestamp_scaler;

组件	作用（结合 g3doc + 类名）
PacketBuffer	RTP 逻辑包排队、排序、丢弃过晚包
NetEqController	目标缓冲深度、滤波缓冲水平、是否允许 time stretch 等决策输入
SyncBuffer	已解码/已处理 PCM 的环形同步区，与「再取 10 ms」强相关
DecoderDatabase	按 payload type 找 Opus 等解码器实例
RedPayloadSplitter	解析 RED 多层 payload，供恢复与解码
NackTracker	维护丢包列表，`GetNackList` 供上层发 NACK （`EnableNack` 后）
Accelerate / Expand / Merge / PreemptiveExpand	具体 DSP 与状态机在对应 `.cc` 中

FilteredCurrentDelayMs 把 controller 滤波缓冲水平 与 sync_buffer 未来长度 折成毫秒级延迟估计，便于统计与调试：

316:323:webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.cc 复制代码

int NetEqImpl::FilteredCurrentDelayMs() const {
  MutexLock lock(&mutex_);
  const int delay_samples =
      controller_->GetFilteredBufferLevel() + sync_buffer_->FutureLength();
  return delay_samples / rtc::CheckedDivExact(fs_hz_, 1000);
}

丢包：PLC、NACK、RED / FEC

机制	在接收链上的位置（概括）
NACK	NetEQ 内 `NackTracker` ；`ChannelReceive` 在 `InsertPacket` 之后取列表并 `ResendPackets` （见上文 `OnReceivedPayloadData`）
RED	`AcmReceiver::InsertPacket` 识别 PT=`red`；`RedPayloadSplitter` 在 NetEQ 内拆冗余
PLC	`Operation::kExpand` 路径；优先尝试 `DoCodecPlc()` （解码器内置），否则 NetEQ Expand

官方 g3doc 亦列出 NetEQ 职责：FEC/RED 拆分 、NACK 列表 、为 A/V sync 增加延迟等（见上文 webrtc源码根目录/src/modules/audio_coding/neteq/g3doc/index.md 中 Other responsibilities）。

时序、RTCP 与音视频同步线索

RTP timestamp 与解码帧长决定媒体时间轴。
ChannelReceive::ReceivedRTCPPacket 把 RTCP 交给 rtp_rtcp_，并用 SR + RTT 更新 NTP 估计 与 capture 时钟偏移（用于时间戳解释与同步类功能）。
AudioReceiveStreamInterface::Config::sync_group 在配置层保留 A/V 同步组 标识（具体策略见代码注释与 issue 跟踪）。

可配置项：抖动缓冲与加速

接收流 Config 中直接暴露 NetEQ 相关字段：

127:131:webrtc源码根目录/src/call/audio_receive_stream.h 复制代码

    // NetEq settings.
    size_t jitter_buffer_max_packets = 200;
    bool jitter_buffer_fast_accelerate = false;
    int jitter_buffer_min_delay_ms = 0;

字段	含义（结合 `NetEq::Config`）
jitter_buffer_max_packets	对应构造 NetEQ 时的 `max_packets_in_buffer` （默认常数 200，与 `webrtc源码根目录/src/api/neteq/neteq.h` 中 `Config` 默认一致）
jitter_buffer_fast_accelerate	映射为 `NetEq::Config::enable_fast_accelerate`，影响是否启用 kFastAccelerate 分支
jitter_buffer_min_delay_ms	与最小播放延迟相关，经 `AcmReceiver::SetBaseMinimumDelayMs` 等传入控制器

运行时还可通过 SetMinimumDelay / SetMaximumDelay（毫秒上限在实现中有 clamp）调节延迟边界。

统计与观测

NetEqNetworkStatistics ：当前/目标缓冲毫秒数、expand/accelerate 等比率、等待时间分布等（见 webrtc源码根目录/src/api/neteq/neteq.h）。
NetEqLifetimeStatistics ：与 W3C inbound-rtp 等指标对齐的累计量，如 concealed_samples 、jitter_buffer_delay_ms 等（同头文件内结构体注释）。
工具：仓库内 neteq_rtpplay 等可对 RTP dump / 事件日志做回放（说明见 webrtc源码根目录/src/modules/audio_coding/neteq/g3doc/index.md）。

Opus 与解码器侧 PLC

Opus 在 帧丢失 时可在解码器内做 PLC ；WebRTC 中若 DoCodecPlc() 成功，Expand 路径可能不走 NetEQ 自研 Expand。
具体是否触发、与 kExpand 的组合关系，以当前 AudioDecoder::SpeechType 与 Decode() 返回值为准（详见 webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.cc 中 Decode 与 Expand 分支）。

参考与延伸阅读

资源	路径或链接
NetEQ 设计说明（英文）	`webrtc源码根目录/src/modules/audio_coding/neteq/g3doc/index.md`
NetEQ 接口	`webrtc源码根目录/src/api/neteq/neteq.h`
实现	`webrtc源码根目录/src/modules/audio_coding/neteq/neteq_impl.{h,cc}`
ACM 接收	`webrtc源码根目录/src/modules/audio_coding/acm2/acm_receiver.{h,cc}`
通道接收	`webrtc源码根目录/src/audio/channel_receive.{h,cc}`
呼叫收包	`webrtc源码根目录/src/call/call.cc`
同目录笔记	`WebRTC源码结构与学习路线图.md`、`Opus音频编码格式详解.md`

若你切换了 WebRTC 分支，可在 webrtc源码根目录 下用 git grep -n GetAudioInternal 等快速核对行号是否漂移。

WebRTC 接收端音频流畅低延迟播放：原理与源码对照（NetEQ / Opus）