iOS 音视频底层架构
在 iOS 平台上,音视频处理是一个涉及硬件驱动、系统框架、多媒体数字信号处理以及网络传输的庞大工程。理解其底层架构,首先必须建立一个核心认知:音视频在物理管线和处理逻辑上是完全隔离、并行运行的 。它们从各自的硬件采集端开始,经过独立的编码、传输、解码管线,直到最后的渲染阶段,才通过 PTS(Presentation Time Stamp,显示时间戳) 这一唯一的"交汇点"实现统一播放。
以下是基于 iOS 核心框架(AVFoundation, VideoToolbox, AudioToolbox, CoreMedia)的架构解析,涵盖理论原理、代码实现和工程实践。
目录
核心概念
音视频的独立性与 PTS 统一机制
如何实现同时播放?通过 PTS(Presentation Time Stamp,显示时间戳) 这一唯一的"交汇点"实现统一播放。
1. 独立并行的管线
在 iOS 系统中,音频和视频拥有各自的硬件抽象层和驱动体系。音频依赖音频芯片和 I2S 总线,视频依赖 ISP(图像信号处理器)和 GPU/显示管线。从采集到解码,开发者通常需要维护两套独立的线程池和缓冲队列。网络传输层(如基于 UDP 的 RTP 或基于 TCP 的 RTMP)也是将音视频包分流传输的。
2. PTS:时间的刻度
由于网络抖动、编码耗时差异(尤其是视频 B 帧的重排)、以及系统调度,音视频数据包到达终端的顺序和耗时不一致。为了让它们"一起播放",系统在采集阶段就会基于高精度时钟(如 mach_absolute_time 或 CVHostTime)为每一帧数据打上 PTS。
- DTS (Decoding Time Stamp):解码时间戳,告诉解码器什么时候解码这一帧(受 B 帧影响,可能与 PTS 不同)。
- PTS (Presentation Time Stamp):显示时间戳,告诉渲染器什么时候把这一帧展示给用户。
3. 同步机制:音频为主时钟
在绝大多数实时音视频系统(如 WebRTC、直播播放器)中,音频被选为"主时钟"。原因在于:
- 人耳对声音的断续、延迟极其敏感(几十毫秒就能感知到卡顿或变调),而眼睛对视频的轻微卡顿容忍度较高。
- 音频声卡是以固定采样率(如 44100Hz)匀速消耗 PCM 数据的,这种物理上的"匀速水龙头"天然构成了一个高精度的线性时钟。
同步策略 :视频在渲染前,必须向音频看齐。计算 diff = video_pts - audio_pts:
diff > 阈值(视频超前音频):当前视频帧不立刻渲染,挂起等待,直到系统时钟追上视频 PTS。diff < -阈值(视频落后音频,如网络卡顿):直接丢弃当前过期的视频帧,迅速追赶,直到视频 PTS 接近音频 PTS。
iOS 系统框架全景
iOS 音视频栈由多个核心框架组成,每个框架负责音视频管线的特定阶段:
#mermaid-svg-lUFLmxMsc7334AbV{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-lUFLmxMsc7334AbV .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-lUFLmxMsc7334AbV .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-lUFLmxMsc7334AbV .error-icon{fill:#552222;}#mermaid-svg-lUFLmxMsc7334AbV .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-lUFLmxMsc7334AbV .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-lUFLmxMsc7334AbV .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-lUFLmxMsc7334AbV .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-lUFLmxMsc7334AbV .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-lUFLmxMsc7334AbV .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-lUFLmxMsc7334AbV .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-lUFLmxMsc7334AbV .marker{fill:#333333;stroke:#333333;}#mermaid-svg-lUFLmxMsc7334AbV .marker.cross{stroke:#333333;}#mermaid-svg-lUFLmxMsc7334AbV svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-lUFLmxMsc7334AbV p{margin:0;}#mermaid-svg-lUFLmxMsc7334AbV .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-lUFLmxMsc7334AbV .cluster-label text{fill:#333;}#mermaid-svg-lUFLmxMsc7334AbV .cluster-label span{color:#333;}#mermaid-svg-lUFLmxMsc7334AbV .cluster-label span p{background-color:transparent;}#mermaid-svg-lUFLmxMsc7334AbV .label text,#mermaid-svg-lUFLmxMsc7334AbV span{fill:#333;color:#333;}#mermaid-svg-lUFLmxMsc7334AbV .node rect,#mermaid-svg-lUFLmxMsc7334AbV .node circle,#mermaid-svg-lUFLmxMsc7334AbV .node ellipse,#mermaid-svg-lUFLmxMsc7334AbV .node polygon,#mermaid-svg-lUFLmxMsc7334AbV .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-lUFLmxMsc7334AbV .rough-node .label text,#mermaid-svg-lUFLmxMsc7334AbV .node .label text,#mermaid-svg-lUFLmxMsc7334AbV .image-shape .label,#mermaid-svg-lUFLmxMsc7334AbV .icon-shape .label{text-anchor:middle;}#mermaid-svg-lUFLmxMsc7334AbV .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-lUFLmxMsc7334AbV .rough-node .label,#mermaid-svg-lUFLmxMsc7334AbV .node .label,#mermaid-svg-lUFLmxMsc7334AbV .image-shape .label,#mermaid-svg-lUFLmxMsc7334AbV .icon-shape .label{text-align:center;}#mermaid-svg-lUFLmxMsc7334AbV .node.clickable{cursor:pointer;}#mermaid-svg-lUFLmxMsc7334AbV .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-lUFLmxMsc7334AbV .arrowheadPath{fill:#333333;}#mermaid-svg-lUFLmxMsc7334AbV .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-lUFLmxMsc7334AbV .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-lUFLmxMsc7334AbV .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-lUFLmxMsc7334AbV .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-lUFLmxMsc7334AbV .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-lUFLmxMsc7334AbV .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-lUFLmxMsc7334AbV .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-lUFLmxMsc7334AbV .cluster text{fill:#333;}#mermaid-svg-lUFLmxMsc7334AbV .cluster span{color:#333;}#mermaid-svg-lUFLmxMsc7334AbV div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-lUFLmxMsc7334AbV .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-lUFLmxMsc7334AbV rect.text{fill:none;stroke-width:0;}#mermaid-svg-lUFLmxMsc7334AbV .icon-shape,#mermaid-svg-lUFLmxMsc7334AbV .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-lUFLmxMsc7334AbV .icon-shape p,#mermaid-svg-lUFLmxMsc7334AbV .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-lUFLmxMsc7334AbV .icon-shape .label rect,#mermaid-svg-lUFLmxMsc7334AbV .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-lUFLmxMsc7334AbV .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-lUFLmxMsc7334AbV .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-lUFLmxMsc7334AbV :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 硬件抽象层
系统框架层
应用层
时间戳同步
时间戳同步
你的应用程序
AVFoundation
采集/播放
VideoToolbox
视频编解码
AudioToolbox
音频编解码
CoreMedia
时间戳/数据结构
CoreAnimation
渲染显示
摄像头硬件
麦克风硬件
GPU/编码芯片
音频DSP
显示硬件
扬声器硬件
框架职责说明:
| 框架 | 主要职责 | 关键 API |
|---|---|---|
| AVFoundation | 摄像头/麦克风采集、播放器实现 | AVCaptureSession, AVPlayer |
| VideoToolbox | 视频硬件编解码 | VTCompressionSession, VTDecompressionSession |
| AudioToolbox | 音频硬件编解码、音频单元 | AudioConverter, AudioUnit |
| CoreMedia | 时间戳体系(CMTime)、缓冲数据结构 | CMTime, CMSampleBuffer, CVPixelBuffer |
| CoreAnimation | 渲染合成、显示驱动 | CALayer, CADisplayLink |
音视频管线总览
音视频从采集到显示的完整流程,体现了物理隔离、PTS 统一的架构:
音视频同步 PTS 统一 视频应用 线程/缓冲 视频硬件 摄像头/GPU 网络传输 RTP/RTCP 音频应用 线程/缓冲 音频硬件 麦克风/扬声器 音视频同步 PTS 统一 视频应用 线程/缓冲 视频硬件 摄像头/GPU 网络传输 RTP/RTCP 音频应用 线程/缓冲 音频硬件 麦克风/扬声器 #mermaid-svg-Ntj4IoOS50nUbQ0W{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Ntj4IoOS50nUbQ0W .error-icon{fill:#552222;}#mermaid-svg-Ntj4IoOS50nUbQ0W .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Ntj4IoOS50nUbQ0W .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Ntj4IoOS50nUbQ0W .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Ntj4IoOS50nUbQ0W .marker.cross{stroke:#333333;}#mermaid-svg-Ntj4IoOS50nUbQ0W svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Ntj4IoOS50nUbQ0W p{margin:0;}#mermaid-svg-Ntj4IoOS50nUbQ0W .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Ntj4IoOS50nUbQ0W text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-Ntj4IoOS50nUbQ0W .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-Ntj4IoOS50nUbQ0W .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-Ntj4IoOS50nUbQ0W #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-Ntj4IoOS50nUbQ0W .sequenceNumber{fill:white;}#mermaid-svg-Ntj4IoOS50nUbQ0W #sequencenumber{fill:#333;}#mermaid-svg-Ntj4IoOS50nUbQ0W #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-Ntj4IoOS50nUbQ0W .messageText{fill:#333;stroke:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Ntj4IoOS50nUbQ0W .labelText,#mermaid-svg-Ntj4IoOS50nUbQ0W .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .loopText,#mermaid-svg-Ntj4IoOS50nUbQ0W .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-Ntj4IoOS50nUbQ0W .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-Ntj4IoOS50nUbQ0W .noteText,#mermaid-svg-Ntj4IoOS50nUbQ0W .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-Ntj4IoOS50nUbQ0W .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Ntj4IoOS50nUbQ0W .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Ntj4IoOS50nUbQ0W .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Ntj4IoOS50nUbQ0W .actorPopupMenu{position:absolute;}#mermaid-svg-Ntj4IoOS50nUbQ0W .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-Ntj4IoOS50nUbQ0W .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Ntj4IoOS50nUbQ0W .actor-man circle,#mermaid-svg-Ntj4IoOS50nUbQ0W line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-Ntj4IoOS50nUbQ0W :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 采集阶段 编码传输阶段 解码播放阶段 PCM 采样 + PTS_A YUV 采集 + PTS_V AAC 编码 H.264 编码 音频包 视频包 音频包 视频包 AAC 解码 H.264 解码 音频主时钟 Audio Clock 视频 PTS Video PTS diff = V-PTS - A-Clock 音频播放 帧同步决策 等待/丢弃/渲染 视频渲染
关键设计点:
- 物理隔离:音视频从硬件到软件完全独立运行
- PTS 统一:采集时打时间戳,播放时对齐
- 主时钟:音频播放时钟作为同步基准
- 帧决策:视频根据 diff 决定等待/丢弃/渲染
音频管线
音频的特点是:数据量小、连续性极强、对延迟极度敏感、对丢包容忍度低(丢包会产生明显爆音)。
1. 采集
核心过程
iOS 框架 :AVAudioEngine (高级) 或 Audio Unit (kAudioUnitSubType_RemoteIO) (底层)。
核心过程:麦克风采集模拟信号,通过 ADC 转换为数字 PCM 裸流。
深入细节:
- 回调机制 :底层通过
AURenderCallback严格以固定频率(如每 10ms 一次)回调上层。在回调中拿到的AudioBufferList必须被立刻读取或拷贝,绝不能在回调中执行加锁或耗时 I/O,否则会导致底层环形缓冲溢出,产生破音。 - 数据排布 :需区分 Interleaved(交错排布,左右声道数据交替)和 Non-interleaved(非交错,左右声道分别在不同的
AudioBuffer中)。iOS 底层硬件通常输出 Non-interleaved 的Float32数据。 - 此时需基于
CMClock或mach_absolute_time记录下第一帧 PCM 的 PTS。
数据流示意图
#mermaid-svg-RjTvfV6ME19EKxKK{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-RjTvfV6ME19EKxKK .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-RjTvfV6ME19EKxKK .error-icon{fill:#552222;}#mermaid-svg-RjTvfV6ME19EKxKK .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-RjTvfV6ME19EKxKK .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-RjTvfV6ME19EKxKK .marker{fill:#333333;stroke:#333333;}#mermaid-svg-RjTvfV6ME19EKxKK .marker.cross{stroke:#333333;}#mermaid-svg-RjTvfV6ME19EKxKK svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-RjTvfV6ME19EKxKK p{margin:0;}#mermaid-svg-RjTvfV6ME19EKxKK .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-RjTvfV6ME19EKxKK .cluster-label text{fill:#333;}#mermaid-svg-RjTvfV6ME19EKxKK .cluster-label span{color:#333;}#mermaid-svg-RjTvfV6ME19EKxKK .cluster-label span p{background-color:transparent;}#mermaid-svg-RjTvfV6ME19EKxKK .label text,#mermaid-svg-RjTvfV6ME19EKxKK span{fill:#333;color:#333;}#mermaid-svg-RjTvfV6ME19EKxKK .node rect,#mermaid-svg-RjTvfV6ME19EKxKK .node circle,#mermaid-svg-RjTvfV6ME19EKxKK .node ellipse,#mermaid-svg-RjTvfV6ME19EKxKK .node polygon,#mermaid-svg-RjTvfV6ME19EKxKK .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-RjTvfV6ME19EKxKK .rough-node .label text,#mermaid-svg-RjTvfV6ME19EKxKK .node .label text,#mermaid-svg-RjTvfV6ME19EKxKK .image-shape .label,#mermaid-svg-RjTvfV6ME19EKxKK .icon-shape .label{text-anchor:middle;}#mermaid-svg-RjTvfV6ME19EKxKK .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-RjTvfV6ME19EKxKK .rough-node .label,#mermaid-svg-RjTvfV6ME19EKxKK .node .label,#mermaid-svg-RjTvfV6ME19EKxKK .image-shape .label,#mermaid-svg-RjTvfV6ME19EKxKK .icon-shape .label{text-align:center;}#mermaid-svg-RjTvfV6ME19EKxKK .node.clickable{cursor:pointer;}#mermaid-svg-RjTvfV6ME19EKxKK .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-RjTvfV6ME19EKxKK .arrowheadPath{fill:#333333;}#mermaid-svg-RjTvfV6ME19EKxKK .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-RjTvfV6ME19EKxKK .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-RjTvfV6ME19EKxKK .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-RjTvfV6ME19EKxKK .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-RjTvfV6ME19EKxKK .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-RjTvfV6ME19EKxKK .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-RjTvfV6ME19EKxKK .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-RjTvfV6ME19EKxKK .cluster text{fill:#333;}#mermaid-svg-RjTvfV6ME19EKxKK .cluster span{color:#333;}#mermaid-svg-RjTvfV6ME19EKxKK div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-RjTvfV6ME19EKxKK .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-RjTvfV6ME19EKxKK rect.text{fill:none;stroke-width:0;}#mermaid-svg-RjTvfV6ME19EKxKK .icon-shape,#mermaid-svg-RjTvfV6ME19EKxKK .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-RjTvfV6ME19EKxKK .icon-shape p,#mermaid-svg-RjTvfV6ME19EKxKK .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-RjTvfV6ME19EKxKK .icon-shape .label rect,#mermaid-svg-RjTvfV6ME19EKxKK .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-RjTvfV6ME19EKxKK .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-RjTvfV6ME19EKxKK .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-RjTvfV6ME19EKxKK :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 麦克风
模拟信号
ADC
模数转换
PCM 裸流
Float32
AURenderCallback
固定频率回调
应用缓冲队列
必须立刻拷贝
打上 PTS
mach_absolute_time
AudioUnit 回调核心代码
objc
// AudioUnit 回调结构
static OSStatus recordingCallback(
void * __unsafe_unretained inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData
) {
// 获取音频上下文
AudioRecorder *recorder = (__bridge AudioRecorder *)inRefCon;
// 准备缓冲区
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mNumberChannels = recorder->audioFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mDataByteSize = inNumberFrames * recorder->audioFormat.mBytesPerFrame;
bufferList.mBuffers[0].mData = recorder->audioBuffer;
// 从 AudioUnit 获取 PCM 数据
OSStatus status = AudioUnitRender(
recorder->audioUnit, // 音频单元
ioActionFlags, // 渲染标志
inTimeStamp, // 时间戳
inBusNumber, // 总线号
inNumberFrames, // 帧数
&bufferList // 输出缓冲
);
if (status == noErr) {
// 关键:立刻记录 PTS
uint64_t pts = mach_absolute_time();
[recorder processAudioBuffer:bufferList.mBuffers[0].mData
frameCount:inNumberFrames
pts:pts];
}
return status;
}
// 配置 AudioUnit 录制
- (void)setupAudioUnit {
AudioComponentDescription desc = {
.componentType = kAudioUnitType_Output,
.componentSubType = kAudioUnitSubType_RemoteIO,
.componentManufacturer = kAppleManufacturer,
.componentFlags = 0,
.componentFlagsMask = 0
};
AudioComponent component = AudioComponentFindNext(NULL, &desc);
AudioComponentInstanceNew(component, &_audioUnit);
// 启用录制
UInt32 enable = 1;
AudioUnitSetProperty(_audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
1, // 输入总线
&enable,
sizeof(enable)
);
// 设置音频格式
AudioStreamBasicDescription format = {
.mSampleRate = 44100.0,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked,
.mFramesPerPacket = 1,
.mChannelsPerFrame = 1, // 单声道
.mBitsPerChannel = 32,
.mBytesPerPacket = 4,
.mBytesPerFrame = 4
};
AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
1,
&format,
sizeof(format)
);
// 设置回调
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = recordingCallback;
callbackStruct.inputProcRefCon = (__bridge void * _Nullable)(self);
AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Output,
1,
&callbackStruct,
sizeof(callbackStruct)
);
AudioUnitInitialize(_audioUnit);
}
代码要点:
- ✅ 回调中必须立刻处理数据,避免阻塞底层
- ✅ 使用
mach_absolute_time()记录高精度 PTS - ✅ 设置合适的音频格式(Float32 PCM)
2. 编码
核心过程
iOS 框架 :AudioToolbox.framework (AudioConverterConvertComplexBuffer)。
核心过程:将 PCM 压缩为 AAC、Opus 或 MP3 格式,以适应网络带宽。
深入细节:
- 硬编加速:iOS 利用 DSP 硬件进行编码,CPU 占用极低。
- ADTS 头处理:如果使用 AAC 裸流传输,通常需要在每个 AAC 帧前手动拼接 7 字节的 ADTS(Audio Data Transport Stream)头,包含采样率、声道数、帧长度等信息,以便接收端解码。
- 算法延迟:编码器本身有前向预测带来的延迟(如 AAC-LC 通常有 2048 采样点的延迟)。在要求极致低延迟的 RTC 场景中,通常采用 Opus 编码并开启 DTX(静音抑制)和 FEC(前向纠错)。
AudioConverter 配置核心代码
objc
// 配置音频转换器(PCM -> AAC)
- (void)setupAudioConverter {
// 输入格式:PCM
AudioStreamBasicDescription inputFormat = {
.mSampleRate = 44100.0,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked,
.mBytesPerPacket = 4,
.mFramesPerPacket = 1,
.mBytesPerFrame = 4,
.mChannelsPerFrame = 1,
.mBitsPerChannel = 32,
.mReserved = 0
};
// 输出格式:AAC
AudioStreamBasicDescription outputFormat = {
.mSampleRate = 44100.0,
.mFormatID = kAudioFormatMPEG4AAC,
.mChannelsPerFrame = 1
};
// 获取 AAC 格式的详细信息
UInt32 size = sizeof(outputFormat);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0, NULL, &size, &outputFormat);
// 创建转换器
AudioConverterNew(&inputFormat, &outputFormat, &_audioConverter);
// 设置编码质量
UInt32 codecQuality = kAudioConverterQuality_High;
AudioConverterSetProperty(_audioConverter,
kAudioConverterCodecQuality,
sizeof(codecQuality),
&codecQuality
);
// 设置比特率(可选)
UInt32 bitRate = 64000; // 64 kbps
AudioConverterSetProperty(_audioConverter,
kAudioConverterEncodeBitRate,
sizeof(bitRate),
&bitRate
);
}
// 编码 PCM 为 AAC
- (void)encodePCM:(AudioBufferList *)pcmBuffer frameCount:(UInt32)frameCount {
// 准备输出缓冲
UInt32 outputBufferSize = pcmBuffer->mBuffers[0].mDataByteSize / 2; // AAC 压缩后约为 PCM 的一半
uint8_t *outputBuffer = malloc(outputBufferSize);
AudioBufferList outputBufferList = {
.mNumberBuffers = 1,
.mBuffers = {{
.mNumberChannels = 1,
.mDataByteSize = outputBufferSize,
.mData = outputBuffer
}}
};
// 转换
UInt32 ioOutputDataPacketSize = outputBufferSize;
OSStatus status = AudioConverterFillComplexBuffer(
_audioConverter,
audioConverterInputCallback, // 输入回调
(__bridge void * _Nullable)(self),
&ioOutputDataPacketSize,
&outputBufferList,
NULL // 输出数据包描述
);
if (status == noErr) {
// 添加 ADTS 头
NSData *aacData = [NSData dataWithBytes:outputBuffer length:outputBufferList.mBuffers[0].mDataByteSize];
NSData *adtsData = [self addADTSHeaderToAAC:aacData sampleRate:44100 channels:1];
// 发送或保存
[self sendEncodedFrame:adtsData pts:_currentPTS];
}
free(outputBuffer);
}
ADTS 头生成代码
objc
// 添加 ADTS 头到 AAC 帧
- (NSData *)addADTSHeaderToAAC:(NSData *)aacData sampleRate:(int)sampleRate channels:(int)channels {
int packetLength = (int)aacData.length + 7; // AAC 数据 + ADTS 头长度
uint8_t adts[7];
// 固定部分
adts[0] = 0xFF; // syncword
adts[1] = 0xF1; // syncword + ID + layer + protection_absent
// 配置部分
int profile = 2; // AAC LC
int samplingFrequencyIndex = getSamplingFrequencyIndex(sampleRate); // 44100Hz -> 4
int channelConfiguration = channels; // 1 = mono
adts[2] = ((profile - 1) << 6) + (samplingFrequencyIndex << 2) + (channelConfiguration >> 2);
adts[3] = ((channelConfiguration & 3) << 6) + ((packetLength >> 11) & 0x1F);
adts[4] = (packetLength >> 3) & 0xFF;
adts[5] = ((packetLength & 0x07) << 5) + 0x1F;
adts[6] = 0xFC;
// 拼接 ADTS + AAC
NSMutableData *result = [NSMutableData dataWithBytes:adts length:7];
[result appendData:aacData];
return result;
}
// 获取采样率索引
int getSamplingFrequencyIndex(int sampleRate) {
switch (sampleRate) {
case 96000: return 0;
case 88200: return 1;
case 64000: return 2;
case 48000: return 3;
case 44100: return 4;
case 32000: return 5;
// ... 其他采样率
default: return 4; // 默认 44100Hz
}
}
编码器性能对比
| 编码器 | 编码延迟 | 码率范围 | 适用场景 | 硬件支持 |
|---|---|---|---|---|
| AAC-LC | ~46ms | 32-320 Kbps | 通用音频流 | ✅ 硬件加速 |
| AAC-ELD | ~15ms | 16-64 Kbps | 实时通信(有专利) | ✅ 硬件加速 |
| Opus | ~20ms | 6-510 Kbps | WebRTC、实时流(开源) | ❌ 软编码 |
| MP3 | ~50ms | 32-320 Kbps | 兼容性需求 | ✅ 硬件加速 |
3. 传输
核心过程
核心过程:将编码后的音频帧封装进网络协议栈(如 RTP/RTCP, RTMP)。
深入细节:
- 音频包一般较小(如 10ms 的 Opus 数据只有几十到几百字节),通常直接放入一个 UDP 包发送。
- 弱网策略:音频一般不允许重传(重传到达时可能已过期,反而引起抖动)。若发生丢包,接收端依赖 PLC(Packet Loss Concealment,丢包隐藏)算法进行补偿插值,避免爆音。
Jitter Buffer 工作原理
#mermaid-svg-p8YUuMWTP57JbeMg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-p8YUuMWTP57JbeMg .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-p8YUuMWTP57JbeMg .error-icon{fill:#552222;}#mermaid-svg-p8YUuMWTP57JbeMg .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-p8YUuMWTP57JbeMg .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-p8YUuMWTP57JbeMg .marker{fill:#333333;stroke:#333333;}#mermaid-svg-p8YUuMWTP57JbeMg .marker.cross{stroke:#333333;}#mermaid-svg-p8YUuMWTP57JbeMg svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-p8YUuMWTP57JbeMg p{margin:0;}#mermaid-svg-p8YUuMWTP57JbeMg .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-p8YUuMWTP57JbeMg .cluster-label text{fill:#333;}#mermaid-svg-p8YUuMWTP57JbeMg .cluster-label span{color:#333;}#mermaid-svg-p8YUuMWTP57JbeMg .cluster-label span p{background-color:transparent;}#mermaid-svg-p8YUuMWTP57JbeMg .label text,#mermaid-svg-p8YUuMWTP57JbeMg span{fill:#333;color:#333;}#mermaid-svg-p8YUuMWTP57JbeMg .node rect,#mermaid-svg-p8YUuMWTP57JbeMg .node circle,#mermaid-svg-p8YUuMWTP57JbeMg .node ellipse,#mermaid-svg-p8YUuMWTP57JbeMg .node polygon,#mermaid-svg-p8YUuMWTP57JbeMg .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-p8YUuMWTP57JbeMg .rough-node .label text,#mermaid-svg-p8YUuMWTP57JbeMg .node .label text,#mermaid-svg-p8YUuMWTP57JbeMg .image-shape .label,#mermaid-svg-p8YUuMWTP57JbeMg .icon-shape .label{text-anchor:middle;}#mermaid-svg-p8YUuMWTP57JbeMg .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-p8YUuMWTP57JbeMg .rough-node .label,#mermaid-svg-p8YUuMWTP57JbeMg .node .label,#mermaid-svg-p8YUuMWTP57JbeMg .image-shape .label,#mermaid-svg-p8YUuMWTP57JbeMg .icon-shape .label{text-align:center;}#mermaid-svg-p8YUuMWTP57JbeMg .node.clickable{cursor:pointer;}#mermaid-svg-p8YUuMWTP57JbeMg .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-p8YUuMWTP57JbeMg .arrowheadPath{fill:#333333;}#mermaid-svg-p8YUuMWTP57JbeMg .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-p8YUuMWTP57JbeMg .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-p8YUuMWTP57JbeMg .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-p8YUuMWTP57JbeMg .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-p8YUuMWTP57JbeMg .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-p8YUuMWTP57JbeMg .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-p8YUuMWTP57JbeMg .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-p8YUuMWTP57JbeMg .cluster text{fill:#333;}#mermaid-svg-p8YUuMWTP57JbeMg .cluster span{color:#333;}#mermaid-svg-p8YUuMWTP57JbeMg div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-p8YUuMWTP57JbeMg .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-p8YUuMWTP57JbeMg rect.text{fill:none;stroke-width:0;}#mermaid-svg-p8YUuMWTP57JbeMg .icon-shape,#mermaid-svg-p8YUuMWTP57JbeMg .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-p8YUuMWTP57JbeMg .icon-shape p,#mermaid-svg-p8YUuMWTP57JbeMg .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-p8YUuMWTP57JbeMg .icon-shape .label rect,#mermaid-svg-p8YUuMWTP57JbeMg .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-p8YUuMWTP57JbeMg .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-p8YUuMWTP57JbeMg .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-p8YUuMWTP57JbeMg :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 播放侧
缓冲区
网络侧
快
慢
匀速
反馈
动态调整
网络到达
不均匀
Jitter Buffer
200ms 容量
排序队列
按 PTS 排序
音频播放器
匀速消耗
Jitter Buffer 动态调节伪代码:
objc
// Jitter Buffer 动态调节(类似 NetEQ)
@interface JitterBuffer : NSObject
@property (nonatomic, assign) int bufferSize; // 当前缓冲区大小(ms)
@property (nonatomic, assign) int minBufferSize; // 最小缓冲(50ms)
@property (nonatomic, assign) int maxBufferSize; // 最大缓冲(500ms)
@property (nonatomic, assign) int targetDelay; // 目标延迟(200ms)
@property (nonatomic, strong) NSMutableArray *packets; // 包队列(按 PTS 排序)
@end
@implementation JitterBuffer
- (instancetype)init {
self = [super init];
if (self) {
_bufferSize = 200; // 初始 200ms
_minBufferSize = 50;
_maxBufferSize = 500;
_targetDelay = 200;
_packets = [NSMutableArray array];
}
return self;
}
// 接收网络包
- (void)pushPacket:(AudioPacket *)packet {
// 按 PTS 插入队列
[self insertSorted:packet];
// 计算当前延迟
int currentDelay = [self calculateCurrentDelay];
// 动态调整缓冲区大小
if (currentDelay > _targetDelay + 50) {
// 延迟过高,加快消耗(拉伸音频)
_bufferSize = MAX(_minBufferSize, _bufferSize - 20);
[self acceleratePlayback];
} else if (currentDelay < _targetDelay - 50) {
// 延迟过低,放慢消耗(减速音频)
_bufferSize = MIN(_maxBufferSize, _bufferSize + 20);
[self deceleratePlayback];
}
}
// 取出应该播放的包
- (AudioPacket *)popPacketForTime:(uint64_t)currentTime {
// 查找最接近当前时间的包
for (AudioPacket *packet in _packets) {
if (packet.pts <= currentTime) {
[_packets removeObject:packet];
return packet;
}
}
// 没有找到,执行 PLC(丢包隐藏)
return [self generatePLCFrame];
}
// 丢包隐藏(简单线性插值)
- (AudioPacket *)generatePLCFrame {
// 基于前后帧进行插值
// 实际实现更复杂,使用频域分析
AudioPacket *lastPacket = _packets.lastObject;
// ... 生成舒适噪声
return comfortNoise;
}
@end
4. 解码
核心过程
iOS 框架 :AudioToolbox.framework (AudioConverter) 或第三方库(如 libopus)。
核心过程:将接收到的压缩数据解压回 PCM。
深入细节:
- 接收端必须维护一个 Jitter Buffer(抖动缓冲区)。由于网络包到达有快有慢,Jitter Buffer 会暂存一定时长(如 200ms)的音频包,重新按 PTS 排序后匀速喂给解码器,平缓网络抖动。
- NetEQ(WebRTC 中的核心组件)会动态调整 Jitter Buffer 的长度:网络好时缩短以降延迟,网络差时拉长以防卡顿,并在缓冲区快空时对音频进行拉伸,快满时进行加速。
解码器核心代码
objc
// AAC 解码器
@interface AudioDecoder : NSObject
@property (nonatomic, strong) AudioConverterRef decoder;
@property (nonatomic, assign) AudioStreamBasicDescription outputFormat; // PCM 格式
@end
@implementation AudioDecoder
- (void)setupDecoder {
// 输入格式:AAC
AudioStreamBasicDescription inputFormat = {
.mSampleRate = 44100.0,
.mFormatID = kAudioFormatMPEG4AAC,
.mChannelsPerFrame = 1
};
// 输出格式:PCM
_outputFormat = (AudioStreamBasicDescription){
.mSampleRate = 44100.0,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked,
.mFramesPerPacket = 1,
.mChannelsPerFrame = 1,
.mBitsPerChannel = 32,
.mBytesPerPacket = 4,
.mBytesPerFrame = 4
};
// 创建解码器
AudioConverterNew(&inputFormat, &_outputFormat, &_decoder);
// 设置解析 ADTS 头
UInt32 adts = 1;
AudioConverterSetProperty(_decoder,
kAudioConverterProperty_AudioCodecResolveProductBundle,
sizeof(adts),
&adts
);
}
// 解码 AAC 帧
- (AudioBufferList *)decodeAAC:(NSData *)aacData {
// 去除 ADTS 头(如果有)
NSData *pureAAC = [self stripADTSHeader:aacData];
// 准备输出缓冲
UInt32 outputBufferSize = 4096; // 预估 PCM 大小
void *outputBuffer = malloc(outputBufferSize);
AudioBufferList outputBufferList = {
.mNumberBuffers = 1,
.mBuffers = {{
.mNumberChannels = 1,
.mDataByteSize = outputBufferSize,
.mData = outputBuffer
}}
};
// 解码
UInt32 ioOutputDataPacketSize = outputBufferSize / _outputFormat.mBytesPerPacket;
OSStatus status = AudioConverterFillComplexBuffer(
_decoder,
audioDecoderInputCallback,
(__bridge void * _Nullable)(pureAAC),
&ioOutputDataPacketSize,
&outputBufferList,
NULL
);
if (status == noErr) {
return &outputBufferList; // 返回 PCM 数据
}
free(outputBuffer);
return NULL;
}
@end
5. 播放
核心过程
iOS 框架 :AVAudioEngine 或底层的 Audio Unit (kAudioUnitSubType_RemoteIO)。
核心过程:将 PCM 数据推入音频硬件驱动发声。
深入细节:
- 声卡以固定采样率消耗 PCM 数据,这个匀速消耗的过程构成了整个系统的音频主时钟。
- 开发者可以通过
AudioDeviceGetCurrentTime获取硬件播放时间戳,以此作为音视频同步的基准锚点。
音频主时钟获取代码
objc
// 获取音频播放时间戳(作为音视频同步基准)
- (CMTime)getCurrentAudioTime {
AudioTimeStamp timeStamp;
OSStatus status = AudioUnitGetProperty(
_audioUnit,
kAudioUnitProperty_CurrentPlayTime,
kAudioUnitScope_Global,
0,
&timeStamp,
&(UInt32){sizeof(timeStamp)}
);
if (status == noErr && timeStamp.mFlags & kAudioTimeStampHostTimeValid) {
// 转换为 CMTime
uint64_t hostTime = timeStamp.mHostTime;
Nanoseconds nanos = AudioConvertHostTimeToNanos(hostTime);
double seconds = (double)nanos.lo / 1e9;
return CMTimeMakeWithSeconds(seconds, 1000000000);
}
return kCMTimeInvalid;
}
// 音频播放回调
static OSStatus playbackCallback(
void * __unsafe_unretained inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData
) {
AudioPlayer *player = (__bridge AudioPlayer *)inRefCon;
// 从解码队列取数据
AudioBufferList *bufferList = [player dequeueBuffer:frameCount:inNumberFrames];
if (bufferList) {
// 拷贝到输出缓冲
for (int i = 0; i < ioData->mNumberBuffers; i++) {
memcpy(ioData->mBuffers[i].mData,
bufferList->mBuffers[i].mData,
bufferList->mBuffers[i].mDataByteSize);
}
} else {
// 没有数据,播放静音
for (int i = 0; i < ioData->mNumberBuffers; i++) {
memset(ioData->mBuffers[i].mData, 0, ioData->mBuffers[i].mDataByteSize);
}
}
return noErr;
}
同步基准代码:
objc
// 音视频同步决策
- (void)syncVideoWithAudio {
// 获取当前音频播放时间
CMTime audioTime = [self getCurrentAudioTime];
if (CMTIME_IS_INVALID(audioTime)) {
return; // 音频未开始播放
}
// 获取视频队列中最早一帧的 PTS
VideoFrame *frame = [self peekNextVideoFrame];
if (!frame) {
return;
}
// 计算时间差
CMTime diff = CMTimeSubtract(frame.pts, audioTime);
double diffSeconds = CMTimeGetSeconds(diff);
const double kThreshold = 0.01; // 10ms 阈值
if (diffSeconds > kThreshold) {
// 视频超前,等待
NSLog(@"Video ahead, waiting %.0fms", diffSeconds * 1000);
[self skipVideoRender];
} else if (diffSeconds < -kThreshold) {
// 视频落后,丢帧
NSLog(@"Video behind, dropping frame");
[self dropNextVideoFrame];
} else {
// 时间接近,渲染
[self renderVideoFrame:frame];
}
}
视频管线
视频的特点是:数据量庞大、单帧处理耗时长、允许丢帧(丢帧只表现为卡顿)、帧间存在强依赖关系(I/P/B帧)。
1. 采集
核心过程
iOS 框架 :AVFoundation (AVCaptureSession + AVCaptureVideoDataOutput)。
核心过程:通过摄像头采集光信号,ISP 处理后输出未压缩的 YUV 图像数据。
深入细节:
- 格式与零拷贝 :iOS 相机原生输出
CVPixelBufferRef(本质是CVImageBufferRef),格式通常为kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange(NV12)。应避免在采集阶段转为 RGB,以节省 CPU 和内存带宽。 - 时间戳 :
AVCaptureVideoDataOutput回调提供的CMSampleBufferRef中自带极其精确的CMTimePTS,这个 PTS 是基于硬件时钟的,必须严格保留,贯穿整个管线。 - 丢帧策略 :需设置
alwaysDiscardsLateVideoFrames = YES,当处理慢于采集速度时,系统直接丢弃晚到的帧,防止内存暴涨和延迟累积。
AVCaptureSession 配置核心代码
objc
@interface VideoCapture : NSObject <AVCaptureVideoDataOutputSampleBufferDelegate>
@property (nonatomic, strong) AVCaptureSession *session;
@property (nonatomic, strong) AVCaptureVideoDataOutput *videoOutput;
@property (nonatomic, strong) dispatch_queue_t videoQueue;
@end
@implementation VideoCapture
- (void)setupCamera {
// 创建会话
_session = [[AVCaptureSession alloc] init];
_session.sessionPreset = AVCaptureSessionPreset1280x720; // 720p
// 获取后置摄像头
AVCaptureDevice *camera = [AVCaptureDevice defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionBack];
AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:camera error:nil];
// 配置视频输出
_videoOutput = [[AVCaptureVideoDataOutput alloc] init];
_videoOutput.videoSettings = @{
(id)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) // NV12
};
// 关键:设置丢帧策略
_videoOutput.alwaysDiscardsLateVideoFrames = YES;
// 创建专用队列
_videoQueue = dispatch_queue_create("com.example.video", DISPATCH_QUEUE_SERIAL);
[_videoOutput setSampleBufferDelegate:self queue:_videoQueue];
// 添加到会话
[_session beginConfiguration];
if ([_session canAddInput:input]) {
[_session addInput:input];
}
if ([_session canAddOutput:_videoOutput]) {
[_session addOutput:_videoOutput];
}
[_session commitConfiguration];
// 启动会话
[_session startRunning];
}
// 视频帧回调
- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// 关键:立刻提取 PTS
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
// 提取 CVPixelBuffer
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// 锁定基址
CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
// 获取 YUV 数据
void *yPlane = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0); // Y 平面
void *uvPlane = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1); // UV 平面
size_t yWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0);
size_t yHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0);
size_t uvWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 1);
size_t uvHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);
// 处理或编码
[self processVideoFrame:yPlane
uvPlane:uvPlane
yWidth:yWidth
yHeight:yHeight
uvWidth:uvWidth
uvHeight:uvHeight
pts:pts];
// 解锁
CVPixelBufferUnlockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
}
@end
CVPixelBufferRef 内存布局
#mermaid-svg-2ZyxPcT7U9yflQQQ{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-2ZyxPcT7U9yflQQQ .error-icon{fill:#552222;}#mermaid-svg-2ZyxPcT7U9yflQQQ .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-2ZyxPcT7U9yflQQQ .marker{fill:#333333;stroke:#333333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .marker.cross{stroke:#333333;}#mermaid-svg-2ZyxPcT7U9yflQQQ svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-2ZyxPcT7U9yflQQQ p{margin:0;}#mermaid-svg-2ZyxPcT7U9yflQQQ .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster-label text{fill:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster-label span{color:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster-label span p{background-color:transparent;}#mermaid-svg-2ZyxPcT7U9yflQQQ .label text,#mermaid-svg-2ZyxPcT7U9yflQQQ span{fill:#333;color:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .node rect,#mermaid-svg-2ZyxPcT7U9yflQQQ .node circle,#mermaid-svg-2ZyxPcT7U9yflQQQ .node ellipse,#mermaid-svg-2ZyxPcT7U9yflQQQ .node polygon,#mermaid-svg-2ZyxPcT7U9yflQQQ .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .rough-node .label text,#mermaid-svg-2ZyxPcT7U9yflQQQ .node .label text,#mermaid-svg-2ZyxPcT7U9yflQQQ .image-shape .label,#mermaid-svg-2ZyxPcT7U9yflQQQ .icon-shape .label{text-anchor:middle;}#mermaid-svg-2ZyxPcT7U9yflQQQ .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .rough-node .label,#mermaid-svg-2ZyxPcT7U9yflQQQ .node .label,#mermaid-svg-2ZyxPcT7U9yflQQQ .image-shape .label,#mermaid-svg-2ZyxPcT7U9yflQQQ .icon-shape .label{text-align:center;}#mermaid-svg-2ZyxPcT7U9yflQQQ .node.clickable{cursor:pointer;}#mermaid-svg-2ZyxPcT7U9yflQQQ .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .arrowheadPath{fill:#333333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-2ZyxPcT7U9yflQQQ .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-2ZyxPcT7U9yflQQQ .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-2ZyxPcT7U9yflQQQ .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster text{fill:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ .cluster span{color:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-2ZyxPcT7U9yflQQQ .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-2ZyxPcT7U9yflQQQ rect.text{fill:none;stroke-width:0;}#mermaid-svg-2ZyxPcT7U9yflQQQ .icon-shape,#mermaid-svg-2ZyxPcT7U9yflQQQ .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-2ZyxPcT7U9yflQQQ .icon-shape p,#mermaid-svg-2ZyxPcT7U9yflQQQ .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-2ZyxPcT7U9yflQQQ .icon-shape .label rect,#mermaid-svg-2ZyxPcT7U9yflQQQ .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-2ZyxPcT7U9yflQQQ .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-2ZyxPcT7U9yflQQQ .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-2ZyxPcT7U9yflQQQ :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 零拷贝
内存基址
CVPixelBufferRef (NV12 格式)
IOSurface
IOSurface
零拷贝
Y 平面
亮度
720 x 480 bytes
UV 平面
色度
360 x 240 bytes
交错存储
Y 基址
CVPixelBufferGetBaseAddressOfPlane 0
UV 基址
CVPixelBufferGetBaseAddressOfPlane 1
IOSurface 句柄
跨进程共享
GPU 显存
直接访问
零拷贝机制:
CVPixelBufferRef底层使用IOSurface,可以在多个进程间共享内存- 编码器和解码器可以直接操作 GPU 显存,无需 CPU 拷贝
- 节省内存带宽和 CPU 时间
2. 编码
核心过程
iOS 框架 :VideoToolbox.framework (VTCompressionSessionCreate)。
核心过程:将 YUV 压缩为 H.264 (AVC) 或 H.265 (HEVC) 码流。
深入细节:
- 硬编机制 :iOS 直接将
CVPixelBufferRef的 IOSurface 句柄传给 GPU/专用编码芯片,无需经过 CPU 拷贝。 - 码率控制:支持 CBR(固定码率)、VBR(可变码率)和 ABR(平均码率)。RTC 场景多用 CBR 或带有上限的 VBR 以防止网络拥塞。
- GOP 与关键帧:需配置 GOP(画面组)大小。过大导致弱网抗性差(丢包后长时间无法恢复),过小导致码率暴涨。直播通常设为 2-4 秒。当发生弱网时,接收端可通过 RTCP 发送 FIR(Full Intra Request)强制发送端立即编码一个 I 帧。
- B 帧与 DTS/PTS 分离 :如果开启了 B 帧(双向预测,压缩率高但有延迟),编码器输出帧的顺序(DTS)将不同于显示顺序(PTS)。实时通信场景为了低延迟,通常关闭 B 帧,此时 DTS == PTS。
四大视频编码全方位对比表
| 特征维度 | H.264 (AVC) | H.265 (HEVC) | VP9 | AV1 (未来趋势) |
|---|---|---|---|---|
| 推出年份 | 2003 年 | 2013 年 | 2013 年 | 2018 年 (近几年普及) |
| 压缩效率 | 基准 (100% 体积) | 比 H.264 省约 50% | 比 H.264 省约 45% | 比 H.265 还要再省 30%+ |
| 专利费用 | 较低/基本免费 | 极高且复杂 | 完全免费 (开源) | 完全免费 (开源) |
| 硬件解码普及率 | 100% (全平台通吃) | 95% (现代设备标配) | 90% (主要是网页和安卓) | 快速普及中 (主流新机均已支持) |
| 主要应用场景 | 监控、老旧设备、常规直播 | 4K 蓝光、国内主流视频 App、影视后期 | YouTube、谷歌生态 | 行业下一代标准、超高清流媒体、AI 生成视频 |
| 编码复杂度 (算力消耗) | 极低 | 较高 | 较高 | 极高 (软解非常吃 CPU) |
AV1老旧手机可能会不支持,大厂商通常选取多种兼容格式,支持AV1(省流量),优先使用,不支持,再使用H.265。这会要求后台支持多路转码能力(需要更多存储和算力),所以一般还是选择H.265编码,除非这个视频特别火。
VideoToolbox 初始化核心代码
objc
@interface VideoEncoder : NSObject
@property (nonatomic, assign) VTCompressionSessionRef compressionSession;
@property (nonatomic, assign) int width;
@property (nonatomic, assign) int height;
@end
@implementation VideoEncoder
- (void)setupEncoder:(int)width height:(int)height {
_width = width;
_height = height;
// 编码器配置
CFStringRef codecType = kCMVideoCodecType_H264; // H.264
// 创建编码会话
OSStatus status = VTCompressionSessionCreate(
NULL, // allocator
width, // width
height, // height
codecType, // codec type
NULL, // encoderSpecification
NULL, // imageBufferAttributes
NULL, // compressedDataAllocator
encodingCallback, // outputCallback
(__bridge void * _Nullable)(self), // callbackRefCon
&_compressionSession
);
if (status != noErr) {
NSLog(@"Failed to create compression session: %d", status);
return;
}
// 配置编码参数
[self configureEncoder];
// 准备编码
VTSessionSetProperty(_compressionSession, kVTCompressionSessionProperty_PropertyOperator, NULL);
VTCompressionSessionPrepareToEncodeFrames(_compressionSession);
}
- (void)configureEncoder {
// 设置实时编码(低延迟)
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_RealTime,
kCFBooleanTrue
);
// 设置码率控制(CBR)
int bitrate = 2000000; // 2 Mbps
CFNumberRef bitrateRef = CFNumberCreate(NULL, kCFNumberIntType, &bitrate);
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_AverageBitRate,
bitrateRef
);
CFRelease(bitrateRef);
// 设置码率上限
int bitrateLimit = 2500000; // 2.5 Mbps
CFNumberRef limitRef = CFNumberCreate(NULL, kCFNumberIntType, &bitrateLimit);
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_DataRateLimits,
limitRef
);
CFRelease(limitRef);
// 设置帧率
int frameRate = 30;
CFNumberRef frameRateRef = CFNumberCreate(NULL, kCFNumberIntType, &frameRate);
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_ExpectedFrameRate,
frameRateRef
);
CFRelease(frameRateRef);
// 关键帧间隔(GOP 大小)
int keyFrameInterval = 120; // 4 秒 (120 / 30fps)
CFNumberRef intervalRef = CFNumberCreate(NULL, kCFNumberIntType, &keyFrameInterval);
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_MaxKeyFrameInterval,
intervalRef
);
CFRelease(intervalRef);
// 关闭 B 帧(实时通信需要低延迟)
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_AllowFrameReordering,
kCFBooleanFalse
);
// 配置 H.264 Profile
VTSessionSetProperty(
_compressionSession,
kVTCompressionPropertyKey_ProfileLevel,
kVTProfileLevel_H264_Baseline_AutoLevel
);
}
// 编码回调
void encodingCallback(
void *outputCallbackRefCon,
void *sourceFrameRefCon,
OSStatus status,
VTEncodeInfoFlags infoFlags,
CMSampleBufferRef sampleBuffer
) {
if (status != noErr || !sampleBuffer) {
return;
}
VideoEncoder *encoder = (__bridge VideoEncoder *)outputCallbackRefCon;
// 检查是否是关键帧
BOOL isKeyFrame = !CFDictionaryContainsKey(
CFArrayGetAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0),
kCMSampleAttachmentKey_NotSync
);
// 提取编码数据
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t totalLength;
char *dataPointer;
CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &totalLength, &dataPointer);
// 提取 SPS/PPS(从关键帧中)
if (isKeyFrame) {
[encoder extractSPSPPSFromSampleBuffer:sampleBuffer];
}
// 发送编码数据
NSData *encodedData = [NSData dataWithBytes:dataPointer length:totalLength];
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
[encoder sendEncodedFrame:encodedData isKeyFrame:isKeyFrame pts:pts];
}
// 编码一帧
- (void)encodeFrame:(CVPixelBufferRef)pixelBuffer pts:(CMTime)pts {
// 将 CVPixelBuffer 转为 CMSampleBuffer
CMVideoFormatDescriptionRef formatDescription;
CMVideoFormatDescriptionCreateForImageBuffer(
NULL,
pixelBuffer,
&formatDescription
);
// 编码
OSStatus status = VTCompressionSessionEncodeFrame(
_compressionSession,
pixelBuffer,
pts,
kCMTimeInvalid, // duration
NULL, // frameProperties
NULL, // sourceFrameRefCon
NULL // outputCallback
);
if (status != noErr) {
NSLog(@"Encoding failed: %d", status);
}
CFRelease(formatDescription);
}
// 清理资源
- (void)dealloc {
if (_compressionSession) {
VTCompressionSessionInvalidate(_compressionSession);
CFRelease(_compressionSession);
_compressionSession = NULL;
}
}
@end
码率控制参数对照表
| 场景 | 分辨率 | 帧率 | 目标码率 | 最大码率 | GOP 大小 | B 帧 | Profile |
|---|---|---|---|---|---|---|---|
| RTC 通话 | 480p | 15fps | 500 Kbps | 750 Kbps | 30 帧 (2s) | ❌ | Baseline |
| 直播推流 | 720p | 30fps | 2-4 Mbps | 5 Mbps | 90 帧 (3s) | ❌ | Main |
| 高清录制 | 1080p | 60fps | 8-12 Mbps | 15 Mbps | 240 帧 (4s) | ✅ | High |
GOP 配置建议
弱网环境(高丢包率):
├── GOP 大小:1-2 秒(更频繁的 I 帧恢复)
├── 码率:固定码率(CBR)
└── 关键帧:主动请求(FIR)
良好网络:
├── GOP 大小:4-8 秒(更高压缩率)
├── 码率:可变码率(VBR)
└── 关键帧:按间隔生成
3. 传输
核心过程
核心过程:视频码流(NALU 单元)被拆包封装发送。
深入细节:
- 格式转换 :
VideoToolbox硬编输出的 H.264 NALU 前通常带 4 字节长度头(AVCC 格式,iOS 偏好)。但在网络传输(如 RTP)时,必须转换为 Start Code (00 00 00 01) 分隔格式(Annex B 格式)。 - 分片:一个视频帧(尤其是 I 帧)往往大于 MTU(1500 字节),必须在应用层(如 RTP 的 FU-A 分片机制)拆分成多个网络包发送。
- 弱网流控:发送端需根据接收端反馈的网络带宽评估(如 WebRTC 的 GCC 算法),动态调整编码器的分辨率、帧率和码率。
一个帧包含多个 NALU
重要概念 :一个编码帧(尤其是 I 帧)通常包含多个 NALU,这是 H.264/H.265 编码的标准结构。
典型 I 帧结构:
┌──────────────────────────────┐
│ SPS NALU │ ← 序列参数集
├──────────────────────────────┤
│ PPS NALU │ ← 图像参数集
├──────────────────────────────┤
│ IDR Slice NALU │ ← 关键帧图像数据
│ (可能有多个 Slice) │
└──────────────────────────────┘
NALU 类型(H.264):
- NALU 7 (SPS):序列参数集,定义编码配置
- NALU 8 (PPS):图像参数集,定义帧级配置
- NALU 5 (IDR):关键帧片(I 帧)
- NALU 1 (非 IDR Slice):P 帧片
处理要点:
objc
// 解析一个 CMSampleBuffer 中的所有 NALU
- (void)parseAllNALUsInSampleBuffer:(CMSampleBufferRef)sampleBuffer {
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t totalLength;
char *dataPointer;
CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &totalLength, &dataPointer);
size_t offset = 0;
while (offset < totalLength) {
// 读取 NALU 长度(AVCC 格式,4 字节长度头)
uint32_t naluLength = CFSwapInt32BigToHost(*(uint32_t *)(dataPointer + offset));
offset += 4;
// 读取 NALU 头,判断类型
uint8_t naluHeader = *(uint8_t *)(dataPointer + offset);
uint8_t naluType = naluHeader & 0x1F; // 低 5 位是类型
switch (naluType) {
case 7: // SPS
NSLog(@"Found SPS, length: %u", naluLength);
[self saveSPS:dataPointer + offset length:naluLength];
break;
case 8: // PPS
NSLog(@"Found PPS, length: %u", naluLength);
[self savePPS:dataPointer + offset length:naluLength];
break;
case 5: // IDR Slice (I 帧)
case 1: // 非 IDR Slice (P 帧)
NSLog(@"Found Slice, type: %d, length: %u", naluType, naluLength);
[self processSlice:dataPointer + offset length:naluLength];
break;
}
offset += naluLength;
}
}
关键注意事项:
- ✅ I 帧必须先提取 SPS/PPS 并缓存,解码器初始化需要
- ✅ 一个
CMSampleBuffer可能包含 3+ 个 NALU - ✅ 转换格式时,必须为每个 NALU 都添加 Start Code
- ✅ 发送时,SPS/PPS 可以缓存,只需在关键帧或初始化时发送
NALU 分片机制示意图
#mermaid-svg-BdvdAHPnqxvd8YLB{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-BdvdAHPnqxvd8YLB .error-icon{fill:#552222;}#mermaid-svg-BdvdAHPnqxvd8YLB .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-BdvdAHPnqxvd8YLB .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-BdvdAHPnqxvd8YLB .marker{fill:#333333;stroke:#333333;}#mermaid-svg-BdvdAHPnqxvd8YLB .marker.cross{stroke:#333333;}#mermaid-svg-BdvdAHPnqxvd8YLB svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-BdvdAHPnqxvd8YLB p{margin:0;}#mermaid-svg-BdvdAHPnqxvd8YLB .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster-label text{fill:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster-label span{color:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster-label span p{background-color:transparent;}#mermaid-svg-BdvdAHPnqxvd8YLB .label text,#mermaid-svg-BdvdAHPnqxvd8YLB span{fill:#333;color:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB .node rect,#mermaid-svg-BdvdAHPnqxvd8YLB .node circle,#mermaid-svg-BdvdAHPnqxvd8YLB .node ellipse,#mermaid-svg-BdvdAHPnqxvd8YLB .node polygon,#mermaid-svg-BdvdAHPnqxvd8YLB .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-BdvdAHPnqxvd8YLB .rough-node .label text,#mermaid-svg-BdvdAHPnqxvd8YLB .node .label text,#mermaid-svg-BdvdAHPnqxvd8YLB .image-shape .label,#mermaid-svg-BdvdAHPnqxvd8YLB .icon-shape .label{text-anchor:middle;}#mermaid-svg-BdvdAHPnqxvd8YLB .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-BdvdAHPnqxvd8YLB .rough-node .label,#mermaid-svg-BdvdAHPnqxvd8YLB .node .label,#mermaid-svg-BdvdAHPnqxvd8YLB .image-shape .label,#mermaid-svg-BdvdAHPnqxvd8YLB .icon-shape .label{text-align:center;}#mermaid-svg-BdvdAHPnqxvd8YLB .node.clickable{cursor:pointer;}#mermaid-svg-BdvdAHPnqxvd8YLB .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-BdvdAHPnqxvd8YLB .arrowheadPath{fill:#333333;}#mermaid-svg-BdvdAHPnqxvd8YLB .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-BdvdAHPnqxvd8YLB .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-BdvdAHPnqxvd8YLB .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BdvdAHPnqxvd8YLB .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-BdvdAHPnqxvd8YLB .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BdvdAHPnqxvd8YLB .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster text{fill:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB .cluster span{color:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-BdvdAHPnqxvd8YLB .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-BdvdAHPnqxvd8YLB rect.text{fill:none;stroke-width:0;}#mermaid-svg-BdvdAHPnqxvd8YLB .icon-shape,#mermaid-svg-BdvdAHPnqxvd8YLB .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BdvdAHPnqxvd8YLB .icon-shape p,#mermaid-svg-BdvdAHPnqxvd8YLB .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-BdvdAHPnqxvd8YLB .icon-shape .label rect,#mermaid-svg-BdvdAHPnqxvd8YLB .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BdvdAHPnqxvd8YLB .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-BdvdAHPnqxvd8YLB .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-BdvdAHPnqxvd8YLB :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 网络传输
分片
格式转换
编码器输出
H.264 码流
AVCC 格式
4 字节长度头
AVCC → Annex B
长度头替换为 Start Code
帧拆分
RTP FU-A
包 1
1500 字节
包 2
1500 字节
包 3
剩余字节
UDP/RTP
有序或乱序
AVCC 转 Annex B 代码
objc
// 将 AVCC 格式(长度头)转换为 Annex B 格式(Start Code)
- (NSData *)convertAVCCToAnnexB:(NSData *)avccData {
NSMutableData *annexBData = [NSMutableData data];
const uint8_t *bytes = (const uint8_t *)avccData.bytes;
NSUInteger length = avccData.length;
NSUInteger offset = 0;
while (offset < length) {
// 读取 4 字节长度头
if (offset + 4 > length) break;
uint32_t naluLength = CFSwapInt32BigToHost(*(uint32_t *)(bytes + offset));
offset += 4;
// 检查长度是否有效
if (offset + naluLength > length) break;
// 添加 Start Code
static const uint8_t startCode[] = {0x00, 0x00, 0x00, 0x01};
[annexBData appendBytes:startCode length:4];
// 添加 NALU 数据
[annexBData appendBytes:bytes + offset length:naluLength];
offset += naluLength;
}
return [annexBData copy];
}
// RTP FU-A 分片伪代码
- (NSArray *)splitNALUToFragments:(NSData *)naluData mtu:(int)mtu {
NSMutableArray *fragments = [NSMutableArray array];
const uint8_t *nalu = (const uint8_t *)naluData.bytes;
uint8_t naluHeader = nalu[0]; // NALU 头
size_t naluLength = naluData.length - 1; // 去掉 NALU 头
// 计算每个分片的最大负载(减去 FU-A 头)
size_t maxPayload = mtu - 14; // RTP 头(12) + FU-A 头(2)
size_t offset = 0;
BOOL isStart = YES;
while (offset < naluLength) {
size_t fragmentSize = MIN(maxPayload, naluLength - offset);
// 构建 FU-A 头
uint8_t fuIndicator = (naluHeader & 0xE0) | 28; // 高 3 位 + FU-A 类型(28)
uint8_t fuHeader = 0;
if (isStart) {
fuHeader |= 0x80; // Start bit
isStart = NO;
} else if (offset + fragmentSize >= naluLength) {
fuHeader |= 0x40; // End bit
}
fuHeader |= (naluHeader & 0x1F); // 低 5 位(原始 NALU 类型)
// 构建完整分片
NSMutableData *fragment = [NSMutableData dataWithBytes:&fuIndicator length:1];
[fragment appendBytes:&fuHeader length:1];
[fragment appendBytes:nalu + 1 + offset length:fragmentSize]; // NALU 数据
[fragments addObject:fragment];
offset += fragmentSize;
}
return fragments;
}
4. 解码
核心过程
iOS 框架 :VideoToolbox.framework (VTDecompressionSessionCreate)。
核心过程:将 H.264/H.265 码流解码回 YUV 图像。
深入细节:
- 硬解与零拷贝 :硬解是异步的。输入带有 SPS/PPS 的
CMSampleBuffer,回调中返回解码后的CVPixelBufferRef。由于使用了IOSurface,解码后的图像数据驻留在 GPU 显存中,可以直接交给渲染管线,无需经过 CPU 内存。 - 重排缓冲区:如果存在 B 帧,解码器内部会维护一个重排缓冲区,处理完帧间依赖后,按 PTS 顺序输出。
- 错误恢复:如果中间丢了一个 P 帧,解码器会丢弃后续的所有 P/B 帧直到下一个 I 帧,表现为画面卡住然后突然跳动。
VTDecompressionSession 配置代码
objc
@interface VideoDecoder : NSObject
@property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
@property (nonatomic, assign) CMVideoFormatDescriptionRef formatDescription;
@property (nonatomic, strong) NSData *sps;
@property (nonatomic, strong) NSData *pps;
@end
@implementation VideoDecoder
- (void)setupDecoder {
// 创建解码会话(在获取 SPS/PPS 后调用)
OSStatus status = VTDecompressionSessionCreate(
NULL, // allocator
_formatDescription, // formatDescription
NULL, // videoDecoderSpecification
NULL, // destinationImageBufferAttributes
NULL, // outputCallback
&_decompressionSession
);
if (status != noErr) {
NSLog(@"Failed to create decompression session: %d", status);
}
}
- (void)setSPS:(NSData *)sps PPS:(NSData *)pps {
_sps = sps;
_pps = pps;
// 从 SPS/PPS 创建格式描述
const uint8_t *spsParam = (const uint8_t *)sps.bytes;
const uint8_t *ppsParam = (const uint8_t *)pps.bytes;
size_t spsParamSize = sps.length;
size_t ppsParamSize = pps.length;
uint8_t *spsData = (uint8_t *)spsParam;
uint8_t *ppsData = (uint8_t *)ppsParam;
// 参数集数组
const uint8_t *parameterSetPointers[2] = {spsData, ppsData};
size_t parameterSetSizes[2] = {spsParamSize, ppsParamSize};
// 创建格式描述
OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(
NULL,
2, // parameterSetCount
parameterSetPointers,
parameterSetSizes,
4, // NALUnitHeaderLength (AVCC 格式)
&_formatDescription
);
if (status == noErr) {
[self setupDecoder];
}
}
// 解码帧
- (void)decodeFrame:(NSData *)frameData pts:(CMTime)pts {
// 将数据封装为 CMBlockBuffer
CMBlockBufferRef blockBuffer;
OSStatus status = CMBlockBufferCreateWithMemoryBlock(
NULL,
(void *)frameData.bytes,
frameData.length,
NULL, // blockAllocator
NULL, // customBlockSource
0, // offsetToData
frameData.length,
0, // flags
&blockBuffer
);
if (status != noErr) {
return;
}
// 创建 CMSampleBuffer
CMSampleBufferRef sampleBuffer;
status = CMSampleBufferCreateReady(
NULL,
blockBuffer,
_formatDescription,
1, // numSamples
0, // numSampleTimingEntries
NULL, // sampleTimingArray
0, // numSampleSizeEntries
NULL, // sampleSizeArray
&sampleBuffer
);
if (status != noErr) {
CFRelease(blockBuffer);
return;
}
// 设置 PTS
CMSampleBufferSetPresentationTimeStamp(sampleBuffer, pts);
// 解码
VTDecodeFrameFlags flags = 0;
VTDecodeInfoFlags infoFlags = 0;
status = VTDecompressionSessionDecodeFrame(
_decompressionSession,
sampleBuffer,
flags,
NULL, // sourceFrameRefCon
&infoFlags
);
if (status != noErr) {
NSLog(@"Decode failed: %d", status);
}
CFRelease(sampleBuffer);
CFRelease(blockBuffer);
}
// 解码回调
void decodingCallback(
void *decompressionOutputRefCon,
void *sourceFrameRefCon,
OSStatus status,
VTDecodeInfoFlags infoFlags,
CVImageBufferRef imageBuffer,
CMTime presentationTimeStamp,
CMTime presentationDuration
) {
if (status != noErr || !imageBuffer) {
return;
}
VideoDecoder *decoder = (__bridge VideoDecoder *)decompressionOutputRefCon;
// 获取解码后的 CVPixelBuffer
CVPixelBufferRef pixelBuffer = (CVPixelBufferRef)imageBuffer;
// 交给渲染管线
[decoder renderPixelBuffer:pixelBuffer pts:presentationTimeStamp];
}
- (void)dealloc {
if (_decompressionSession) {
VTDecompressionSessionInvalidate(_decompressionSession);
CFRelease(_decompressionSession);
_decompressionSession = NULL;
}
if (_formatDescription) {
CFRelease(_formatDescription);
_formatDescription = NULL;
}
}
@end
B 帧重排缓冲区工作原理
渲染器 解码器 重排缓冲 网络传输 编码器 渲染器 解码器 重排缓冲 网络传输 编码器 #mermaid-svg-XAuM0uJzVCeYN5pd{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-XAuM0uJzVCeYN5pd .error-icon{fill:#552222;}#mermaid-svg-XAuM0uJzVCeYN5pd .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-XAuM0uJzVCeYN5pd .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-XAuM0uJzVCeYN5pd .marker{fill:#333333;stroke:#333333;}#mermaid-svg-XAuM0uJzVCeYN5pd .marker.cross{stroke:#333333;}#mermaid-svg-XAuM0uJzVCeYN5pd svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-XAuM0uJzVCeYN5pd p{margin:0;}#mermaid-svg-XAuM0uJzVCeYN5pd .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-XAuM0uJzVCeYN5pd text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-XAuM0uJzVCeYN5pd .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-XAuM0uJzVCeYN5pd .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-XAuM0uJzVCeYN5pd #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-XAuM0uJzVCeYN5pd .sequenceNumber{fill:white;}#mermaid-svg-XAuM0uJzVCeYN5pd #sequencenumber{fill:#333;}#mermaid-svg-XAuM0uJzVCeYN5pd #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-XAuM0uJzVCeYN5pd .messageText{fill:#333;stroke:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-XAuM0uJzVCeYN5pd .labelText,#mermaid-svg-XAuM0uJzVCeYN5pd .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .loopText,#mermaid-svg-XAuM0uJzVCeYN5pd .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-XAuM0uJzVCeYN5pd .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-XAuM0uJzVCeYN5pd .noteText,#mermaid-svg-XAuM0uJzVCeYN5pd .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-XAuM0uJzVCeYN5pd .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-XAuM0uJzVCeYN5pd .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-XAuM0uJzVCeYN5pd .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-XAuM0uJzVCeYN5pd .actorPopupMenu{position:absolute;}#mermaid-svg-XAuM0uJzVCeYN5pd .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-XAuM0uJzVCeYN5pd .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-XAuM0uJzVCeYN5pd .actor-man circle,#mermaid-svg-XAuM0uJzVCeYN5pd line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-XAuM0uJzVCeYN5pd :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 编码顺序(DTS) 网络可能乱序 重排缓冲 显示顺序(PTS) I 帧(0) P 帧(3) B 帧(1) B 帧(2) P 帧(6) I 帧(0) P 帧(3) B 帧(1)- 等待 P(3) B 帧(2)- 等待 P(3) P 帧(6) 收到 I(0) - 立即解码 收到 P(3) - 立即解码 收到 B(1) - 解码(依赖 I/P) 收到 B(2) - 解码(依赖 I/P) 收到 P(6) - 等待 I 帧(PTS 0) B 帧(PTS 1) B 帧(PTS 2) P 帧(PTS 3)
关键点:
- 编码顺序(DTS)与显示顺序(PTS)不同
- B 帧依赖前后的 I/P 帧,解码后需要重排
- 实时通信通常关闭 B 帧,避免延迟
5. 播放
核心过程
iOS 框架 :AVSampleBufferDisplayLayer (系统级零拷贝渲染) 或 Metal / OpenGL ES (自定义渲染)。
核心过程:将 YUV 图像数据渲染到屏幕。
深入细节(统一同步的核心点):
- 使用
AVSampleBufferDisplayLayer:这是最省力的方式。只要将带有准确 PTS 的CMSampleBuffer直接enqueue给该 Layer,其底层会自动处理与系统音频时钟的同步逻辑(早了等待,晚了丢弃)。 - 自定义 Metal 渲染时的同步算法 :如果为了加水印或美颜,需要自己用 Metal 渲染,则必须手写同步算法:
- 通过
CADisplayLink获取屏幕刷新回调(VSync 信号)。 - 读取当前音频播放的硬件时间
audio_clock。 - 从视频解码队列取出一帧,读取其
video_pts。 - 计算
diff = video_pts - audio_clock。 - 若
diff > 0(视频太快):这一帧暂不渲染,等到下一次 VSync 再比较。 - 若
diff < 0(视频太慢):立刻丢弃这一帧,去队列取下一帧,直到找到diff接近 0 的帧进行渲染。
- 通过
AVSampleBufferDisplayLayer 使用代码
objc
@interface VideoRenderer : NSObject
@property (nonatomic, strong) AVSampleBufferDisplayLayer *displayLayer;
@property (nonatomic, strong) dispatch_queue_t rendererQueue;
@end
@implementation VideoRenderer
- (void)setupRenderer {
// 创建显示层
_displayLayer = [[AVSampleBufferDisplayLayer alloc] init];
_displayLayer.bounds = CGRectMake(0, 0, 720, 480);
// 创建专用队列
_rendererQueue = dispatch_queue_create("com.example.renderer", DISPATCH_QUEUE_SERIAL);
// 设置重力模式
_displayLayer.videoGravity = AVLayerVideoGravityResizeAspect;
}
- (void)enqueueSampleBuffer:(CMSampleBufferRef)sampleBuffer {
dispatch_async(_rendererQueue, ^{
// 检查是否可以立即显示
BOOL isReady = self.displayLayer.isReadyForMoreMediaData;
if (!isReady) {
NSLog(@"DisplayLayer not ready, dropping frame");
CFRelease(sampleBuffer);
return;
}
// 入队
[self.displayLayer enqueueSampleBuffer:sampleBuffer];
});
}
// 清理
- (void)flush {
dispatch_async(_rendererQueue, ^{
[self.displayLayer flush];
});
}
@end
自定义 Metal 渲染同步算法
objc
@interface MetalVideoRenderer : NSObject
@property (nonatomic, strong) id<MTLDevice> device;
@property (nonatomic, strong) id<MTLCommandQueue> commandQueue;
@property (nonatomic, strong) id<MTLRenderPipelineState> pipelineState;
@property (nonatomic, strong) CAMetalLayer *metalLayer;
@property (nonatomic, strong) CADisplayLink *displayLink;
@property (nonatomic, strong) NSMutableArray *videoFrames; // 待渲染帧队列
@property (nonatomic, strong) AudioClock *audioClock; // 音频时钟
@end
@implementation MetalVideoRenderer
- (void)setupRenderer:(UIView *)view {
// 初始化 Metal
_device = MTLCreateSystemDefaultDevice();
_commandQueue = [_device newCommandQueue];
// 配置 Metal Layer
_metalLayer = (CAMetalLayer *)view.layer;
_metalLayer.device = _device;
_metalLayer.pixelFormat = MTLPixelFormatBGRA8Unorm;
// 创建渲染管线(YUV -> RGB 转换)
[self createPipelineState];
// 启动显示链接
_displayLink = [CADisplayLink displayLinkWithTarget:self selector:@selector(displayLinkCallback:)];
[_displayLink addToRunLoop:[NSRunLoop mainRunLoop] forMode:NSRunLoopCommonModes];
}
- (void)createPipelineState {
// YUV -> RGB 转换的着色器
NSString *shaderSource = @""
"#include <metal_stdlib>\n"
"using namespace metal;\n"
"\n"
"struct VertexOut {\n"
" float4 position [[position]];\n"
" float2 texCoord;\n"
"};\n"
"\n"
"vertex VertexOut vertex_main(uint vertexID [[vertex_id]]) {\n"
" float4 positions[4] = {\n"
" float4(-1, -1, 0, 1),\n"
" float4(1, -1, 0, 1),\n"
" float4(-1, 1, 0, 1),\n"
" float4(1, 1, 0, 1)\n"
" };\n"
" float2 texCoords[4] = {\n"
" float2(0, 1),\n"
" float2(1, 1),\n"
" float2(0, 0),\n"
" float2(1, 0)\n"
" };\n"
" \n"
" VertexOut out;\n"
" out.position = positions[vertexID];\n"
" out.texCoord = texCoords[vertexID];\n"
" return out;\n"
"}\n"
"\n"
"fragment float4 fragment_main(VertexOut in [[stage_in]],\n"
" texture2d<float> textureY [[texture(0)]],\n"
" texture2d<float> textureUV [[texture(1)]]) {\n"
" constexpr sampler s(mag_filter::linear, min_filter::linear);\n"
" \n"
" float y = textureY.sample(s, in.texCoord).r;\n"
" float2 uv = textureUV.sample(s, in.texCoord).rg - 0.5;\n"
" \n"
" // YUV -> RGB 转换\n"
" float4 yuv = float4(y, uv);\n"
" float4 rgba;\n"
" \n"
" rgba.r = yuv.r + 1.402 * yuv.b;\n"
" rgba.g = yuv.r - 0.344 * yuv.g - 0.714 * yuv.b;\n"
" rgba.b = yuv.r + 1.772 * yuv.g;\n"
" rgba.a = 1.0;\n"
" \n"
" return rgba;\n"
"}";
// 编译着色器并创建管线状态
// ... (完整实现需要创建 MTLLibrary, MTLRenderPipelineDescriptor 等)
}
// 显示链接回调(每帧调用)
- (void)displayLinkCallback:(CADisplayLink *)displayLink {
// 获取当前音频时间
CMTime audioTime = [_audioClock getCurrentTime];
// 从队列取帧
VideoFrame *frame = [self findBestFrameForAudioTime:audioTime];
if (frame) {
[self renderFrame:frame];
}
}
// 寻找最佳匹配的帧(音视频同步核心算法)
- (VideoFrame *)findBestFrameForAudioTime:(CMTime)audioTime {
const double kThreshold = 0.01; // 10ms 阈值
while (_videoFrames.count > 0) {
VideoFrame *frame = _videoFrames.firstObject;
double diff = CMTimeGetSeconds(CMTimeSubtract(frame.pts, audioTime));
if (diff > kThreshold) {
// 视频超前,等待
return nil;
} else if (diff < -kThreshold) {
// 视频落后,丢弃
[_videoFrames removeObjectAtIndex:0];
continue;
} else {
// 找到匹配的帧
[_videoFrames removeObjectAtIndex:0];
return frame;
}
}
return nil;
}
// 渲染帧
- (void)renderFrame:(VideoFrame *)frame {
CVPixelBufferRef pixelBuffer = frame.pixelBuffer;
// 锁定纹理
id<MTLTexture> textureY = [self createTextureFromPlane:pixelBuffer planeIndex:0];
id<MTLTexture> textureUV = [self createTextureFromPlane:pixelBuffer planeIndex:1];
// 创建命令缓冲
id<MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];
id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor:_renderPassDescriptor];
// 设置纹理
[encoder setFragmentTexture:textureY atIndex:0];
[encoder setFragmentTexture:textureUV atIndex:1];
// 绘制
[encoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:4];
[encoder endEncoding];
// 提交
[commandBuffer presentDrawable:_metalLayer.nextDrawable];
[commandBuffer commit];
}
@end
工程实践
1. 性能调优指南
码率/分辨率选择策略
根据场景和网络条件选择合适的参数:
| 场景 | 分辨率 | 帧率 | 目标码率 | GOP 大小 | B 帧 | 适用网络 |
|---|---|---|---|---|---|---|
| RTC 通话 | 480p (640x480) | 15fps | 500 Kbps | 30 帧 (2s) | ❌ | 3G/弱 WiFi |
| RTC 通话 | 720p (1280x720) | 30fps | 1.5 Mbps | 60 帧 (2s) | ❌ | 4G/良好 WiFi |
| 直播推流 | 720p | 30fps | 3 Mbps | 90 帧 (3s) | ❌ | 良好 WiFi |
| 直播推流 | 1080p | 60fps | 6-8 Mbps | 120 帧 (2s) | ❌ | 优秀 WiFi |
| 本地录制 | 1080p | 60fps | 12 Mbps | 240 帧 (4s) | ✅ | 无网络限制 |
CPU/GPU/功耗优化清单
✅ 推荐做法:
- 使用硬编硬解,避免软编软解
- 保持 YUV 格式直到渲染阶段
- 启用
alwaysDiscardsLateVideoFrames - 合理设置 GOP 和关键帧间隔
- 使用
AVSampleBufferDisplayLayer系统渲染 - 启用 Metal 离屏渲染(如果需要后处理)
⚠️ 需要注意:
- 避免频繁的格式转换(YUV ↔ RGB)
- 避免在回调中执行耗时操作
- 避免主线程做编解码
- 控制预览层数量
❌ 禁止做法:
- 不要在主线程阻塞等待编解码结果
- 不要频繁创建和销毁编解码会话
- 不要忽视内存泄漏(CVPixelBufferRef 必须释放)
不同设备性能差异处理
| 设备类型 | 芯片 | 最大编码能力 | 建议配置 |
|---|---|---|---|
| iPhone 6s/7 | A9/A10 Fusion | 720p@30fps (H.264) | 降分辨率、降帧率 |
| iPhone 8/X | A11 Bionic | 1080p@30fps (H.264) | 标准配置 |
| iPhone 11/12 | A13/A14 Bionic | 4K@30fps (H.265) | 高质量配置 |
| iPhone 13/14 | A15/A16 Bionic | 4K@60fps (H.265) | 最高质量 |
| iPad Air/Pro | A 系列芯片 | 4K@30fps | 标准配置 |
动态调整策略:
objc
// 根据设备能力动态调整配置
- (void)adjustConfigurationForDevice {
NSString *deviceModel = [[UIDevice currentDevice] model];
if ([deviceModel containsString:@"iPhone6"] || [deviceModel containsString:@"iPhone7"]) {
// 降级配置
self.targetResolution = CGSizeMake(640, 480); // 480p
self.targetFrameRate = 15;
self.targetBitrate = 500000; // 500 Kbps
} else if ([deviceModel containsString:@"iPhone13"] || [deviceModel containsString:@"iPhone14"]) {
// 高质量配置
self.targetResolution = CGSizeMake(1920, 1080); // 1080p
self.targetFrameRate = 60;
self.targetBitrate = 8000000; // 8 Mbps
}
}
2. 常见问题排查
音画不同步调试方法
症状识别 Checklist:
- 视频快于音频(画面超前声音)
- 视频慢于音频(声音超前画面)
- 音频卡顿或爆音
- 视频频繁卡顿或丢帧
- 延迟逐渐累积
调试步骤:
-
验证时间戳连续性:
objc// 检查 PTS 是否连续 - (void)checkPTSContinuity:(CMTime)pts { static CMTime lastPTS = kCMTimeInvalid; if (CMTIME_IS_VALID(lastPTS)) { CMTime diff = CMTimeSubtract(pts, lastPTS); double diffSeconds = CMTimeGetSeconds(diff); if (diffSeconds < 0 || diffSeconds > 0.1) { NSLog(@"PTS discontinuity detected: %.3f seconds", diffSeconds); } } lastPTS = pts; } -
监控音视频时间差:
objc// 记录音视频时间差 - (void)logAVSyncStats { CMTime audioTime = [self getCurrentAudioTime]; CMTime videoTime = [self getCurrentVideoTime]; if (CMTIME_IS_VALID(audioTime) && CMTIME_IS_VALID(videoTime)) { double diff = CMTimeGetSeconds(CMTimeSubtract(videoTime, audioTime)); NSLog(@"A/V Sync: %.0f ms", diff * 1000); } } -
Instruments 性能分析:
- 使用 Time Profiler 检查是否主线程阻塞
- 使用 System Trace 分析编解码耗时
- 使用 Allocations 检查内存泄漏
音画不同步原因与解决方案
| 症状 | 可能原因 | 解决方案 |
|---|---|---|
| 视频快于音频 | 未使用音频主时钟 | 检查同步逻辑,确保音频为主 |
| 视频快于音频 | 视频帧没有正确的 PTS | 验证 CMSampleBufferGetPresentationTimeStamp |
| 视频慢于音频 | 解码速度慢 | 降分辨率/帧率,检查硬件加速 |
| 视频慢于音频 | 渲染线程阻塞 | 使用专用渲染队列 |
| 音频卡顿 | Jitter Buffer 过小 | 调大缓冲区(200ms+) |
| 音频爆音 | 丢包未处理 | 启用 PLC(丢包隐藏) |
| 延迟累积 | 未及时丢帧 | 检查帧同步决策逻辑 |
编解码失败处理
常见错误码及含义:
objc
// 编解码错误处理
- (void)handleCodecError:(OSStatus)status operation:(NSString *)operation {
switch (status) {
case kVTInvalidSessionErr:
NSLog(@"%@: Session not created or invalidated", operation);
[self recreateSession];
break;
case kVTParameterErr:
NSLog(@"%@: Invalid parameters", operation);
[self validateParameters];
break;
case kVTInvalidErr:
NSLog(@"%@: Invalid operation", operation);
break;
case kVTHardwareErr:
NSLog(@"%@: Hardware codec not supported", operation);
[self fallbackToSoftwareCodec];
break;
case kVTVideoDecoderMalfunctionErr:
NSLog(@"%@: Decoder malfunction, resetting", operation);
[self resetDecoder];
break;
default:
NSLog(@"%@: Unknown error %d", operation, status);
break;
}
}
常见失败场景及恢复:
| 场景 | 错误码 | 恢复策略 |
|---|---|---|
| 格式描述失效 | kVTInvalidSessionErr |
重新创建 CMVideoFormatDescription |
| SPS/PPS 缺失 | kVTParameterErr |
从关键帧提取 SPS/PPS |
| 硬件不支持 | kVTHardwareErr |
降级到软编码 |
| 解码器崩溃 | kVTVideoDecoderMalfunctionErr |
销毁并重建解码器 |
| 内存不足 | kVTAllocationFailureErr |
降分辨率/帧率 |
内存泄漏检查清单
✅ 必须释放的资源:
objc
// 清理清单
- (void)cleanupChecklist {
// 1. 编解码会话
if (_compressionSession) {
VTCompressionSessionInvalidate(_compressionSession);
CFRelease(_compressionSession);
_compressionSession = NULL;
}
if (_decompressionSession) {
VTDecompressionSessionInvalidate(_decompressionSession);
CFRelease(_decompressionSession);
_decompressionSession = NULL;
}
// 2. 音频单元
if (_audioUnit) {
AudioUnitUninitialize(_audioUnit);
AudioComponentInstanceDispose(_audioUnit);
_audioUnit = NULL;
}
// 3. 音频转换器
if (_audioConverter) {
AudioConverterDispose(_audioConverter);
_audioConverter = NULL;
}
// 4. 格式描述
if (_formatDescription) {
CFRelease(_formatDescription);
_formatDescription = NULL;
}
// 5. 捕获会话
if (_captureSession) {
[_captureSession stopRunning];
_captureSession = nil;
}
}
// CVPixelBufferRef 使用规范
- (void)usePixelBuffer:(CVPixelBufferRef)pixelBuffer {
// 锁定基址
CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
// 使用数据
void *data = CVPixelBufferGetBaseAddress(pixelBuffer);
// ... 处理 ...
// 解锁
CVPixelBufferUnlockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
// 如果是创建的 PixelBuffer,记得 CFRelease
// CVPixelBufferRelease(pixelBuffer);
}
Instruments 检查:
- 使用 Allocations 工具
- 启用 Record Reference Counts
- 筛选
VTCompressionSession,VTDecompressionSession,CVPixelBuffer - 检查是否有未释放的实例
3. iOS 版本兼容性
iOS 14+ 隐私权限处理
Info.plist 必需配置:
xml
<key>NSCameraUsageDescription</key>
<string>需要访问摄像头进行视频采集</string>
<key>NSMicrophoneUsageDescription</key>
<string>需要访问麦克风进行音频采集</string>
权限请求代码:
objc
#import <AVFoundation/AVFoundation.h>
// 请求相机和麦克风权限
- (void)requestAVPermissions {
// 相机权限
[AVCaptureDevice requestAccessForMediaType:AVMediaTypeVideo
completionHandler:^(BOOL granted) {
if (granted) {
NSLog(@"Camera permission granted");
[self checkMicrophonePermission];
} else {
NSLog(@"Camera permission denied");
[self showPermissionDeniedAlert:@"相机"];
}
}];
}
- (void)checkMicrophonePermission {
AVAudioSessionRecordPermission permission = [[AVAudioSession sharedInstance] recordPermission];
switch (permission) {
case AVAudioSessionRecordPermissionGranted:
NSLog(@"Microphone permission granted");
[self startAVSession];
break;
case AVAudioSessionRecordPermissionDenied:
NSLog(@"Microphone permission denied");
[self showPermissionDeniedAlert:@"麦克风"];
break;
case AVAudioSessionRecordPermissionUndetermined:
[[AVAudioSession sharedInstance] requestRecordPermission:^(BOOL granted) {
if (granted) {
NSLog(@"Microphone permission granted");
[self startAVSession];
} else {
NSLog(@"Microphone permission denied");
[self showPermissionDeniedAlert:@"麦克风"];
}
}];
break;
}
}
- (void)showPermissionDeniedAlert:(NSString *)mediaType {
UIAlertController *alert = [UIAlertController
alertControllerWithTitle:@"权限被拒绝"
message:[NSString stringWithFormat:@"请在设置中开启 %@ 权限", mediaType]
preferredStyle:UIAlertControllerStyleAlert];
[alert addAction:[UIAlertAction
actionWithTitle:@"去设置"
style:UIAlertActionStyleDefault
handler:^(UIAlertAction *action) {
[[UIApplication sharedApplication] openURL:
[NSURL URLWithString:UIApplicationOpenSettingsURLString]
options:@{}
completionHandler:nil];
}]];
[alert addAction:[UIAlertAction
actionWithTitle:@"取消"
style:UIAlertActionStyleCancel
handler:nil]];
[self presentViewController:alert animated:YES completion:nil];
}
iOS 15/16 新特性
AVCaptureMultiCamSession(iOS 13+):
objc
// 多摄像头同时采集(前置 + 后置)
- (void)setupMultiCamSession {
AVCaptureMultiCamSession *multiCamSession = [[AVCaptureMultiCamSession alloc] init];
// 前置摄像头
AVCaptureDevice *frontCamera = [AVCaptureDevice defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionFront];
AVCaptureDeviceInput *frontInput = [AVCaptureDeviceInput deviceInputWithDevice:frontCamera error:nil];
// 后置摄像头
AVCaptureDevice *backCamera = [AVCaptureDevice defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionBack];
AVCaptureDeviceInput *backInput = [AVCaptureDeviceInput deviceInputWithDevice:backCamera error:nil];
// 添加到会话
[multiCamSession addInput:frontInput];
[multiCamSession addInput:backInput];
// 配置输出
// ... (每个摄像头独立的 DataOutput)
[multiCamSession startRunning];
}
AVCaptureWideSpectrumCamera(iOS 15+):
objc
// 宽光谱摄像头(用于特殊场景,如 AR)
- (void)setupWideSpectrumCamera {
AVCaptureDevice *wideSpectrumCamera = [AVCaptureDevice
defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideSpectrumCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionBack];
if (wideSpectrumCamera) {
NSLog(@"Wide spectrum camera available");
// 配置宽光谱采集
}
}
4. 调试工具链
Instruments 使用要点
Time Profiler(分析 CPU 热点):
- 打开 Instruments → Time Profiler
- 选择目标设备和应用
- 开始录制,复现问题
- 停止录制,查看 Call Tree
- 筛选
VTCompressionSession,VTDecompressionSession - 查找占用 CPU 时间最多的函数
关键检查点:
- 是否在主线程做编解码?
- 是否有频繁的格式转换?
- 回调函数是否有耗时操作?
Allocations(检查内存泄漏):
- 打开 Instruments → Allocations
- 启用 Mark Heap(周期性标记堆)
- 运行应用,进行多次采集/编码/解码循环
- 停止录制,查看 Heap Growth
- 筛选关键字:
VTCompressionSession,CVPixelBuffer,CMSampleBuffer
System Trace(分析性能):
- 打开 Instruments → System Trace
- 开始录制,复现卡顿问题
- 查看线程状态,检查是否有线程阻塞
- 查看 GPU 活动,检查渲染是否过载
Console 日志过滤技巧
在 macOS Console.app 中连接 iOS 设备:
bash
# 流式过滤音视频相关日志
adb logcat -v time | grep -E "AVFoundation|VideoToolbox|AudioToolbox"
# 或使用 Console.app 筛选
# subsystem == "com.apple_AVFoundation" OR subsystem == "com.apple.videotoolbox"
常用过滤条件:
| 类别 | 过滤表达式 | 说明 |
|---|---|---|
| 音频 | subsystem == "com.apple.coreaudio" |
CoreAudio 相关 |
| 视频 | subsystem == "com.apple.video" |
视频/摄像头相关 |
| 编解码 | subsystem == "com.apple.videotoolbox" |
VideoToolbox 相关 |
| 捕获 | subsystem == "com.apple.AVFoundation" |
AVFoundation 相关 |
开发者调试技巧
开启系统日志:
bash
# 开启调试级别的系统日志
sudo log config --mode "level:debug"
# 查看特定子系统的日志
log show --predicate 'subsystem == "com.apple.videotoolbox"' --last 5m
使用 os_signpoint 追踪性能:
objc
#import <os/signpost.h>
// 在关键操作添加标记
os_log_t logger = os_log_create("com.example.app", "VideoEncoder");
// 编码开始
os_signpost_interval_begin(logger, OS_SIGNPOST_ID_EXCLUSIVE, "VideoEncoder", "EncodeFrame");
// 编码过程
VTCompressionSessionEncodeFrame(...);
// 编码结束
os_signpost_interval_end(logger, OS_SIGNPOST_ID_EXCLUSIVE, "VideoEncoder", "EncodeFrame");
// 在 Instruments 的 **Instrument Points** 中查看标记
自定义 PTS 追踪日志:
objc
// 追踪 PTS 流转
- (void)trackPTS:(CMTime)pts stage:(NSString *)stage {
static uint64_t frameCount = 0;
double ptsSeconds = CMTimeGetSeconds(pts);
NSLog(@"[PTS] Frame #%llu: %.3f s at stage: %@", frameCount++, ptsSeconds, stage);
}
// 在各阶段调用
- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer ... {
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
[self trackPTS:pts stage:@"capture"];
// ... 编码 ...
[self trackPTS:pts stage:@"encoded"];
// ... 传输 ...
[self trackPTS:pts stage:@"received"];
// ... 解码 ...
[self trackPTS:pts stage:@"decoded"];
// ... 渲染 ...
[self trackPTS:pts stage:@"rendered"];
}
// 对比各阶段的 PTS,验证时间戳是否被正确保留
总结
iOS 平台通过 CoreMedia (CMTime, CMSampleBuffer) 提供了一套贯穿始终的时间戳体系,通过 VideoToolbox 和 AudioToolbox 提供了极致优化的硬件编解码能力。
在架构设计时,必须将音视频视为两个完全解耦的系统 ,分别设计各自的线程模型、缓冲队列和流控策略。但在最终的播放展现层,必须建立以音频硬件时钟为基准 的同步机制。理解并利用好 iOS 底层的零拷贝(IOSurface, CVPixelBufferRef)和异步硬件管线,是构建高性能 iOS 音视频应用的关键所在。
最佳实践总结
- 使用系统渲染器 :优先使用
AVSampleBufferDisplayLayer,避免自定义渲染 - 保持 YUV 格式:延迟到渲染阶段才转换为 RGB
- 启用丢帧策略 :
alwaysDiscardsLateVideoFrames = YES - 合理设置 GOP:弱网环境小 GOP(1-2s),良好环境大 GOP(4-8s)
- 关闭 B 帧:实时通信场景关闭 B 帧降低延迟
- 监控 PTS 连续性:确保时间戳在采集到显示全程保留
- 使用 Instruments:定期分析性能和内存泄漏
- 处理权限:iOS 14+ 必须正确处理相机/麦克风权限
进一步学习资源
官方文档:
开源项目参考: