iOS视频编码详细步骤(视频编码器,基于 VideoToolbox,支持硬件编码 H264/H265)

iOS视频编码详细步骤流程

1. 视频采集阶段

视频采集所使用的代码和之前的相同,所以不再过多进行赘述

  • 初始化配置
    • 通过VideoCaptureConfig设置分辨率1920x1080、帧率30fps、像素格式kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
    • 设置摄像头位置(默认前置)和镜像模式
  • 授权与初始化
    • 检查并请求相机权限
    • 创建AVCaptureSession会话
    • 配置摄像头输入源AVCaptureDeviceInput
    • 设置视频输出AVCaptureVideoDataOutput
    • 创建预览层AVCaptureVideoPreviewLayer
  • 数据回调
    • 实现AVCaptureVideoDataOutputSampleBufferDelegate接收视频帧
    • 通过sampleBufferOutputCallBack传递CMSampleBuffer

2. 视频编码准备

  • 编码参数配置
    • 创建KFVideoEncoderConfig对象
    • 设置分辨率1080x1920、码率5Mbps、帧率30fps、GOP帧数150帧
    • 检测设备支持情况,优先选择HEVC,不支持则降级为H264
    • 设置相应编码Profile(H264使用High Profile,HEVC使用Main Profile)
swift 复制代码
//
//  KFVideoEncoderConfig.swift
//  VideoDemo
//
//  Created by ricard.li on 2025/5/14.
//


import Foundation
import AVFoundation
import VideoToolbox

class KFVideoEncoderConfig {
    
    /// 分辨率
    var size: CGSize
    /// 码率 (bps)
    var bitrate: Int
    /// 帧率 (fps)
    var fps: Int
    /// GOP 帧数 (关键帧间隔)
    var gopSize: Int
    /// 是否启用 B 帧
    var openBFrame: Bool
    /// 编码器类型
    var codecType: CMVideoCodecType
    /// 编码 profile
    var profile: String

    init() {
        self.size = CGSize(width: 1080, height: 1920)
        self.bitrate = 5000 * 1024
        self.fps = 30
        self.gopSize = self.fps * 5
        self.openBFrame = true

        var supportHEVC = false
        if #available(iOS 11.0, *) {
            // 注意 Swift 中直接调用 VTIsHardwareDecodeSupported
            supportHEVC = VTIsHardwareDecodeSupported(kCMVideoCodecType_HEVC)
        }
        
        if supportHEVC {
            self.codecType = kCMVideoCodecType_HEVC
            self.profile = kVTProfileLevel_HEVC_Main_AutoLevel as String
        } else {
            self.codecType = kCMVideoCodecType_H264
            self.profile = AVVideoProfileLevelH264HighAutoLevel
        }
    }
}
  • 编码器初始化
    • 创建KFVideoEncoder实例
    • 创建VTCompressionSession编码会话
    • 配置属性:kVTCompressionPropertyKey_RealTimekVTCompressionPropertyKey_ProfileLevel
    • 设置码率控制、GOP大小、帧率等参数
    • 配置编码回调函数
swift 复制代码
//
//  KFVideoEncoder.swift
//  VideoDemo
//
//  Created by ricard.li on 2025/5/14.
//

import Foundation
import AVFoundation
import VideoToolbox
import UIKit

/// 视频编码器,基于 VideoToolbox,支持硬件编码 H264/H265
class KFVideoEncoder {
    
    /// 编码会话
    private var compressionSession: VTCompressionSession?
    /// 编码配置
    private(set) var config: KFVideoEncoderConfig
    /// 编码专用队列,避免线程竞争
    private let encoderQueue = DispatchQueue(label: "com.KeyFrameKit.videoEncoder")
    /// 用于串行化的信号量
//    private let semaphore = DispatchSemaphore(value: 1)
    
    /// 是否需要刷新 session(比如进入后台后)
    private var needRefreshSession = false
    /// 重试创建 session 计数
    private var retrySessionCount = 0
    /// 编码失败的帧计数
    private var encodeFrameFailedCount = 0
    
    /// 编码成功后的 SampleBuffer 回调
    var sampleBufferOutputCallBack: ((CMSampleBuffer) -> Void)?
    /// 错误回调
    var errorCallBack: ((Error) -> Void)?
    
    /// 最大允许重试 session 创建次数
    private let maxRetrySessionCount = 5
    /// 最大允许编码失败帧数
    private let maxEncodeFrameFailedCount = 20
    
    /// 初始化
    init(config: KFVideoEncoderConfig) {
        self.config = config
        NotificationCenter.default.addObserver(self, selector: #selector(didEnterBackground), name: UIApplication.didEnterBackgroundNotification, object: nil)
    }
    
    deinit {
        NotificationCenter.default.removeObserver(self)
//        semaphore.wait()
        releaseCompressionSession()
//        semaphore.signal()
    }
    
    /// 标记需要刷新 session
    func refresh() {
        needRefreshSession = true
    }
    
    /// 强制刷新编码器(不带完成回调)
    func flush() {
        encoderQueue.async { [weak self] in
            guard let self = self else { return }
//            self.semaphore.wait()
            self.flushInternal()
//            self.semaphore.signal()
        }
    }
    
    /// 强制刷新编码器(带完成回调)
    func flush(withCompleteHandler handler: @escaping () -> Void) {
        encoderQueue.async { [weak self] in
            guard let self = self else { return }
//            self.semaphore.wait()
            self.flushInternal()
//            self.semaphore.signal()
            handler()
        }
    }
    
    /// 编码单帧视频
    func encode(pixelBuffer: CVPixelBuffer, ptsTime: CMTime) {
        guard retrySessionCount < maxRetrySessionCount, encodeFrameFailedCount < maxEncodeFrameFailedCount else { return }
        
        encoderQueue.async { [weak self] in
            guard let self = self else { return }
//            self.semaphore.wait()
            
            var setupStatus: OSStatus = noErr
            
            /// 检查 session 是否需要重建
            if self.compressionSession == nil || self.needRefreshSession {
                self.releaseCompressionSession()
                setupStatus = self.setupCompressionSession()
                self.retrySessionCount = (setupStatus == noErr) ? 0 : (self.retrySessionCount + 1)
                
                if setupStatus != noErr {
                    print("KFVideoEncoder setupCompressionSession error: \(setupStatus)")
                    self.releaseCompressionSession()
                } else {
                    self.needRefreshSession = false
                }
            }
            
            guard let session = self.compressionSession else {
//                self.semaphore.signal()
                if self.retrySessionCount >= self.maxRetrySessionCount {
                    DispatchQueue.main.async {
                        self.errorCallBack?(NSError(domain: "\(KFVideoEncoder.self)", code: Int(setupStatus), userInfo: nil))
                    }
                }
                return
            }
            
            var flags: VTEncodeInfoFlags = []
            /// 编码当前帧
            let encodeStatus = VTCompressionSessionEncodeFrame(session, imageBuffer: pixelBuffer, presentationTimeStamp: ptsTime, duration: CMTime(value: 1, timescale: CMTimeScale(self.config.fps)), frameProperties: nil, sourceFrameRefcon: nil, infoFlagsOut: &flags)
            
            /// 检测 session 异常,尝试重建
            if encodeStatus == kVTInvalidSessionErr {
                self.releaseCompressionSession()
                setupStatus = self.setupCompressionSession()
                self.retrySessionCount = (setupStatus == noErr) ? 0 : (self.retrySessionCount + 1)
                if setupStatus == noErr {
                    _ = VTCompressionSessionEncodeFrame(session, imageBuffer: pixelBuffer, presentationTimeStamp: ptsTime, duration: CMTime(value: 1, timescale: CMTimeScale(self.config.fps)), frameProperties: nil, sourceFrameRefcon: nil, infoFlagsOut: &flags)
                } else {
                    self.releaseCompressionSession()
                }
                print("KFVideoEncoder kVTInvalidSessionErr")
            }
            
            /// 编码失败计数
            if encodeStatus != noErr {
                print("KFVideoEncoder VTCompressionSessionEncodeFrame error: \(encodeStatus)")
            }
            
            self.encodeFrameFailedCount = (encodeStatus == noErr) ? 0 : (self.encodeFrameFailedCount + 1)
            
//            self.semaphore.signal()
            
            /// 达到最大失败次数,触发错误回调
            if self.encodeFrameFailedCount >= self.maxEncodeFrameFailedCount {
                DispatchQueue.main.async {
                    self.errorCallBack?(NSError(domain: "\(KFVideoEncoder.self)", code: Int(encodeStatus), userInfo: nil))
                }
            }
        }
    }
    
    /// 进入后台,标记 session 需要刷新
    @objc private func didEnterBackground() {
        needRefreshSession = true
    }
    
    /// 创建编码会话
    private func setupCompressionSession() -> OSStatus {
        var session: VTCompressionSession?
        
        let status = VTCompressionSessionCreate(allocator: nil,
                                                width: Int32(config.size.width),
                                                height: Int32(config.size.height),
                                                codecType: config.codecType,
                                                encoderSpecification: nil,
                                                imageBufferAttributes: nil,
                                                compressedDataAllocator: nil,
                                                outputCallback: { (outputCallbackRefCon, _, status, infoFlags, sampleBuffer) in
            guard let sampleBuffer = sampleBuffer else {
                if infoFlags.contains(.frameDropped) {
                    print("VideoToolboxEncoder kVTEncodeInfo_FrameDropped")
                }
                return
            }
            /// 将 sampleBuffer 通过回调抛出
            let encoder = Unmanaged<KFVideoEncoder>.fromOpaque(outputCallbackRefCon!).takeUnretainedValue()
            encoder.sampleBufferOutputCallBack?(sampleBuffer)
        },
                                                refcon: UnsafeMutableRawPointer(Unmanaged.passUnretained(self).toOpaque()),
                                                compressionSessionOut: &session)
        if status != noErr {
            return status
        }
        
        guard let compressionSession = session else { return status }
        self.compressionSession = compressionSession
        
        /// 设置基本属性
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_RealTime, value: kCFBooleanTrue)
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_ProfileLevel, value: config.profile as CFString)
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_AllowFrameReordering, value: config.openBFrame as CFTypeRef)
        
        /// 针对 H264,设置 CABAC
        if config.codecType == kCMVideoCodecType_H264 {
            VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_H264EntropyMode, value: kVTH264EntropyMode_CABAC)
        }
        
        /// 设置像素转换属性
        let transferDict: [String: Any] = [kVTPixelTransferPropertyKey_ScalingMode as String: kVTScalingMode_Letterbox]
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_PixelTransferProperties, value: transferDict as CFTypeRef)
        
        /// 设置码率
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_AverageBitRate, value: config.bitrate as CFTypeRef)
        
        /// 针对 H264 且不支持 B 帧,限制数据速率
        if !config.openBFrame && config.codecType == kCMVideoCodecType_H264 {
            let limits = [config.bitrate * 3 / 16, 1] as [NSNumber]
            VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_DataRateLimits, value: limits as CFArray)
        }
        
        /// 设置帧率、GOP
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_ExpectedFrameRate, value: config.fps as CFTypeRef)
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_MaxKeyFrameInterval, value: config.gopSize as CFTypeRef)
        VTSessionSetProperty(compressionSession, key: kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration, value: (Double(config.gopSize) / Double(config.fps)) as CFTypeRef)
        
        /// 准备编码
        return VTCompressionSessionPrepareToEncodeFrames(compressionSession)
    }
    
    /// 释放编码会话
    private func releaseCompressionSession() {
        if let session = compressionSession {
            VTCompressionSessionCompleteFrames(session, untilPresentationTimeStamp: .invalid)
            VTCompressionSessionInvalidate(session)
            self.compressionSession = nil
        }
    }
    
    /// 内部刷新逻辑
    private func flushInternal() {
        if let session = compressionSession {
            VTCompressionSessionCompleteFrames(session, untilPresentationTimeStamp: .invalid)
        }
    }
}

可以很容易的知道,在编码采集成功后,会有一个视频帧输出回调

会调用上面文件的encode方法,encode方法中,会对session回话进行配置,我们再看向session会话,如果编码成功的话,会通过闭包返回 sampleBuffer

3. 编码过程执行

  • 输入画面
    • 摄像头采集到CMSampleBuffer数据
    • 从中提取CVPixelBuffer和时间戳信息
  • 编码操作
    • 通过VTCompressionSessionEncodeFrame提交帧进行编码
    • 设置时间戳、帧持续时间等属性
    • 支持编码状态检查和异常处理
  • 应对中断
    • 应用进入后台时标记需刷新会话
    • 会话失效时进行重建
    • 最多重试5次,每次失败计数

4. 数据处理与存储

  • 参数集提取
    • CMFormatDescription中获取H264的SPS、PPS或HEVC的VPS、SPS、PPS
    • 检测关键帧(判断kCMSampleAttachmentKey_NotSync是否存在)
  • 格式转换
    • 原始数据为AVCC/HVCC格式:[extradata]|[length][NALU]|[length][NALU]|...
    • 转换为AnnexB格式:[startcode][NALU]|[startcode][NALU]|...
    • 添加起始码0x00000001
  • 数据写入
    • 关键帧时写入参数集(VPS、SPS、PPS)+ 帧数据
    • 普通帧只写入帧数据
    • 使用FileHandle写入到.h264/.h265文件

5. 并发与线程控制

  • 专用队列隔离
    • 采集使用captureQueue队列
    • 编码使用encoderQueue队列
    • 避免线程竞争和阻塞UI
  • 错误处理
    • 编码失败计数与阈值控制
    • 异常回调通知上层处理
    • 编码状态监控

6. 控制与交互

  • 用户界面控制
    • Start按钮:开始编码
    • Stop按钮:停止编码并刷新
    • Camera按钮:切换前后摄像头
    • 双击屏幕:快速切换摄像头
相关推荐
macken999922 分钟前
音频分类的学习
人工智能·深度学习·学习·计算机视觉·音视频
天夏已微凉4 小时前
OpenHarmony系统HDF驱动开发介绍(补充)
驱动开发·音视频·harmonyos
忆源4 小时前
【Qt】之音视频编程2:QtAV的使用篇
开发语言·qt·音视频
I烟雨云渊T5 小时前
iOS 阅后即焚功能的实现
macos·ios·cocoa
struggle20255 小时前
适用于 iOS 的 开源Ultralytics YOLO:应用程序和 Swift 软件包,用于在您自己的 iOS 应用程序中运行 YOLO
yolo·ios·开源·app·swift
飞猿_SIR6 小时前
Android Exoplayer多路不同时长音视频混合播放
android·音视频
Digitally6 小时前
如何将视频从手机传输到电脑(Android和iPhone)
智能手机·电脑·音视频
忆源10 小时前
【Qt】之音视频编程1:QtAV的背景和安装篇
开发语言·qt·音视频
路溪非溪14 小时前
各种音频产品及场景总结
音视频