Android WebRTC VideoFrame

VideoFrame

VideoFrame 是 WebRTC 中相当重要的一个概念,我们叫他视频帧,Androider 视角来看位于 org.webrtc 包下,在前端或者 iOS 应该也有对应的 VideoFrame 甚至他们大部分字段都一样。因为他们都是不同平台对 native 层的抽象实体。

核心既然是在 c++ 为什么我们要重视它去了解它,除了我们可能需要在做帧补偿的时候对某一帧画面进行截留,其次还有可能我们需要将它转换成 Bitmap 或者是转换成 byte [] 进行存储,重要的是我在进行一些业务处理的时候操作它出现了:

log 复制代码
java.lang.IllegalStateException: retain() called on an obiect with refcount < 1at 
org.webrtc.RefCountDelegate.retain(RefCountDelegate.java:34)at 
org.webrtc.TextureBufferImpl.retain(TextureBufferImpl.java:119)at 
org.webrtc.VideoFrame.retain(VideoFrame.iava:196)at 
org.webrtc.EglRenderer.onFrame(EglRenderer.java:521)at 
org.webrtc.SurfaceEglRenderer.onFrame(SurfaceEqlRenderer.java:106)at 
org.webrtc.SurfaceViewRenderer.onFrame(SurfaceViewRenderer.java:196)at 
org.webrtc.SurfaceViewRenderer.triggerLastFrameRefresh(SurfaceViewRenderer.java:.28)

为什么会报这个错误? retain 和 release 代表了什么含义?

java 复制代码
public class VideoFrame implements RefCounted

回答上面的问题,VideoFrame 实现了 RefCounted ,字母意思好像是引用计数,实际也确实跟这个相关

java 复制代码
public interface RefCounted {
  /** Increases ref count by one. */
  @CalledByNative void retain();

  /**
   * Decreases ref count by one. When the ref count reaches zero, resources related to the object
   * will be freed.
   */
  @CalledByNative void release();
}

retainrelease 方法是用于管理 VideoFrame 对象的引用计数的。引用计数是一种内存管理技术,用于跟踪一个对象被引用的次数。当对象的引用计数为 0 时,即没有任何对象在使用它时,该对象应该被销毁以释放内存。

具体来说,这两个方法的含义如下:

  • retain(): 调用这个方法会增加 VideoFrame 对象的引用计数。这意味着另一个对象现在也持有对该帧的引用,防止它在不再需要时被销毁。在多线程环境下,对于需要在不同线程之间传递 VideoFrame 对象的情况下,通常会在传递前调用 retain() 方法,以确保对象在传递期间不会被销毁。
  • release(): 调用这个方法会减少 VideoFrame 对象的引用计数。当引用计数变为 0 时,表示没有任何对象再持有对该帧的引用,该帧对象可以被销毁以释放内存。在使用完 VideoFrame 对象后,通常会调用这个方法来告诉系统不再需要该对象,以便及时释放资源。

了解了 retain 和 release 的含义,我们应该能避免再出现文中开头的错误了:

  1. 确保在调用'retain()方法之前,对象的引用计数大于等于1

  2. 确保在调用'release()方法后,不再对对象进行任何操作,以避免重复释放对象

再来看下 VideoFrame 还有哪些我们需要关注的点呢?

java 复制代码
public class VideoFrame implements RefCounted {
  /**
   * Implements image storage medium. Might be for example an OpenGL texture or a memory region
   * containing I420-data.
   *
   * <p>Reference counting is needed since a video buffer can be shared between multiple VideoSinks,
   * and the buffer needs to be returned to the VideoSource as soon as all references are gone.
   */
  public interface Buffer extends RefCounted {
    /**
     * Representation of the underlying buffer. Currently, only NATIVE and I420 are supported.
     */
    @CalledByNative("Buffer")
    @VideoFrameBufferType
    default int getBufferType() {
      return VideoFrameBufferType.NATIVE;
    }

    /**
     * Resolution of the buffer in pixels.
     */
    @CalledByNative("Buffer") int getWidth();
    @CalledByNative("Buffer") int getHeight();

    /**
     * Returns a memory-backed frame in I420 format. If the pixel data is in another format, a
     * conversion will take place. All implementations must provide a fallback to I420 for
     * compatibility with e.g. the internal WebRTC software encoders.
     */
    @CalledByNative("Buffer") I420Buffer toI420();

    @Override @CalledByNative("Buffer") void retain();
    @Override @CalledByNative("Buffer") void release();

    /**
     * Crops a region defined by `cropx`, `cropY`, `cropWidth` and `cropHeight`. Scales it to size
     * `scaleWidth` x `scaleHeight`.
     */
    @CalledByNative("Buffer")
    Buffer cropAndScale(
        int cropX, int cropY, int cropWidth, int cropHeight, int scaleWidth, int scaleHeight);
  }

  /**
   * Interface for I420 buffers.
   */
  public interface I420Buffer extends Buffer {
    @Override
    default int getBufferType() {
      return VideoFrameBufferType.I420;
    }

    /**
     * Returns a direct ByteBuffer containing Y-plane data. The buffer capacity is at least
     * getStrideY() * getHeight() bytes. The position of the returned buffer is ignored and must
     * be 0. Callers may mutate the ByteBuffer (eg. through relative-read operations), so
     * implementations must return a new ByteBuffer or slice for each call.
     */
    @CalledByNative("I420Buffer") ByteBuffer getDataY();
    /**
     * Returns a direct ByteBuffer containing U-plane data. The buffer capacity is at least
     * getStrideU() * ((getHeight() + 1) / 2) bytes. The position of the returned buffer is ignored
     * and must be 0. Callers may mutate the ByteBuffer (eg. through relative-read operations), so
     * implementations must return a new ByteBuffer or slice for each call.
     */
    @CalledByNative("I420Buffer") ByteBuffer getDataU();
    /**
     * Returns a direct ByteBuffer containing V-plane data. The buffer capacity is at least
     * getStrideV() * ((getHeight() + 1) / 2) bytes. The position of the returned buffer is ignored
     * and must be 0. Callers may mutate the ByteBuffer (eg. through relative-read operations), so
     * implementations must return a new ByteBuffer or slice for each call.
     */
    @CalledByNative("I420Buffer") ByteBuffer getDataV();

    @CalledByNative("I420Buffer") int getStrideY();
    @CalledByNative("I420Buffer") int getStrideU();
    @CalledByNative("I420Buffer") int getStrideV();
  }

  /**
   * Interface for buffers that are stored as a single texture, either in OES or RGB format.
   */
  public interface TextureBuffer extends Buffer {
    enum Type {
      OES(GLES11Ext.GL_TEXTURE_EXTERNAL_OES),
      RGB(GLES20.GL_TEXTURE_2D);

      private final int glTarget;

      private Type(final int glTarget) {
        this.glTarget = glTarget;
      }

      public int getGlTarget() {
        return glTarget;
      }
    }

    Type getType();
    int getTextureId();

    /**
     * Retrieve the transform matrix associated with the frame. This transform matrix maps 2D
     * homogeneous coordinates of the form (s, t, 1) with s and t in the inclusive range [0, 1] to
     * the coordinate that should be used to sample that location from the buffer.
     */
    Matrix getTransformMatrix();
  }

  private final Buffer buffer;
  private final int rotation;
  private final long timestampNs;

  /**
   * Constructs a new VideoFrame backed by the given {@code buffer}.
   *
   * @note Ownership of the buffer object is tranferred to the new VideoFrame.
   */
  @CalledByNative
  public VideoFrame(Buffer buffer, int rotation, long timestampNs) {
    if (buffer == null) {
      throw new IllegalArgumentException("buffer not allowed to be null");
    }
    if (rotation % 90 != 0) {
      throw new IllegalArgumentException("rotation must be a multiple of 90");
    }
    this.buffer = buffer;
    this.rotation = rotation;
    this.timestampNs = timestampNs;
  }

  @CalledByNative
  public Buffer getBuffer() {
    return buffer;
  }

  /**
   * Rotation of the frame in degrees.
   */
  @CalledByNative
  public int getRotation() {
    return rotation;
  }

  /**
   * Timestamp of the frame in nano seconds.
   */
  @CalledByNative
  public long getTimestampNs() {
    return timestampNs;
  }

  public int getRotatedWidth() {
    if (rotation % 180 == 0) {
      return buffer.getWidth();
    }
    return buffer.getHeight();
  }

  public int getRotatedHeight() {
    if (rotation % 180 == 0) {
      return buffer.getHeight();
    }
    return buffer.getWidth();
  }

  @Override
  public void retain() {
    buffer.retain();
  }

  @Override
  @CalledByNative
  public void release() {
    buffer.release();
  }
}

从代码中提取几个关键词:

  • Buffer
  • Y/U/V
  • I420

这个 buffer 可不是我们 java.io 里面的 buffer 是一种接口用于规范 WebRTC 的视频帧的实现,实现一般有两种类型,一种是基于 YUV 的,另外一种是基于 RBG 和 openGL 的。

YUV 又是什么呢 , RGB 我们很熟悉,就像 Android 中的 bitmap 位图,我们在指定一些参数时会见到 RGB_8888 或者 RGB_565。

第一次接触 YUV 还是 2015 年那时候基于 ZXing 库去试图通过 YUV 去提高扫码和识别二维码的成功率,YUV 和 RGB 的两者的差异是:

表示方式

  1. RGB:使用红色(R)、绿色(G)和蓝色(B)三种颜色通道的组合来表示颜色。每个像素由这三个通道的强度值来描述,每个通道的强度值通常范围在0到255之间,表示不同程度的颜色亮度。
  2. YUV:YUV编码是一种亮度和色度分离的颜色编码方式。其中Y表示亮度(Luminance),U和V表示色度(Chrominance)。Y通道描述了图像的亮度信息,而U和V通道描述了色彩信息。

存储方式

  1. RGB:每个像素使用3个字节(或4个字节,如果包含Alpha通道 ARBG 别对应红色、绿色和蓝色通道。这种方式在计算机图形处理中比较直观,但相对而言需要更多的存储空间。

  2. YUV:使用更为复杂的编码方式来表示颜色信息,通常需要更少的存储空间。在视频压缩和传输中,YUV通常比RGB更为常见,因为它更适合于对人眼感知不敏感的色度信息进行压缩。

用途

  1. RGB:通常用于计算机显示器和图形处理领域,因为它直接对应于显示器的颜色输出。

  2. YUV:在视频处理和传输领域中更为常见,因为它更适合于视频信号的压缩和传输。在许多视频编解码标准中(如MPEG、H.264等),都使用了YUV格式。

RGB适合于计算机图形处理和显示,而YUV则更适合于视频编解码和传输,尤其是在带宽和存储资源有限的情况下。

关于 YUV 里面最重要的是 Y 的亮度信息,如果你使用过 YUV 的三个参数的调试工具,发现不管怎么去调 U和V 你最终还是能识别出这个物体是什么,或者像是看到了一张黑白照片。

I420 又是什么呢?参考注释

I420 是图像或者视频帧的存储格式,类似的存储格式还有 NA12 / YUV2 等,只是 WebRTC 要求各端以这个通用的格式进行数据返回,以保证软件编码器的兼容。

下面提供几个来自互联网的对 VideoFrame 应用的场景示例:

VideoFrame 转 byte[]

java 复制代码
public void onFrame(VideoFrame var1) {
    VideoFrame.I420Buffer buffer = var1.getBuffer().toI420();
    int height = buffer.getHeight();
    int width = buffer.getWidth();
 
    ByteBuffer yBuffer = buffer.getDataY();
    ByteBuffer uBuffer = buffer.getDataU();
    ByteBuffer vBuffer = buffer.getDataV();
 
    int yStride = buffer.getStrideY();
    int uStride = buffer.getStrideU();
    int vStride = buffer.getStrideV();
 
    byte[] data = new byte[height * width * 3 / 2];
    yBuffer.get(data, 0, height * width);
 
    int uOffset = width * height;
    int vOffset = width * height * 5 / 4;
    for (int i = 0; i < height / 2; i++) {
         uBuffer.position(i * uStride);
         uBuffer.get(data, uOffset, width / 2);
         uOffset += width / 2;
         vBuffer.position(i * vStride);
         vBuffer.get(data, vOffset, width / 2);
         vOffset += width / 2;
     }
     buffer.release();
}

VideoFrame 转 Bitmap

java 复制代码
    public Bitmap saveImgBitmap(VideoFrame frame){
        final Matrix drawMatrix = new Matrix();
        // Used for bitmap capturing.
        final GlTextureFrameBuffer bitmapTextureFramebuffer =
                new GlTextureFrameBuffer(GLES20.GL_RGBA);
        drawMatrix.reset();
        drawMatrix.preTranslate(0.5f, 0.5f);
        //控制图片的方向
        drawMatrix.preScale( -1f ,  -1f);
        drawMatrix.preScale(-1f, 1f); // We want the output to be upside down for Bitmap.
        drawMatrix.preTranslate(-0.5f, -0.5f);
        
        final int scaledWidth = (int) (1 * frame.getRotatedWidth());
        final int scaledHeight = (int) (1 * frame.getRotatedHeight());
        bitmapTextureFramebuffer.setSize(scaledWidth, scaledHeight);
 
        GLES20.glBindFramebuffer(GLES20.GL_FRAMEBUFFER, bitmapTextureFramebuffer.getFrameBufferId());
        GLES20.glFramebufferTexture2D(GLES20.GL_FRAMEBUFFER, GLES20.GL_COLOR_ATTACHMENT0,
                GLES20.GL_TEXTURE_2D, bitmapTextureFramebuffer.getTextureId(), 0);
 
        GLES20.glClearColor(0 /* red */, 0 /* green */, 0 /* blue */, 0 /* alpha */);
        GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);
        VideoFrameDrawer frameDrawer = new VideoFrameDrawer();
        RendererCommon.GlDrawer drawer = new GlRectDrawer();
        frameDrawer.drawFrame(frame, drawer, drawMatrix, 0 /* viewportX */,
                0 /* viewportY */, scaledWidth, scaledHeight);
 
        final ByteBuffer bitmapBuffer = ByteBuffer.allocateDirect(scaledWidth * scaledHeight * 4);
        GLES20.glViewport(0, 0, scaledWidth, scaledHeight);
        GLES20.glReadPixels(
                0, 0, scaledWidth, scaledHeight, GLES20.GL_RGBA, GLES20.GL_UNSIGNED_BYTE, bitmapBuffer);
 
        GLES20.glBindFramebuffer(GLES20.GL_FRAMEBUFFER, 0);
        GlUtil.checkNoGLES2Error("EglRenderer.notifyCallbacks");
 
        final Bitmap bitmap = Bitmap.createBitmap(scaledWidth, scaledHeight, Bitmap.Config.ARGB_8888);
        bitmap.copyPixelsFromBuffer(bitmapBuffer);
 
        try {
            File file = new File("/data/data/com.xxx.diagnose/files"+ "/test.jpg");
            if (!file.exists()){
                file.createNewFile();
            }
            OutputStream outputStream=new FileOutputStream(file);
            bitmap.compress(Bitmap.CompressFormat.JPEG,100,outputStream);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return bitmap;
    }
相关推荐
EasyCVR8 小时前
EHOME视频平台EasyCVR视频融合平台使用OBS进行RTMP推流,WebRTC播放出现抖动、卡顿如何解决?
人工智能·算法·ffmpeg·音视频·webrtc·监控视频接入
安步当歌11 小时前
【WebRTC】视频编码链路中各个类的简单分析——VideoStreamEncoder
音视频·webrtc·视频编解码·video-codec
安步当歌1 天前
【WebRTC】视频采集模块中各个类的简单分析
音视频·webrtc·视频编解码·video-codec
wyw00002 天前
解决SRS推送webrtc流卡顿问题
webrtc·srs
关键帧Keyframe2 天前
音视频面试题集锦第 7 期
音视频开发·视频编码·客户端
关键帧Keyframe2 天前
音视频面试题集锦第 8 期
ios·音视频开发·客户端
安步当歌4 天前
【WebRTC】WebRTC的简单使用
音视频·webrtc
西部秋虫4 天前
Windows下FFmpeg集成metaRTC实现webrtc推拉流的例子
ffmpeg·webrtc
johnny2337 天前
《Web性能权威指南》-WebRTC-读书笔记
webrtc
蚝油菜花7 天前
MimicTalk:字节跳动和浙江大学联合推出 15 分钟生成 3D 说话人脸视频的生成模型
人工智能·开源·音视频开发