Netty源码—6.ByteBuf原理一

大纲

1.关于ByteBuf的问题整理

2.ByteBuf结构以及重要API

3.ByteBuf的分类

4.ByteBuf分类的补充说明

5.ByteBuf的主要内容分三大方面

6.内存分配器ByteBufAllocator

7.ByteBufAllocator的两大子类

8.PoolArena分配内存的流程

1.关于ByteBuf的问题整理

问题一:Netty的内存类别有哪些?

答:ByteBuf可以按三个维度来进行分类:一个是堆内和堆外,一个是Unsafe和非Unsafe,一个是Pooled和非Pooled。

堆内是基于byte字节数组内存进行分配,堆外是基于JDK的DirectByteBuffer内存进行分配。

Unsafe是通过JDK的一个unsafe对象基于物理内存地址进行数据读写,非Unsafe是直接调用JDK的API进行数据读写。

Pooled是预先分配的一整块内存,分配时直接用一定的算法从这整块内存里取出一块连续内存。UnPooled是每次分配内存都直接申请内存。

问题二:如何减少多线程内存分配之间的竞争?如何确保多线程对于同一内存分配不产生冲突?

答:一个内存分配器里维护着一个PoolArena数组,所有的内存分配都在PoolArena上进行。通过一个PoolThreadCache对象将线程和PoolArena进行一一绑定(利用ThreadLocal原理)。默认一个线程对应一个PoolArena,这样就能做到多线程内存分配相互不受影响。

问题三:不同大小的内存是如何进行分配的?

答:对于Page级别的内存分配与释放是直接通过完全二叉树的标记来寻找某一个连续内存的。对于Page级别以下的内存分配与释放,首先是找到一个Page,然后把此Page按照SubPage大小进行划分,最后通过位图的方式来进行内存分配与释放。

不管是Page级别的内存还是SubPage级别的内存,当内存被释放掉时有可能会被加入到不同级别的一个缓存队列供下一次分配使用。

2.ByteBuf结构以及重要API

(1)ByteBuf的结构

(2)read、write和set方法

(3)mark和reset方法

(4)retain和release方法

(5)slice、duplicate和copy方法

(1)ByteBuf的结构

ByteBuf是一个字节容器,分三部分:

第一部分是已经丢弃的字节,这部分数据是无效的;

第二部分是可读字节,这部分数据是ByteBuf的主体数据,从ByteBuf里读取的数据都是来自这部分;

第三部分是可写字节,所有写到ByteBuf的数据都会写到这一段;

从ByteBuf中每读取一字节,读指针readerIndex + 1。从ByteBuf中每写入一字节,写指针writerIndex + 1。ByteBuf里的可读字节数是:writerIndex - readerIndex。当writerIndex = readerIndex时,表示ByteBuf不可读。

capacity表示ByteBuf底层内存总容量,当writerIndex = capacity时,表示ByteBuf不可写。

当向ByteBuf写数据时,如果容量不足,可以进行扩容,直到capacity扩容到maxCapacity。

注意:ByteBuf的三个核心变量是readerIndex、writerIndex、capacity,ByteBuffer的三个核心变量是position、limit、capacity。

(2)read、write和set方法

readableBytes()表示ByteBuf当前可读的字节数,它的值等于writerIndex - readerIndex。readBytes(byte[] dst)表示把ByteBuf里的数据全部读取到dst,这里dst字节数组的大小通常等于readableBytes()。

writableBytes()表示ByteBuf当前可写的字节数,它的值等于capacity - writerIndex。writeBytes(byte[] src)表示把字节数组src里的全部数据写到ByteBuf,这里src字节数组的大小通常小于等于writableBytes()。

(3)mark和reset方法

markReaderIndex()表示将当前的readerIndex备份到markedReaderIndex中,markWriterIndex()表示将当前的writerIndex备份到markedWriterIndex中,resetReaderIndex()表示将当前的readerIndex设置为markedReaderIndex,resetWriterIndex()表示将当前的writerIndex设置为markedWriterIndex。

(4)retain和release方法

由于Netty使用了堆外内存,而堆外内存是不被JVM直接管理的。也就是说,申请到的堆外内存无法被垃圾回收器自动回收,所以需要手动回收。

Netty的ByteBuf是通过引用计数的方式管理的,如果一个ByteBuf没有地方被引用到,则需要回收底层内存。

在默认情况下,当创建完一个ByteBuf时,它的引用计数为1。然后每次调用retain()方法,它的引用计数就会加1。接着每次调用release()方法,它的引用计数就会减1。release()方法减完之后如果发现引用计数为0,则直接回收ByteBuf底层的内存。

所以在一个函数体内,只要增加了引用计数(包括ByteBuf的创建和手动调用retain()方法),就必须调用release()方法以免内存泄露。

(5)slice、duplicate和copy方法

这三个方法的返回值分别是一个新的ByteBuf对象。

slice()方法会从原始ByteBuf中截取一段,这段数据是从readIndex到writerIndex的,同时返回的新的ByteBuf的最大容量maxCapacity为原始ByteBuf的readableBytes()。往slice()方法返回的ByteBuf中写数据会影响原始ByteBuf。

duplicate()方法会把原始ByteBuf全都截取出来,包括所有的数据和指针信息。往duplicate()方法返回的ByteBuf中写数据会影响原始ByteBuf。

copy()方法会从原始ByteBuf中复制所有的信息,包括读写指针和底层对应的数据。往copy()方法返回的ByteBuf中写数据不会影响原始ByteBuf。

slice()方法与duplicate()方法的相同点是:底层内存及引用计数与原始ByteBuf共享。slice()方法或者duplicate()方法返回的ByteBuf调用write系列方法都会影响到原始ByteBuf。

retainedSlice()方法等价于slice().retain(),retainedDuplicate()方法等价于duplicate().retain()。

复制代码
//1.多次释放的例子
ByteBuf buffer = xxx;
doWith(buffer);
//重复释放
buffer.release();

public void doWith(ByteBuf buffer) {
    //没有增加引用计数
    ByteBuf slice = buffer.slice();
    foo(slice);
}

public void foo(ByteBuf buffer) {
    //一次释放
    buffer.release();
}

//2.不释放造成内存泄露的例子
ByteBuf buffer = xxx;
doWith(buffer);
//引用计数为2,调用release()方法后引用计数为1,无法释放内存
buffer.release();

public void doWith(ByteBuf buffer) {
    //增加引用计数
    ByteBuf slice = buffer.retainedSlice();
    foo(slice);
}

public void foo(ByteBuf buffer) {
    //没有调用release()方法
}

3.ByteBuf的分类

(1)ByteBuf的类结构

(2)Pooled和Unpooled

(3)Unsafe和非Unsafe

(4)Heap和Direct

(1)ByteBuf的类结构

AbstractByteBuf继承自ByteBuf,主要实现了一些基本骨架的方法,如ByteBuf的一些公共属性和功能,而具体的读字节和写字节操作会放到其子类通过下划线前缀的方法来实现。

(2)Pooled和Unpooled

Pooled和Unpooled就是池化和非池化的分类。Pooled的内存分配每一次都是从预先分配好的一块内存里去取一段连续内存来封装成一个ByteBuf对象给应用程序。Unpooled的内存分配每一次都是直接调用系统API去向操作系统申请一块内存。所以两者最大区别是:一个是从预先分配好的内存里分配,一个是直接去分配。

(3)Unsafe和非Unsafe

Unsafe和非Unsafe不需要我们操心,Netty会根据系统是否有unsafe对象自行选择。JDK里有个Unsafe对象,它可以直接拿到对象的内存地址,然后可以基于这个内存地址进行读写操作。Unsafe类型的ByteBuf可以拿到ByteBuf对象在JVM里的具体内存地址,然后直接通过JDK的Unsafe进行读写。非Unsafe类型的ByteBuf则是不会依赖到JDK里的Unsafe对象。

(4)Heap和Direct

Heap和Direct就是堆内和堆外的分类。Heap分配出来的内存会自动受到GC的管理,不需要手动释放。Direct分配出来的内存则不受JVM控制,不参与GC垃圾回收的过程,需要手动释放以免内存泄露。UnpooledHeapByteBuf底层会依赖于一个字节数组byte[]进行所有内存相关的操作,UnpooledDirectByteBuf底层则依赖于一个JDK堆外内存对象DirectByteBuffer进行所有内存相关的操作。

4.ByteBuf分类的补充说明

(1)堆内存HeapByteBuf

(2)直接内存DIrectByteBuf

(3)Pooled和Unpooled

(4)池化和非池化的HeapByteBuf

(1)堆内存HeapByteBuf

优点是内存的分配和回收速度快,可以被JVM自动回收。缺点是如果进行Socket的IO读写,需要额外做一次内存复制。也就是将堆内存对应的缓冲区复制到内核Channel中,性能下降。

(2)直接内存DIrectByteBuf

在堆外进行内存分配,相比于堆内存,它的分配和回收速度还会慢一些。但是如果进行Scoket的IO读写,由于少了一次内存复制,所以速度比堆内存快。

所以最佳实践应该是:IO通信线程的读写缓冲区使用DirectByteBuf,后端业务消息的编解码模块使用HeapByteBuf。

(3)Pooled和Unpooled

Pooled的ByteBuf可以重用ByteBuf对象,它自己维护了一个内存池,可以循环利用已创建的ByteBuf,从而提升了内存的使用效率,降低了由于高负载导致的频繁GC。

尽管推荐使用基于Pooled的ByteBuf,但是内存池的管理和维护更加复杂,使用起来需要更加谨慎。

(4)池化和非池化的HeapByteBuf

UnpooledHeapByteBuf是基于堆内存进行内存分配的字节缓冲区,每次IO读写都会创建一个新的UnpooledHeapByteBuf。

频繁进行大块内存的分配和回收会对性能造成影响,但相比于堆外内存的申请和释放,成本还是低些。

UnpooledHeapByteBuf的实现原理比PooledHeapByteBuf简单,不容易出现内存管理方面的问题,满足性能下推荐UnpooledHeapByteBuf。

5.ByteBuf的主要内容分三大方面

一.内存与内存分配器的抽象

二.不同规格大小和不同类别的内存的分配策略

三.内存的回收过程

6.内存分配器ByteBufAllocator

(1)ByteBufAllocator的功能

(2)AbstractByteBufAllocator

(1)ByteBufAllocator的核心功能

所有类型的ByteBuf最终都是通过Netty里的内存分配器分配出来的,Netty里的内存分配器都有一个最顶层的抽象ByteBufAllocator,用于负责分配所有类型的内存。

ByteBufAllocator的核心功能如下:

一.buffer()

分配一块内存或者说分配一个字节缓冲区,由子类具体实现决定是Heap还是Direct。

二.ioBuffer()

分配一块DirectByteBuffer的内存。

三.heapBuffer()

在堆上进行内存分配。

四.directBuffer()

在堆外进行内存分配。

复制代码
//Implementations are responsible to allocate buffers. Implementations of this interface are expected to be thread-safe. 
public interface ByteBufAllocator {
    ByteBufAllocator DEFAULT = ByteBufUtil.DEFAULT_ALLOCATOR;

    //Allocate a ByteBuf. If it is a direct or heap buffer depends on the actual implementation.
    ByteBuf buffer();

    //Allocate a ByteBuf with the given initial capacity.
    //If it is a direct or heap buffer depends on the actual implementation.
    ByteBuf buffer(int initialCapacity);

    //Allocate a ByteBuf} with the given initial capacity and the given maximal capacity. 
    //If it is a direct or heap buffer depends on the actual implementation.
    ByteBuf buffer(int initialCapacity, int maxCapacity);

    //Allocate a ByteBuf, preferably a direct buffer which is suitable for I/O.
    ByteBuf ioBuffer();

    //Allocate a ByteBuf, preferably a direct buffer which is suitable for I/O.
    ByteBuf ioBuffer(int initialCapacity);

    //Allocate a ByteBuf, preferably a direct buffer which is suitable for I/O.
    ByteBuf ioBuffer(int initialCapacity, int maxCapacity);

    //Allocate a heap ByteBuf.
    ByteBuf heapBuffer();

    //Allocate a heap ByteBuf with the given initial capacity.
    ByteBuf heapBuffer(int initialCapacity);

    //Allocate a heap ByteBuf with the given initial capacity and the given maximal capacity.
    ByteBuf heapBuffer(int initialCapacity, int maxCapacity);

    //Allocate a direct ByteBuf.
    ByteBuf directBuffer();

    //Allocate a direct ByteBuf with the given initial capacity.
    ByteBuf directBuffer(int initialCapacity);

    //Allocate a direct ByteBuf with the given initial capacity and the given maximal capacity.
    ByteBuf directBuffer(int initialCapacity, int maxCapacity);

    //Allocate a CompositeByteBuf.
    //If it is a direct or heap buffer depends on the actual implementation.
    CompositeByteBuf compositeBuffer();

    //Allocate a CompositeByteBuf with the given maximum number of components that can be stored in it.
    //If it is a direct or heap buffer depends on the actual implementation.
    CompositeByteBuf compositeBuffer(int maxNumComponents);

    //Allocate a heap CompositeByteBuf.
    CompositeByteBuf compositeHeapBuffer();

    //Allocate a heap CompositeByteBuf with the given maximum number of components that can be stored in it.
    CompositeByteBuf compositeHeapBuffer(int maxNumComponents);

    //Allocate a direct CompositeByteBuf.
    CompositeByteBuf compositeDirectBuffer();

    //Allocate a direct CompositeByteBuf with the given maximum number of components that can be stored in it.
    CompositeByteBuf compositeDirectBuffer(int maxNumComponents);

    //Returns true if direct ByteBuf's are pooled
    boolean isDirectBufferPooled();

    //Calculate the new capacity of a ByteBuf that is used when a ByteBuf needs to expand by the minNewCapacity with maxCapacity as upper-bound. 
    int calculateNewCapacity(int minNewCapacity, int maxCapacity);
 }

(2)AbstractByteBufAllocator

AbstractByteBufAllocator实现了ByteBufAllocator的大部分功能,并最终暴露出两个基本的API,也就是抽象方法newDirectBuffer()和newHeapBuffer()。

这两个抽象方法会由PooledByteBufAllocator和UnpooledByteBufAllocator来实现。

复制代码
//Skeletal ByteBufAllocator implementation to extend.
public abstract class AbstractByteBufAllocator implements ByteBufAllocator {
    private static final int DEFAULT_INITIAL_CAPACITY = 256;
    private final boolean directByDefault;
    private final ByteBuf emptyBuf;

    //Instance use heap buffers by default
    protected AbstractByteBufAllocator() {
        this(false);
    }

    //Create new instance
    //@param preferDirect true if #buffer(int) should try to allocate a direct buffer rather than a heap buffer
    protected AbstractByteBufAllocator(boolean preferDirect) {
        directByDefault = preferDirect && PlatformDependent.hasUnsafe();
        emptyBuf = new EmptyByteBuf(this);
    }

    @Override
    public ByteBuf buffer() {
        if (directByDefault) {
            return directBuffer();
        }
        return heapBuffer();
    }
    
    @Override
    public ByteBuf directBuffer() {
        return directBuffer(DEFAULT_INITIAL_CAPACITY, Integer.MAX_VALUE);
    }
    
    @Override
    public ByteBuf heapBuffer() {
        return heapBuffer(DEFAULT_INITIAL_CAPACITY, Integer.MAX_VALUE);
    }
    
    @Override
    public ByteBuf directBuffer(int initialCapacity, int maxCapacity) {
        if (initialCapacity == 0 && maxCapacity == 0) {
            return emptyBuf;
        }
        validate(initialCapacity, maxCapacity);
        return newDirectBuffer(initialCapacity, maxCapacity);
    }
    
    @Override
    public ByteBuf heapBuffer(int initialCapacity, int maxCapacity) {
        if (initialCapacity == 0 && maxCapacity == 0) {
            return emptyBuf;
        }
        validate(initialCapacity, maxCapacity);
        return newHeapBuffer(initialCapacity, maxCapacity);
    }
    
    private static void validate(int initialCapacity, int maxCapacity) {
        if (initialCapacity < 0) {
            throw new IllegalArgumentException("initialCapacity: " + initialCapacity + " (expectd: 0+)");
        }
        if (initialCapacity > maxCapacity) {
            throw new IllegalArgumentException(String.format("initialCapacity: %d (expected: not greater than maxCapacity(%d)", initialCapacity, maxCapacity)); 
        }
    }
    
    //Create a heap {@link ByteBuf} with the given initialCapacity and maxCapacity.
    protected abstract ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity);


    //Create a direct {@link ByteBuf} with the given initialCapacity and maxCapacity.
    protected abstract ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity);
    ...
}

7.ByteBufAllocator的两大子类

(1)UnpooledByteBufAllocator介绍

(2)PooledByteBufAllocator介绍

(3)PooledByteBufAllocator的结构

(4)PooledByteBufAllocator如何创建一个ByteBuf总结

(1)UnpooledByteBufAllocator介绍

对于UnpooledHeadByteBuf的创建,会直接new一个字节数组,并且读写指针初始化为0。对于UnpooledDirectByteBuf的创建,会直接new一个DirectByteBuffer对象。注意:其中unsafe是Netty自行判断的,如果系统支持获取unsafe对象就使用unsafe对象。

对比UnpooledUnsafeHeadByteBuf和UnpooledHeadByteBuf的getByte()方法可知,unsafe和非unsafe的区别如下:unsafe最终会通过对象的内存地址 + 偏移量的方式去拿到对应的数据,非unsafe最终会通过数组 + 下标或者JDK底层的ByteBuffer的API去拿到对应的数据。一般情况下,通过unsafe对象去取数据要比非unsafe要快一些,因为unsafe对象是直接通过内存地址操作的。

复制代码
//Simplistic ByteBufAllocator implementation that does not pool anything.
public final class UnpooledByteBufAllocator extends AbstractByteBufAllocator {
    ...
    @Override
    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        return PlatformDependent.hasUnsafe() ? 
            new UnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) : 
            new UnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
    }

    @Override
    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        ByteBuf buf = PlatformDependent.hasUnsafe() ?
            UnsafeByteBufUtil.newUnsafeDirectByteBuf(this, initialCapacity, maxCapacity) :
            new UnpooledDirectByteBuf(this, initialCapacity, maxCapacity);

        return disableLeakDetector ? buf : toLeakAwareBuffer(buf);
    }
    ...
}

//1.使用UnpooledUnsafeHeapByteBuf通过unsafe创建一个非池化的堆内存ByteBuf
final class UnpooledUnsafeHeapByteBuf extends UnpooledHeapByteBuf {
    ...
    //Creates a new heap buffer with a newly allocated byte array.
    //@param initialCapacity the initial capacity of the underlying byte array
    //@param maxCapacity the max capacity of the underlying byte array
    UnpooledUnsafeHeapByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
        super(alloc, initialCapacity, maxCapacity);
    }
    
    @Override
    public byte getByte(int index) {
        checkIndex(index);
        return _getByte(index);
    }

    @Override
    protected byte _getByte(int index) {
        return UnsafeByteBufUtil.getByte(array, index);
    }
    ...
}

//2.使用UnpooledHeapByteBuf通过非unsafe创建一个非池化的堆内存ByteBuf
public class UnpooledHeapByteBuf extends AbstractReferenceCountedByteBuf {
    private final ByteBufAllocator alloc;
    byte[] array;
    ...
    //Creates a new heap buffer with a newly allocated byte array.
    //@param initialCapacity the initial capacity of the underlying byte array
    //@param maxCapacity the max capacity of the underlying byte array
    protected UnpooledHeapByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
        this(alloc, new byte[initialCapacity], 0, 0, maxCapacity);
    }
    
    private UnpooledHeapByteBuf(ByteBufAllocator alloc, byte[] initialArray, int readerIndex, int writerIndex, int maxCapacity) { 
        ...
        this.alloc = alloc;
        setArray(initialArray);//设置直接创建的字节数组
        setIndex(readerIndex, writerIndex);//设置读写指针为0
    }

    private void setArray(byte[] initialArray) {
        array = initialArray;
        ...
    }
    
    @Override
    public byte getByte(int index) {
        ensureAccessible();
        return _getByte(index);
    }

    @Override
    protected byte _getByte(int index) {
        return HeapByteBufUtil.getByte(array, index);
    }
    ...
}

//3.使用UnsafeByteBufUtil.newUnsafeDirectByteBuf()创建一个非池化的直接内存ByteBuf
final class UnsafeByteBufUtil {
    ...
    static UnpooledUnsafeDirectByteBuf newUnsafeDirectByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
        if (PlatformDependent.useDirectBufferNoCleaner()) {
            return new UnpooledUnsafeNoCleanerDirectByteBuf(alloc, initialCapacity, maxCapacity);
        }
        return new UnpooledUnsafeDirectByteBuf(alloc, initialCapacity, maxCapacity);
    }
}

public class UnpooledUnsafeDirectByteBuf extends AbstractReferenceCountedByteBuf {
    private final ByteBufAllocator alloc;
    ByteBuffer buffer;
    ...
    //Creates a new direct buffer.
    //@param initialCapacity the initial capacity of the underlying direct buffer
    //@param maxCapacity     the maximum capacity of the underlying direct buffer
    protected UnpooledUnsafeDirectByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
        ...
        this.alloc = alloc;
        setByteBuffer(allocateDirect(initialCapacity), false);
    }
    
    //Allocate a new direct ByteBuffer with the given initialCapacity.
    protected ByteBuffer allocateDirect(int initialCapacity) {
        //使用ByteBuffer直接分配一个DirectByteBuffer对象
        return ByteBuffer.allocateDirect(initialCapacity);
    }
    
    final void setByteBuffer(ByteBuffer buffer, boolean tryFree) {
        ...
        this.buffer = buffer;
        ...
    }
}

//4.使用UnpooledDirectByteBuf创建一个非池化的直接内存ByteBuf
public class UnpooledDirectByteBuf extends AbstractReferenceCountedByteBuf {
    private final ByteBufAllocator alloc;
    private ByteBuffer buffer;
    ...
    //Creates a new direct buffer.  
    //@param initialCapacity the initial capacity of the underlying direct buffer
    //@param maxCapacity     the maximum capacity of the underlying direct buffer
    protected UnpooledDirectByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
        ...
        this.alloc = alloc;
        //使用ByteBuffer直接分配一个DirectByteBuffer对象
        setByteBuffer(ByteBuffer.allocateDirect(initialCapacity));
    }
    
    private void setByteBuffer(ByteBuffer buffer) {
        ...
        this.buffer = buffer;
        ...
    }
}

public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer> {
    ...
    //Allocates a new direct byte buffer.
    public static ByteBuffer allocateDirect(int capacity) {
        return new DirectByteBuffer(capacity);
    }
    ...
}

//5.unsafe和非unsafe的区别
final class UnsafeByteBufUtil {
    //unsafe会调用这个方法
    static byte getByte(byte[] array, int index) {
        return PlatformDependent.getByte(array, index);
    }
    ...
}
    
final class HeapByteBufUtil {
    //非unsafe会调用这个方法
    static byte getByte(byte[] memory, int index) {
        return memory[index];
    }
    ...
}

(2)PooledByteBufAllocator介绍

PooledByteBufAllocator的newHeapBuffer()方法和newDirectBuffer()方法,都会首先通过threadCache获取一个PoolThreadCache对象,然后再从该对象获取一个heapArena对象或directArena对象,最后通过heapArena对象或directArena对象的allocate()方法去分配内存。

具体步骤如下:

步骤一:拿到线程局部缓存PoolThreadCache

因为newHeapBuffer()和newDirectBuffer()可能会被多线程同时调用,所以threadCache.get()拿到的是当前线程的cache,一个PoolThreadLocalCache对象。

PoolThreadLocalCache继承自FastThreadLocal,FastThreadLocal可以当作JDK的ThreadLocal,只不过比ThreadLocal更快。

每个线程都有唯一的PoolThreadCache,PoolThreadCache里维护两大内存:一个是堆内存heapArena,一个是堆外内存directArena。

步骤二:在线程局部缓存的Arena上进行内存分配

Arena可以翻译成竞技场的意思。

创建PooledByteBufAllocator内存分配器时,会创建两种类型的PoolArena数组:heapArenas和directArenas。这两个数组的大小默认都是两倍CPU核数,因为这样就和创建的NIO线程数一样了。这样每个线程都可以有一个独立的PoolArena。

PoolThreadLocalCache的initialValue()方法中,会从PoolArena数组中获取一个PoolArena与当前线程进行绑定。对于PoolArena数组里的每个PoolArena,在分配内存时是不用加锁的。

复制代码
public class PooledByteBufAllocator extends AbstractByteBufAllocator {
    private final PoolThreadLocalCache threadCache;
    private final PoolArena<byte[]>[] heapArenas;//一个线程会和一个PoolArena绑定
    private final PoolArena<ByteBuffer>[] directArenas;//一个线程会和一个PoolArena绑定
    //表示threadCache.tinySubPageHeapCaches数组里的每个MemoryRegionCache元素,最多可以缓存512个ByteBuf
    private final int tinyCacheSize;
    //表示threadCache.smallSubPageHeapCaches数组里的每个MemoryRegionCache元素,最多可以缓存256个ByteBuf
    private final int smallCacheSize;
    //表示threadCache.normalHeapCaches数组里的每个MemoryRegionCache元素,最多可以缓存64个ByteBuf
    private final int normalCacheSize;
    ...
    static {
        int defaultPageSize = SystemPropertyUtil.getInt("io.netty.allocator.pageSize", 8192);
        DEFAULT_PAGE_SIZE = defaultPageSize;
       
        int defaultMaxOrder = SystemPropertyUtil.getInt("io.netty.allocator.maxOrder", 11);
        DEFAULT_MAX_ORDER = defaultMaxOrder;
      
        final Runtime runtime = Runtime.getRuntime();
        final int defaultMinNumArena = runtime.availableProcessors() * 2;
        final int defaultChunkSize = DEFAULT_PAGE_SIZE << DEFAULT_MAX_ORDER;//8K * 2^11 = 16M
        DEFAULT_NUM_HEAP_ARENA = Math.max(0,SystemPropertyUtil.getInt("io.netty.allocator.numHeapArenas",
            (int) Math.min(defaultMinNumArena, runtime.maxMemory() / defaultChunkSize / 2 / 3)));
        DEFAULT_NUM_DIRECT_ARENA = Math.max(0,SystemPropertyUtil.getInt("io.netty.allocator.numDirectArenas",
            (int) Math.min(defaultMinNumArena, PlatformDependent.maxDirectMemory() / defaultChunkSize / 2 / 3)));

        //cache sizes
        DEFAULT_TINY_CACHE_SIZE = SystemPropertyUtil.getInt("io.netty.allocator.tinyCacheSize", 512);
        DEFAULT_SMALL_CACHE_SIZE = SystemPropertyUtil.getInt("io.netty.allocator.smallCacheSize", 256);
        DEFAULT_NORMAL_CACHE_SIZE = SystemPropertyUtil.getInt("io.netty.allocator.normalCacheSize", 64);

        //32 kb is the default maximum capacity of the cached buffer. Similar to what is explained in 'Scalable memory allocation using jemalloc'
        DEFAULT_MAX_CACHED_BUFFER_CAPACITY = SystemPropertyUtil.getInt("io.netty.allocator.maxCachedBufferCapacity", 32 * 1024);

        //the number of threshold of allocations when cached entries will be freed up if not frequently used
        DEFAULT_CACHE_TRIM_INTERVAL = SystemPropertyUtil.getInt("io.netty.allocator.cacheTrimInterval", 8192);
        ...
    }
    
    public PooledByteBufAllocator() {
        this(false);
    }
      
    public PooledByteBufAllocator(boolean preferDirect) {
        this(preferDirect, DEFAULT_NUM_HEAP_ARENA, DEFAULT_NUM_DIRECT_ARENA, DEFAULT_PAGE_SIZE, DEFAULT_MAX_ORDER);
    }
      
    public PooledByteBufAllocator(boolean preferDirect, int nHeapArena, int nDirectArena, int pageSize, int maxOrder) { 
        this(preferDirect, nHeapArena, nDirectArena, pageSize, maxOrder,
            DEFAULT_TINY_CACHE_SIZE, DEFAULT_SMALL_CACHE_SIZE, DEFAULT_NORMAL_CACHE_SIZE);
    }
      
    //默认的pageSize=8K=8192,maxOrder=11,tinyCacheSize=512,smallCacheSize=256,normalCacheSize=64
    public PooledByteBufAllocator(boolean preferDirect, int nHeapArena, int nDirectArena, 
        int pageSize, int maxOrder, int tinyCacheSize, int smallCacheSize, int normalCacheSize) {
        super(preferDirect);
        //初始化PoolThreadLocalCache
        this.threadCache = new PoolThreadLocalCache();
        this.tinyCacheSize = tinyCacheSize;//512
        this.smallCacheSize = smallCacheSize;//256
        this.normalCacheSize = normalCacheSize;//64
        //chunkSize = 8K * 2^11 = 16M
        final int chunkSize = validateAndCalculateChunkSize(pageSize, maxOrder);
        ...
        //pageShifts = 13
        int pageShifts = validateAndCalculatePageShifts(pageSize);

        if (nHeapArena > 0) {
            heapArenas = newArenaArray(nHeapArena);
            List<PoolArenaMetric> metrics = new ArrayList<PoolArenaMetric>(heapArenas.length);
            for (int i = 0; i < heapArenas.length; i ++) {
                PoolArena.HeapArena arena = new PoolArena.HeapArena(this, pageSize, maxOrder, pageShifts, chunkSize);
                heapArenas[i] = arena;
                metrics.add(arena);
            }
            heapArenaMetrics = Collections.unmodifiableList(metrics);
        } else {
            heapArenas = null;
            heapArenaMetrics = Collections.emptyList();
        }

        if (nDirectArena > 0) {
            directArenas = newArenaArray(nDirectArena);
            List<PoolArenaMetric> metrics = new ArrayList<PoolArenaMetric>(directArenas.length);
            for (int i = 0; i < directArenas.length; i ++) {
                PoolArena.DirectArena arena = new PoolArena.DirectArena(this, pageSize, maxOrder, pageShifts, chunkSize);
                directArenas[i] = arena;
                metrics.add(arena);
            }
            directArenaMetrics = Collections.unmodifiableList(metrics);
        } else {
            directArenas = null;
            directArenaMetrics = Collections.emptyList();
        }
    }

    @Override
    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<byte[]> heapArena = cache.heapArena;
        ByteBuf buf;
        if (heapArena != null) {
            //分配堆内存
            buf = heapArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            buf = new UnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
        }
        return toLeakAwareBuffer(buf);
    }

    @Override
    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<ByteBuffer> directArena = cache.directArena;
        ByteBuf buf;
        if (directArena != null) {
            //分配直接内存
            buf = directArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            if (PlatformDependent.hasUnsafe()) {
                buf = UnsafeByteBufUtil.newUnsafeDirectByteBuf(this, initialCapacity, maxCapacity);
            } else {
                buf = new UnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
            }
        }
        return toLeakAwareBuffer(buf);
    }
    
    final class PoolThreadLocalCache extends FastThreadLocal<PoolThreadCache> {
        @Override
        protected synchronized PoolThreadCache initialValue() {
            final PoolArena<byte[]> heapArena = leastUsedArena(heapArenas);
            final PoolArena<ByteBuffer> directArena = leastUsedArena(directArenas);
            return new PoolThreadCache(
                heapArena, directArena, tinyCacheSize, smallCacheSize, normalCacheSize,
                DEFAULT_MAX_CACHED_BUFFER_CAPACITY, DEFAULT_CACHE_TRIM_INTERVAL);
        }

        @Override
        protected void onRemoval(PoolThreadCache threadCache) {
            threadCache.free();
        }
    
        private <T> PoolArena<T> leastUsedArena(PoolArena<T>[] arenas) {
            if (arenas == null || arenas.length == 0) {
                return null;
            }
            PoolArena<T> minArena = arenas[0];
            for (int i = 1; i < arenas.length; i++) {
                PoolArena<T> arena = arenas[i];
                if (arena.numThreadCaches.get() < minArena.numThreadCaches.get()) {
                    minArena = arena;
                }
            }
            return minArena;
        }
    }
    ...
}

final class PoolThreadCache {
    //PoolArena对象
    final PoolArena<byte[]> heapArena;
    final PoolArena<ByteBuffer> directArena;
    
    //ByteBuffer缓存队列
    //Hold the caches for the different size classes, which are tiny, small and normal.
    //有32个MemoryRegionCache元素,分别存放16B、32B、48B、...、480B、496B的SubPage级别的内存
    private final MemoryRegionCache<byte[]>[] tinySubPageHeapCaches;
    //有4个MemoryRegionCache元素,分别存放512B、1K、2K、4K的SubPage级别的内存
    private final MemoryRegionCache<byte[]>[] smallSubPageHeapCaches;
    //有3个MemoryRegionCache元素,分别存放8K、16K、32K的Page级别的内存
    private final MemoryRegionCache<byte[]>[] normalHeapCaches;
    private final MemoryRegionCache<ByteBuffer>[] tinySubPageDirectCaches;
    private final MemoryRegionCache<ByteBuffer>[] smallSubPageDirectCaches;
    private final MemoryRegionCache<ByteBuffer>[] normalDirectCaches;
    ...
}

(3)PooledByteBufAllocator的结构

(4)PooledByteBufAllocator如何创建一个ByteBuf总结

每个线程调用PoolThreadLocalCache的get()方法时,都会拿到一个PoolThreadCache对象。然后通过PoolThreadCache对象可以拿到该线程对应的一个PoolArena对象。

这个PoolThreadCache对象的作用就是通过FastThreadLocal的方式,把PooledByteBufAllocator内存分配器的PoolArena数组中的一个PoolArena对象放入它的成员变量里。

比如第1个线程就会拿到内存分配器的heapArenas数组中的第1个PoolArena对象,第n个线程就会拿到内存分配器的heapArenas数组中的第n个PoolArena对象,从而将一个线程和一个PoolArena进行绑定了。

PoolThreadCache除了可以直接在这个PoolArena内存上进行分配外,还可以在它维护的一些ByteBuffer或者byte[]缓存列表上进行分配。比如我们通过PooledByteBufAllocator内存分配器创建了一个1024字节的ByteBuf,该ByteBuf被用完并释放后,可能还需要在其他地方继续分配1024字节大小的内存。这时其实不需要重新在PoolArena上进行内存分配,而可以直接从PoolThreadCache里维护的ByteBuffer或byte[]缓存列表里拿出来返回即可。

PooledByteBufAllocator内存分配器里维护了三个值:tinyCacheSize、smallCacheSize、normalCacheSize,tinyCacheSize表明tiny类型的ByteBuf最多可以缓存512个,smallCacheSize表明small类型的ByteBuf最多可以缓存256个,normalCacheSize表明normal类型的ByteBuf最多可以缓存64个。

在创建PoolThreadCache对象时,会把这3个值传递进去。然后用于创建:

tinySubPageHeapCaches、

smallSubPageHeapCaches、

normalHeapCaches、

tinySubPageDirectCaches、

smallSubPageDirectCaches、

normalDirectCaches。

8.PoolArena分配内存的流程

PooledByteBufAllocator内存分配器在使用其方法newHeapBuffer()和newDirectBuffer()分配内存时,会分别执行代码heapArena.allocate()和directArena.allocate(),其实就是调用PoolArena的allocate()方法。在PoolArena的allocate()方法里,会通过其抽象方法newByteBuf()创建一个PooledByteBuf对象,而具体的newByteBuf()方法会由PoolArena的子类DirectArena和HeapArena来实现。

PoolArena的allocate()方法分配内存的大体逻辑如下:首先通过由PoolArena子类实现的newByteBuf()方法获取一个PooledByteBuf对象,接着通过PoolArena的allocate()方法在线程私有的PoolThreadCache上分配内存,这个分配过程其实就是拿一块内存,然后初始化PooledByteBuf对象里的内存地址值。

复制代码
public class PooledByteBufAllocator extends AbstractByteBufAllocator {
    private final PoolThreadLocalCache threadCache;
    private final PoolArena<byte[]>[] heapArenas;//一个线程会和一个PoolArena绑定
    private final PoolArena<ByteBuffer>[] directArenas;//一个线程会和一个PoolArena绑定
    ...
    @Override
    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<byte[]> heapArena = cache.heapArena;
        ByteBuf buf;
        if (heapArena != null) {
            //分配堆内存
            buf = heapArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            buf = new UnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
        }
        return toLeakAwareBuffer(buf);
    }

    @Override
    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<ByteBuffer> directArena = cache.directArena;
        ByteBuf buf;
        if (directArena != null) {
            //分配直接内存
            buf = directArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            if (PlatformDependent.hasUnsafe()) {
                buf = UnsafeByteBufUtil.newUnsafeDirectByteBuf(this, initialCapacity, maxCapacity);
            } else {
                buf = new UnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
            }
        }
        return toLeakAwareBuffer(buf);
    }
    ...
}

abstract class PoolArena<T> implements PoolArenaMetric {
    ...
    PooledByteBuf<T> allocate(PoolThreadCache cache, int reqCapacity, int maxCapacity) {
        PooledByteBuf<T> buf = newByteBuf(maxCapacity);//创建ByteBuf对象
        allocate(cache, buf, reqCapacity);//基于PoolThreadCache对ByteBuf对象进行内存分配
        return buf;
    }
    
    private void allocate(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity) {
        final int normCapacity = normalizeCapacity(reqCapacity);
        if (isTinyOrSmall(normCapacity)) {//capacity < pageSize,需要分配的内存小于8K
            int tableIdx;
            PoolSubpage<T>[] table;
            boolean tiny = isTiny(normCapacity);
            if (tiny) {//< 512
                if (cache.allocateTiny(this, buf, reqCapacity, normCapacity)) {
                    //命中缓存,was able to allocate out of the cache so move on
                    return;
                }
                tableIdx = tinyIdx(normCapacity);
                table = tinySubpagePools;
            } else {
                if (cache.allocateSmall(this, buf, reqCapacity, normCapacity)) {
                    //命中缓存,was able to allocate out of the cache so move on
                    return;
                }
                tableIdx = smallIdx(normCapacity);
                table = smallSubpagePools;
            }

            final PoolSubpage<T> head = table[tableIdx];

            //Synchronize on the head. 
            //This is needed as PoolChunk#allocateSubpage(int) and PoolChunk#free(long) may modify the doubly linked list as well.
            synchronized (head) {
                final PoolSubpage<T> s = head.next;
                if (s != head) {
                    assert s.doNotDestroy && s.elemSize == normCapacity;
                    long handle = s.allocate();
                    assert handle >= 0;
                    s.chunk.initBufWithSubpage(buf, handle, reqCapacity);
                    if (tiny) {
                        allocationsTiny.increment();
                    } else {
                        allocationsSmall.increment();
                    }
                    return;
                }
            }
            //没有命中缓存
            allocateNormal(buf, reqCapacity, normCapacity);
            return;
        }
        if (normCapacity <= chunkSize) {//需要分配的内存大于8K,但小于16M
            if (cache.allocateNormal(this, buf, reqCapacity, normCapacity)) {
                //命中缓存,was able to allocate out of the cache so move on
                return;
            }
            //没有命中缓存
            allocateNormal(buf, reqCapacity, normCapacity);
        } else {//需要分配的内存大于16M
            //Huge allocations are never served via the cache so just call allocateHuge
            allocateHuge(buf, reqCapacity);
        }
    }

    protected abstract PooledByteBuf<T> newByteBuf(int maxCapacity);
    
    static final class HeapArena extends PoolArena<byte[]> {
        ...
        @Override
        protected PooledByteBuf<byte[]> newByteBuf(int maxCapacity) {
            return HAS_UNSAFE ? PooledUnsafeHeapByteBuf.newUnsafeInstance(maxCapacity)
                : PooledHeapByteBuf.newInstance(maxCapacity);
        }
    }
    
    static final class DirectArena extends PoolArena<ByteBuffer> {
        ...
        @Override
        protected PooledByteBuf<ByteBuffer> newByteBuf(int maxCapacity) {
            if (HAS_UNSAFE) {
                return PooledUnsafeDirectByteBuf.newInstance(maxCapacity);
            } else {
                return PooledDirectByteBuf.newInstance(maxCapacity);
            }
        }
    }
    ...
}

class PooledHeapByteBuf extends PooledByteBuf<byte[]> {
    private static final Recycler<PooledHeapByteBuf> RECYCLER = new Recycler<PooledHeapByteBuf>() {
        @Override
        protected PooledHeapByteBuf newObject(Handle<PooledHeapByteBuf> handle) {
            return new PooledHeapByteBuf(handle, 0);
        }
    };

    static PooledHeapByteBuf newInstance(int maxCapacity) {
        PooledHeapByteBuf buf = RECYCLER.get();
        buf.reuse(maxCapacity);
        return buf;
    }
    ...
}

final class PooledUnsafeHeapByteBuf extends PooledHeapByteBuf {
    private static final Recycler<PooledUnsafeHeapByteBuf> RECYCLER = new Recycler<PooledUnsafeHeapByteBuf>() {
        @Override
        protected PooledUnsafeHeapByteBuf newObject(Handle<PooledUnsafeHeapByteBuf> handle) {
            return new PooledUnsafeHeapByteBuf(handle, 0);
        }
    };

    static PooledUnsafeHeapByteBuf newUnsafeInstance(int maxCapacity) {
        PooledUnsafeHeapByteBuf buf = RECYCLER.get();
        buf.reuse(maxCapacity);
        return buf;
    }
    ...
}

final class PooledUnsafeDirectByteBuf extends PooledByteBuf<ByteBuffer> {
    private static final Recycler<PooledUnsafeDirectByteBuf> RECYCLER = new Recycler<PooledUnsafeDirectByteBuf>() {
        @Override
        protected PooledUnsafeDirectByteBuf newObject(Handle<PooledUnsafeDirectByteBuf> handle) {
            return new PooledUnsafeDirectByteBuf(handle, 0);
        }
    };

    static PooledUnsafeDirectByteBuf newInstance(int maxCapacity) {
        PooledUnsafeDirectByteBuf buf = RECYCLER.get();
        buf.reuse(maxCapacity);
        return buf;
    }
    ...
}

final class PooledDirectByteBuf extends PooledByteBuf<ByteBuffer> {
    private static final Recycler<PooledDirectByteBuf> RECYCLER = new Recycler<PooledDirectByteBuf>() {
        @Override
        protected PooledDirectByteBuf newObject(Handle<PooledDirectByteBuf> handle) {
            return new PooledDirectByteBuf(handle, 0);
        }
    };

    static PooledDirectByteBuf newInstance(int maxCapacity) {
        PooledDirectByteBuf buf = RECYCLER.get();
        buf.reuse(maxCapacity);
        return buf;
    }
    ...
}

PoolArena.allocate()方法分配内存的逻辑如下:

**步骤一:**首先PoolArena.newByteBuf()方法会从RECYCLER对象池中,尝试获取一个PooledByteBuf对象并进行复用,若获取不到就创建一个PooledByteBuf。以DirectArena的newByteBuf()方法为例,它会通过RECYCLER.get()拿到一个PooledByteBuf。RECYCLER是一个带有回收特性的对象池,RECYCLER.get()的含义是:若对象池里有一个PooledByteBuf就拿出一个,没有就创建一个。拿到一个PooledByteBuf之后,由于可能是从回收站里拿出来的,所以要调用buf.reuse()进行复用,然后才是返回。

**步骤二:**接着PoolArena.allocate()方法会在PoolThreadCache缓存上尝试进行内存分配。如果有一个ByteBuf对象之前已使用过并且被释放掉了,而这次需要分配的内存是差不多规格大小的一个ByteBuf,那么就可以直接在该规格大小对应的一个缓存列表里获取这个ByteBuf缓存,然后进行分配。

**步骤三:**如果没有命中PoolThreadCache的缓存,那么就进行实际的内存分配。

相关推荐
东阳马生架构14 小时前
Netty源码—9.性能优化和设计模式
netty应用与源码
东阳马生架构5 天前
Netty源码—5.Pipeline和Handler
netty应用与源码
东阳马生架构6 天前
Netty源码—4.客户端接入流程
netty应用与源码
东阳马生架构8 天前
Netty源码—3.Reactor线程模型二
netty应用与源码