Unity Job System详解（3）——NativeList源码分析

【前言】

查看NativeList源码需要安装Unity的Entities Package

NativeList要实现的基本功能类似C# List，如下：

（一些简单的类同NativeArray的不在说明）

构造函数、析构函数、取值赋值

扩容、添加、移除操作

解析步骤包括，基本函数，常用操作，常用属性，接口函数

【源码分析】

定义

cs 复制代码

//一个泛型结构体，且是unsafe的，继承了三个接口，并要求泛型是非托管类型的，如果T是引用类型的会报错
public unsafe struct NativeList<T> : INativeDisposable, INativeList<T>, IIndexable<T> where T : unmanaged
{

}

构造函数

cs 复制代码

        public NativeList(AllocatorManager.AllocatorHandle allocator)
            : this(1, allocator)
        {
        }

        public NativeList(int initialCapacity, AllocatorManager.AllocatorHandle allocator)
        {
            this = default;
            AllocatorManager.AllocatorHandle temp = allocator;
            Initialize(initialCapacity, ref temp);
        }

        internal void Initialize<U>(int initialCapacity, ref U allocator) where U : unmanaged, AllocatorManager.IAllocator
        {
            var totalSize = sizeof(T) * (long)initialCapacity; //sizeof用于计算值类型对象所占用的内存大小，与容量相乘，得到初始化需要的内存大小
            m_ListData = UnsafeList<T>.Create(initialCapacity, ref allocator, NativeArrayOptions.UninitializedMemory);//listData是当前结构体的字段，调用UnsafeList的静态方法创建
        }

        //数据所在
        [NativeDisableUnsafePtrRestriction]//该特性允许使用指针
        internal UnsafeList<T>* m_ListData;//指向数据的指针

接着看UnsafeList

cs 复制代码

//同NativeList一样
public unsafe struct UnsafeList<T> : INativeDisposable, INativeList<T>, IIndexable<T> where T : unmanaged
{
}

直接看Create方法

cs 复制代码

        internal static UnsafeList<T>* Create<U>(int initialCapacity, ref U allocator, NativeArrayOptions options) where U : unmanaged, AllocatorManager.IAllocator
        {
            UnsafeList<T>* listData = allocator.Allocate(default(UnsafeList<T>), 1);//通过Allocator分配内存，返回该内存起始地址的指针  NativeArrayOptions有两个选择ClearMemory、UninitializedMemory
            *listData = new UnsafeList<T>(initialCapacity, allocator.Handle, options);//构造函数返回新的内存起始地址的指针
            return listData;
        }

        public UnsafeList(int initialCapacity, AllocatorManager.AllocatorHandle allocator, NativeArrayOptions options = NativeArrayOptions.UninitializedMemory)
    {
        Ptr = null;//数据所在
        m_length = 0;//length和capacity分别类似于C# List的count和capacity
        m_capacity = 0;
        Allocator = allocator;
        padding = 0;

        SetCapacity(math.max(initialCapacity, 1));

        if (options == NativeArrayOptions.ClearMemory && Ptr != null)
        {
            var sizeOf = sizeof(T);
            UnsafeUtility.MemClear(Ptr, Capacity * sizeOf);//ClearMemory选项会清空已有的内存，本质是调用memset，将从ptr开始，长度为Capacity * sizeOf的内存的值设置为0
        }
    }

进一步看SetCapacity方法

cs 复制代码

        public void SetCapacity(int capacity)
        {
            SetCapacity(ref Allocator, capacity);
        }

        void SetCapacity<U>(ref U allocator, int capacity) where U : unmanaged, AllocatorManager.IAllocator
        {
            CollectionHelper.CheckCapacityInRange(capacity, Length);//先检查设置的容量是否大于长度

            var sizeOf = sizeof(T);//获取T类型占用的字节数
            //CacheLineSize是当前平台的L1缓存行。L1缓存是离CPU最近的缓存层级，也是访问速度最快的缓存。它由数据缓存和指令缓存组成，分别用于存储数据和指令。缓存行是缓存的最小读写单位，一般是以字节为单位。
            //L1缓存行的大小在不同的处理器架构上可能会有所不同，常见的大小是64字节。当CPU需要读取或写入数据时，它会首先检查L1缓存，如果所需数据在缓存行中，则可以直接访问，从而加快数据访问速度。如果数据不在缓存行中，则需要从更慢的内存层次（如L2缓存或主存）中获取。
            //L1缓存行的设计目的是通过提前将数据和指令加载到高速缓存中，减少CPU等待数据的时间，从而提高计算机的性能。
            var newCapacity = math.max(capacity, CollectionHelper.CacheLineSize / sizeOf);//CollectionHelper.CacheLineSize / sizeOf得到的是L1缓存一次要读取的数据的数量，将其与设置的容量相比取大值，主要是为了防止T过小而容量不足
            newCapacity = math.ceilpow2(newCapacity);//向上取整到最接近的2的幂次方，并返回结果。如果newCapacity为10,得到的结果为16。这里将容量限制为2的幂次也是为了缓存友好

            if (newCapacity == Capacity)
            {
                return;
            }

            ResizeExact(ref allocator, newCapacity);
        }

继续看ResizeExact

cs 复制代码

//和C# List扩容类似，多了一步分配内存相关的处理        
        void ResizeExact<U>(ref U allocator, int newCapacity) where U : unmanaged, AllocatorManager.IAllocator
        {
            newCapacity = math.max(0, newCapacity);

            CollectionHelper.CheckAllocator(Allocator);
            T* newPointer = null;

            var alignOf = UnsafeUtility.AlignOf<T>();//获取T内存对齐的字节数
            var sizeOf = sizeof(T);

            if (newCapacity > 0)
            {
                //分配内存，传入T类型的大小，内存对齐方式、数量，得到新的内存地址指针
                newPointer = (T*)allocator.Allocate(sizeOf, alignOf, newCapacity);

                if (Ptr != null && m_capacity > 0)
                {
                    var itemsToCopy = math.min(newCapacity, Capacity);//得到需要拷贝的数量，这里用了容量，没用长度
                    var bytesToCopy = itemsToCopy * sizeOf;
                    UnsafeUtility.MemCpy(newPointer, Ptr, bytesToCopy);//拷贝，需要传入目的地址指针、源地址指针，拷贝数据大小，底层实际调用的是C++的memcpy
                }
            }

            allocator.Free(Ptr, Capacity);//释放原来的内存

            Ptr = newPointer;//新的内存地址指针
            m_capacity = newCapacity;//新的容量
            m_length = math.min(m_length, newCapacity);//新的长度
        }

取值赋值

cs 复制代码

//NativeList的索引器        
public T this[int index]
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]//这个特性用于指示编译器在编译期间对方法进行内联优化。
//内联是一种编译器优化技术，它将方法调用替换为方法的实际代码。这样可以减少方法调用的开销，提高代码执行的效率。
            get
            {
                return (*m_ListData)[index];
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            set
            {
                (*m_ListData)[index] = value;
            }
        }

UnSafeList的索引器
        public T this[int index]
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            get
            {
                CollectionHelper.CheckIndexInRange(index, m_length);//检测index是否小于length
                return Ptr[CollectionHelper.AssumePositive(index)];//先检测下Index是否大于零
                //Ptr的声明是， public T* Ptr;这是一个T类型的指针数组，直接通过Ptr[Index]访问元素即可
            }

            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            set
            {
                CollectionHelper.CheckIndexInRange(index, m_length);
                Ptr[CollectionHelper.AssumePositive(index)] = value;
            }
        }

取值赋值时需要注意，当T为结构体时，不能直接像类一个修改T的成员变量，需要修改成员变量后重新设置回去，例如：

/// T t = NativeList[Index];

/// t.a = 10;t.b = 15;

/// NativeList[Index] = T;

推荐使用ElementAt方法，例如：

/// ref T t = NativeList[Index];

/// t.a = 10;t.b = 15;

cs 复制代码

        public ref T ElementAt(int index)
        {
            return ref m_ListData->ElementAt(index);//注意，指针调用方法，访问成员用->
        }


        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public ref T ElementAt(int index)
        {
            CollectionHelper.CheckIndexInRange(index, m_length);
            return ref Ptr[CollectionHelper.AssumePositive(index)];
        }


//该方法和索引器的区别在于，如果T是一个结构体的话，通过索引器获取后再修改结构体字段的值是无效的，通过该方法是有效的

析构函数

有两个方法，一个是继承IDispose接口需要实现的void Dispose()方法；另一个是继承INativeDisposable要实现的JobHandle Dispose(JobHandle inputDeps)方法

cs 复制代码

        public void Dispose()
        {
            if (!IsCreated)
            {
                return;
            }

            UnsafeList<T>.Destroy(m_ListData);
            m_ListData = null;
        }

        //UnsafeList中Destroy
        public static void Destroy(UnsafeList<T>* listData)
        {
            CheckNull(listData);//检查下是否为空
            var allocator = listData->Allocator;
            listData->Dispose();
            AllocatorManager.Free(allocator, listData);//释放内存
        }

        public void Dispose()
        {
            if (!IsCreated)
            {
                return;
            }

            if (CollectionHelper.ShouldDeallocate(Allocator))
            {
                AllocatorManager.Free(Allocator, Ptr, m_capacity);
                Allocator = AllocatorManager.Invalid;
            }

            Ptr = null;
            m_length = 0;
            m_capacity = 0;
        }

Job有依赖时的释放Dispose：与NativeArray类似，自动创建一个NativeListDisposeJob依赖输入Job，输入Job完成后，在NativeListDisposeJob释放NativeList

cs 复制代码

        public JobHandle Dispose(JobHandle inputDeps)
        {
            if (!IsCreated)
            {
                return inputDeps;
            }

            var jobHandle = new NativeListDisposeJob { Data = new NativeListDispose { m_ListData = (UntypedUnsafeList*)m_ListData } }.Schedule(inputDeps);//创建一个新的Job调用依赖Job，新的Job持有ListData的引用，在Eexcute中释放ListData占用的内存
            m_ListData = null;

            return jobHandle;
        }

添加元素

添加单个元素，这个方法同C# List一样，长度大于容量时会扩容

cs 复制代码

        public void Add(in T value)
        {
            m_ListData->Add(in value);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Add(in T value)
        {
            var idx = m_length;
            if (m_length < m_capacity)
            {
                Ptr[idx] = value;
                m_length++;
                return;
            }
            
            Resize(idx + 1); //UnsafeList的Resize方法实际调用了SetCapacity方法
            Ptr[idx] = value;
        }

添加一系列元素AddRange

cs 复制代码

        public void AddRange(NativeArray<T> array)
        {
            AddRange(array.GetUnsafeReadOnlyPtr(), array.Length);//这里的参数是NativeArray，GetUnsafeReadOnlyPtr()是扩展方法，直接返回数据所在的buffer的指针
        }


        public void AddRange(void* ptr, int count)
        {
            CheckArgPositive(count);
            m_ListData->AddRange(ptr, CollectionHelper.AssumePositive(count));
        }


        public void AddRange(void* ptr, int count)
        {
            var idx = m_length;

            if (m_length + count > Capacity)
            {
                Resize(m_length + count);
            }
            else
            {
                m_length += count;
            }

            var sizeOf = sizeof(T);
            void* dst = (byte*)Ptr + idx * sizeOf; //这里需要转为Byte*指针，数据长度都是按照byte来的
            UnsafeUtility.MemCpy(dst, ptr, count * sizeOf);
        }

并行添加元素，相比之前区别在于长度自增是原子操作的

cs 复制代码

        public int AddNoResizeParallel(T value)
        {
            return m_ListData->AddNoResizeParallel(value);
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public int AddNoResizeParallel(T value)
        {
            var idx = Interlocked.Increment(ref m_length) - 1; //Interlocked.Increment是原子操作，多线程下确保+1操作是线程安全的。注意一般情况下，可能是先添加元素，再长度+1，这里是先+1完成，再添加元素
            CheckNoResizeHasEnoughCapacity(idx, 1);
            UnsafeUtility.WriteArrayElement(Ptr, idx, value);
            return idx;
        }

在末尾多次添加同一个元素

cs 复制代码

        public void AddReplicate(in T value, int count)//in表示参数是只读的
        {
            CheckArgPositive(count);
            m_ListData->AddReplicate(in value, CollectionHelper.AssumePositive(count));
        }


        public void AddReplicate(in T value, int count)
        {
            var idx = m_length;
            if (m_length + count > Capacity)
            {
                Resize(m_length + count);
            }
            else
            {
                m_length += count;
            }

            fixed (void* ptr = &value)//这里是拿到value所在地址指针, 是取地址运算符，用于获取变量或对象的内存地址。将&value赋值给void* ptr，表示将value的内存地址存储在void*类型的指针变量 ptr 中。
            {
                //这里用的是指针加法，其他地方转为byte*来算的，这里直接用的是T*，增加时指针长度为sizeof(T)
                UnsafeUtility.MemCpyReplicate(Ptr + idx, ptr, UnsafeUtility.SizeOf<T>(), count);
            }
        }

移除元素

单元素移除

cs 复制代码

        public void RemoveAt(int index)
        {
            m_ListData->RemoveAt(index);
        }

        public void RemoveAt(int index)
        {
            CollectionHelper.CheckIndexInRange(index, m_length);

            index = CollectionHelper.AssumePositive(index);

            T* dst = Ptr + index;
            T* src = dst + 1;
            m_length--;

            // Because these tend to be smaller (< 1MB), and the cost of jumping context to native and back is
            // so high, this consistently optimizes to better code than UnsafeUtility.MemCpy
            for (int i = index; i < m_length; i++)
            {
                *dst++ = *src++; //这里为了提高性能，没有内存的copy，直接修改了指针地址
            }
        }

范围移除

cs 复制代码

        public void RemoveRange(int index, int count)
        {
            m_ListData->RemoveRange(index, count);
        }

        public void RemoveRange(int index, int count)
        {
            CheckIndexCount(index, count);

            index = CollectionHelper.AssumePositive(index);
            count = CollectionHelper.AssumePositive(count);

            if (count > 0)
            {
                int copyFrom = math.min(index + count, m_length);
                var sizeOf = sizeof(T);
                void* dst = (byte*)Ptr + index * sizeOf;
                void* src = (byte*)Ptr + copyFrom * sizeOf;
                UnsafeUtility.MemCpy(dst, src, (m_length - copyFrom) * sizeOf);//直接将后面的元素向前移动
                m_length -= count;
            }
        }

长度属性

一般都get，不要去set

cs 复制代码

        public int Length
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            readonly get
            {
                return CollectionHelper.AssumePositive(m_ListData->Length);//返回了UnsafeList的长度
            }

            set
            {
                m_ListData->Resize(value, NativeArrayOptions.ClearMemory);//设置长度会调用Resize方法
            }
        }


        public int Length
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            readonly get => CollectionHelper.AssumePositive(m_length);//这个长度在初始化时会赋值

            set
            {
                if (value > Capacity)
                {
                    Resize(value);
                }
                else
                {
                    m_length = value;
                }
            }
        }

IEnumerable实现

返回了NativeArray<T>.Enumerator，转换成NativeArray，As没做拷贝，只是把指针赋值过去,ToArray做了拷贝

cs 复制代码

        public NativeArray<T> AsArray()
        {

            var array = NativeArrayUnsafeUtility.ConvertExistingDataToNativeArray<T>(m_ListData->Ptr, m_ListData->Length, Allocator.None);

            return array;
        }

        public unsafe static NativeArray<T> ConvertExistingDataToNativeArray<T>(void* dataPointer, int length, Allocator allocator) where T : struct
        {
            CheckConvertArguments<T>(length);
            NativeArray<T> result = default(NativeArray<T>);
            result.m_Buffer = dataPointer;
            result.m_Length = length;
            result.m_AllocatorLabel = allocator;
            result.m_MinIndex = 0;
            result.m_MaxIndex = length - 1;
            return result;
        }



        public NativeArray<T> ToArray(AllocatorManager.AllocatorHandle allocator)
        {
            NativeArray<T> result = CollectionHelper.CreateNativeArray<T>(Length, allocator, NativeArrayOptions.UninitializedMemory);

            UnsafeUtility.MemCpy((byte*)result.m_Buffer, (byte*)m_ListData->Ptr, Length * UnsafeUtility.SizeOf<T>());
            return result;
        }

INativeList

抽象出来的接口，其他NativeContainer也会用到

cs 复制代码

//一个可索引的接口    
public interface IIndexable<T> where T : unmanaged
    {
        int Length { get; set; }//元素集合的长度，即集合的元素数量

        ref T ElementAt(int index);//通过索引获取元素，注意，结构体中获取元素，如果存在写入，要用ref
    }

//自定义了一个NativeList接口，用于定义共用的方法
    public interface INativeList<T> : IIndexable<T> where T : unmanaged
    {

        int Capacity { get; set; }

        bool IsEmpty { get; }

        T this[int index] { get; set; }

        void Clear();
    }

总结

可以看到NativeList是对UnsafeList的封装，核心都在UnsafeList中。总体实现和一般C# List并无差别，可能多了下Native内存的访问，要注意下元素长度和实际内存长度的区别，特别之处是分配内存时是CPU L1缓存的整数倍

【内存分配】

这里是一个比较细的粒度的内存分配，这块内存一般用于存储特定的对象T的数据，叫Block,一般需要用数据结构记录关于这块内存的以下信息：

内存指针对应Block.Range.Pointer
所需的内存大小对应Block.Bytes 由于内存对齐的原因，实际分配的内存大小比所需的内存大小要大
对象T所需的内存对齐大小对应Block.Alignment
对象T占用内存大小对应Block.BytesPerItem
内存最多可分配的对象数量对应Block.Range.Items 与Block.Alignment相乘可以算出实际的内存大小，但一般上层基本不会用到，也不需要列出
该内存被分配的方式（可能有）对应Block.Range.Allocator

如果存在内存复用，例如原来Block有20个T对象，过段时间上层不需要释放了，对底层的内存管理模块而言，不是立马就释放了，可能缓存着，紧接着有新的请求，需要10个T对象内存，那么可以直接用已经分配好的Block。这种缓存很常见，上层业务也经常用。

（什么时候用已有的，什么是时候不缓存，用的地方不同，策略也不同）

在复用的情况下，需要记录额外的信息：

已经分配的对象数量 AllocatedItems
已经分配的对象内存大小 AllocatedBytes

分配会调用到AllocatorManager.TryLegacy(),接着调用Memory.Unmanaged.Allocate、Array.Resize，最后还是调用到UnsafeUtility.MallocTracked，前文NativeArray分配你内存最终也是调用的这个方法

cs 复制代码

    static unsafe int TryLegacy(ref Block block) //同时处理分配和释放内存的情况，不太理解为什么要写要一起
    {
        if (block.Range.Pointer == IntPtr.Zero) // Allocate
        {
            block.Range.Pointer = (IntPtr)Memory.Unmanaged.Allocate(block.Bytes, block.Alignment, LegacyOf(block.Range.Allocator));
            block.AllocatedItems = block.Range.Items;
            return (block.Range.Pointer == IntPtr.Zero) ? -1 : 0;
        }
        if (block.Bytes == 0) // Free，释放时上层会将这里设置为0
        {
            if (LegacyOf(block.Range.Allocator) != Allocator.None)
            {
                Memory.Unmanaged.Free((void*)block.Range.Pointer, LegacyOf(block.Range.Allocator));
            }
            block.Range.Pointer = IntPtr.Zero;
            block.AllocatedItems = 0;
            return 0;
        }
        // Reallocate (keep existing pointer and change size if possible. otherwise, allocate new thing and copy)
        return -1;
    }

cs 复制代码

    //传递的是枚举类型的Allocator，使用时却变成结构体AllocatorHandle，因为做了隐式转换，如下：
    public static implicit operator AllocatorHandle(Allocator a) => new AllocatorHandle
    {
        Index = (ushort)((uint)a & 0xFFFF),
        Version = 0
    };