聊一聊 C#异步中的Overlapped是如何寻址的

一:背景

1. 讲故事

前段时间训练营里的一位朋友提了一个问题,我用ReadAsync做文件异步读取时,我知道在Win32层面会传 lpOverlapped 到内核层,那在内核层回头时,它是如何通过这个 lpOverlapped 寻找到 ReadAsync 这个异步的Task的呢?

这是一个好问题,这需要回答人对异步完整的运转流程有一个清晰的认识,即使有清晰的认识也不能很好的口头表述出来,就算表述出来对方也不一定能听懂,所以干脆开两篇文章来尝试解读一下吧。

二:lpOverlapped 如何映射

1. 测试案例

为了能够讲清楚,我们先用 fileStream.ReadAsync 方法来写一段异步读取来产生Overlapped,参考代码如下:

C# 复制代码
        static void Main(string[] args)
        {
            UseAwaitAsync();
            Console.ReadLine();
        }

        static async Task<string> UseAwaitAsync()
        {
            string filePath = "D:\\dumps\\trace-1\\GenHome.DMP";
            Console.WriteLine($"{DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss:fff")} 请求发起...");
            FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 16, useAsync: true);
            {
                byte[] buffer = new byte[fileStream.Length];

                int bytesRead = await fileStream.ReadAsync(buffer, 0, buffer.Length);

                string content = Encoding.UTF8.GetString(buffer, 0, bytesRead);

                var query = $"{DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss:fff")} 获取到结果:{content.Length}";

                Console.WriteLine(query);

                return query;
            }
        }

很显然上面的方法会调用 Win32 中的 ReadFile,接下来上一下它的签名和 _OVERLAPPED 结构体。

C# 复制代码
BOOL ReadFile(
  [in]                HANDLE       hFile,
  [out]               LPVOID       lpBuffer,
  [in]                DWORD        nNumberOfBytesToRead,
  [out, optional]     LPDWORD      lpNumberOfBytesRead,
  [in, out, optional] LPOVERLAPPED lpOverlapped
);

typedef struct _OVERLAPPED {
  ULONG_PTR Internal;
  ULONG_PTR InternalHigh;
  union {
    struct {
      DWORD Offset;
      DWORD OffsetHigh;
    } DUMMYSTRUCTNAME;
    PVOID Pointer;
  } DUMMYUNIONNAME;
  HANDLE    hEvent;
} OVERLAPPED, *LPOVERLAPPED;

2. 寻找映射的两端

既然是映射嘛,肯定要找到两个端口,即非托管层的 NativeOverlapped 和 托管层的 ThreadPoolBoundHandleOverlapped。

  1. 非托管 _OVERLAPPED

在 C# 中用 NativeOverlapped 结构体表示 Win32 的 _OVERLAPPED 结构,参考如下:

C# 复制代码
public struct NativeOverlapped
{
	public nint InternalLow;
	public nint InternalHigh;
	public int OffsetLow;
	public int OffsetHigh;
	public nint EventHandle;
}
  1. 托管 ThreadPoolBoundHandleOverlapped

ReadAsync 所产生的 Task<int> 在底层是经过ValueTask, OverlappedValueTaskSource 一阵痉挛后弄出来的,最后会藏匿在 Overlapped 子类的 ThreadPoolBoundHandleOverlapped 中,参考代码和模型图如下:

C# 复制代码
        public override Task<int> ReadAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken)
        {
            ValueTask<int> valueTask = this.ReadAsync(new Memory<byte>(buffer, offset, count), cancellationToken);
            if (!valueTask.IsCompletedSuccessfully)
            {
                return valueTask.AsTask();
            }
            return this._lastSyncCompletedReadTask.GetTask(valueTask.Result);
        }

        private unsafe static ValueTuple<SafeFileHandle.OverlappedValueTaskSource, int> QueueAsyncReadFile(SafeFileHandle handle, Memory<byte> buffer, long fileOffset, CancellationToken cancellationToken, OSFileStreamStrategy strategy)
        {
            SafeFileHandle.OverlappedValueTaskSource overlappedValueTaskSource = handle.GetOverlappedValueTaskSource();
            
            NativeOverlapped* ptr = overlappedValueTaskSource.PrepareForOperation(buffer, fileOffset, strategy);
            if (Interop.Kernel32.ReadFile(handle, (byte*)overlappedValueTaskSource._memoryHandle.Pointer, buffer.Length, IntPtr.Zero, ptr) == 0)
            {
                overlappedValueTaskSource.RegisterForCancellation(cancellationToken);
            }
            overlappedValueTaskSource.FinishedScheduling();
            return new ValueTuple<SafeFileHandle.OverlappedValueTaskSource, int>(overlappedValueTaskSource, -1);
        }

最后就是两端的映射关系了,先通过 malloc 分配了一块私有内存,中间隔了一个refcount 的 8byte大小,模型图如下:

3. 眼见为实

要想眼见为实,可以从C#源码中的Overlapped.AllocateNativeOverlapped方法寻找答案。

C# 复制代码
    public unsafe class Overlapped
    {
        private NativeOverlapped* AllocateNativeOverlapped(object? userData)
        {
            NativeOverlapped* pNativeOverlapped = null;

            nuint handleCount = 1;

            pNativeOverlapped = (NativeOverlapped*)NativeMemory.Alloc((nuint)(sizeof(NativeOverlapped) + sizeof(nuint)) + handleCount * (nuint)sizeof(GCHandle));

            GCHandleCountRef(pNativeOverlapped) = 0;

            pNativeOverlapped->InternalLow = default;
            pNativeOverlapped->InternalHigh = default;
            pNativeOverlapped->OffsetLow = _offsetLow;
            pNativeOverlapped->OffsetHigh = _offsetHigh;
            pNativeOverlapped->EventHandle = _eventHandle;

            GCHandleRef(pNativeOverlapped, 0) = GCHandle.Alloc(this);
            GCHandleCountRef(pNativeOverlapped)++;

            return pRet;
        }

        private static ref nuint GCHandleCountRef(NativeOverlapped* pNativeOverlapped)
                               => ref *(nuint*)(pNativeOverlapped + 1);

        private static ref GCHandle GCHandleRef(NativeOverlapped* pNativeOverlapped, nuint index)
                              => ref *((GCHandle*)((nuint*)(pNativeOverlapped + 1) + 1) + index);
    }

卦中代码先用 NativeMemory.Alloc 方法分配了一块私有内存,随后还把 Overlapped 给 GCHandle.Alloc 住了,这是防止异步期间对象被移动,有了代码接下来上windbg去眼见为实,在 Kernel32!ReadFile 中下断点观察方法的第五个参数。

C# 复制代码
0:000> bp Kernel32!ReadFile
0:000> g
Breakpoint 0 hit
KERNEL32!ReadFile:
00007ffd`fa2f56a0 ff25caca0500    jmp     qword ptr [KERNEL32!_imp_ReadFile (00007ffd`fa352170)] ds:00007ffd`fa352170={KERNELBASE!ReadFile (00007ffd`f85c5520)}
0:000> k 5
 # Child-SP          RetAddr               Call Site
00 000000ff`8837e1c8 00007ffd`96229ce3     KERNEL32!ReadFile
01 000000ff`8837e1d0 00007ffd`96411a4a     System_Private_CoreLib!Interop.Kernel32.ReadFile+0xa3 [/_/src/coreclr/System.Private.CoreLib/Microsoft.Interop.LibraryImportGenerator/Microsoft.Interop.LibraryImportGenerator/LibraryImports.g.cs @ 6797] 
02 000000ff`8837e2d0 00007ffd`96411942     System_Private_CoreLib!System.IO.RandomAccess.QueueAsyncReadFile+0x8a
03 000000ff`8837e350 00007ffd`96433677     System_Private_CoreLib!System.IO.RandomAccess.ReadAtOffsetAsync+0x112 [/_/src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs @ 238] 
04 000000ff`8837e3f0 00007ffd`9642d5f8     System_Private_CoreLib!System.IO.Strategies.OSFileStreamStrategy.ReadAsync+0xb7 [/_/src/libraries/System.Private.CoreLib/src/System/IO/Strategies/OSFileStreamStrategy.cs @ 290] 
0:000> uf 00007ffd`96229ce3
...
 6797 00007ffd`96229c98 4c8b7d30        mov     r15,qword ptr [rbp+30h]
 6797 00007ffd`96229c9c 4c897c2420      mov     qword ptr [rsp+20h],r15
 6797 00007ffd`96229ca1 498bce          mov     rcx,r14
 6797 00007ffd`96229ca4 48894dac        mov     qword ptr [rbp-54h],rcx
 6797 00007ffd`96229ca8 488bd3          mov     rdx,rbx
 6797 00007ffd`96229cab 488955a4        mov     qword ptr [rbp-5Ch],rdx
 6797 00007ffd`96229caf 448bc6          mov     r8d,esi
 6797 00007ffd`96229cb2 448945b4        mov     dword ptr [rbp-4Ch],r8d
 6797 00007ffd`96229cb6 4c8bcf          mov     r9,rdi
 6797 00007ffd`96229cb9 4c894d9c        mov     qword ptr [rbp-64h],r9
 6797 00007ffd`96229cbd 488d8d40ffffff  lea     rcx,[rbp-0C0h]
 6797 00007ffd`96229cc4 ff159e909e00    call    qword ptr [System_Private_CoreLib!Interop.CallStringMethod+0x5ab9c8 (00007ffd`96c12d68)]
 6797 00007ffd`96229cca 488b055708a100  mov     rax,qword ptr [System_Private_CoreLib!Interop.CallStringMethod+0x5d3188 (00007ffd`96c3a528)]
 6797 00007ffd`96229cd1 488b4dac        mov     rcx,qword ptr [rbp-54h]
 6797 00007ffd`96229cd5 488b55a4        mov     rdx,qword ptr [rbp-5Ch]
 6797 00007ffd`96229cd9 448b45b4        mov     r8d,dword ptr [rbp-4Ch]
 6797 00007ffd`96229cdd 4c8b4d9c        mov     r9,qword ptr [rbp-64h]
 6797 00007ffd`96229ce1 ff10            call    qword ptr [rax]
 6797 00007ffd`96229ce3 8bd8            mov     ebx,eax

仔细阅读卦中的汇编代码,通过这句 r15,qword ptr [rbp+30h] 可知 pNativeOverlapped 是保存在 r15 寄存器中。

C# 复制代码
0:000> r r15
r15=00000241ca2d4d70
0:000> dp 00000241ca2d4d70
00000241`ca2d4d70  00000000`00000000 00000000`00000000
00000241`ca2d4d80  00000000`00000000 00000000`00000000
00000241`ca2d4d90  00000000`00000001 00000241`c8761358

根据上面的模型图,00000241ca2d4d90 保存的是引用计数,00000241c8761358 就是我们的 ThreadPoolBoundHandleOverlapped ,可以 !do 它一下便知。

最后用 dnspy 在 Overlapped.GetOverlappedFromNative 方法中下一个断点,这个方法会在异步处理完成后,执行NativeOverlapped寻址ThreadPoolBoundHandleOverlapped 的逻辑,截图如下,那个 ReadAsync保存在内部的 _continuationState 字段里。

三:总结

C#的传统做法大多都是采用传参数的方式来建议映射关系,而本篇中用 malloc 开辟一块私有区域来映射两者的关系也真是独一份,实属无奈!