http请求超时, 底层发生了什么?

业务方反应调用接口超时,但是在服务端监控并没有看到5xx异常, 于是我们模拟一下请求超时时发生了什么?

1.openresty模拟长耗时服务端

延迟5s响应

error_log logs/error.log;

http {
    server {
      listen 80;
      charset  utf-8;
      
      location /reqtimeout {
      default_type text/html;
      content_by_lua '
      local  start  = os.clock()
          while os.clock() - start  <  5 do  end
            ngx.say("delay  success!!")
      ';
    }
  }  
}

2.golang和.net默认的httpclient对外都只有一个timeout设置

用于控制请求、响应的整体时间

.net httpclient 默认timeout= 100s;

golang net/http 无默认值设置,强烈推荐设置timeout,以避免服务端慢响应拖垮客户端。

C# 复制代码
  static void Main(string[] args)
    {
        Console.WriteLine("Hello, World!");
       var a =  HttpReqTimeout();
       Console.WriteLine(a.Result);
    }

    static async  Task<string> HttpReqTimeout()
    {
        var handler = new SocketsHttpHandler
        {
            PooledConnectionLifetime = TimeSpan.FromMinutes(1)
        };
        using (var hc = new HttpClient(handler))
        {
            hc.Timeout = TimeSpan.FromSeconds(3);
            return  await hc.GetStringAsync("http://localhost/reqtimeout");
        }
    }

dotnet run ./ 显示客户端请求3s超时,爆出异常

Hello, World!
Unhandled exception. System.AggregateException: One or more errors occurred. (A task was canceled.)
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at System.Threading.Tasks.Task.GetExceptions(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at ConsoleApp1.Program.Main(String[] args) in /Users/admin/RiderProjects/TestHttpClientFactory/ConsoleApp1/Program.cs:line 9
--- End of stack trace from previous location ---

   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at ConsoleApp1.Program.Main(String[] args) in /Users/admin/RiderProjects/TestHttpClientFactory/ConsoleApp1/Program.cs:line 9

openresty服务端日志,显示执行完成,返回200ok:

127.0.0.1 - - [04/Dec/2024:15:17:50 +0800] "GET /reqtimeout HTTP/1.1" 200 28 "-" "-"

这也正是对应上了业务方的反馈和服务端的监控现象(无5xx报错)。

3.wireshark抓包看实质

tcp.port == 80 && ip.addr ==127.0.0.1 && ip.dst ==127.0.0.1

从tcp抓包过程看,分为三阶段:

1>. httpclient请求, 正常tcp三次握手+ 请求确认;

2>. 客户端3s之后超时, 客户端发送FIN+ACK 数据包(客户端标记连接已经被关闭), 服务端确认收到客户端的FIN包;

3>. 服务端5s尝试响应给客户端,最终会检测到客户端已经关闭而释放资源。

也就是说客户端请求超时,只会影响客户端, 服务端还会继续处理并响应, 这也是我们在服务端监控上看不到5xx报错的原因,可以通过在服务端设置: request_time between (-xx, 3s) 监测请求耗时占比。

正常的请求/响应读者可以参考下图:

本文记录了httpclient客户端超时在双端的现象, 服务端会继续响应,在服务端可能检测不到客户端认定的报错, 经验,唯手熟尔。