nginx upstream转发连接错误情况研究

本次测试用到3台服务器:

192.168.10.115:转发服务器A

192.168.10.209:upstream下服务器1

192.168.10.210:upstream下服务器2

1台客户端:192.168.10.112

服务器A中nginx主要配置如下:

复制代码
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    keepalive_timeout  65;

    #gzip  on;

    upstream testup{
        server 192.168.10.209 weight=1 max_fails=1 fail_timeout=30s;
        server 192.168.10.210 weight=1 max_fails=1 fail_timeout=30s;
    }

    server {
        listen       80;
        server_name  localhost;

        #charset koi8-r;

        access_log  logs/host.access.log  main;

        location / {
            #root   html;
            #index  index.html index.htm;
            proxy_next_upstream http_502 http_504 error timeout invalid_header;
            proxy_ignore_client_abort on;
            proxy_send_timeout 60s;
            proxy_read_timeout 300s;
            proxy_next_upstream_tries 0;
            proxy_pass http://testup;
            proxy_set_header Host $host:$server_port;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_connect_timeout 3;
            proxy_redirect default;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }

nginx工作进程配置为2,

服务器1和服务器2都是普通的web服务配置不在此展示了

服务器1页面1:

服务器2页面2:

正常访问服务器A会在如上页面1和2之间切换

测试情况1:关闭服务器1

请求A地址,先卡顿差不多3s(应该与'proxy_connect_timeout 3'相关)然后切换成页面2内容,随后刷新无任何卡顿,30s(nginx相关配置:'max_fails=1 fail_timeout=30s')过后再次请求仍会卡顿3s成功随后并不卡顿,nginx报错日志如下:

2024/09/21 18:23:19 [error] 6056#0: *114 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

测试情况2:不关闭服务器1,只关闭其nginx服务

请求A地址,不出现卡顿,页面一直显示的页面2内容。nginx报错日志:

2024/09/21 18:30:14 [error] 6055#0: *133 connect() failed (113: No route to host) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

2024/09/21 18:30:48 [error] 6055#0: *133 connect() failed (113: No route to host) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

此处说明'proxy_connect_timeout 3'针对的是请求能否转达服务器,与目标服务器上的nginx服务是否正常运行无关。

测试情况3:关闭服务器1和2(2选择的是屏蔽了对外端口)

请求A地址,卡顿了差不多6,7s出现如下页面:

此处应该是先后请求了两个服务器耗时3*2s,随后请求不卡顿直接返回如上页面,30s后继续请求仍会卡顿6s随后不卡顿。nginx报错日志:

此处前两条分别是两台服务器的超时日志,后续的日志"no live upstreams while connecting to upstream"是nginx在由于之前请求超时已经判断两台机器都不可用,在30s不会再去转发请求到该服务器,没有了可用的upstreams直接报该错误。

测试情况4:模拟慢网情况

恢复服务器1,2的正常访问,保证访问A地址能正常在页面1,2之前切换。

模拟网络延迟:

tc qdisc add dev ens33 root netem delay 1000ms

参考文档:1分钟学会在Linux下模拟网络延迟_linux模拟网络延迟-CSDN博客

这里将服务器A的nginx配置'proxy_connect_timeout'值改为1。如果喜欢等待的老铁也可以选择不改,但上面的延迟时间要改为3000ms了。这延迟那真的是连xshell连接服务器的输入输出都延迟了╮(╯▽╰)╭,只要是走网卡的应该都会被卡一下。

此时访问A地址,卡顿了一段时间后页面如下:

与之前的关闭服务器1,2的情况大致一样。首次卡顿了4,5s返回如上页面,随后刷新2s返回,过30s后仍会卡顿4,5s返回。错误日志输出情况也与情况3一致。

将服务器A的nginx配置'proxy_connect_timeout'值改为2。继续请求A地址,差不多4,5s返回页面1或页面2,并且页面内容正常切换。

测试情况4.2:调整proxy_next_upstream_tries

在情况4的弱网请求失败情况下,修改'proxy_next_upstream_tries'值为1,请求A地址3次。nginx错误日志:

2024/09/21 21:03:10 [notice] 6399#0: signal process started

2024/09/21 21:03:29 [error] 6401#0: *396 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

2024/09/21 21:03:40 [error] 6401#0: *396 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.210:80/", host: "192.168.10.115"

2024/09/21 21:03:49 [error] 6401#0: *396 no live upstreams while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://testup/", host: "192.168.10.115

修改'proxy_next_upstream_tries'值为2,请求A地址2次,错误日志:

2024/09/21 21:23:16 [error] 6432#0: *413 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

2024/09/21 21:23:17 [error] 6432#0: *413 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.210:80/", host: "192.168.10.115"

2024/09/21 21:23:22 [error] 6432#0: *413 no live upstreams while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://testup/", host: "192.168.10.115"

相较于上次修改配置,这次请求第一次就连续尝试连接209和210,都失败,归为不可用,第二次请求直接返回"no live upstreams"报错。

再次修改'proxy_next_upstream_tries'值为3,请求A地址2次,错误日志:

2024/09/21 21:28:59 [error] 6446#0: *423 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.209:80/", host: "192.168.10.115"

2024/09/21 21:29:00 [error] 6446#0: *423 upstream timed out (110: Connection timed out) while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.10.210:80/", host: "192.168.10.115"

2024/09/21 21:29:12 [error] 6446#0: *423 no live upstreams while connecting to upstream, client: 192.168.10.112, server: localhost, request: "GET / HTTP/1.1", upstream: "http://testup/", host: "192.168.10.115"

跟上次测试日志输出一样,upstream下总共就2台服务器,不会因为proxy_next_upstream_tries的值大于2就循环再次请求第一个服务器。所以proxy_next_upstream_tries的值大于转发的服务器的数量时以服务器数量为准。proxy_next_upstream_tries值为0的时候从日志上看应该是按照服务器数量尝试连接的。

测试情况4.3:调整max_fails

配置nginx中max_fails值改为2,请求地址A3次,前两次请求时间明显较长,报错日志:

30s内每个upstream下服务器连接超时2次,才都判定为不可用时,最后一次请求出现"no live upstream"。

至此,想要测试的差不多结束了。

最后测试完记得删除网络延迟:

tc qdisc del dev ens33 root

如果上述测试对您有学习和工作有所帮助就点个赞吧!

相关推荐
liucan20127 小时前
nginx服务器实现上传文件功能_使用nginx-upload-module模块
服务器·前端·nginx
摇滚侠10 小时前
Windows 版 Nginx 关闭
运维·windows·nginx
Meepo_haha14 小时前
Nginx 反向代理配置
运维·nginx
星辰徐哥16 小时前
C语言Web开发:CGI、FastCGI、Nginx深度解析
c语言·前端·nginx
sunwenjian88617 小时前
httpslocalhostindex 配置的nginx,一刷新就报404了
运维·nginx
bearpping18 小时前
nginx 代理 redis
运维·redis·nginx
ywf121518 小时前
Nginx 缓存清理
运维·nginx·缓存
dustcell.18 小时前
企业级高可用电商平台实战项目设计
运维·redis·nginx·docker·web·lvs·haproxy
chehaoman1 天前
Failed to restart nginx.service Unit nginx.service not found
运维·nginx
今晚务必早点睡1 天前
Nginx 从入门到精通:一篇讲透原理、功能、配置与实战场景
运维·nginx·负载均衡