k8s ipvs 模式下不支持 localhost:<nodeport>方式访问服务

简介

今天去定位一个nodeport的问题,发现curl 127.0.0.1:32000 访问nodeport的时候会规律的hang住,本来以为是后端服务的问题,但是curl管理ip:nodeport 是正常的。这个就奇怪了,深入研究了下发现 ipvs模式下是不支持这样访问的,如果想使用 localhost: 得使用iptables模式。

下面是一个歪果仁的解释

After a little dig into the kernel, failing to connect localhost: can be explained.

Assume we are visiting http://127.0.0.1:, every packet will first pass through ip_vs_nat_xmit. Then after some check, it runs to https://github.com/torvalds/linux/blob/v4.18/net/netfilter/ipvs/ip_vs_xmit.c#L756. __ip_vs_get_out_rt is used to search for route to remote server, k8s pods in our case. Take a deeper look at __ip_vs_get_out_rt, after validating route cache or finding route via do_output_route4, it comes to crosses_local_route_boundary to judge whether the searched route can pass cross-local-route-boundary check.

Copy the code of crosses_local_route_boundary here and go deeper.

c 复制代码
static inline bool crosses_local_route_boundary(int skb_af, struct sk_buff *skb,
						int rt_mode,
						bool new_rt_is_local)
{
	bool rt_mode_allow_local = !!(rt_mode & IP_VS_RT_MODE_LOCAL);
	bool rt_mode_allow_non_local = !!(rt_mode & IP_VS_RT_MODE_NON_LOCAL);
	bool rt_mode_allow_redirect = !!(rt_mode & IP_VS_RT_MODE_RDR);
	bool source_is_loopback;
	bool old_rt_is_local;

#ifdef CONFIG_IP_VS_IPV6
	/* omit ipv6 */
#endif
	{
		source_is_loopback = ipv4_is_loopback(ip_hdr(skb)->saddr);
		old_rt_is_local = skb_rtable(skb)->rt_flags & RTCF_LOCAL;
	}

	if (unlikely(new_rt_is_local)) {
		if (!rt_mode_allow_local)
			return true;
		if (!rt_mode_allow_redirect && !old_rt_is_local)
			return true;
	} else {
		if (!rt_mode_allow_non_local)
			return true;
		if (source_is_loopback)
			return true;
	}
	return false;
}

For nat mode of IPVS, rt_mode is assigned as IP_VS_RT_MODE_LOCAL | IP_VS_RT_MODE_NON_LOCAL | IP_VS_RT_MODE_RDR as https://github.com/torvalds/linux/blob/v4.18/net/netfilter/ipvs/ip_vs_xmit.c#L757-L759 indicates and new_rt_is_local is 0 due to https://github.com/torvalds/linux/blob/v4.18/net/netfilter/ipvs/ip_vs_xmit.c#L363.

source_is_loopback will be true because source address ip_hdr(skb)->saddr is 127.0.0.1. The five booleans defined in crosses_local_route_boundary will all be true in this case. So we finally fall into here

if (source_is_loopback)

return true;

and never pass the cross-local-route-boundary check.

解决方案

目前看如果想这样使用1是修改内核,这个可能比较难实现,另一个方案是修改路由表,修改源ip的地址,也是一种曲线救国的方式

bash 复制代码
sudo ip route change 127.0.0.1 dev lo proto  kernel scope host src <node ip> table local

引用

https://github.com/kubernetes/kubernetes/issues/96879

https://github.com/kubernetes/kubernetes/issues/67730

https://serverfault.com/questions/851221/what-is-the-local-routing-table-used-for

相关推荐
MonkeyKing_sunyuhua1 小时前
docker compose up -d --build 完全使用新代码打包的方法
docker·容器·eureka
醇氧2 小时前
【docker】mysql 8 的健康检查(Health Check)
mysql·docker·容器
匀泪5 小时前
云原生(LVS NAT模式集群实验)
服务器·云原生·lvs
70asunflower6 小时前
用Docker创建不同的容器类型
运维·docker·容器
CodeGolang6 小时前
Docker容器化部署Zabbix监控系统完整指南
docker·容器·zabbix
DolitD6 小时前
云流技术深度剖析:国内云渲染主流技术与开源和海外厂商技术实测对比
功能测试·云原生·开源·云计算·实时云渲染
ghostwritten8 小时前
春节前夕,运维的「年关」:用 Kubeowler 给集群做一次「年终体检」
运维·云原生·kubernetes
[shenhonglei]17 小时前
灰度发布功能需求说明书
kubernetes
lpruoyu17 小时前
【Docker进阶-03】存储原理
docker·容器
文静小土豆18 小时前
Docker 与 containerd 代理配置详解:镜像拉取速度慢的终极解决方案
运维·docker·容器