深入docker-swarm overlay网络模型

目录

1.简介

2.网络模型

3.docker_gwbridge网络

3.1.docker_gwbridge网关地址

3.2.检查docker_gwbridge网络

3.2.1.查找任务容器eth接口

3.2.2.查找ingress-sbox容器eth接口

4.检查ingress网络

4.1.检查ingress网络

4.2.检查ingress网络的命名空间

4.2.1.查找任务容器eth接口

4.2.2.查找ingress-sbox容器eth接口

5.网络链访问路径跟踪

5.1.查看nat表

5.2.查看mangle表

5.3.ipvs负载均衡

5.4.网络路径访问

6.VXLAN

6.1.tcpdump监听

6.2.测试


1.简介

overlay网络也被称为重叠网络或覆盖网络,是一种基于underlay网络之上的逻辑网络,既在物理网络的基础之上,通过节点间的单播隧道机制将主机两两相连形成的虚拟的,独立的网络。

Docker swarm集群中的overlay网络主要是通过iptables,ipvs,vxlan等技术实现的,基于本身通信需求的网络模型。

2.网络模型

docker swarm集群的overlay网络模型在创建时,会创建出两个网络,一个是docker_gwbridge网络,一个是ingress网络,这就是典型的overlay网络,即在物理网络的基础上又创建出新的网络,同时还创建出docker_gwbridge网关,br0网关和ingress-sbox容器。

当请求到达后首先经由docker_gwbridge网络跳转到ingress-sbox容器,ingress-sbox容器中具有当前整个service的所有容器IP,在其中轮询负载均衡的方式选择一个容器IP作为目标地址,然后跳转到br0网关,在br0网关中会根据目标地址所在主机进行判断,如果目标地址为本地容器IP,则直接请求转发给该容器处理即可,否则,将请求转到VXLAN网络

在宿主机上执行docker network ls

root@wss05:/var/run/docker/netns# docker network ls

NETWORK ID NAME DRIVER SCOPE

3ee266d99388 bridge bridge local

3662a36aaafd docker_gwbridgebridge local

19b6b58500e8 host host local

qh13cp9hcd9f ingress overlay swarm

wdwvazbeweo8 mongo overlay swarm

f62673a1ab1f none null local

可以发现docker_gwbridge网络和ingress网络

docker_gwbridge网络:bridge桥接模式,范围是本地local

ingress网络:overlay 跨主机模式,范围是整个swarm集群

3.docker_gwbridge网络

3.1.docker_gwbridge网关地址

宿主机执行命令:ip a | grep docker_gwbridge

root@wss05:/var/run/docker/netns# ip a | grep docker_gwbridge

8: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default

inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge

10: veth9315eea@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridgestate UP group default

175: veth01b86d6@if174: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridgestate UP group default

可以发现docker_gwbridge网关地址是172.18.0.1,在这个网关下存在两组veth接口:

10: veth9315eea@if9

175: veth01b86d6@if174

3.2.检查docker_gwbridge网络

宿主机执行命令:docker network inspect docker_gwbridge

root@wss05:/var/run/docker/netns# docker network inspect docker_gwbridge

[

{

"Name": "docker_gwbridge",

"Id": "3662a36aaafd66289145afcb40da3d1cd630ba06008050caaf0db70450601966",

......

"IPAM": {

"Driver": "default",

"Options": null,

"Config": [

{

"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"

}

]

},

......

"Containers": {

"7fe83a80e505157f15dfbe93f692d8791204c1d6c7fabd0d508bfb028d755b52": {

"Name": "gateway_68395c879fa8 ",

"EndpointID": "81b7d123133649363f694cbff8e20416350695160d8b75c748ba91609bcfeb46",

"MacAddress": "02:42:ac:12:00:03",

"IPv4Address": "172.18.0.3 /16",

"IPv6Address": ""

},

"ingress-sbox": {

"Name": "gateway_ingress-sbox",

"EndpointID": "83d3e9a378d09b39aaf6abee36f9b57c635b43cc939cc77bad9055f6eafb9faf",

"MacAddress": "02:42:ac:12:00:02",

"IPv4Address": "172.18.0.2/16",

"IPv6Address": ""

}

},

......

}

]

可以确认docker_gwbridge 网关地址是172.18.0.1,子网网段是172.18.0.0/16

该网络下存在两个容器,7fe83a80e505和ingress-sbox

执行docker ps ,发现只有一个容器相对应,且该容器IP是172.18.0.3

root@wss05:/var/run/docker/netns# docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7fe83a80e505 openjdk-omc:8 "java -jar -Dfile.en..." 2 days ago Up 2 days omc_o.2.m15gi46y82su0meap1qyo87u2

3.2.1.查找任务容器eth接口

检查该容器内部ip,在宿主机上执行docker exec -it 7fe83a80e505 ip a

root@wss05:/var/run/docker/netns# docker exec -it 7fe83a80e505 ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

172: eth0@if173: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:00:ba brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 10.0.0.186/24 brd 10.0.0.255 scope global eth0

valid_lft forever preferred_lft forever
174: eth2@if175: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 2

inet 172.18.0.3/16 brd 172.18.255.255 scope global eth2

valid_lft forever preferred_lft forever

176: eth1@if177: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:01:0d brd ff:ff:ff:ff:ff:ff link-netnsid 1

inet 10.0.1.13/24 brd 10.0.1.255 scope global eth1

valid_lft forever preferred_lft forever

可以发现该容器ip是172.18.0.3,对应的eth接口是174: eth2@if175,与3.1.docker_gwbridge网关地址中175: veth01b86d6@if174相对应

3.2.2.查找ingress-sbox容器eth接口

切换目录cd /var/run/docker/netns

root@wss05:/var/run/docker/netns# ll -al

total 0

drwxr-xr-x 2 root root 140 Apr 19 09:37 ./

drwx------ 8 root root 180 Apr 9 02:09 ../

-r--r--r-- 1 root root 0 Apr 8 03:20 1-qh13cp9hcd

-r--r--r-- 1 root root 0 Apr 19 09:37 1-wdwvazbewe

-r--r--r-- 1 root root 0 Apr 19 09:37 68395c879fa8

-r--r--r-- 1 root root 0 Apr 8 03:20 ingress_sbox

-r--r--r-- 1 root root 0 Apr 19 09:37 lb_wdwvazbew

这里是docker network网络命名空间,查看该目录下存在68395c879fa8和ingress_sbox,分别对应着任务容器和ingress-sbox容器的网络命名****空间

查看ingress_sbox的网络命名空间

宿主机执行nsenter --net=ingress_sbox ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

6: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:00:06 brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 10.0.0.6/24 brd 10.0.0.255 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.22/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.169/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.177/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.180/32 scope global eth0

valid_lft forever preferred_lft forever
9: eth1@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 1

inet 172.18.0.2/16 brd 172.18.255.255 scope global eth1

可以发现该容器ip是172.18.0.2,对应的eth接口是9: eth1@if10,与3.1.docker_gwbridge网关地址中10: veth9315eea@if9****相对应

综上可以用图描述,这就是docker_gwbridge网络

4.检查ingress网络

4.1.检查ingress网络

宿主机执行命令:docker network inspect ingress

root@wss05:/var/run/docker/netns# docker network inspect ingress

[

{

"Name": "ingress",

"Id": "qh13cp9hcd9f9qqmfll9u7vwu",

"Created": "2024-04-08T03:20:32.270499152Z",

"Scope": "swarm",

"Driver": "overlay",

"EnableIPv6": false,

"IPAM": {

"Driver": "default",

"Options": null,

"Config": [

{

"Subnet": "10.0.0.0/24",
"Gateway": "10.0.0.1"

}

]

},

"Internal": false,

"Attachable": false,

"Ingress": true,

"ConfigFrom": {

"Network": ""

},

"ConfigOnly": false,

"Containers": {

"7fe83a80e505157f15dfbe93f692d8791204c1d6c7fabd0d508bfb028d755b52": {

"Name": "omc_o.2.m15gi46y82su0meap1qyo87u2",

"EndpointID": "8d654001245103247eae54d3cf45482e8548b7824dcf84c8bfc2d460c03f3962",

"MacAddress": "02:42:0a:00:00:ba",

"IPv4Address": "10.0.0.186/24",

"IPv6Address": ""

},

"ingress-sbox": {

"Name": "ingress-endpoint",

"EndpointID": "986f3a9b4552ee28538135d24459d648a59fea78ae469ef08464b06ba00dc8b0",

"MacAddress": "02:42:0a:00:00:06",

"IPv4Address": "10.0.0.6/24",

"IPv6Address": ""

}

},

"Options": {

"com.docker.network.driver.overlay.vxlanid_list": "4096"

},

"Labels": {},

"Peers": [

{

"Name": "3ce0cc96ea64",

"IP": "192.168.90.212"

},

{

"Name": "6b0cd2b904e8",

"IP": "192.168.90.213"

},

{

"Name": "28f90f999bef",

"IP": "192.168.90.214"

},

{

"Name": "677884795e0e",

"IP": "192.168.90.211"

},

{

"Name": "295a2e8a0868",

"IP": "192.168.90.215"

},

{

"Name": "f3d9c50f8c3e",

"IP": "192.168.90.216"

},

{

"Name": "0403ffa52d19",

"IP": "192.168.90.217"

},

{

"Name": "628b1ba6aef7",

"IP": "192.168.90.218"

}

]

}

]

可以发现该网络的网关地址是10.0.0.1,且该网络ID是qh13cp9hcd9f9qqmfll9u7vwu

4.2.检查ingress网络的命名空间

切换目录cd /var/run/docker/netns,存在1-qh13cp9hcd相对应

root@wss05:/var/run/docker/netns# ll -al

total 0

drwxr-xr-x 2 root root 140 Apr 19 09:37 ./

drwx------ 8 root root 180 Apr 9 02:09 ../

-r--r--r-- 1 root root 0 Apr 8 03:20 1-qh13cp9hcd

-r--r--r-- 1 root root 0 Apr 19 09:37 1-wdwvazbewe

-r--r--r-- 1 root root 0 Apr 19 09:37 68395c879fa8

-r--r--r-- 1 root root 0 Apr 8 03:20 ingress_sbox

-r--r--r-- 1 root root 0 Apr 19 09:37 lb_wdwvazbew

宿主机执行nsenter --net=1-qh13cp9hcd ip a

root@wss05:/var/run/docker/netns# nsenter --net=1-qh13cp9hcd ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

2:br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:50:91:48:7a:4a brd ff:ff:ff:ff:ff:ff

inet 10.0.0.1/24 brd 10.0.0.255 scope global br0

valid_lft forever preferred_lft forever
5: vxlan0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master br0 state UNKNOWN group default

link/ether 02:50:91:48:7a:4a brd ff:ff:ff:ff:ff:ff link-netnsid 0
7: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master br0state UP group default

link/ether ee:2c:44:7c:0f:dc brd ff:ff:ff:ff:ff:ff link-netnsid 1
173: veth16@if172: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master br0state UP group default

link/ether 1e:71:0f:d1:7c:f5 brd ff:ff:ff:ff:ff:ff link-netnsid 2

可以发现ip是10.0.0.1对应网关是br0,且在这个网关下存在三组veth接口:

5: vxlan0@if5

7: veth0@if6

173: veth16@if172

4.2.1.查找任务容器eth接口

操作跟3.2.1.查找任务容器eth接口一样

root@wss05:/var/run/docker/netns# docker exec -it 7fe83a80e505 ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever
172: eth0@if173: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:00:ba brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 10.0.0.186 /24 brd 10.0.0.255 scope global eth0

valid_lft forever preferred_lft forever

174: eth2@if175: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 2

inet 172.18.0.3 /16 brd 172.18.255.255 scope global eth2

valid_lft forever preferred_lft forever

176: eth1@if177: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:01:0d brd ff:ff:ff:ff:ff:ff link-netnsid 1

inet 10.0.1.13/24 brd 10.0.1.255 scope global eth1

valid_lft forever preferred_lft forever

可以发现该容器接口是172: eth0@if173,与4.1.ingress网关地址中173: veth16@if172相对应

4.2.2.查找ingress-sbox容器eth接口

操作跟3.2.2.查找ingress-sbox容器eth接口一样

宿主机执行nsenter --net=ingress_sbox ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever
6: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default

link/ether 02:42:0a:00:00:06 brd ff:ff:ff:ff:ff:ff link-netnsid 0

inet 10.0.0.6/24 brd 10.0.0.255 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.22/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.169/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.177/32 scope global eth0

valid_lft forever preferred_lft forever

inet 10.0.0.180/32 scope global eth0

valid_lft forever preferred_lft forever

9: eth1@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default

link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 1

inet 172.18.0.2/16 brd 172.18.255.255 scope global eth1

可以发现该容器接口是接口是6: eth0@if7,与4.1.ingress****网关地址中7: veth0@if6相对应

综上可用图描述,docker_gwbridge网络和ingress网络描述

5.网络链访问路径跟踪

iptables四表五链内容

5.1.查看nat表

在宿主机上执行iptables -nvL -t nat

root@wss05:/var/run/docker/netns# iptables -nvL -t nat

......

Chain DOCKER (2 references)

pkts bytes target prot opt in out source destination

0 0 RETURN all -- docker_gwbridge * 0.0.0.0/0 0.0.0.0/0

0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-INGRESS (2 references)

pkts bytes target prot opt in out source destination

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:31241 to:172.18.0.2:31241

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:31232 to:172.18.0.2:31232

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:443 to:172.18.0.2:443

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 to:172.18.0.2:80

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:9090 to:172.18.0.2:9090

0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:27017 to:172.18.0.2:27017

183K 26M RETURN all -- * * 0.0.0.0/0 0.0.0.0/0

DNAT表示目的地转换,也就是说凡是访问9090端口的IP转到172.18.0.2上

在宿主机上执行ip route,想访问172.18.0.2就必须先访问docker_gwbridge网关(172.18.0.1)

root@wss05:/var/run/docker/netns# ip route

default via 10.0.2.2 dev eth0 proto dhcp src 10.0.2.15 metric 100

10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15

10.0.2.2 dev eth0 proto dhcp scope link src 10.0.2.15 metric 100

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1

192.168.90.0/24 dev eth1 proto kernel scope link src 192.168.90.215

综上可以用图表述

5.2.查看mangle表

访问路径转到ingress-sbox容器的网络命名空间中了,接下来怎么转换呢?

后面就需要查看该网络空间的mangle表了

切换目录cd /var/run/docker/netns

root@wss05:/var/run/docker/netns# ll -al

total 0

drwxr-xr-x 2 root root 140 Apr 19 09:37 ./

drwx------ 8 root root 180 Apr 9 02:09 ../

-r--r--r-- 1 root root 0 Apr 8 03:20 1-qh13cp9hcd

-r--r--r-- 1 root root 0 Apr 19 09:37 1-wdwvazbewe

-r--r--r-- 1 root root 0 Apr 19 09:37 68395c879fa8

-r--r--r-- 1 root root 0 Apr 8 03:20 ingress_sbox

-r--r--r-- 1 root root 0 Apr 19 09:37 lb_wdwvazbew

这里是docker network网络命名空间

在宿主机上执行nsenter --net=ingress_sbox iptables -nvL -t mangle

root@wss05:/var/run/docker/netns# nsenter --net=ingress_sbox iptables -nvL -t mangle

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)

pkts bytes target prot opt in out source destination

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:27017 MARK set 0x297

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:9090 MARK set 0x2e8

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 MARK set 0x2ec

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:443 MARK set 0x2ec

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:31232 MARK set 0x2ed

0 0 MARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:31241 MARK set 0x2ed

......

可以看出凡是访问9090的都给你打上标签MASK,这个标记是0x2e8 等于 十进制的744

8+14*16+2*16*16=744

本质是把访问9090端口的数据包增加上744标签,后续用改标签进行过滤和路由

5.3.ipvs负载均衡

如果没有ipvsadm命令,可自行安装,例如apt install ipvsadm -y

在宿主机上执行nsenter --net=ingress_sbox ipvsadm

root@wss05:/var/run/docker/netns# nsenter --net=ingress_sbox ipvsadm

IP Virtual Server version 1.2.1 (size=4096)

Prot LocalAddress:Port Scheduler Flags

-> RemoteAddress:Port Forward Weight ActiveConn InActConn

FWM 663 rr

-> 10.0.0.101:0 Masq 1 0 0
FWM 744 rr

-> 10.0.0.170:0 Masq 1 0 0

-> 10.0.0.186:0 Masq 1 0 0

FWM 748 rr

-> 10.0.0.178:0 Masq 1 0 0

-> 10.0.0.179:0 Masq 1 0 0

FWM 749 rr

-> 10.0.0.181:0 Masq 1 0 0

-> 10.0.0.184:0 Masq 1 0 0

FWM:防火墙标记

rr:代表的是轮询(Round-Robin)

也就是说,当一个service包含多个task时候,用户对service的访问最终会通过负载均衡的方式转发给各个task处理,这个负载均衡为轮询策略,且无法通过修改service的属性方式进行变更。但是通过ipvsadm命令可以修改策略和权重。

10.0.0.186恰恰是task容器的一个ip地址

5.4.网络路径访问

接着5.2的描述,访问路径打上标签744后,进过ipvs负载均衡后,想转到10.0.0.x的网段,怎么办呢?

在宿主机上执行nsenter --net=ingress_sbox ip route

root@wss05:/var/run/docker/netns# nsenter --net=ingress_sbox ip route

default via 172.18.0.1 dev eth1
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.6

172.18.0.0/16 dev eth1 proto kernel scope link src 172.18.0.2

可以发现需要先访问10.0.0.6,请看图中10.0.0.6的位置,它与br0网关相连

查看br0的网络命名空间是1-qh13cp9hcd

在宿主机上执行nsenter --net=1-qh13cp9hcd ip route

root@wss05:/var/run/docker/netns# nsenter --net=1-qh13cp9hcd ip route
10.0.0.0/24 dev br0 proto kernel scope link src 10.0.0.1

想访问10.0.0.0/24就需要先访问网关10.0.0.1,而10.0.0.1就是br0

接下来的情况如下:

1.如果负载均衡轮询的地址是10.0.0.186,在br0网关中会根据目标地址所在主机进行判断,如果目标地址为本地容器IP,请求直接发给本地容器处理

2.如果没对上,则直接发给vxlan(单向隧道技术)

综上可用图描述

总结:请求访问容器9090端口的服务

192.168.90.215---->iptables:NAT表转换---->docker_gwbridge网关172.18.0.1---->

ingress-sbox容器(命名空间)172.18.0.2---->经过iptables:mangle表打标签成744---->

ipvs的轮询策略---->先经ingress网络的网关br0,10.0.0.1---->假设轮询到10.0.0.186,则在本地容器处理,否则转到VXLAN网络

6.VXLAN

vxlan是一种隧道技术,可以将不同协议的数据包重新封装后发送,新的包头提供了路由信息,从而使被封装的数据包在隧道的两个端点间通过公共互联网网络进行路由,被封装的数据包在公共互联网网络上传递时经过的逻辑路径称为隧道,一旦到达网络终点,数据将被解包并转发到最终目的地。

本质是在不改变原有的数据包和协议的基础上重新封装转发到新的目的地,然后再解包

其中VXLAN的端口是4789

6.1.tcpdump监听

如果没有tcpdump命令,可自行安装,例如apt install -y tcpdump

端口监听,在宿主机上执行tcpdump -i eth1 port 4789

其中eth1是宿主主机网卡,4789是VXLAN端口

6.2.测试

在sw5上执行tcpdump -i eth1 port 4789监听

在另一台sw6上执行curl命令,访问9090端口,

则sw5上的监听结果如下:

IP 10.0.0.186.9090 > 10.0.0.7.52636: Flags [P.], seq 1:580, ack 253, win 501, options [nop,nop,TS val 387656974 ecr 3962031773], length 579

08:07:30.149258 IP sw5.50691 > sw4.4789: VXLAN, flags [I] (0x08), vni 4096

IP 10.0.0.7.52636 > 10.0.0.186.9090: Flags [.], ack 580, win 508, options [nop,nop,TS val 3962031803 ecr 387656974], length 0

08:07:30.150612 IP sw4.35369 > sw1.4789: VXLAN, flags [I] (0x08), vni 4097

IP 10.0.1.14.40734 > 10.0.1.224.27017: Flags [P.], seq 1957:2530, ack 6300, win 502, options [nop,nop,TS val 2546573019 ecr 2825432625], length 573

08:07:30.151267 IP sw1.42226 > sw4.4789: VXLAN, flags [I] (0x08), vni 4097

IP 10.0.1.224.27017 > 10.0.1.14.40734: Flags [P.], seq 6300:6530, ack 2530, win 501, options [nop,nop,TS val 2825432638 ecr 2546573019], length 230

08:07:30.152196 IP sw4.56484 > sw5.4789: VXLAN, flags [I] (0x08), vni 4096

IP 10.0.0.186.9090 > 10.0.0.7.52636: Flags [P.], seq 580:585, ack 253, win 501, options [nop,nop,TS val 387656977 ecr 3962031803], length 5

08:07:30.152471 IP sw5.50691 > sw4.4789: VXLAN, flags [I] (0x08), vni 4096

IP 10.0.0.7.52636 > 10.0.0.186.9090: Flags [.], ack 585, win 509, options [nop,nop,TS val 3962031806 ecr 387656977], length 0

08:07:30.152599 IP sw5.50691 > sw4.4789: VXLAN, flags [I] (0x08), vni 4096

IP 10.0.0.7.52636 > 10.0.0.186.9090: Flags [F.], seq 253, ack 585, win 512, options [nop,nop,TS val 3962031806 ecr 387656977], length 0

08:07:30.153530 IP sw4.56484 > sw5.4789: VXLAN, flags [I] (0x08), vni 4096

IP 10.0.0.186.9090 > 10.0.0.7.52636: Flags [F.], seq 585, ack 254, win 501, options [nop,nop,TS val 387656979 ecr 3962031806], length 0

08:07:30.153698 IP sw5.50691 > sw4.4789: VXLAN, flags [I] (0x08), vni 4096

vni是ingress网络中的Options

在宿主机上执行docker network inspect ingress

"Options": {

"com.docker.network.driver.overlay.vxlanid_list": "4096"

},

本章完。

相关推荐
苦逼IT运维3 分钟前
YUM 源与 APT 源的详解及使用指南
linux·运维·ubuntu·centos·devops
仍有未知等待探索21 分钟前
Linux 传输层UDP
linux·运维·udp
zeruns80228 分钟前
如何搭建自己的域名邮箱服务器?Poste.io邮箱服务器搭建教程,Linux+Docker搭建邮件服务器的教程
linux·运维·服务器·docker·网站
爱跑步的程序员~28 分钟前
Docker
docker·容器
北城青35 分钟前
WebRTC Connection Negotiate解决
运维·服务器·webrtc
福大大架构师每日一题1 小时前
23.1 k8s监控中标签relabel的应用和原理
java·容器·kubernetes
程序那点事儿1 小时前
k8s 之动态创建pv失败(踩坑)
云原生·容器·kubernetes
疯狂的大狗1 小时前
docker进入正在运行的容器,exit后的比较
运维·docker·容器
XY.散人1 小时前
初识Linux · 文件(1)
linux·运维·服务器
长天一色1 小时前
【Docker从入门到进阶】01.介绍 & 02.基础使用
运维·docker·容器