一 安装说明
部署说明
本次部署采用的系统及组件版本
| 项目 | 版本 |
|---|---|
| 操作系统 | Rocky Linux release 10.1 |
| 内核版本 | 6.12.0 |
| Kubernetes | v1.35.0 |
| containerd | 2.2.1 |
| CNI 插件 | v1.9.0 |
| crictl | 1.35.0 |
| etcd | 3.6.6-0 |
二 准备开始
Linux 主机,兼容 Debian / RedHat 系列或其他无包管理器的发行版。
Rocky Linu 系统为最小化安装可能缺少常用命令用到了再安装即可。
如果不在 Rocky类似系统上,请确认内核版本 ≥ v5.13(参考 官方文档)。
每台机器 ≥ 4 GB RAM;控制平面节点建议 ≥ 4 CPU。
集群中的所有机器之间必须能网络互通。
所有节点 MAC 地址和 product_uuid 必须唯一。
禁用 swap。
三 集群安装
3.1 基本网络 / 主机名 /静态 IP 配置
| 节点角色 | 主机名 | IP 地址 | 组件 |
|---|---|---|---|
| Master1 | master01 | 192.168.0.24 | kube-apiserver, kube-controller-manager, kube-scheduler, etcd kubelet, kube-proxy, container runtime |
| Master2 | master02 | 192.168.0.26 | kube-apiserver, kube-controller-manager, kube-scheduler, etcd kubelet, kube-proxy, container runtime |
| Master3 | master03 | 192.168.0.27 | kube-apiserver, kube-controller-manager, kube-scheduler, etcd kubelet, kube-proxy, container runtime |
| VIP | 192.168.0.100 | Keepalived VIP |
Kubernetes Service 网段:
10.96.0.0/12Pod 网段:
10.244.0.0/16
3.2 系统环境 & 基本配置(所有节点)
3.2.1 系统版本确认
cat /etc/redhat-release
# 应为 Rocky Linux release 10.1 (Red Quartz)
3.2.2 修改 /etc/hosts
在所有节点:
echo '192.168.0.24 master01
192.168.1.26 master02
192.168.1.27 master03
' >> /etc/hosts
3.2.3 关闭防火墙与 SELinux
systemctl disable --now firewalld
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
3.2.4 禁用 swap
swapoff -a
sed -i.bak '/swap/s/^/#/' /etc/fstab
3.2.5 时间同步
-
安装 ntp 或
-
使用 chronyd
dnf install -y ntpd
或
systemctl status chronyd
3.2.6 系统限制(limits)
bash
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
3.2.7 无密码 SSH 登陆(Master01 -> 所有节点)
在主控机 Master01 上:
bash
ssh-keygen -t rsa # 回车全部默认值
for i in master01 master02 master03; do
ssh-copy-id -i ~/.ssh/id_rsa.pub $i
done
3.3 内核与 ipvs 配置
3.3.1 安装 ipvsadm 及相关模块
bash
dnf install -y ipvsadm ipset sysstat conntrack libseccomp
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack
3.3.2 ipvs 模块开机加载
bash
cat > /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF
systemctl enable --now systemd-modules-load.service
检查加载情况:
bash
lsmod | grep -e ip_vs -e nf_conntrack
3.3.3 配置内核参数
在所有节点创建 /etc/sysctl.d/k8s.conf:
bash
cat <<EOF > /etc/sysctl.d/k8s.conf
## 网络优化 启用 IPv4 数据包转发 CNI 网络插件如 Calico/Cilium 依赖
net.ipv4.ip_forward = 1
net.ipv4.tcp_tw_reuse = 2
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.conf.all.route_localnet = 1
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_conntrack_max = 65536
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65536
# 增加 SYN 半连接队列长度
net.ipv4.tcp_max_syn_backlog = 65536
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.netfilter.nf_conntrack_max = 1048576
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 10
# 文件系统
fs.file-max = 2097152
fs.nr_open = 52706963
fs.may_detach_mounts = 1
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
# 内存管理
vm.swappiness = 0
vm.max_map_count = 262144
vm.overcommit_memory = 1
vm.panic_on_oom = 0
kernel.panic = 10
# 容器支持
kernel.pid_max = 4194304
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-arptables = 1
# Kubernetes 要求
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
kernel.softlockup_panic = 1
EOF
sysctl --system
重启机器以保证修改生效:
bash
reboot
重启后确认内核模块仍已加载:
bash
lsmod | grep --color=auto -e ip_vs -e nf_conntrack
3.4 安装 containerd + CRI 工具
配置内核参数
转发 IPv4 并让 iptables 看到桥接流量
bash
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# 应用 sysctl 参数而不重新启动
sudo sysctl --system
通过运行以下指令确认 br_netfilter 和 overlay 模块被加载:
bash
lsmod | grep br_netfilter
lsmod | grep overlay
查看内核参数是否为 1
bash
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
3.4.1 下载与安装 containerd
bash
wget https://github.com/containerd/containerd/releases/download/v2.2.1/containerd-2.2.1-linux-amd64.tar.gz
tar xvf containerd-2.2.1-linux-amd64.tar.gz
mv bin/* /usr/local/bin/
mkdir /etc/containerd
containerd config default > /etc/containerd/config.toml
3.4.2 containerd 启动文件
bash
cat > /usr/lib/systemd/system/containerd.service <<EOF
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target dbus.service
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now containerd
3.4.3 安装 runc
下载地址:https://github.com/opencontainers/runc/releases/download/v1.4.0/runc.amd64
bash
install -m 755 runc.amd64 /usr/local/sbin/runc
3.4.4 安装 CNI 插件
bash
mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.9.0.tgz
3.4.5 安装 crictl
bash
tar -xf crictl-v1.35.0-linux-amd64.tar.gz -C /usr/local/bin
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 30
debug: false
pull-image-on-create: false
EOF
3.4.6 启用 systemd cgroup 驱动
bash
cgroup 详细介绍请查看 官方文档
编辑 /etc/containerd/config.toml 中对应部分:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
ShimCgroup = '' # 在这行下面添加
SystemdCgroup = true # 默认是没有这行的
重启 containerd:
systemctl restart containerd
3.5 高可用组件:HAProxy + Keepalived
3.5.1 安装
在所有 Master 节点上:
bash
dnf install -y haproxy keepalived
3.5.2 配置 HAProxy
所有 Master 节点共享相同配置文件 /etc/haproxy/haproxy.cfg,内容如下:
bash
cat > /etc/haproxy/haproxy.cfg << EOF
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend k8s-master
bind 0.0.0.0:8443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
balance roundrobin
option httpchk GET /healthz
http-check expect status 200
option tcp-check
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server master01 192.168.0.24:6443 check
server master02 192.168.0.26:6443 check
server master03 192.168.0.27:6443 check
EOF
3.5.3 Keepalived 配置(不同节点略有差异)
Master01:
bash
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
mcast_src_ip 192.168.0.24
virtual_router_id 51
priority 100
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.0.100
}
track_script {
chk_apiserver
}
}
EOF
Master02:
bash
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
mcast_src_ip 192.168.0.26
virtual_router_id 51
priority 99
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.0.100
}
track_script {
chk_apiserver
} }
EOF
Master03:
bash
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
mcast_src_ip 192.168.0.27
virtual_router_id 51
priority 98
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.0.100
}
track_script {
chk_apiserver
} }
EOF
健康检查脚本 /etc/keepalived/check_apiserver.sh:
bash
cat > /etc/keepalived/check_apiserver.sh << EOF
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_apiserver.sh
启动服务:
bash
systemctl daemon-reload
systemctl enable --now haproxy
systemctl enable --now keepalived
systemctl status keepalived haproxy
测试 VIP 是否可 ping 通:
bash
ping 192.168.0.100
3.6 安装 Kubernetes 核心组件:kubeadm, kubelet, kubectl
3.6.1 配置 yum 源
bash
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.35/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.35/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
3.6.2 安装并启用服务
bash
dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
systemctl enable --now kubelet
3.7 初始化 Master01(控制面第一个节点)
3.7.1 查看所需镜像并预先拉取
bash
kubeadm config images list
所需镜像(版本 v1.35.0):
registry.k8s.io/kube-apiserver:v1.35.0
registry.k8s.io/kube-controller-manager:v1.35.0
registry.k8s.io/kube-scheduler:v1.35.0
registry.k8s.io/kube-proxy:v1.35.0
registry.k8s.io/coredns/coredns:v1.13.1
registry.k8s.io/etcd:3.6.6-0 这个命令输出的镜像版本是3.6.6,这个再下载镜像的时候报错加上 -0 就好了 估计以后就没问题了
导入镜像示例:
bash
ctr -n k8s.io image import 加镜像名字 # 或者导入自己的镜像仓库在pull下来
# 倒入好镜像以后用crictl查看ctr也能查看,但是不直观
crictl images
# ctr 好像又命名空间的概念 我也没研究过 要是嫌麻烦可以安装docker客户端工具管理containerd
ctr -n k8s.io images ls
3.7.2 生成并修改初始化配置文件
bash
kubeadm config print init-defaults > kubeadm-init.yaml
修改生成的 kubeadm-init.yaml,例子如下:
当前配置文件是 堆叠ETCD 说人话就是 内部ETCD 这个可以修改为外部ETCD 也就是二进制安装的ETCD
bash
cat > ./kubeadm-init.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
# 引导令牌(保持默认即可)
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
# 本地API端点
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.11
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
imagePullSerial: true
name: master01
taints: null
# 超时设置(保持默认即可)
timeouts:
controlPlaneComponentHealthCheck: 4m0s
discovery: 5m0s
etcdAPICall: 2m0s
kubeletHealthCheck: 4m0s
kubernetesAPICall: 1m0s
tlsBootstrap: 5m0s
upgradeManifests: 5m0s
---
apiServer: {}
apiVersion: kubeadm.k8s.io/v1beta4
caCertificateValidityPeriod: 87600h0m0s
certificateValidityPeriod: 8760h0m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
encryptionAlgorithm: RSA-2048
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.33.5
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
# 如果不是高可用集群 删除这行即可
controlPlaneEndpoint: "192.168.1.100:8443"
proxy: {}
scheduler: {}
EOF
外部 etcd 配置:
bash
etcd:
local:
dataDir: /var/lib/etcd
替换为 external 配置块,并填写你的 etcd 集群 endpoints 及证书路径:
bash
etcd:
external:
endpoints:
- https://etcd-node1.example.com:2379
- https://etcd-node2.example.com:2379
- https://etcd-node3.example.com:2379
caFile: /etc/kubernetes/pki/etcd/ca.crt
certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
说明:
endpoints:外部 etcd 集群各成员的访问地址列表(至少两个或三个实例以实 HA)。
caFile:etcd CA 证书文件(用于 TLS 客户端验证)。
certFile / keyFile:apiserver 与 etcd 通信所需的客户端证书和密钥。
local 和 external 是互斥的。一旦使用 external,需要删除同一配置文件中保留 local etcd 配置。
3.7.3 执行初始化
-
初始化以后会在/etc/kubernetes目录下生成对应的证书和配置文件,之后其他Master节点加入Master01即可。
-
初始化的时候可以看详细日志 在后面添加 --v=5 即可
bashkubeadm init --config kubeadm-init.yaml --upload-certs若初始化失败,可重置再来:
bashkubeadm reset -f ; ipvsadm --clear ; rm -rf ~/.kube初始化成功后,配置 kubeconfig:
bashmkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config # 或者如果是 root 用户 export KUBECONFIG=/etc/kubernetes/admin.conf初始化成功以后显示如下
bash[addons] Applied essential addon: CoreDNS W0531 11:03:13.052324 10273 endpoint.go:56] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes running the following command on each as root: kubeadm join 192.168.0.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ae1d135fcfee8652680f6bdfd5c3b465af4f4079012432ee6e16c085e3480f88 \ --control-plane --certificate-key f0df24280384c7aae5385cf735172938fcdc3624df2035084c06211f97d029fc Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.0.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ae1d135fcfee8652680f6bdfd5c3b465af4f4079012432ee6e16c085e3480f88 [root@master01 ~]# [root@master01 ~]#接下来,您需要部署一个 Pod 网络到集群中。您可以在以下链接中的选项之一中选择,并运行 kubectl apply -f podnetwork.yaml:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
您现在可以通过在每个控制平面节点上以 root 用户身份运行以下命令来加入任意数量的控制平面节点
bash
kubeadm join 192.168.0.100:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ae1d135fcfee8652680f6bdfd5c3b465af4f4079012432ee6e16c085e3480f88 \
--control-plane --certificate-key f0df24280384c7aae5385cf735172938fcdc3624df2035084c06211f97d029fc
请注意,certificate-key 提供对集群敏感数据的访问权限,请保密!为了安全起见,上传的证书将在两个小时后被删除;如果需要,您可以使用 "kubeadm init phase upload-certs --upload-certs" 在之后重新加载证书。
然后,您可以通过在每个工作节点上以 root 用户身份运行以下命令来加入任意数量的工作节点:
bash
kubeadm join 192.168.0.100:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:ae1d135fcfee8652680f6bdfd5c3b465af4f4079012432ee6e16c085e3480f88
3.8 部署网络插件(Calico)
下载地址:https://github.com/projectcalico/calico/blob/v3.31.3/manifests/calico-etcd.yaml
下载好以后修改配置
bash
# 添加etcd 节点
sed -i 's#etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"#etcd_endpoints: "https://192.168.1.11:2379,https://192.168.1.12:2379,https://192.168.1.13:2379"#g' calico-etcd.yaml
# 添加证书
ETCD_CA=`cat /etc/kubernetes/pki/etcd/ca.crt | base64 | tr -d '\n'`
ETCD_CERT=`cat /etc/kubernetes/pki/etcd/server.crt | base64 | tr -d '\n'`
ETCD_KEY=`cat /etc/kubernetes/pki/etcd/server.key | base64 | tr -d '\n'`
sed -i "s@# etcd-key: null@etcd-key: ${ETCD_KEY}@g; s@# etcd-cert: null@etcd-cert: ${ETCD_CERT}@g; s@# etcd-ca: null@etcd-ca: ${ETCD_CA}@g" calico-etcd.yaml
# 添加证书路径
sed -i 's#etcd_ca: ""#etcd_ca: "/calico-secrets/etcd-ca"#g; s#etcd_cert: ""#etcd_cert: "/calico-secrets/etcd-cert"#g; s#etcd_key: "" #etcd_key: "/calico-secrets/etcd-key" #g' calico-etcd.yaml
# 修改pod网段地址
POD_SUBNET="10.244.0.0/16"
sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@g; s@# value: "192.168.0.0/16"@ value: '"${POD_SUBNET}"'@g' calico-etcd.yaml
全部修改好以后检查一遍没问题就可以部署了
bash
kubectl create -f calico-etcd.yaml
部署成功以后再次查看集群状态就没问题了
bash
[root@master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane 25m v1.35.0
master02 Ready control-plane 24m v1.35.0
master03 Ready control-plane 23m v1.35.0
3.9 部署 Metrics Server
安装之前需要删除污点
bash
kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule-
这是官方配置文件,直接拿来用会提示缺少证书:https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
以下为修改添加证书相关路径添加挂在点等等 证书文件路径为/etc/kubernetes/pki/front-proxy-ca.crt(部署集群时自动生成的证书)
在安装Metrics
bash
cat > ./components.yaml << E
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
verbs:
- get
- apiGroups:
- ""
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- appProtocol: https
name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-username-headers=X-Remote-User
- --requestheader-group-headers=X-Remote-Group
- --requestheader-extra-headers-prefix=X-Remote-Extra-
image: registry.k8s.io/metrics-server/metrics-server:v0.8.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 10250
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
volumeMounts:
- mountPath: /tmp
name: tmp-dir
- mountPath: /etc/kubernetes/pki
name: k8s-certs
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
- hostPath:
path: /etc/kubernetes/pki
name: k8s-certs
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
E
kubectl create -f components.yaml
3.10 将 kube-proxy 切换到 ipvs 模式
bash
kubectl edit cm kube-proxy -n kube-system
# 将 mode 修改为 "ipvs"
更新Kube-Proxy的Pod:
bash
kubectl patch daemonset kube-proxy -n kube-system -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"$(date +'%s')\"}}}}}"
验证模式:
curl 127.0.0.1:10249/proxyMode
# 应显示 ipvs
四 注意事项
kubeadm 默认签发的证书有效期为 一年,生产环境可考虑延长或设置自动更新。
控制平面组件(kube-apiserver、controller-manager、scheduler、etcd)以静态 Pod 方式运行,配置文件在 /etc/kubernetes/manifests;更改后 kubelet 会自动重启对应 Pod。
kubelet 的配置在 /etc/sysconfig/kubelet 和 /var/lib/kubelet/config.yaml。
默认情况下 control-plane/master 节点有污点,不允许调度普通 Pod;如需在 master 上部署 Pod,需要移除污点
bash
## 查看污点
kubectl describe node | grep Taint
## 删除污点
kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule-