kubeadm部署k8s集群
服务器环境:(2GB或更多RAM,2个CPU或更多CPU)
Kubernetes Master1节点:172.20.26.34
Kubernetes Master2节点:172.20.26.36
Kubernetes Node1节点: 172.20.26.37
Kubernetes Node2节点: 172.20.26.38
操作系统:CentOS7.9
每台服务器部署前将日常工具安装完成
yum install vim net-tools lrzsz epel-release -y
yum update
一、K8S节点Hosts及防火墙设置
Master1、Master2、Node1、Node2节点进行如下配置:
#添加hosts解析;
cat >/etc/hosts<<EOF
127.0.0.1 localhost localhost.localdomain
172.20.26.34 master1
172.20.26.36 master2
172.20.26.37 node1
172.20.26.38 node2
EOF
#临时关闭selinux和防火墙;
sed -i '/SELINUX/s/enforcing/disabled/g' /etc/sysconfig/selinux
setenforce 0
systemctl stop firewalld.service
systemctl disable firewalld.service
#同步节点时间;
yum install ntpdate -y
ntpdate pool.ntp.org
#修改对应节点主机名;
hostname `cat /etc/hosts|grep (ifconfig\|grep broadcast\|awk '{print 2}')|awk '{print $2}'`;su
#关闭swapoff(因交换分区读写速度无法与内存比,关闭交换分区,确保k8s性能);
swapoff -a # 临时关闭
sed -ri 's/.*swap.*/#&/' /etc/fstab #再执行命令永久关闭
二、Linux内核参数设置&优化
Master1、Master2、Node1、Node2节点执行
让k8s支持IP负载均衡技术:
cat > /etc/modules-load.d/ipvs.conf <<EOF
Load IPVS at boot
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
EOF
systemctl enable --now systemd-modules-load.service ##加载模块
#确认内核模块加载成功
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
显示如下:
nf_conntrack_ipv4 15053 0
nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 0
ip_vs 145497 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 133095 2 ip_vs,nf_conntrack_ipv4
libcrc32c 12644 3 xfs,ip_vs,nf_conntrack
如没有信息显示,可以尝试重启机器
init 6 重启系统
#安装ipset、ipvsadm
yum install -y ipset ipvsadm
#配置内核参数;(加入桥接转发,让容器能够使用二层网络)
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#手动加载所有的配置文件
sysctl --system
#局部挂载配置文件
sysctl -p
所有节点安装Docker、kubeadm、kubelet
三、安装Docker
安装依赖软件包
yum install -y yum-utils device-mapper-persistent-data lvm2
添加Docker repository,这里使用国内阿里云yum源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
安装docker-ce,这里直接安装最新版本
yum install -y docker-ce
#修改docker配置文件
mkdir /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"registry-mirrors": ["https://uyah70su.mirror.aliyuncs.com"]
}
EOF
注意,由于国内拉取镜像较慢,配置文件最后增加了registry-mirrors
mkdir -p /etc/systemd/system/docker.service.d
重启docker服务
systemctl daemon-reload
systemctl enable docker.service
systemctl start docker.service
ps -ef|grep -aiE docker
四、Kubernetes添加部署源
Kubernetes Master1、Master2、Node1、Node2节点上安装Docker、Etcd和Kubernetes、Flannel网络,
在Master1、Master2、Node1、Node2节点上添加kubernetes源指令如下:
cat>>/etc/yum.repos.d/kubernetes.repo<<EOF
kubernetes
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
EOF
五、K8S Kubeadm
1)安装Kubeadm工具;
#在Master1、Master2、Node1、Node2节点安装Kubeadm;
#搜索k8s版本
yum list kubelet --showduplicates
yum install -y kubeadm-1.23.1 kubelet-1.23.1 kubectl-1.23.1
#在Master1、Master2节点启动kubelet服务
systemctl enable kubelet.service && systemctl restart kubelet.service
2)Kubeadm常见指令操作;
kubeadm init 启动一个 Kubernetes 主节点
kubeadm join 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade 更新一个 Kubernetes 集群到新版本
kubeadm config 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token 管理 kubeadm join 使用的令牌
kubeadm reset 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version 打印 kubeadm 版本
kubeadm alpha 预览一组可用的新功能以便从社区搜集反馈
六、K8S Master1节点初始化,以及Master2节点加入集群
1)在Master 1节点上,执行kubeadm init初始化安装Master相关软件(2个CPU、2G内存以上);
echo "1" > /proc/sys/net/ipv4/ip_forward #打开IP转发
init 6 重启系统
kubeadm init --control-plane-endpoint=172.20.26.34:6443 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.23.1 --service-cidr=10.10.0.0/16 --pod-network-cidr=10.244.0.0/16 --upload-certs
报错如下:
init\] Using Kubernetes version: v1.20.4
\[preflight\] Running pre-flight checks
\[WARNING SystemVerification\]: this Docker version is not on the list of validated versions: 20.10.12. Latest validated version: 19.03
error execution phase preflight: \[preflight\] Some fatal errors occurred:
\[ERROR Swap\]: running with swap on is not supported. Please disable swap
\[preflight\] If you know what you are doing, you can make a check non-fatal with \`--ignore-preflight-errors=...\`
To see the stack trace of this error execute with --v=5 or higher
解决:
swapoff -a # 临时关闭
sed -ri 's/.\*swap.\*/#\&/' /etc/fstab #再执行命令永久关闭
另外还要设置/etc/sysconfig/kubelet参数(在以往老版本中是必须要关闭swap的,但是现在新版又多了一个选择,可以通过参数指定,忽略swap报错!)
sed -i 's/KUBELET_EXTRA_ARGS=/KUBELET_EXTRA_ARGS="--fail-swap-on=false"/' /etc/sysconfig/kubelet
假如后续再次初始化时报错:
\[root@master1 \~\]# kubeadm init --control-plane-endpoint=172.20.26.34:6443 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.23.1 --service-cidr=10.10.0.0/16 --pod-network-cidr=10.244.0.0/16 --upload-certs
\[init\] Using Kubernetes version: v1.23.1
\[preflight\] Running pre-flight checks
error execution phase preflight: \[preflight\] Some fatal errors occurred:
\[ERROR Port-6443\]: Port 6443 is in use
\[ERROR Port-10259\]: Port 10259 is in use
\[ERROR Port-10257\]: Port 10257 is in use
\[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml\]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
\[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml\]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
\[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml\]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
\[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml\]: /etc/kubernetes/manifests/etcd.yaml already exists
\[ERROR Port-10250\]: Port 10250 is in use
\[ERROR Port-2379\]: Port 2379 is in use
\[ERROR Port-2380\]: Port 2380 is in use
\[ERROR DirAvailable--var-lib-etcd\]: /var/lib/etcd is not empty
\[preflight\] If you know what you are doing, you can make a check non-fatal with \`--ignore-preflight-errors=...\`
To see the stack trace of this error execute with --v=5 or higher
出现上述报错,端口被占用。执行
kubeadm reset #重置kubeadm
初始化报错:
\[kubelet-check\] It seems like the kubelet isn't running or healthy.
\[kubelet-check\] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
\[kubelet-check\] It seems like the kubelet isn't running or healthy.
\[kubelet-check\] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
\[kubelet-check\] It seems like the kubelet isn't running or healthy.
\[kubelet-check\] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
\[kubelet-check\] It seems like the kubelet isn't running or healthy.
\[kubelet-check\] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
\[kubelet-check\] It seems like the kubelet isn't running or healthy.
\[kubelet-check\] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a \| grep kube \| grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
**2)根据如上指令操作,执行成功**
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f \[podnetwork\].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \\
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cddd89c04c5cc39b039e8376f21225d8dec4dcbb \\
--control-plane --certificate-key f26b6b75628f095a5cb4e4c12f3469b8aaced8ccd08674cfded0c36e087ca92e3
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \\
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb
**3)根据如上图提示,接下来需在Master端手工执行如下指令,拷贝admin配置文件;**
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
Master1节点在当前目录下,上传kube-flannel.yaml,然后执行下面的命令
\[root@master1 \~\]# kubectl apply -f kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
**4)将Master2节点加入集群中,同时作为node节点加入K8S集群时使用的参数和指令如下;(token生命周期为一天)**
#如果token过期,可以在master1上生成token
kubeadm token generate
#根据上面kubeadm init初始化成功的信息提示,在Master2上执行下面命令,将Master2节点加入集群中
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \\
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb \\
--control-plane --certificate-key f26b6b75628f05a5cb4e4c12f3469b8aaced8ccd08674cfded0c36e087ca92e3
等待完成,Master2成功加入集群的信息如下:
This node has joined the cluster and a new control plane instance was created:
\* Certificate signing request was sent to apiserver and approval was received.
\* The Kubelet was informed of the new secure connection details.
\* Control plane (master) label and taint were applied to the new node.
\* The Kubernetes control plane instances scaled up.
\* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
**七、Node1节点加入集群**
#启动Node1节点上docker引擎服务;
systemctl start docker.service
#将Node1节点加入K8S集群;
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \\
--discovery-token-ca-cert-hash sha256:754abbc6dd802l5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb
提示以下成功加入信息即可:
This node has joined the cluster:
\* Certificate signing request was sent to apiserver and a response was received.
\* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
#假如在执行kubeadm init时没有记录下加入集群的指令,可以在Master端通过以下命令重新创建即可;
kubeadm token create --print-join-command
#登录K8S Master节点验证节点信息;
在Master节点上执行,节点查询命令
kubectl get nodes
\[root@master1 \~\]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master1 Ready control-plane,master 15m v1.23.1
master2 Ready control-plane,master 4m23s v1.23.1
node1 Ready \