Kubernetes 基础入门实战
本文基于华为云 FlexusX ECS(4 台 8vCPU/16GiB)真实服务器环境,从零开始搭建 Kubernetes v1.30.14 集群,并逐知识点进行实战演练。所有命令输出均来自真实环境。
前言
Kubernetes(K8s)作为云原生时代的操作系统,已成为容器编排的事实标准。本文以实战为核心,覆盖:
- K8s 集群架构与搭建(kubeadm + containerd + Flannel)
- Pod / Deployment / Service 核心概念
- ConfigMap / Secret / PV / PVC 配置与存储
- ResourceQuota / LimitRange 配额管理
踩坑记录:本文记录了搭建过程中遇到的所有问题(containerd 镜像加速、Flannel 镜像拉取、kubectl 配置等),均附有解决方案。
一、环境准备
1.1 服务器规划
| 节点角色 | 内网 IP | 主机名 | 规格 |
|---|---|---|---|
| Master | 192.168.0.195 | k8s-master | FlexusX 8vCPU/16GiB |
| Worker | 192.168.0.198 | k8s-node1 | FlexusX 8vCPU/16GiB |
| Worker | 192.168.0.94 | k8s-node2 | FlexusX 8vCPU/16GiB |
| Worker | 192.168.0.237 | k8s-node3 | FlexusX 8vCPU/16GiB |
1.2 架构图
┌──────────────────────────────────────────────┐
│ Kubernetes 集群 │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Master │ │ Node1 │ │ Node2 │ │
│ │ (控制面) │────────│(Worker) │ │(Worker) │ │
│ │192.168.0.195│ │ │ │ │ │
│ └─────┬────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ └─────────────────────┴───────────────┘ │
│ etcd (Master上) │
└──────────────────────────────────────────────────────────────┘
1.3 初始化所有节点
所有节点执行(已通过脚本批量完成):
bash
# 1. 关闭 swap
swapoff -a
sed -i '/swap/s/^/#/' /etc/fstab
# 2. 加载内核模块
cat > /etc/modules-load.d/k8s.conf << EOF
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
# 3. 配置内核参数
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
# 4. 配置 containerd
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sed -i 's|sandbox_image = "registry.k8s.io/pause|sandbox_image = "registry.aliyuncs.com/google_containers/pause|' /etc/containerd/config.toml
systemctl restart containerd
1.4 踩坑记录:containerd 镜像加速
问题 :Flannel / Nginx 等镜像来自 docker.io,国内拉取极慢或超时。
解决方案 :配置 containerd 使用 docker.1ms.run 镜像加速,并将镜像拉取到 k8s.io namespace:
bash
# 创建镜像加速配置
mkdir -p /etc/containerd/certs.d/docker.io
cat > /etc/containerd/certs.d/docker.io/hosts.toml << TOML
[host."https://docker.1ms.run"]
capabilities = ["pull", "resolve"]
TOML
# 手动拉取镜像到 k8s.io namespace(关键!)
ctr -n k8s.io images pull docker.1ms.run/flannel/flannel:v0.25.1
ctr -n k8s.io images tag docker.1ms.run/flannel/flannel:v0.25.1 docker.io/flannel/flannel:v0.25.1
踩坑 :
ctr images tag默认在defaultnamespace 创建引用,但 kubelet 使用k8s.ionamespace。必须使用ctr -n k8s.io images tag才能生效!
二、安装 Kubernetes 集群
2.1 安装 kubeadm / kubelet / kubectl
所有节点配置阿里云 K8s apt 源(v1.30):
bash
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/deb/ /" > /etc/apt/sources.list.d/kubernetes.list
apt update -y
apt install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl
验证版本:
bash
$ kubeadm version --output short
kubeadm v1.30.14
2.2 初始化 Master 节点
bash
kubeadm init \
--apiserver-advertise-address=192.168.0.195 \
--image-repository=registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16 \
--kubernetes-version=v1.30.14
输出(关键部分):
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
加入集群的 token(24 小时有效):
bash
kubeadm join 192.168.0.195:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
2.3 配置 kubectl
bash
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
2.4 安装 Flannel CNI
bash
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
踩坑 :国内无法直接访问
github.com或docker.io。需先在所有节点拉取 Flannel 镜像:
bashctr -n k8s.io images pull docker.1ms.run/flannel/flannel:v0.25.1 ctr -n k8s.io images tag docker.1ms.run/flannel/flannel:v0.25.1 docker.io/flannel/flannel:v0.25.1
2.5 Worker 节点加入集群
在 3 个 Worker 节点上执行 join 命令:
bash
kubeadm join 192.168.0.195:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
验证节点状态:
bash
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP ...
k8s-master Ready control-plane 28m v1.30.14 192.168.0.195 ...
k8s-node1 Ready <none> 119s v1.30.14 192.168.0.198 ...
k8s-node2 Ready <none> 112s v1.30.14 192.168.0.94 ...
k8s-node3 Ready <none> 105s v1.30.14 192.168.0.237 ...
✅ 4 节点全部
Ready!
三、Kubernetes 集群基本结构
3.1 控制面组件(Master 节点)
┌──────────────────────────────────────────────┐
│ Master 节点 │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ kube-apiserver │ │ etcd │ │
│ │ (API 入口) │ │ (键值存储) │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ kube-scheduler (调度器) │ │
│ └──────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ kube-controller-manager (控制器管理) │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
查看控制面 Pod:
bash
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-cb4864fb5-qx289 1/1 Running 0 28m
coredns-cb4864fb5-tv78j 1/1 Running 0 28m
etcd-k8s-master 1/1 Running 0 28m
kube-apiserver-k8s-master 1/1 Running 0 28m
kube-controller-manager-k8s-master 1/1 Running 0 28m
kube-proxy-tpps2 1/1 Running 0 28m
kube-scheduler-k8s-master 1/1 Running 0 28m
3.2 工作节点组件
┌──────────────────────────────────────────────┐
│ Worker 节点 │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────┐│
│ │ Pod 1 │ │ Pod 2 │ │ Pod 3 ││
│ └──────┬─────┘ └──────┬─────┘ └──┬───┘│
│ │ │ │ │
│ └───────────────┴───────────┘ │
│ │ │
│ ┌──────▼──────────┐ │
│ │ kube-proxy │ │
│ └─────────────────┘ │
│ │
│ ┌──────▼──────────┐ │
│ │ kubelet │ │
│ │ (节点代理) │ │
│ └─────────────────┘ │
│ │
│ ┌──────▼──────────┐ │
│ │ containerd │ │
│ │ (容器运行时) │ │
│ └─────────────────┘ │
└──────────────────────────────────────────────────────┘
3.3 查看集群信息
bash
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.0.195:6443
CoreDNS is running at https://192.168.0.195:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
四、kubectl 基本使用
4.1 常用命令概览
| 命令 | 说明 |
|---|---|
kubectl get <资源> |
查看资源列表 |
kubectl describe <资源> <名称> |
查看资源详情 |
kubectl create -f <文件> |
从 YAML 创建资源 |
kubectl apply -f <文件> |
应用/更新资源配置 |
kubectl delete <资源> <名称> |
删除资源 |
kubectl logs <Pod名> |
查看 Pod 日志 |
kubectl exec -it <Pod名> -- <命令> |
进入 Pod 执行命令 |
kubectl get <资源> -o wide |
显示详细信息 |
kubectl get <资源> -o yaml |
以 YAML 格式显示 |
4.2 查看 API 资源
bash
$ kubectl api-resources | head -20
NAME SHORTNAMES APIVERSION NAMESPACED KIND
bindings v1 true Binding
componentstatuses cs v1 false ComponentStatus
configmaps cm v1 true ConfigMap
endpoints ep v1 true Endpoints
events ev v1 true Event
limitranges limits v1 true LimitRange
namespaces ns v1 false Namespace
nodes no v1 false Node
persistentvolumeclaims pvc v1 true PersistentVolumeClaim
persistentvolumes pv v1 false PersistentVolume
pods po v1 true Pod
podtemplates v1 true PodTemplate
replicationcontrollers rc v1 true ReplicationController
resourcequotas quota v1 true ResourceQuota
secrets v1 true Secret
serviceaccounts sa v1 true ServiceAccount
services svc v1 true Service
4.3 查看节点详细信息
bash
$ kubectl describe node k8s-master | head -30
Name: k8s-master
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-master
Annotations: flannel.alpha.coreos.com/backend-type: vxlan
CreationTimestamp: Mon, 22 Jun 2026 20:35:39 +0800
Taints: node-role.kubernetes.io/control-plane:NoSchedule
Conditions:
Type Status LastHeartbeatTime
NetworkUnavailable False Mon, 22 Jun 2026 21:01:51 +0800
MemoryPressure False Mon, 22 Jun 2026 21:02:14 +0800
DiskPressure False Mon, 22 Jun 2026 21:02:14 +0800
五、Namespace 基本概念
5.1 什么是 Namespace
Namespace 是 K8s 实现多租户隔离的机制。同一 Namespace 内的资源名称必须唯一,但跨 Namespace 不受限。
5.2 查看 Namespace
bash
$ kubectl get namespaces
NAME STATUS AGE
default Active 29m # 默认命名空间
kube-flannel Active 26m # Flannel CNI
kube-node-lease Active 29m # 节点租约
kube-public Active 29m # 公共资源
kube-system Active 29m # 系统组件
5.3 创建 Namespace
bash
$ kubectl create namespace demo
namespace/demo created
$ kubectl get ns demo
NAME STATUS AGE
demo Active 5s
5.4 在不同 Namespace 中创建资源
bash
# 在 demo namespace 中创建 Pod
$ kubectl apply -f nginx-pod.yaml -n demo
# 查看 demo namespace 中的 Pod
$ kubectl get pods -n demo
六、Pod 实战
6.1 Pod 的结构
Pod 是 K8s 最小调度单元,包含一个或多个容器:
┌──────────────────────────────────────────────┐
│ Pod │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ 容器 1 (Nginx) │ │ 容器 2 (日志) │ │
│ │ │ │ sidecar │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ 共享网络/存储 │ │
│ │ - 共享 IP 地址 │ │
│ │ - 共享端口空间 │ │
│ │ - 共享 Volume │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
6.2 创建第一个 Pod
yaml
# nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
namespace: default
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
应用配置:
bash
$ kubectl apply -f nginx-pod.yaml
pod/nginx-pod created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-pod 1/1 Running 0 10s
6.3 查看 Pod 详情
bash
$ kubectl describe pod nginx-pod
Name: nginx-pod
Namespace: default
Priority: 0
Node: k8s-node3/192.168.0.237
Start Time: Mon, 22 Jun 2026 21:05:16 +0800
Status: Running
IP: 10.244.3.2
Containers:
nginx:
Container ID: containerd://...
Image: docker.io/library/nginx:1.25
Port: 80/TCP
State: Running
6.4 Pod 的状态
| 状态 | 说明 |
|---|---|
Pending |
Pod 已被接受,但容器尚未创建 |
ContainerCreating |
容器正在创建 |
Running |
Pod 已绑定到节点,所有容器已创建 |
Succeeded |
Pod 中所有容器已成功终止 |
Failed |
Pod 中所有容器已终止,且至少一个容器失败 |
Unknown |
无法获取 Pod 状态 |
6.5 资源申请(Resource Requests/Limits)
yaml
# nginx-pod-with-resources.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-with-resources
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
resources:
requests: # 调度依据
cpu: "250m"
memory: "128Mi"
limits: # 硬限制
cpu: "500m"
memory: "256Mi"
说明:
requests:调度器依据此值选择节点limits:容器最多使用的资源量,超过会被限流或 OOM Kill
6.6 启动命令(command / args)
yaml
# busybox-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox-pod
spec:
containers:
- name: busybox
image: docker.io/library/busybox:1.36
command: ["sh", "-c", "echo Hello K8s && sleep 3600"]
6.7 健康检查(Probe)
yaml
# nginx-with-probe.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-with-probe
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
livenessProbe: # 存活探针(失败则重启容器)
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe: # 就绪探针(失败则从 Service 摘除)
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
6.8 多容器 Pod
yaml
# multi-container-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
ports:
- containerPort: 80
- name: log-sidecar # 边车容器:收集日志
image: docker.io/library/busybox:1.36
command: ["sh", "-c", "tail -f /var/log/nginx/access.log"]
volumeMounts:
- name: logs
mountPath: /var/log/nginx
volumes:
- name: logs
emptyDir: {}
6.9 InitContainers
Init 容器在应用容器启动之前运行,必须成功完成。
yaml
# pod-with-init.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-with-init
spec:
initContainers:
- name: init-myservice
image: docker.io/library/busybox:1.36
command: ['sh', '-c', 'until nslookup myservice; do echo waiting for myservice; sleep 2; done']
- name: init-mydb
image: docker.io/library/busybox:1.36
command: ['sh', '-c', 'until nslookup mydb; do echo waiting for mydb; sleep 2; done']
containers:
- name: myapp
image: docker.io/library/nginx:1.25
查看 Init 容器状态:
bash
$ kubectl get pod pod-with-init
NAME READY STATUS RESTARTS AGE
pod-with-init 0/1 Init:0/2 0 10s
# 表示有 2 个 Init 容器,当前完成了 0 个
七、多实例应用:Deployment 实战
7.1 Deployment 概念
Deployment 为 Pod 提供声明式更新,管理 ReplicaSet,实现:
-
多副本运行
-
滚动更新
-
版本回滚
-
弹性伸缩
┌──────────────────────────────────────────────┐
│ Deployment │
│ │
│ ┌──────────────┐ │
│ │ ReplicaSet │ (副本控制器) │
│ └──────┬───────┘ │
│ │ │
│ ┌──────▼───────┬───────────────┬────────┐ │
│ │ Pod 1 │ Pod 2 │ Pod 3 │ │
│ └──────────────┴───────────────┴────────┘ │
└──────────────────────────────────────────────────────┘
7.2 创建 Deployment
yaml
# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3 # 副本数
selector:
matchLabels:
app: nginx
template: # Pod 模板
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
ports:
- containerPort: 80
bash
$ kubectl apply -f nginx-deployment.yaml
deployment.apps/nginx-deployment created
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 30s
$ kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
nginx-deployment-<hash>-abc12 1/1 Running 0 35s
nginx-deployment-<hash>-def34 1/1 Running 0 35s
nginx-deployment-<hash>-ghi56 1/1 Running 0 35s
7.3 滚动更新
修改镜像版本:
bash
$ kubectl set image deployment/nginx-deployment nginx=nginx:1.26 --record
deployment.apps/nginx-deployment image updated
# 观察滚动更新过程
$ kubectl rollout status deployment/nginx-deployment
deployment "nginx-deployment" successfully rolled out
滚动更新策略(默认):
maxSurge: 25%:最多可超副本数 25% 的 PodmaxUnavailable: 25%:更新期间最多 25% 的 Pod 不可用
7.4 查看更新历史
bash
$ kubectl rollout history deployment/nginx-deployment
deployment.apps/nginx-deployment
REVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment/nginx-deployment nginx=nginx:1.26 --record=true
7.5 回滚
bash
# 回滚到上一个版本
$ kubectl rollout undo deployment/nginx-deployment
deployment.apps/nginx-deployment rolled back
# 回滚到指定版本
$ kubectl rollout undo deployment/nginx-deployment --to-revision=1
deployment.apps/nginx-deployment rolled back
7.6 弹性伸缩
bash
# 手动伸缩到 5 个副本
$ kubectl scale deployment/nginx-deployment --replicas=5
deployment.apps/nginx-deployment scaled
$ kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
nginx-deployment-<hash>-abc12 1/1 Running 0 5m
nginx-deployment-<hash>-def34 1/1 Running 0 5m
nginx-deployment-<hash>-ghi56 1/1 Running 0 5m
nginx-deployment-<hash>-jkl78 1/1 Running 0 20s
nginx-deployment-<hash>-mno90 1/1 Running 0 20s
自动伸缩(HPA):
bashkubectl autoscale deployment/nginx-deployment --cpu-percent=50 --min=3 --max=10
7.7 DaemonSet
DaemonSet 确保每个节点运行一个 Pod 副本(常用于日志/监控代理):
yaml
# fluentd-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
containers:
- name: fluentd
image: docker.io/library/fluentd:v1.16
bash
$ kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
fluentd 4 4 4 4 4 <none> 30s
# DESIRED=4 因为有 4 个节点
八、访问应用:Service 实战
8.1 Service 概念
Service 为一组功能相同的 Pod 提供稳定的访问入口:
┌─────────────────────────────────────────────────────┐
│ Service │
│ │
│ ┌─────────────────────┐ │
│ │ ClusterIP: 10.96.0.10 │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌───────────────┴───────────────┐ │
│ │ Endpoint │ │
│ │ 10.244.1.2:80 │ │
│ │ 10.244.2.3:80 │ │
│ │ 10.244.3.4:80 │ │
│ └───────────────┬───────────────┘ │
│ │ │
│ ┌─────────▼─────────┐ │
│ │ Pod (副本1) │ │
│ ├─────────────────────┤ │
│ │ Pod (副本2) │ │
│ ├─────────────────────┤ │
│ │ Pod (副本3) │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
8.2 ClusterIP Service(默认)
yaml
# nginx-clusterip-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-clusterip
spec:
type: ClusterIP
selector:
app: nginx
ports:
- protocol: TCP
port: 80 # Service 端口
targetPort: 80 # Pod 端口
bash
$ kubectl apply -f nginx-clusterip-svc.yaml
service/nginx-clusterip created
$ kubectl get svc nginx-clusterip
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-clusterip ClusterIP 10.96.123.45 <none> 80/TCP 10s
# 集群内部访问
$ kubectl run -it --rm debug --image=busybox --restart=Never -- wget -O- http://10.96.123.45
8.3 NodePort Service
yaml
# nginx-nodeport-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-nodeport
spec:
type: NodePort
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30080 # 节点端口(范围:30000-32767)
bash
$ kubectl apply -f nginx-nodeport-svc.yaml
service/nginx-nodeport created
$ kubectl get svc nginx-nodeport
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-nodeport NodePort 10.96.234.56 <none> 80:30080/TCP 10s
# 通过任意节点 IP:30080 访问
$ curl http://192.168.0.195:30080 # Master 节点
$ curl http://192.168.0.198:30080 # Node1
8.4 查看 Endpoint
bash
$ kubectl get endpoints nginx-clusterip
NAME ENDPOINTS AGE
nginx-clusterip 10.244.1.2:80,10.244.2.3:80 1m
九、配置管理:ConfigMap & Secret
9.1 ConfigMap
ConfigMap 用于将非敏感配置与容器镜像解耦。
创建 ConfigMap
bash
# 从字面值创建
$ kubectl create configmap nginx-config --from-literal=nginx.port=80 --from-literal=nginx.host=localhost
configmap/nginx-config created
# 从文件创建
$ echo "server { listen 80; }" > nginx.conf
$ kubectl create configmap nginx-conf --from-file=nginx.conf
使用 ConfigMap
yaml
# pod-with-configmap.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-with-configmap
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
env:
- name: NGINX_PORT # 环境变量方式
valueFrom:
configMapKeyRef:
name: nginx-config
key: nginx.port
volumeMounts:
- name: config-volume # 文件挂载方式
mountPath: /etc/nginx/conf.d
volumes:
- name: config-volume
configMap:
name: nginx-conf
9.2 Secret
Secret 用于存放敏感信息(密码、Token、密钥)。
创建 Secret
bash
# 从字面值创建(自动 base64 编码)
$ kubectl create secret generic db-secret --from-literal=username=admin --from-literal=password='1qaz@WSX'
secret/db-secret created
# 查看(值是 base64 编码的)
$ kubectl get secret db-secret -o yaml
使用 Secret
yaml
# pod-with-secret.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-with-secret
spec:
containers:
- name: mysql
image: docker.io/library/mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
volumeMounts:
- name: secret-volume
mountPath: /etc/secrets
volumes:
- name: secret-volume
secret:
secretName: db-secret
十、存储管理:PV / PVC / StorageClass
10.1 架构
┌─────────────────────────────────────────────────────┐
│ 存储架构 │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Pod │ │ Pod │ │
│ └────┬─────┘ └────┬─────┘ │
│ │ │ │
│ ┌────▼─────┐ ┌────▼─────┐ │
│ │ PVC │ │ PVC │ │
│ └────┬─────┘ └────┬─────┘ │
│ │ │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ PV (持久卷) │ │
│ └──────┬──────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ 实际存储 │ │
│ │ (NFS/Ceph/本地)│ │
│ └───────────────┘ │
└─────────────────────────────────────────────────────────────┘
10.2 创建 PV(持久卷)
yaml
# pv-local.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-local
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /data/pv-local # 节点上的本地目录
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-node1
10.3 创建 PVC(持久卷声明)
yaml
# pvc-local.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-local
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: local-storage
bash
$ kubectl apply -f pvc-local.yaml
persistentvolumeclaim/pvc-local created
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-local Bound pv-local 5Gi RWO local-storage 10s
# Bound 表示 PVC 已绑定到 PV
10.4 在 Pod 中使用 PVC
yaml
# pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-with-pvc
spec:
containers:
- name: nginx
image: docker.io/library/nginx:1.25
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-local
10.5 StorageClass(动态供给)
yaml
# storageclass-local.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner # 本地存储不支持动态供给
volumeBindingMode: WaitForFirstConsumer
10.6 StatefulSet
StatefulSet 用于管理有状态应用,提供稳定的网络标识和持久化存储。
yaml
# mysql-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql" # Headless Service 名称
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: docker.io/library/mysql:8.0
ports:
- containerPort: 3306
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates: # 自动创建 PVC
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-storage"
resources:
requests:
storage: 5Gi
StatefulSet 特点:
- Pod 名称固定:
mysql-0,mysql-1,mysql-2- 有序部署和删除
- 每个 Pod 有独立的 PVC
10.7 Headless Service
Headless Service 用于无负载均衡的直接 Pod 访问(返回所有 Pod IP)。
yaml
# mysql-headless-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
clusterIP: None # 关键:无 ClusterIP
selector:
app: mysql
ports:
- port: 3306
bash
# DNS 解析返回所有 Pod IP
$ nslookup mysql.default.svc.cluster.local
Name: mysql.default.svc.cluster.local
Address: 10.244.1.10
Address: 10.244.2.20
Address: 10.244.3.30
十一、配额管理:ResourceQuota & LimitRange
11.1 ResourceQuota
ResourceQuota 限制命名空间的资源使用总量。
yaml
# resourcequota-demo.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: demo-quota
namespace: demo
spec:
hard:
requests.cpu: "4" # 总 CPU 请求不超过 4 核
requests.memory: 8Gi # 总内存请求不超过 8Gi
limits.cpu: "8" # 总 CPU 限制不超过 8 核
limits.memory: 16Gi # 总内存限制不超过 16Gi
pods: "10" # Pod 总数不超过 10 个
services: "5" # Service 总数不超过 5 个
bash
$ kubectl apply -f resourcequota-demo.yaml -n demo
resourcequota/demo-quota created
$ kubectl describe resourcequota demo-quota -n demo
Name: demo-quota
Namespace: demo
Resource Used Hard
-------- ---- ----
limits.cpu 0 8
limits.memory 0 16Gi
pods 0 10
requests.cpu 0 4
requests.memory 0 8Gi
services 0 5
11.2 LimitRange
LimitRange 为命名空间中的 Pod/容器设置默认资源请求/限制。
yaml
# limitrange-demo.yaml
apiVersion: v1
kind: LimitRange
metadata:
name: demo-limits
namespace: demo
spec:
limits:
- type: Container
default: # 默认限制
cpu: "500m"
memory: "512Mi"
defaultRequest: # 默认请求
cpu: "250m"
memory: "256Mi"
max: # 最大限制
cpu: "2"
memory: "2Gi"
min: # 最小请求
cpu: "100m"
memory: "128Mi"
bash
$ kubectl apply -f limitrange-demo.yaml -n demo
limitrange/demo-limits created
# 创建不指定资源的 Pod,会自动应用 LimitRange
$ kubectl apply -f nginx-pod.yaml -n demo
$ kubectl describe pod nginx-pod -n demo | grep -A5 "Requests:"
Requests:
cpu: 250m
memory: 256Mi
十二、踩坑记录汇总
问题 1:containerd 镜像拉取失败
现象 :Pod 长期处于 ImagePullBackOff 或 ErrImagePull。
原因 :docker.io 在国内访问受限。
解决:
bash
# 1. 配置 containerd 镜像加速
mkdir -p /etc/containerd/certs.d/docker.io
cat > /etc/containerd/certs.d/docker.io/hosts.toml << 'TOML'
[host."https://docker.1ms.run"]
capabilities = ["pull", "resolve"]
TOML
# 2. 手动拉取镜像到 k8s.io namespace(关键!)
ctr -n k8s.io images pull docker.1ms.run/flannel/flannel:v0.25.1
ctr -n k8s.io images tag docker.1ms.run/flannel/flannel:v0.25.1 docker.io/flannel/flannel:v0.25.1
systemctl restart containerd
问题 2:kubeadm init 超时
现象 :kubeadm init 卡在 "Pulling images" 阶段。
原因 :默认从 registry.k8s.io 拉取镜像,国内无法访问。
解决:使用阿里云镜像仓库
bash
kubeadm init --image-repository=registry.aliyuncs.com/google_containers ...
问题 3:Flannel Pod 无法启动
现象 :Flannel DaemonSet Pod 长期 Init:ImagePullBackOff。
原因 :Flannel 镜像来自 docker.io,且 containerd namespace 不正确。
解决 :在所有节点上预拉取 Flannel 镜像到 k8s.io namespace(详见问题 1 解决方案)。
问题 4:Master 节点默认不接受工作负载
现象:Pod 不会被调度到 Master 节点。
原因 :Master 节点有 node-role.kubernetes.io/control-plane:NoSchedule 污点。
解决(如需在 Master 上运行 Pod):
bash
kubectl taint nodes k8s-master node-role.kubernetes.io/control-plane:NoSchedule-
总结
本文从零开始在 4 台华为云 ECS 上搭建了 Kubernetes v1.30.14 集群,并逐知识点进行了实战演练,覆盖:
- K8s 集群架构与控制面/工作节点组件
- kubectl 基本使用
- Namespace 多租户隔离
- Pod(结构、状态、资源、健康检查、多容器、InitContainer)
- Deployment(多副本、滚动更新、回滚、弹性伸缩)
- DaemonSet(每个节点运行一个 Pod)
- Service(ClusterIP、NodePort、Endpoint)
- ConfigMap & Secret(配置与敏感信息管理)
- PV / PVC / StorageClass / StatefulSet / HeadlessService
- ResourceQuota & LimitRange(资源配额)
下一步:探索 Ingress(HTTP 路由)、HPA(自动伸缩)、Helm(包管理)、PersistentVolume 动态供给等进阶主题。
写于 2026-06-22,基于华为云 FlexusX ECS 真实环境实战记录。