Kubelet 组件超深度分析
基于 Kubernetes 源码逐行级专业分析
源码路径:
cmd/kubelet/+pkg/kubelet/+staging/src/k8s.io/kubelet/
一、模块定位
1.1 业务职责
Kubelet 是 Kubernetes 集群中运行在每个 Node 上的核心节点代理,其根本职责可以概括为:
"让期望状态(Desired State)变为实际状态(Actual State)"
具体职责涵盖:
| 职责域 | 说明 |
|---|---|
| Pod 生命周期管理 | 监听 Pod 配置变更,创建/更新/删除 Pod 及其容器 |
| 容器运行时交互 | 通过 CRI (Container Runtime Interface) 与容器运行时通信 |
| 节点状态上报 | 周期性向 API Server 汇报 Node 状态(Ready、MemoryPressure、DiskPressure 等) |
| Volume 管理 | 负责 Volume 的 Attach/Detach 和 Mount/Unmount(当 EnableControllerAttachDetach=false 时) |
| 资源驱逐 | 节点资源不足时按优先级驱逐 Pod |
| 健康检查 | 执行 Liveness/Readiness/Startup Probe |
| 镜像管理 | 拉取镜像、镜像垃圾回收 |
| 容器垃圾回收 | 清理已退出容器 |
| 设备管理 | 通过 Device Plugin 框架管理 GPU/FPGA 等扩展资源 |
| 证书管理 | 支持 TLS Bootstrap 和证书自动轮转 |
| Pod 拓扑分布 | CPU Manager / Memory Manager / Topology Manager 实现 NUMA 感知资源分配 |
| 节点关闭 | 优雅处理节点关机事件 |
源码注释(cmd/kubelet/kubelet.go)明确声明:
go
// The kubelet binary is responsible for maintaining a set of containers on a
// particular host VM. It syncs data from both configuration file(s) as well as
// from a quorum of etcd servers. It then communicates with the container runtime
// (or a CRI shim for the runtime) to see what is currently running. It synchronizes
// the configuration data, with the running set of containers by starting or stopping containers.
1.2 在系统中的位置
┌───────────────────────────────────────────────────────────────┐
│ API Server (Control Plane) │
│ ↑ ↓ │
│ List/Watch Pod │
│ Update Node Status │
│ Create/Update Lease │
└───────────────────────────┬───────────────────────────────────┘
│
┌──────────▼──────────┐
│ Kubelet │ ← 本分析核心
│ (Node Agent) │
└──┬───────┬───────┬──┘
│ │ │
┌────────▼┐ ┌───▼───┐ ┌▼──────────┐
│ CRI │ │cAdvisor│ │Volume │
│ Runtime │ │(Stats) │ │Plugin │
│(container│ │ │ │(CSI/In- │
│d,cri-o) │ │ │ │ Tree) │
└────┬─────┘ └────────┘ └───────────┘
│
┌────▼─────┐
│ Container │
│ Engine │
│(runc/etc) │
└───────────┘
Kubelet 处于 Control Plane 与容器运行时之间的中间层,是节点上所有 Kubernetes 工作负载的守护者。
二、模块整体结构
2.1 核心数据结构
Kubelet 主结构体 (pkg/kubelet/kubelet.go)
Kubelet 结构体是整个组件的核心,拥有 70+ 个字段,管理所有子模块:
go
type Kubelet struct {
kubeletConfiguration kubeletconfiginternal.KubeletConfiguration
hostname string
hostnameOverridden bool
nodeName types.NodeName
runtimeCache kubecontainer.RuntimeCache
kubeClient clientset.Interface
heartbeatClient clientset.Interface
rootDirectory string
// 核心子模块
podWorkers PodWorkers // Pod 异步同步工作器
podManager kubepod.Manager // Pod 状态管理器
statusManager status.Manager // Pod 状态同步到 API Server
volumeManager volumemanager.VolumeManager // Volume 挂载管理
evictionManager eviction.Manager // 驱逐管理
containerManager cm.ContainerManager // cgroup/资源管理
probeManager prober.Manager // 健康探针管理
imageManager images.ImageGCManager // 镜像管理
containerGC kubecontainer.GC // 容器垃圾回收
pleg pleg.PodLifecycleEventGenerator // Pod 生命周期事件生成器
// 探针结果管理器
livenessManager proberesults.Manager
readinessManager proberesults.Manager
startupManager proberesults.Manager
// 运行时
containerRuntime kubecontainer.Runtime
streamingRuntime kubecontainer.StreamingRuntime
runtimeService internalapi.RuntimeService
// 准入链
admitHandlers lifecycle.PodAdmitHandlers
softAdmitHandlers lifecycle.PodAdmitHandlers
PodSyncLoopHandlers lifecycle.PodSyncLoopHandlers
PodSyncHandlers lifecycle.PodSyncHandlers
// 其他关键字段
cadvisor cadvisor.Interface
oomWatcher oomwatcher.Watcher
pluginManager pluginmanager.PluginManager
secretManager secret.Manager
configMapManager configmap.Manager
serverCertificateManager certificate.Manager
nodeLeaseController lease.Controller
shutdownManager *nodeshutdown.Manager
// 同步与状态
workQueue queue.WorkQueue
backOff *flowcontrol.Backoff
podKiller PodKiller
podCache kubecontainer.Cache
reasonCache *ReasonCache
runtimeState *runtimeState
sourcesReady config.SourcesReady
// 节点状态相关
nodeStatusUpdateFrequency time.Duration
nodeStatusReportFrequency time.Duration
syncNodeStatusMux sync.Mutex
nodeIPs []net.IP
// ...
}
Dependencies 依赖注入容器 (pkg/kubelet/kubelet.go)
go
type Dependencies struct {
Options []Option
Auth server.AuthInterface
CAdvisorInterface cadvisor.Interface
Cloud cloudprovider.Interface
ContainerManager cm.ContainerManager
DockerOptions *DockerOptions
EventClient v1core.EventsGetter
HeartbeatClient clientset.Interface
OnHeartbeatFailure func()
KubeClient clientset.Interface
Mounter mount.Interface
HostUtil hostutil.HostUtils
OOMAdjuster *oom.OOMAdjuster
OSInterface kubecontainer.OSInterface
PodConfig *config.PodConfig
Recorder record.EventRecorder
Subpather subpath.Interface
VolumePlugins []volume.VolumePlugin
DynamicPluginProber volume.DynamicPluginProber
TLSOptions *server.TLSOptions
KubeletConfigController *kubeletconfig.Controller
RemoteRuntimeService internalapi.RuntimeService
RemoteImageService internalapi.ImageManagerService
dockerLegacyService legacy.DockerLegacyService
useLegacyCadvisorStats bool
}
Dependencies 是运行时构建的依赖注入容器,将所有外部依赖集中管理,便于测试和解耦。
SyncHandler 接口
go
type SyncHandler interface {
HandlePodAdditions(pods []*v1.Pod)
HandlePodUpdates(pods []*v1.Pod)
HandlePodRemoves(pods []*v1.Pod)
HandlePodReconcile(pods []*v1.Pod)
HandlePodSyncs(pods []*v1.Pod)
HandlePodCleanups() error
}
Kubelet 实现了 SyncHandler 接口,作为 syncLoop 的事件处理器。
Bootstrap 接口
go
type Bootstrap interface {
GetConfiguration() kubeletconfiginternal.KubeletConfiguration
BirthCry()
StartGarbageCollection()
ListenAndServe(...)
ListenAndServeReadOnly(...)
ListenAndServePodResources()
Run(<-chan kubetypes.PodUpdate)
RunOnce(<-chan kubetypes.PodUpdate) ([]RunPodResult, error)
}
2.2 核心接口定义
| 接口 | 位置 | 职责 |
|---|---|---|
container.Runtime |
container/runtime.go |
容器运行时抽象(SyncPod/KillPod/GetPods等) |
internalapi.RuntimeService |
CRI API | CRI Runtime Service gRPC 客户端 |
internalapi.ImageManagerService |
CRI API | CRI Image Service gRPC 客户端 |
pleg.PodLifecycleEventGenerator |
pleg/pleg.go |
Pod 生命周期事件生成器 |
status.Manager |
status/status_manager.go |
Pod 状态管理与 API Server 同步 |
volumemanager.VolumeManager |
volumemanager/volume_manager.go |
Volume Attach/Mount 管理 |
eviction.Manager |
eviction/eviction_manager.go |
资源驱逐管理 |
prober.Manager |
prober/prober_manager.go |
健康探针管理 |
cm.ContainerManager |
cm/container_manager.go |
cgroup/资源管理 |
pluginmanager.PluginManager |
pluginmanager/plugin_manager.go |
插件注册管理 |
pod.Manager |
pod/pod_manager.go |
Pod 元数据管理 |
secret.Manager |
secret/secret_manager.go |
Secret 缓存管理 |
configmap.Manager |
configmap/configmap_manager.go |
ConfigMap 缓存管理 |
2.3 核心方法清单
Kubelet 主方法
| 方法 | 作用 |
|---|---|
NewMainKubelet() |
构造 Kubelet 实例及所有子模块 |
Run(updates) |
Kubelet 主入口,启动所有模块 |
syncLoop(updates, handler) |
核心同步循环,永不返回 |
syncLoopIteration(...) |
单次迭代,从5个通道读取并分发事件 |
syncPod(o) |
单个 Pod 的同步事务脚本 |
HandlePodAdditions/Updates/Removes/Reconcile/Syncs |
SyncHandler 实现 |
StartGarbageCollection() |
启动容器/镜像 GC |
updateRuntimeUp() |
检测运行时是否就绪 |
syncNodeStatus() |
同步节点状态到 API Server |
canAdmitPod() |
Pod 准入检查 |
deletePod() |
异步删除 Pod |
dispatchWork() |
将 Pod 分发到 PodWorker |
initializeModules() |
初始化不依赖运行时的模块 |
initializeRuntimeDependentModules() |
初始化依赖运行时的模块 |
kubeGenericRuntimeManager 方法(CRI 适配层)
| 方法 | 作用 |
|---|---|
SyncPod() |
同步 Pod 到运行时(核心方法) |
KillPod() |
终止 Pod |
startContainer() |
启动单个容器(pull→create→start→postStart hook) |
killContainer() |
终止单个容器 |
createPodSandbox() |
创建 Pod Sandbox |
generatePodSandboxConfig() |
生成 Sandbox 配置 |
generateContainerConfig() |
生成容器配置 |
2.4 内部调用关系
main()
└─ app.NewKubeletCommand()
└─ Run()
└─ run()
├─ UnsecuredDependencies() // 构建依赖
├─ buildKubeletClientConfig() // 构建 API Server 客户端
├─ cm.NewContainerManager() // 构建容器管理器
├─ PreInitRuntimeService() // 初始化 CRI 连接
└─ RunKubelet()
└─ createAndInitKubelet()
├─ kubelet.NewMainKubelet() // 构造 Kubelet
├─ BirthCry()
└─ StartGarbageCollection()
└─ startKubelet()
├─ k.Run(podCfg.Updates()) // 启动主循环
├─ k.ListenAndServe() // HTTP 服务器
└─ k.ListenAndServePodResources() // PodResources gRPC
2.5 数据流入流出方式
数据流入:
-
PodConfig 通道 (
<-chan kubetypes.PodUpdate):三个来源汇聚- API Server(ListWatch)→
config.NewSourceApiserver() - 静态文件目录 →
config.NewSourceFile() - HTTP URL →
config.NewSourceURL()
- API Server(ListWatch)→
-
PLEG 事件通道 (
<-chan *pleg.PodLifecycleEvent):容器状态变化事件 -
探针结果通道:livenessManager/readinessManager/startupManager 的 Updates()
-
定时器通道:syncTicker(1s)、housekeepingTicker(2s)
数据流出:
- Pod 状态更新:statusManager → API Server PATCH
- Node 状态更新:syncNodeStatus → API Server PATCH/UPDATE
- Node Lease 更新:nodeLeaseController → API Server
- 事件记录:recorder → API Server Events
- CRI gRPC 调用:runtimeService/imageService → 容器运行时
三、核心业务逻辑深度解析
3.1 Kubelet 启动流程
3.1.1 入口与命令构造
cmd/kubelet/kubelet.go 中的 main() 函数极其简洁:
go
func main() {
rand.Seed(time.Now().UnixNano())
command := app.NewKubeletCommand()
logs.InitLogs()
defer logs.FlushLogs()
if err := command.Execute(); err != nil {
os.Exit(1)
}
}
NewKubeletCommand() 构造 cobra.Command,关键设计决策:
DisableFlagParsing: true:禁用 Cobra 内置 flag 解析,因为 Kubelet 有flag 优先级规则(命令行 flag > 配置文件 > 默认值),需要手动处理- 构建
KubeletFlags(命令行标志)和KubeletConfiguration(配置文件参数)两套配置 - 加载配置文件后,通过
kubeletConfigFlagPrecedence()重新解析命令行以确保 flag 优先级
3.1.2 Run() 函数完整流程
go
func Run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Dependencies, ...) error {
// 1. 设置日志格式
logOption := logs.NewOptions()
logOption.LogFormat = s.Logging.Format
logOption.Apply()
// 2. 记录版本
klog.InfoS("Kubelet version", "kubeletVersion", version.Get())
// 3. OS 特定初始化(Windows Service 等)
initForOS(...)
// 4. 进入主运行函数
run(ctx, s, kubeDeps, featureGate)
}
run() 函数是真正的启动逻辑,执行步骤如下:
Step 1: 设置 Feature Gate
go
err = utilfeature.DefaultMutableFeatureGate.SetFromMap(s.KubeletConfiguration.FeatureGates)
Step 2: 校验配置
go
options.ValidateKubeletServer(s)
Step 3: 获取文件锁(防止多实例)
go
flock.Acquire(s.LockFilePath)
Step 4: 注册 /configz 端点
go
initConfigz(&s.KubeletConfiguration)
Step 5: 判断 standalone 模式
go
standaloneMode := len(s.KubeConfig) > 0 // false = standalone
Step 6: 构建依赖(若未提供)
go
kubeDeps, err = UnsecuredDependencies(s, featureGate)
Step 7: 初始化云提供商
go
cloud, err := cloudprovider.InitCloudProvider(s.CloudProvider, s.CloudConfigFile)
kubeDeps.Cloud = cloud
Step 8: 确定 Node 名称
go
hostName, err := nodeutil.GetHostname(s.HostnameOverride)
nodeName, err := getNodeName(kubeDeps.Cloud, hostName) // 云提供商可能重写
Step 9: 构建 API Server 客户端(支持 TLS Bootstrap)
go
clientConfig, closeAllConns, err := buildKubeletClientConfig(ctx, s, nodeName)
kubeDeps.KubeClient = clientset.NewForConfig(clientConfig)
kubeDeps.EventClient = ... // 独立的事件客户端(有 QPS 限制)
kubeDeps.HeartbeatClient = ... // 独立的心跳客户端(无 QPS 限制,有超时)
Step 10: 构建 Auth
go
auth, runAuthenticatorCAReload, err := BuildAuth(nodeName, kubeDeps.KubeClient, ...)
Step 11: 初始化 cAdvisor
go
kubeDeps.CAdvisorInterface, err = cadvisor.New(...)
Step 12: 构建 ContainerManager(包含 CPU/Memory/Topology/Device Manager)
go
kubeDeps.ContainerManager, err = cm.NewContainerManager(...)
Step 13: 预初始化运行时服务(建立 CRI gRPC 连接)
go
kubelet.PreInitRuntimeService(...)
// 对 Docker: 启动 dockershim
// 对 Remote: 无操作
// 建立 gRPC 客户端:
kubeDeps.RemoteRuntimeService = remote.NewRemoteRuntimeService(endpoint, timeout)
kubeDeps.RemoteImageService = remote.NewRemoteImageService(endpoint, timeout)
Step 14: RunKubelet(创建并启动 Kubelet 实例)
go
RunKubelet(s, kubeDeps, s.RunOnce)
// → createAndInitKubelet()
// → NewMainKubelet() // 构造完整 Kubelet
// → BirthCry() // 发送启动事件
// → StartGarbageCollection() // 启动 GC
// → startKubelet()
// → go k.Run(podCfg.Updates()) // 主同步循环
// → go k.ListenAndServe(...) // HTTP/HTTPS 服务
// → go k.ListenAndServePodResources() // PodResources gRPC
3.1.3 NewMainKubelet() 构造详解
这是整个 Kubelet 最复杂的构造函数,创建约 30 个子模块:
- 校验参数:rootDirectory、SyncFrequency、IPTables 参数
- Node Informer:启动 Node 的 SharedInformer
- PodConfig :
makePodSourceConfig()创建三源 Pod 配置 - Secret/ConfigMap Manager:根据变更检测策略选择 Watch/TTLCache/Get
- Prober 结果管理器:livenessManager、readinessManager、startupManager
- PodCache :
kubecontainer.NewCache() - PodManager :
kubepod.NewBasicPodManager() - StatusManager :
status.NewManager() - ContainerRuntime :
kuberuntime.NewKubeGenericRuntimeManager() - RuntimeCache :
kubecontainer.NewRuntimeCache() - PLEG :
pleg.NewGenericPLEG() - ContainerGC :
kubecontainer.NewContainerGC() - ImageManager :
images.NewImageGCManager() - ProbeManager :
prober.NewManager() - VolumePluginMgr :
NewInitializedVolumePluginMgr() - PluginManager :
pluginmanager.NewPluginManager() - VolumeManager :
volumemanager.NewVolumeManager() - PodWorkers :
newPodWorkers(klet.syncPod, ...) - WorkQueue :
queue.NewBasicWorkQueue() - EvictionManager :
eviction.NewManager() - 准入链:admitHandlers、softAdmitHandlers
- NodeLeaseController :
lease.NewController() - ShutdownManager :
nodeshutdown.NewManager() - OOMWatcher :
oomwatcher.NewWatcher() - DNS Configurer :
dns.NewConfigurer() - StatsProvider:根据运行时类型选择 CRI/cAdvisor Stats Provider
- ContainerLogManager :
logs.NewContainerLogManager() - RuntimeClassManager :
runtimeclass.NewManager() - ServerCertificateManager:若启用 ServerTLSBootstrap
- NodeStatusFuncs :
defaultNodeStatusFuncs()
3.2 Kubelet.Run() 主循环启动
go
func (kl *Kubelet) Run(updates <-chan kubetypes.PodUpdate) {
// 1. 设置日志服务器
if kl.logServer == nil {
kl.logServer = http.StripPrefix("/logs/", http.FileServer(http.Dir("/var/log/")))
}
// 2. 启动云资源同步
go kl.cloudResourceSyncManager.Run(wait.NeverStop)
// 3. 初始化不依赖运行时的模块
kl.initializeModules()
// → metrics.Register() 注册 Prometheus 指标
// → kl.setupDataDirs() 创建目录结构
// → kl.imageManager.Start() 启动镜像管理
// → kl.serverCertificateManager.Start()
// → kl.oomWatcher.Start() OOM 监控
// → kl.resourceAnalyzer.Start()
// 4. 启动 Volume Manager
go kl.volumeManager.Run(kl.sourcesReady, wait.NeverStop)
// 5. 节点状态同步
go wait.Until(kl.syncNodeStatus, kl.nodeStatusUpdateFrequency, wait.NeverStop)
go kl.fastStatusUpdateOnce() // 快速初始化:立即更新 CIDR/运行时/节点状态
// 6. Node Lease
go kl.nodeLeaseController.Run(wait.NeverStop)
// 7. 运行时状态检查(每5秒)
go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop)
// 8. 设置 iptables 规则
kl.initNetworkUtil()
// 9. Pod Killer 协程
go wait.Until(kl.podKiller.PerformPodKillingWork, 1*time.Second, wait.NeverStop)
// 10. StatusManager 启动
kl.statusManager.Start()
// 11. RuntimeClass 同步
kl.runtimeClassManager.Start(wait.NeverStop)
// 12. PLEG 启动
kl.pleg.Start()
// 13. 进入主同步循环(永不返回)
kl.syncLoop(updates, kl)
}
3.3 syncLoop 核心同步循环
go
func (kl *Kubelet) syncLoop(updates <-chan kubetypes.PodUpdate, handler SyncHandler) {
klog.InfoS("Starting kubelet main sync loop")
syncTicker := time.NewTicker(time.Second) // 1秒
housekeepingTicker := time.NewTicker(housekeepingPeriod) // 2秒
plegCh := kl.pleg.Watch()
// 指数退避参数(运行时错误时)
const (base = 100*time.Millisecond; max = 5*time.Second; factor = 2)
duration := base
// 检查 resolv.conf 限制
kl.dnsConfigurer.CheckLimitsForResolvConf()
for {
// 运行时错误时指数退避
if err := kl.runtimeState.runtimeErrors(); err != nil {
time.Sleep(duration)
duration = time.Duration(math.Min(float64(max), factor*float64(duration)))
continue
}
duration = base // 成功则重置
kl.syncLoopMonitor.Store(kl.clock.Now())
if !kl.syncLoopIteration(updates, handler, syncTicker.C, housekeepingTicker.C, plegCh) {
break // 仅当 configCh 关闭时退出
}
kl.syncLoopMonitor.Store(kl.clock.Now())
}
}
syncLoopIteration 详解 --- 五路事件分发
这是 Kubelet 最关键的事件分发逻辑,使用 select 从5个通道读取:
go
func (kl *Kubelet) syncLoopIteration(
configCh <-chan kubetypes.PodUpdate, // Pod 配置变更
handler SyncHandler,
syncCh <-chan time.Time, // 周期同步(1s)
housekeepingCh <-chan time.Time, // 清理周期(2s)
plegCh <-chan *pleg.PodLifecycleEvent // PLEG 事件
) bool {
select {
case u, open := <-configCh:
// Pod 配置来源变更
if !open { return false } // 通道关闭 = 退出
switch u.Op {
case kubetypes.ADD:
handler.HandlePodAdditions(u.Pods)
case kubetypes.UPDATE:
handler.HandlePodUpdates(u.Pods)
case kubetypes.REMOVE:
handler.HandlePodRemoves(u.Pods)
case kubetypes.RECONCILE:
handler.HandlePodReconcile(u.Pods)
case kubetypes.DELETE:
handler.HandlePodUpdates(u.Pods) // DELETE = UPDATE(优雅删除)
}
kl.sourcesReady.AddSource(u.Source)
case e := <-plegCh:
// PLEG 事件:容器状态变化
if e.Type == pleg.ContainerStarted {
kl.lastContainerStartedTime.Add(e.ID, time.Now())
}
if isSyncPodWorthy(e) {
if pod, ok := kl.podManager.GetPodByUID(e.ID); ok {
handler.HandlePodSyncs([]*v1.Pod{pod})
}
}
if e.Type == pleg.ContainerDied {
kl.cleanUpContainersInPod(e.ID, containerID)
}
case <-syncCh:
// 周期性同步(由 workQueue 驱动)
podsToSync := kl.getPodsToSync()
handler.HandlePodSyncs(podsToSync)
case update := <-kl.livenessManager.Updates():
// Liveness 探针失败 → 重启容器
if update.Result == proberesults.Failure {
handleProbeSync(kl, update, handler, "liveness", "unhealthy")
}
case update := <-kl.readinessManager.Updates():
// Readiness 探针变化 → 更新就绪状态
ready := update.Result == proberesults.Success
kl.statusManager.SetContainerReadiness(...)
handleProbeSync(kl, update, handler, "readiness", ...)
case update := <-kl.startupManager.Updates():
// Startup 探针变化 → 更新启动状态
kl.statusManager.SetContainerStartup(...)
handleProbeSync(kl, update, handler, "startup", ...)
case <-housekeepingCh:
// 清理工作
handler.HandlePodCleanups()
}
return true
}
关键设计要点:
select的 case 是伪随机顺序评估的,不保证优先级- PLEG 事件过滤:
isSyncPodWorthy()排除ContainerRemoved(不影响 Pod 状态) - DELETE 被视为 UPDATE:因为需要走优雅删除流程
- 运行时错误时整个循环退避(不处理任何事件)
3.4 syncPod 事务脚本详解
syncPod 是单个 Pod 同步的完整事务脚本,代码约 200 行,步骤极为严格:
go
func (kl *Kubelet) syncPod(o syncPodOptions) error {
pod := o.pod
mirrorPod := o.mirrorPod
podStatus := o.podStatus
updateType := o.updateType
// ═══ Step 0: Kill 类型直接处理 ═══
if updateType == kubetypes.SyncPodKill {
apiPodStatus := killPodOptions.PodStatusFunc(pod, podStatus)
kl.statusManager.SetPodStatus(pod, apiPodStatus)
return kl.killPod(pod, nil, podStatus, ...)
}
// ═══ Step 1: 检查 Pod 是否正在优雅终止 ═══
if kl.podKiller.IsPodPendingTerminationByPodName(podFullName) {
return fmt.Errorf("pod %q is pending termination", podFullName)
}
// ═══ Step 2: 记录 Pod Worker 启动延迟 ═══
if updateType == kubetypes.SyncPodCreate {
metrics.PodWorkerStartDuration.Observe(...)
}
// ═══ Step 3: 生成 API Pod Status ═══
apiPodStatus := kl.generateAPIPodStatus(pod, podStatus)
// ═══ Step 4: 记录 Pod 启动延迟 ═══
// (Pending → Running 时)
// ═══ Step 5: 准入检查 ═══
runnable := kl.canRunPod(pod)
if !runnable.Admit {
apiPodStatus.Reason = runnable.Reason
// 设置所有容器 Waiting 原因为 "Blocked"
}
// ═══ Step 6: 更新 StatusManager ═══
kl.statusManager.SetPodStatus(pod, apiPodStatus)
// ═══ Step 7: 不可运行则 Kill ═══
if !runnable.Admit || pod.DeletionTimestamp != nil || apiPodStatus.Phase == v1.PodFailed {
return kl.killPod(pod, nil, podStatus, nil)
}
// ═══ Step 8: 网络就绪检查 ═══
if err := kl.runtimeState.networkErrors(); err != nil && !kubecontainer.IsHostNetworkPod(pod) {
return fmt.Errorf("%s: %v", NetworkNotReadyErrorMsg, err)
}
// ═══ Step 9: 创建/更新 Pod Cgroup ═══
pcm := kl.containerManager.NewPodContainerManager()
if !kl.podIsTerminated(pod) {
// 如果 cgroup 不存在且非首次同步 → 先 kill 再重建
if !pcm.Exists(pod) && !firstSync {
kl.killPod(pod, nil, podStatus, nil)
}
pcm.EnsureExists(pod)
kl.containerManager.UpdateQOSCgroups()
}
// ═══ Step 10: 静态 Pod 镜像处理 ═══
if kubetypes.IsStaticPod(pod) {
// 镜像 Pod 过期则删除,不存在则创建
}
// ═══ Step 11: 创建 Pod 数据目录 ═══
kl.makePodDataDirs(pod)
// ═══ Step 12: 等待 Volume 挂载 ═══
if !kl.podIsTerminated(pod) {
kl.volumeManager.WaitForAttachAndMount(pod) // 最多等 2m3s
}
// ═══ Step 13: 获取 Pull Secrets ═══
pullSecrets := kl.getPullSecretsForPod(pod)
// ═══ Step 14: 调用容器运行时 SyncPod ═══
result := kl.containerRuntime.SyncPod(pod, podStatus, pullSecrets, kl.backOff)
return nil
}
3.5 CRI 交互 --- kubeGenericRuntimeManager.SyncPod
kuberuntime/kuberuntime_manager.go 中的 SyncPod 是与容器运行时交互的核心方法:
go
func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, podStatus *kubecontainer.PodStatus,
pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) PodSyncResult {
// Step 1: 计算 Sandbox 和容器变更
podContainerChanges := m.computePodActions(pod, podStatus)
// Step 2: Kill 不应存在的 Sandbox
if podContainerChanges.KillPod {
killResult := m.killPodWithSyncResult(pod, ...)
// ...
}
// Step 3: Kill 不应存在的容器(如镜像变更、注解变更等)
for _, containerToKill := range podContainerChanges.ContainersToKill {
m.killContainer(pod, containerToKill.ID, containerToKill.Name, ...)
}
// Step 4: 若需要新建 Sandbox
if podContainerChanges.CreateSandbox {
podSandboxID, msg, err = m.createPodSandbox(pod, podContainerChanges.Attempt)
// 生成 Sandbox 配置并调用 CRI
// m.runtimeService.RunPodSandbox(podSandboxConfig, runtimeHandler)
}
// Step 5: 启动 Init 容器
for _, idx := range podContainerChanges.InitContainersToStart {
m.startContainer(podSandboxID, podSandboxConfig, containerStartSpec(initContainer), ...)
}
// Step 6: 启动业务容器
for _, idx := range podContainerChanges.ContainersToStart {
m.startContainer(podSandboxID, podSandboxConfig, containerStartSpec(container), ...)
}
return
}
startContainer 四步曲
go
func (m *kubeGenericRuntimeManager) startContainer(...) (string, error) {
// Step 1: 拉取镜像
imageRef, msg, err := m.imagePuller.EnsureImageExists(pod, container, pullSecrets, ...)
// Step 2: 创建容器
containerConfig, cleanup, err := m.generateContainerConfig(container, pod, ...)
m.internalLifecycle.PreCreateContainer(pod, container, containerConfig)
containerID, err := m.runtimeService.CreateContainer(podSandboxID, containerConfig, podSandboxConfig)
// Step 3: 启动容器
m.internalLifecycle.PreStartContainer(pod, container, containerID)
err = m.runtimeService.StartContainer(containerID)
// Step 4: PostStart Hook
if container.Lifecycle != nil && container.Lifecycle.PostStart != nil {
msg, handlerErr := m.runner.Run(kubeContainerID, pod, container, container.Lifecycle.PostStart)
if handlerErr != nil {
m.killContainer(pod, kubeContainerID, container.Name, "FailedPostStartHook", ...)
}
}
}
3.6 PLEG (Pod Lifecycle Event Generator)
架构设计
PLEG 是 Kubelet 与容器运行时之间的状态桥梁,解决"如何感知容器状态变化"的问题:
go
type PodLifecycleEventGenerator interface {
Start()
Watch() chan *PodLifecycleEvent
Healthy() (bool, error)
}
type PodLifecycleEvent struct {
ID types.UID
Type PodLifeCycleEventType // ContainerStarted/ContainerDied/ContainerRemoved/ContainerChanged/PodSync
Data interface{}
}
GenericPLEG 实现
GenericPLEG 使用**定期重列表(Relist)**策略,而非事件驱动:
go
type GenericPLEG struct {
relistPeriod time.Duration // 重列表周期(默认1秒)
runtime kubecontainer.Runtime
eventChannel chan *PodLifecycleEvent
podRecords podRecords // 存储旧/新状态的对比
relistTime atomic.Value // 上次重列表时间
cache kubecontainer.Cache
}
relist 流程:
- 调用
runtime.GetPods(true)获取所有容器(包括已退出的) - 与上次记录的
podRecords对比 - 对每个容器状态变化生成事件:
non-existent → running:ContainerStartedrunning → exited:ContainerDiedexited → non-existent:ContainerRemoved- 其他变化:ContainerChanged
- 更新
podCache - 将事件发送到
eventChannel
健康检查:如果超过 3 分钟没有成功 relist,PLEG 报告不健康。
3.7 Eviction 驱逐机制
驱逐信号与阈值
Eviction Manager 监控以下信号:
| 信号 | 说明 |
|---|---|
memory.available |
节点可用内存 |
nodefs.available |
节点根文件系统可用空间 |
nodefs.inodesFree |
节点根文件系统可用 inode |
imagefs.available |
镜像文件系统可用空间 |
imagefs.inodesFree |
镜像文件系统可用 inode |
pid.available |
可用 PID 数 |
阈值分为 Hard (立即驱逐)和 Soft(有宽限期)。
驱逐流程
go
func (m *managerImpl) synchronize(diskInfoProvider, podFunc) []*v1.Pod {
// 1. 获取节点状态摘要
summary, err := m.summaryProvider.Get()
// 2. 观察各信号的当前值
observations = signalsToObservations(m.signalToRankFunc, summary)
// 3. 检查哪些阈值被触发
thresholdsMet = m.thresholdsMet(observations, m.config.Thresholds)
// 4. 更新节点条件(MemoryPressure/DiskPressure/PIDPressure)
m.updateNodeConditions(thresholdsMet)
// 5. 若有新触发的阈值(经过宽限期),开始驱逐
if len(localThresholds) > 0 {
// 6. 对 Pod 按优先级排序(BestEffort > Burstable > Guaranteed)
// 7. 选择 Pod 进行驱逐
// 8. 优先回收非关键 Pod
}
// 9. 尝试节点级回收(如镜像 GC)
for _, nodeReclaimFuncs := range m.signalToNodeReclaimFuncs {
nodeReclaimFuncs(...)
}
// 10. 驱选中的 Pod
for _, pod := range podsToEvict {
m.killPodFunc(pod)
}
}
准入控制
Eviction Manager 同时实现 PodAdmitHandler 接口:
- 节点有 MemoryPressure → 拒绝 BestEffort Pod(Critical Pod 除外)
- 节点有 DiskPressure/PIDPressure → 拒绝所有非 Critical Pod
3.8 Volume 挂载流程
VolumeManager 使用 Desired State + Actual State 双缓存模型:
go
type volumeManager struct {
desiredStateOfWorld cache.DesiredStateOfWorld // 期望状态
actualStateOfWorld cache.ActualStateOfWorld // 实际状态
reconciler reconciler.Reconciler // 协调器
desiredStateOfWorldPopulator populator.DesiredStateOfWorldPopulator
operationExecutor operationexecutor.OperationExecutor
}
运行流程:
- DesiredStateOfWorldPopulator:定期从 PodManager 获取 Pod 列表,将需要的 Volume 添加到 DesiredStateOfWorld
- Reconciler :循环对比 Desired 与 Actual,执行差异操作:
- Desired 有但 Actual 没有 → Attach + Mount
- Actual 有但 Desired 没有 → Unmount + Detach
- OperationExecutor:异步执行 Attach/Detach/Mount/Unmount 操作
3.9 Probe 健康检查
Prober 架构
go
type prober struct {
exec execprobe.Prober
readinessHTTP httpprobe.Prober
livenessHTTP httpprobe.Prober
startupHTTP httpprobe.Prober
tcp tcpprobe.Prober
runner kubecontainer.CommandRunner
}
支持三种探针类型,每种支持 HTTP/TCP/Exec 三种方式:
| 探针类型 | 初始值 | 失败处理 |
|---|---|---|
| Liveness | Success | 重启容器 |
| Readiness | Failure | 从 Service Endpoints 移除 |
| Startup | Unknown | 阻塞 Liveness/Readiness |
Worker 运行
每个容器的每种探针都有一个独立的 worker 协程:
go
func (w *worker) run() {
// 随机初始延迟(防止重启风暴)
time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod)))
probeTicker := time.NewTicker(probeTickerPeriod)
defer probeTicker.Stop()
for {
select {
case <-probeTicker.C:
case <-w.stopCh:
return
case <-w.manualTriggerCh:
}
// 跳过 InitialDelaySeconds
// 跳过 Startup Probe 尚未完成的情况
result := w.probeManager.probe(w.probeType, w.pod, ...)
// 结果一致性检查:连续相同结果才变更状态
if result == w.lastResult {
w.resultRun++
} else {
w.resultRun = 1
}
// 最小失败次数 = 3 (maxProbeRetries)
if w.resultRun < w.spec.SuccessThreshold || w.resultRun < w.spec.FailureThreshold {
continue
}
w.resultsManager.Set(w.containerID, result)
}
}
3.10 镜像管理
ImageGCManager
go
type realImageGCManager struct {
runtime container.Runtime
imageRecords map[string]*imageRecord
policy ImageGCPolicy
statsProvider StatsProvider
imageCache imageCache
sandboxImage string // Sandbox 镜像免于 GC
}
GC 策略:
HighThresholdPercent ──── 触发 GC
│
│ GC 目标:降至 LowThreshold 以下
│
LowThresholdPercent ──── GC 停止
GC 流程:
- 获取镜像文件系统使用率
- 若超过 HighThreshold → 触发 GC
- 按最近使用时间排序镜像
- 逐个删除最久未使用的镜像,直到使用率低于 LowThreshold
- 不删除 Sandbox 镜像和正在使用的镜像
四、Mermaid 图表
图1:Kubelet 组件架构图
#mermaid-svg-Khwnjnm9ZNHCr8l4{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .error-icon{fill:#552222;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .marker.cross{stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 p{margin:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label text{fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label span{color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label span p{background-color:transparent;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 span{fill:#333;color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node rect,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node circle,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node ellipse,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node polygon,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .rough-node .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label{text-anchor:middle;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .rough-node .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label{text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node.clickable{cursor:pointer;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .arrowheadPath{fill:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster text{fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster span{color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 rect.text{fill:none;stroke-width:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape p,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label rect,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Khwnjnm9ZNHCr8l4 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Container Runtime
CRI Layer
Sub-Modules
Kubelet Core
API Server
configCh
configCh
configCh
plegCh
syncCh
probeCh
housekeepingCh
ListWatch
Update Status
syncPod
gRPC
gRPC
API Server
main
NewKubeletCommand
Run
NewMainKubelet
Kubelet Struct
syncLoop
syncLoopIteration
HandlePodAdditions
HandlePodUpdates
HandlePodRemoves
PLEG Events
HandlePodSyncs
Probe Updates
HandlePodCleanups
PodWorkers
PodManager
StatusManager
VolumeManager
EvictionManager
ProbeManager
ContainerManager
ImageManager
ContainerGC
PLEG
PluginManager
kubeGenericRuntimeManager
RemoteRuntimeService
gRPC Client
RemoteImageService
gRPC Client
Containerd / CRI-O / Docker
图2:Kubelet 启动流程图
#mermaid-svg-fPsi3MxhxTlJ1Tqo{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .error-icon{fill:#552222;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .marker.cross{stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo p{margin:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label text{fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label span{color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label span p{background-color:transparent;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo span{fill:#333;color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node rect,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node circle,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node ellipse,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node polygon,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .rough-node .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label{text-anchor:middle;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .rough-node .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label{text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node.clickable{cursor:pointer;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .arrowheadPath{fill:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster text{fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster span{color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo rect.text{fill:none;stroke-width:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape p,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label rect,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fPsi3MxhxTlJ1Tqo :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Yes
No
Yes
No
main
NewKubeletCommand
Parse Flags & Config
Config File?
loadConfigFile
Use Default Config
kubeletConfigFlagPrecedence
ValidateKubeletConfiguration
Dynamic Kubelet Config?
BootstrapKubeletConfigController
Skip
UnsecuredDependencies
Build KubeClient
TLS Bootstrap?
Init CloudProvider
Init cAdvisor
NewContainerManager
PreInitRuntimeService
Connect CRI
RunKubelet
NewMainKubelet
30+ sub-modules
BirthCry
StartGarbageCollection
startKubelet
go k.Run
Main Loop
go ListenAndServe
HTTPS Server
go ListenAndServePodResources
gRPC Server
initializeModules
go volumeManager.Run
go syncNodeStatus
go nodeLeaseController.Run
go updateRuntimeUp
5s interval
initializeRuntimeDependentModules
PLEG.Start
syncLoop
NEVER RETURNS
图3:Pod 生命周期管理流程图
#mermaid-svg-a5d57kKAiw6Fk90U{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-a5d57kKAiw6Fk90U .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-a5d57kKAiw6Fk90U .error-icon{fill:#552222;}#mermaid-svg-a5d57kKAiw6Fk90U .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-a5d57kKAiw6Fk90U .marker{fill:#333333;stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .marker.cross{stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-a5d57kKAiw6Fk90U p{margin:0;}#mermaid-svg-a5d57kKAiw6Fk90U .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label text{fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label span{color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label span p{background-color:transparent;}#mermaid-svg-a5d57kKAiw6Fk90U .label text,#mermaid-svg-a5d57kKAiw6Fk90U span{fill:#333;color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .node rect,#mermaid-svg-a5d57kKAiw6Fk90U .node circle,#mermaid-svg-a5d57kKAiw6Fk90U .node ellipse,#mermaid-svg-a5d57kKAiw6Fk90U .node polygon,#mermaid-svg-a5d57kKAiw6Fk90U .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .rough-node .label text,#mermaid-svg-a5d57kKAiw6Fk90U .node .label text,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label,#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label{text-anchor:middle;}#mermaid-svg-a5d57kKAiw6Fk90U .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .rough-node .label,#mermaid-svg-a5d57kKAiw6Fk90U .node .label,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label,#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label{text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .node.clickable{cursor:pointer;}#mermaid-svg-a5d57kKAiw6Fk90U .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .arrowheadPath{fill:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-a5d57kKAiw6Fk90U .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-a5d57kKAiw6Fk90U .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster text{fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster span{color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-a5d57kKAiw6Fk90U .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U rect.text{fill:none;stroke-width:0;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape p,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label rect,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-a5d57kKAiw6Fk90U .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-a5d57kKAiw6Fk90U :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} ADD
UPDATE
REMOVE
DELETE
RECONCILE
No
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Pod Update arrives
via configCh
Op Type?
HandlePodAdditions
HandlePodUpdates
HandlePodRemoves
HandlePodReconcile
PodManager.AddPod
canAdmitPod?
rejectPod
Phase=Failed
dispatchWork
SyncPodCreate
PodManager.UpdatePod
dispatchWork
SyncPodUpdate
PodManager.DeletePod
deletePod
→ podKiller
probeManager.RemovePod
PodWorkers.UpdatePod
syncPod
Transaction Script
SyncPodKill?
killPod
generateAPIPodStatus
canRunPod?
killPod
Status=Blocked
Network Ready?
Return Error
Create/Update Cgroups
Static Pod?
Create/Update Mirror Pod
makePodDataDirs
WaitForAttachAndMount
containerRuntime.SyncPod
Return Result
图4:CRI (Container Runtime Interface) 交互图
Container Runtime RemoteImageService (gRPC) RemoteRuntimeService (gRPC) kubeGenericRuntimeManager Kubelet Container Runtime RemoteImageService (gRPC) RemoteRuntimeService (gRPC) kubeGenericRuntimeManager Kubelet #mermaid-svg-JqR9fcvFlNTDvYqV{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-JqR9fcvFlNTDvYqV .error-icon{fill:#552222;}#mermaid-svg-JqR9fcvFlNTDvYqV .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-JqR9fcvFlNTDvYqV .marker{fill:#333333;stroke:#333333;}#mermaid-svg-JqR9fcvFlNTDvYqV .marker.cross{stroke:#333333;}#mermaid-svg-JqR9fcvFlNTDvYqV svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-JqR9fcvFlNTDvYqV p{margin:0;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-JqR9fcvFlNTDvYqV .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .sequenceNumber{fill:white;}#mermaid-svg-JqR9fcvFlNTDvYqV #sequencenumber{fill:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageText{fill:#333;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV .labelText,#mermaid-svg-JqR9fcvFlNTDvYqV .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .loopText,#mermaid-svg-JqR9fcvFlNTDvYqV .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-JqR9fcvFlNTDvYqV .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-JqR9fcvFlNTDvYqV .noteText,#mermaid-svg-JqR9fcvFlNTDvYqV .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .actorPopupMenu{position:absolute;}#mermaid-svg-JqR9fcvFlNTDvYqV .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-man circle,#mermaid-svg-JqR9fcvFlNTDvYqV line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-JqR9fcvFlNTDvYqV :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Pod SyncFlow altNeed Kill Pod altNeed Create Sandbox altPostStart Hook exists loopFor each container to start SyncPod(pod, podStatus, pullSecrets, backOff)computePodActions(对比期望vs实际)StopPodSandbox(sandboxID)gRPC StopPodSandboxRemovePodSandbox(sandboxID)gRPC RemovePodSandboxgeneratePodSandboxConfig(pod)RunPodSandbox(config, runtimeHandler)gRPC RunPodSandboxsandboxIDsandboxIDPullImage(imageSpec, authConfig, podSandboxConfig)gRPC PullImageimageRefimageRefgenerateContainerConfig(container, pod)CreateContainer(sandboxID, containerConfig, sandboxConfig)gRPC CreateContainercontainerIDcontainerIDStartContainer(containerID)gRPC StartContainerOK/Errorrunner.Run(postStartHook)PodSyncResult
图5:Volume 挂载流程图
#mermaid-svg-GlJknFcfdOB82PTF{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-GlJknFcfdOB82PTF .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-GlJknFcfdOB82PTF .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-GlJknFcfdOB82PTF .error-icon{fill:#552222;}#mermaid-svg-GlJknFcfdOB82PTF .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-GlJknFcfdOB82PTF .marker{fill:#333333;stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .marker.cross{stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-GlJknFcfdOB82PTF p{margin:0;}#mermaid-svg-GlJknFcfdOB82PTF .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label text{fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label span{color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label span p{background-color:transparent;}#mermaid-svg-GlJknFcfdOB82PTF .label text,#mermaid-svg-GlJknFcfdOB82PTF span{fill:#333;color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .node rect,#mermaid-svg-GlJknFcfdOB82PTF .node circle,#mermaid-svg-GlJknFcfdOB82PTF .node ellipse,#mermaid-svg-GlJknFcfdOB82PTF .node polygon,#mermaid-svg-GlJknFcfdOB82PTF .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .rough-node .label text,#mermaid-svg-GlJknFcfdOB82PTF .node .label text,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label,#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label{text-anchor:middle;}#mermaid-svg-GlJknFcfdOB82PTF .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .rough-node .label,#mermaid-svg-GlJknFcfdOB82PTF .node .label,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label,#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label{text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .node.clickable{cursor:pointer;}#mermaid-svg-GlJknFcfdOB82PTF .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .arrowheadPath{fill:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-GlJknFcfdOB82PTF .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-GlJknFcfdOB82PTF .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .cluster text{fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster span{color:#333;}#mermaid-svg-GlJknFcfdOB82PTF div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-GlJknFcfdOB82PTF .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF rect.text{fill:none;stroke-width:0;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape,#mermaid-svg-GlJknFcfdOB82PTF .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape p,#mermaid-svg-GlJknFcfdOB82PTF .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label rect,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-GlJknFcfdOB82PTF .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-GlJknFcfdOB82PTF :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Kubelet syncPod
Volume Plugin Types
VolumeManager
Add Pod Volumes
Get Pod Status
Compare DSW vs ASW
DSW has, ASW missing
DSW has, ASW missing
ASW has, DSW missing
ASW has, DSW missing
Execute
Update
DesiredStateOfWorldPopulator
100ms loop
DesiredStateOfWorld Cache
ActualStateOfWorld Cache
Reconciler
100ms loop
OperationExecutor
Diff?
Attach Volume
Mount Volume
Unmount Volume
Detach Volume
VolumePlugin
Attach/Mount/Unmount/Detach
CSI Plugin
In-Tree Plugin
NFS Plugin
HostPath Plugin
syncPod
WaitForAttachAndMount
Timeout: 2m3s
图6:Pod 状态同步流程图
渲染错误: Mermaid 渲染失败: Parse error on line 23: ...>/api/v1/namespaces/{ns}/pods/{name}/sta -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'DIAMOND_START'
图7:PLEG (Pod Lifecycle Event Generator) 架构图
渲染错误: Mermaid 渲染失败: Parse error on line 5: ...odRecords
UID → {old, current}] -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'DIAMOND_START'
图8:Probe 健康检查流程图
#mermaid-svg-ocgNSMdWIrzhrQ5A{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-ocgNSMdWIrzhrQ5A .error-icon{fill:#552222;}#mermaid-svg-ocgNSMdWIrzhrQ5A .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-ocgNSMdWIrzhrQ5A .marker{fill:#333333;stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .marker.cross{stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-ocgNSMdWIrzhrQ5A p{margin:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label text{fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label span{color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label span p{background-color:transparent;}#mermaid-svg-ocgNSMdWIrzhrQ5A .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A span{fill:#333;color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node rect,#mermaid-svg-ocgNSMdWIrzhrQ5A .node circle,#mermaid-svg-ocgNSMdWIrzhrQ5A .node ellipse,#mermaid-svg-ocgNSMdWIrzhrQ5A .node polygon,#mermaid-svg-ocgNSMdWIrzhrQ5A .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .rough-node .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label{text-anchor:middle;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .rough-node .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label{text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node.clickable{cursor:pointer;}#mermaid-svg-ocgNSMdWIrzhrQ5A .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .arrowheadPath{fill:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster text{fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster span{color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-ocgNSMdWIrzhrQ5A .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A rect.text{fill:none;stroke-width:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape p,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label rect,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-ocgNSMdWIrzhrQ5A :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Result Managers
Updates Channel
Updates Channel
Updates Channel
Result Processing
count < threshold
count >= FailureThreshold
count >= SuccessThreshold
Yes
No
Yes
No
Yes
Probe Result
Consecutive
Same Result?
Continue Waiting
Set Failure
Set Success
liveness?
Kill Container
→ PLEG detects ContainerDied
→ syncPod restarts
readiness?
Set Container Not Ready
→ Remove from Service Endpoints
startup?
Block Liveness/Readiness
Container Ready
Worker Per Probe
Initial Delay
HTTP
TCP
Exec
worker goroutine
Ticker: PeriodSeconds
Skip until
InitialDelaySeconds
Probe Type?
HTTP Get
followNonLocalRedirects=false
TCP Dial
Exec in Container
ProbeManager
AddPod
RemovePod
prober.Manager
Register workers
per container per probe type
Stop & Remove workers
livenessManager
Initial: Success
readinessManager
Initial: Failure
startupManager
Initial: Unknown
syncLoopIteration
图9:Image 管理流程图
#mermaid-svg-Zjgp7AP2PvUBa2T5{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .error-icon{fill:#552222;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .marker.cross{stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 p{margin:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label text{fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label span{color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label span p{background-color:transparent;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 span{fill:#333;color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node rect,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node circle,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node ellipse,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node polygon,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .rough-node .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label{text-anchor:middle;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .rough-node .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label{text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node.clickable{cursor:pointer;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .arrowheadPath{fill:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster text{fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster span{color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 rect.text{fill:none;stroke-width:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape p,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label rect,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Zjgp7AP2PvUBa2T5 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Image Record Tracking
Image GC
No
Yes
Yes
No
Yes
No
Detect new image
Image in use
Image removed
Image Pull
Yes
No
EnsureImageExists
Auth
Registry Auth
imagePuller
Serialize Pulls?
QPS/Burst Limited
Queue
Parallel Pull
Pull via CRI
RemoteImageService
PullImage gRPC
Container Runtime
credentialProvider
Keyring + Plugin
ImageGCManager
Start goroutine
5min period
GarbageCollect
Get ImageFs Usage %
Usage > High
Threshold?
No GC needed
Sort images by
lastUsed time
oldest first
For each unused image
Usage < Low
Threshold?
GC Complete
Delete Image via CRI
Sandbox Image?
Skip - never delete sandbox
RemoveImage gRPC
imageRecord
firstDetected time
lastUsed time
size bytes
Add Record
Update lastUsed
Delete Record
图10:Eviction 驱逐机制流程图
渲染错误: Mermaid 渲染失败: Parse error on line 21: ...s
- BestEffort (first)
- Burst -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'PS'
五、关键代码逐行解析
5.1 PodConfig 多源汇聚
go
// pkg/kubelet/config/config.go
func NewPodConfig(mode PodConfigNotificationMode, recorder record.EventRecorder) *PodConfig {
cfg := &PodConfig{
pods: map[string][]*v1.Pod{}, // 每个来源的 Pod 列表
mux: &podMux{}, // 多路复用器
updates: make(chan kubetypes.PodUpdate, 50), // 更新通道,缓冲50
sources: sets.String{}, // 已见过的来源
}
// 启动汇聚协程
go cfg.mux.loop(ctx, cfg.updates)
return cfg
}
// Channel() 为每个来源创建独立通道,mux.loop() 将多个通道合并到 updates
func (c *PodConfig) Channel(source string) chan<- interface{} {
ch := make(chan interface{}, 50)
c.mux.add(source, ch)
return ch
}
// mux.loop() 从所有来源通道读取,合并后发送到 updates
func (m *podMux) loop(ctx, updates) {
for {
select {
case <-m.sources[0]: // 实际使用 reflect.Select 动态选择
// 合并所有来源,发送增量更新
}
}
}
5.2 PodWorkers 异步同步
go
// pkg/kubelet/pod_workers.go
func (p *podWorkers) UpdatePod(options *UpdatePodOptions) {
p.podLock.Lock()
defer p.podLock.Unlock()
podUID := options.Pod.UID
// 若该 Pod 没有独立协程,创建一个
if _, exists := p.podUpdates[podUID]; !exists {
ch := make(chan UpdatePodOptions, 1)
p.podUpdates[podUID] = ch
go p.managePodLoop(podUID, ch) // 每个 Pod 一个 goroutine
}
// 若 Pod 正在工作,存储为 lastUndeliveredWorkUpdate(丢弃中间状态)
if p.isWorking[podUID] {
p.lastUndeliveredWorkUpdate[podUID] = *options
return
}
// 直接发送到 Pod 通道
p.podUpdates[podUID] <- *options
}
func (p *podWorkers) managePodLoop(podUID types.UID, updates chan UpdatePodOptions) {
for update := range updates {
p.isWorking[podUID] = true
// 执行 syncPodFn(即 Kubelet.syncPod)
err := p.syncPodFn(syncPodOptions{
mirrorPod: update.MirrorPod,
pod: update.Pod,
podStatus: status,
updateType: update.UpdateType,
killPodOptions: update.KillPodOptions,
})
// 有未投递的工作则继续处理
if undelivered, ok := p.lastUndeliveredWorkUpdate[podUID]; ok {
p.podUpdates[podUID] <- undelivered
delete(p.lastUndeliveredWorkUpdate, podUID)
} else {
p.isWorking[podUID] = false
}
}
}
关键设计:
- 每 Pod 一个协程:避免全局锁,提高并发性
- 中间状态丢弃 :若 Pod 正在同步,新的更新只保留最新的(
lastUndeliveredWorkUpdate) - BackOff 处理:同步失败时使用指数退避
5.3 容器运行时远程连接
go
// pkg/kubelet/cri/remote/remote_runtime.go
func NewRemoteRuntimeService(endpoint string, connectionTimeout time.Duration) (internalapi.RuntimeService, error) {
addr, dialer, err := util.GetAddressAndDialer(endpoint) // Unix socket 或 TCP
conn, err := grpc.DialContext(ctx, addr,
grpc.WithInsecure(),
grpc.WithContextDialer(dialer),
grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(maxMsgSize)),
)
return &remoteRuntimeService{
timeout: connectionTimeout,
runtimeClient: runtimeapi.NewRuntimeServiceClient(conn), // gRPC Stub
}, nil
}
5.4 Node 状态上报
go
// pkg/kubelet/kubelet_node_status.go
func (kl *Kubelet) registerWithAPIServer() {
if kl.registrationCompleted { return }
for {
time.Sleep(step) // 指数退避,最大7s
node, err := kl.initialNode(context.TODO()) // 构造 Node 对象
registered := kl.tryRegisterWithAPIServer(node)
if registered {
kl.registrationCompleted = true
return
}
}
}
func (kl *Kubelet) syncNodeStatus() {
kl.syncNodeStatusMux.Lock()
defer kl.syncNodeStatusMux.Unlock()
// 首次注册
if kl.kubeClient != nil && !kl.registrationCompleted {
kl.registerWithAPIServer()
}
// 周期性更新
if kl.lastStatusReportTime.Add(kl.nodeStatusReportFrequency).Before(kl.clock.Now()) {
kl.fastStatusUpdateOnce() // CIDR + Runtime + NodeStatus
}
// 同步节点状态
kl.tryUpdateNodeStatus()
}
5.5 Cgroup 管理与资源分配
ContainerManager 内嵌 CPU Manager、Memory Manager、Device Manager、Topology Manager:
go
// pkg/kubelet/cm/container_manager_linux.go
type containerManager struct {
cgroupManager CgroupManager
cpuManager cpumanager.Manager
memoryManager memorymanager.Manager
deviceManager devicemanager.Manager
topologyManager topologymanager.Manager
}
func (cm *containerManager) Start(node, activePods, sourcesReady, statusManager, runtimeService) {
// 1. 初始化 cgroup 树
cm.cgroupManager.Create(...)
// 2. 启动 CPU Manager
cm.cpuManager.Start(...)
// 3. 启动 Memory Manager
cm.memoryManager.Start(...)
// 4. 启动 Device Manager
cm.deviceManager.Start(...)
// 5. 启动 Topology Manager
cm.topologyManager.Start(...)
// 6. QoS cgroup 管理
go cm.qosContainerManager.Run(...)
}
CPU Manager 策略:
none:默认,不做特殊 CPU 分配static:为 Guaranteed Pod 的整数 CPU 请求分配独占 CPU 核心
Topology Manager 策略:
none:不做 NUMA 感知best-effort:尽量满足,不满足也不拒绝restricted:必须满足,否则拒绝 Podsingle-numa-node:所有资源必须在同一 NUMA 节点
六、总结
6.1 核心设计模式
| 模式 | 应用场景 |
|---|---|
| Controller 模式 | syncLoop → syncPod(Observe → Diff → Act) |
| Producer-Consumer | PodConfig → syncLoop;PLEG → syncLoop;ProbeManager → syncLoop |
| Desired vs Actual State | VolumeManager(DSW/ASW 双缓存) |
| Per-Worker Goroutine | PodWorkers(每 Pod 一个协程) |
| Chain of Responsibility | admitHandlers(Eviction → Sysctl → ActiveDeadline → ResourceAllocate) |
| Event Sourcing | PLEG 通过事件序列反映容器状态变化 |
| Exponential Backoff | 运行时错误、容器重启、镜像拉取 |
| Dependency Injection | Dependencies 容器注入所有外部依赖 |
6.2 关键常量与配置
| 常量 | 值 | 含义 |
|---|---|---|
plegRelistPeriod |
1s | PLEG 重列表周期 |
plegChannelCapacity |
1000 | PLEG 事件通道容量 |
relistThreshold |
3min | PLEG 健康检查阈值 |
housekeepingPeriod |
2s | 清理循环周期 |
backOffPeriod |
10s | 基础退避周期 |
MaxContainerBackOff |
300s | 最大退避周期 |
ContainerGCPeriod |
60s | 容器 GC 周期 |
ImageGCPeriod |
300s | 镜像 GC 周期 |
evictionMonitoringPeriod |
10s | 驱逐监控周期 |
podAttachAndMountTimeout |
2m3s | Volume 挂载等待超时 |
6.3 线程模型
Kubelet 启动后运行以下关键 goroutine:
- syncLoop(主循环)--- 永不退出
- PodWorkers(每 Pod 一个)--- 跟随 Pod 生命周期
- PLEG relist(1s 周期)
- syncNodeStatus(按 nodeStatusUpdateFrequency)
- fastStatusUpdateOnce(启动时快速路径)
- nodeLeaseController(按 lease 续约间隔)
- updateRuntimeUp(5s 周期)
- containerGC(1min 周期)
- imageGC(5min 周期)
- evictionManager(10s 周期)
- volumeManager reconciler(100ms 周期)
- DSW populator(100ms 周期)
- probeManager workers(按 Pod 容器探针 PeriodSeconds)
- cloudResourceSyncManager
- pluginManager.Run
- containerLogManager
- podKiller(1s 周期)
- shutdownManager
总计约 18+ 类长期运行的 goroutine,加上每个 Pod/容器的探针 worker,构成了 Kubelet 复杂的并发模型。
本文档基于 Kubernetes 源码严格分析,所有代码引用均来自 cmd/kubelet/ 和 pkg/kubelet/ 下的实际 Go 源文件。