【kubernetes v1.21】(五)Kubelet 组件超深度分析

Kubelet 组件超深度分析

基于 Kubernetes 源码逐行级专业分析

源码路径:cmd/kubelet/ + pkg/kubelet/ + staging/src/k8s.io/kubelet/


一、模块定位

1.1 业务职责

Kubelet 是 Kubernetes 集群中运行在每个 Node 上的核心节点代理,其根本职责可以概括为:

"让期望状态(Desired State)变为实际状态(Actual State)"

具体职责涵盖:

职责域 说明
Pod 生命周期管理 监听 Pod 配置变更,创建/更新/删除 Pod 及其容器
容器运行时交互 通过 CRI (Container Runtime Interface) 与容器运行时通信
节点状态上报 周期性向 API Server 汇报 Node 状态(Ready、MemoryPressure、DiskPressure 等)
Volume 管理 负责 Volume 的 Attach/Detach 和 Mount/Unmount(当 EnableControllerAttachDetach=false 时)
资源驱逐 节点资源不足时按优先级驱逐 Pod
健康检查 执行 Liveness/Readiness/Startup Probe
镜像管理 拉取镜像、镜像垃圾回收
容器垃圾回收 清理已退出容器
设备管理 通过 Device Plugin 框架管理 GPU/FPGA 等扩展资源
证书管理 支持 TLS Bootstrap 和证书自动轮转
Pod 拓扑分布 CPU Manager / Memory Manager / Topology Manager 实现 NUMA 感知资源分配
节点关闭 优雅处理节点关机事件

源码注释(cmd/kubelet/kubelet.go)明确声明:

go 复制代码
// The kubelet binary is responsible for maintaining a set of containers on a
// particular host VM. It syncs data from both configuration file(s) as well as
// from a quorum of etcd servers. It then communicates with the container runtime
// (or a CRI shim for the runtime) to see what is currently running. It synchronizes
// the configuration data, with the running set of containers by starting or stopping containers.

1.2 在系统中的位置

复制代码
┌───────────────────────────────────────────────────────────────┐
│                     API Server (Control Plane)                 │
│                         ↑ ↓                                    │
│                    List/Watch Pod                               │
│                    Update Node Status                           │
│                    Create/Update Lease                          │
└───────────────────────────┬───────────────────────────────────┘
                            │
                 ┌──────────▼──────────┐
                 │      Kubelet        │  ← 本分析核心
                 │  (Node Agent)       │
                 └──┬───────┬───────┬──┘
                    │       │       │
           ┌────────▼┐ ┌───▼───┐ ┌▼──────────┐
           │  CRI     │ │cAdvisor│ │Volume     │
           │  Runtime │ │(Stats) │ │Plugin     │
           │(container│ │        │ │(CSI/In-   │
           │d,cri-o)  │ │        │ │  Tree)    │
           └────┬─────┘ └────────┘ └───────────┘
                │
           ┌────▼─────┐
           │ Container │
           │ Engine    │
           │(runc/etc) │
           └───────────┘

Kubelet 处于 Control Plane 与容器运行时之间的中间层,是节点上所有 Kubernetes 工作负载的守护者。


二、模块整体结构

2.1 核心数据结构

Kubelet 主结构体 (pkg/kubelet/kubelet.go)

Kubelet 结构体是整个组件的核心,拥有 70+ 个字段,管理所有子模块:

go 复制代码
type Kubelet struct {
    kubeletConfiguration kubeletconfiginternal.KubeletConfiguration

    hostname        string
    hostnameOverridden bool
    nodeName        types.NodeName
    runtimeCache    kubecontainer.RuntimeCache
    kubeClient      clientset.Interface
    heartbeatClient clientset.Interface
    rootDirectory   string

    // 核心子模块
    podWorkers      PodWorkers           // Pod 异步同步工作器
    podManager      kubepod.Manager      // Pod 状态管理器
    statusManager   status.Manager       // Pod 状态同步到 API Server
    volumeManager   volumemanager.VolumeManager  // Volume 挂载管理
    evictionManager eviction.Manager     // 驱逐管理
    containerManager cm.ContainerManager // cgroup/资源管理
    probeManager    prober.Manager       // 健康探针管理
    imageManager    images.ImageGCManager // 镜像管理
    containerGC     kubecontainer.GC     // 容器垃圾回收
    pleg            pleg.PodLifecycleEventGenerator  // Pod 生命周期事件生成器

    // 探针结果管理器
    livenessManager  proberesults.Manager
    readinessManager proberesults.Manager
    startupManager   proberesults.Manager

    // 运行时
    containerRuntime kubecontainer.Runtime
    streamingRuntime kubecontainer.StreamingRuntime
    runtimeService   internalapi.RuntimeService

    // 准入链
    admitHandlers    lifecycle.PodAdmitHandlers
    softAdmitHandlers lifecycle.PodAdmitHandlers
    PodSyncLoopHandlers lifecycle.PodSyncLoopHandlers
    PodSyncHandlers     lifecycle.PodSyncHandlers

    // 其他关键字段
    cadvisor         cadvisor.Interface
    oomWatcher       oomwatcher.Watcher
    pluginManager    pluginmanager.PluginManager
    secretManager    secret.Manager
    configMapManager configmap.Manager
    serverCertificateManager certificate.Manager
    nodeLeaseController lease.Controller
    shutdownManager  *nodeshutdown.Manager

    // 同步与状态
    workQueue        queue.WorkQueue
    backOff          *flowcontrol.Backoff
    podKiller        PodKiller
    podCache         kubecontainer.Cache
    reasonCache      *ReasonCache
    runtimeState     *runtimeState
    sourcesReady     config.SourcesReady

    // 节点状态相关
    nodeStatusUpdateFrequency time.Duration
    nodeStatusReportFrequency time.Duration
    syncNodeStatusMux         sync.Mutex
    nodeIPs                   []net.IP

    // ...
}
Dependencies 依赖注入容器 (pkg/kubelet/kubelet.go)
go 复制代码
type Dependencies struct {
    Options            []Option
    Auth               server.AuthInterface
    CAdvisorInterface  cadvisor.Interface
    Cloud              cloudprovider.Interface
    ContainerManager   cm.ContainerManager
    DockerOptions      *DockerOptions
    EventClient        v1core.EventsGetter
    HeartbeatClient    clientset.Interface
    OnHeartbeatFailure func()
    KubeClient         clientset.Interface
    Mounter            mount.Interface
    HostUtil           hostutil.HostUtils
    OOMAdjuster        *oom.OOMAdjuster
    OSInterface        kubecontainer.OSInterface
    PodConfig          *config.PodConfig
    Recorder           record.EventRecorder
    Subpather          subpath.Interface
    VolumePlugins      []volume.VolumePlugin
    DynamicPluginProber volume.DynamicPluginProber
    TLSOptions         *server.TLSOptions
    KubeletConfigController *kubeletconfig.Controller
    RemoteRuntimeService internalapi.RuntimeService
    RemoteImageService   internalapi.ImageManagerService
    dockerLegacyService  legacy.DockerLegacyService
    useLegacyCadvisorStats bool
}

Dependencies 是运行时构建的依赖注入容器,将所有外部依赖集中管理,便于测试和解耦。

SyncHandler 接口
go 复制代码
type SyncHandler interface {
    HandlePodAdditions(pods []*v1.Pod)
    HandlePodUpdates(pods []*v1.Pod)
    HandlePodRemoves(pods []*v1.Pod)
    HandlePodReconcile(pods []*v1.Pod)
    HandlePodSyncs(pods []*v1.Pod)
    HandlePodCleanups() error
}

Kubelet 实现了 SyncHandler 接口,作为 syncLoop 的事件处理器。

Bootstrap 接口
go 复制代码
type Bootstrap interface {
    GetConfiguration() kubeletconfiginternal.KubeletConfiguration
    BirthCry()
    StartGarbageCollection()
    ListenAndServe(...)
    ListenAndServeReadOnly(...)
    ListenAndServePodResources()
    Run(<-chan kubetypes.PodUpdate)
    RunOnce(<-chan kubetypes.PodUpdate) ([]RunPodResult, error)
}

2.2 核心接口定义

接口 位置 职责
container.Runtime container/runtime.go 容器运行时抽象(SyncPod/KillPod/GetPods等)
internalapi.RuntimeService CRI API CRI Runtime Service gRPC 客户端
internalapi.ImageManagerService CRI API CRI Image Service gRPC 客户端
pleg.PodLifecycleEventGenerator pleg/pleg.go Pod 生命周期事件生成器
status.Manager status/status_manager.go Pod 状态管理与 API Server 同步
volumemanager.VolumeManager volumemanager/volume_manager.go Volume Attach/Mount 管理
eviction.Manager eviction/eviction_manager.go 资源驱逐管理
prober.Manager prober/prober_manager.go 健康探针管理
cm.ContainerManager cm/container_manager.go cgroup/资源管理
pluginmanager.PluginManager pluginmanager/plugin_manager.go 插件注册管理
pod.Manager pod/pod_manager.go Pod 元数据管理
secret.Manager secret/secret_manager.go Secret 缓存管理
configmap.Manager configmap/configmap_manager.go ConfigMap 缓存管理

2.3 核心方法清单

Kubelet 主方法
方法 作用
NewMainKubelet() 构造 Kubelet 实例及所有子模块
Run(updates) Kubelet 主入口,启动所有模块
syncLoop(updates, handler) 核心同步循环,永不返回
syncLoopIteration(...) 单次迭代,从5个通道读取并分发事件
syncPod(o) 单个 Pod 的同步事务脚本
HandlePodAdditions/Updates/Removes/Reconcile/Syncs SyncHandler 实现
StartGarbageCollection() 启动容器/镜像 GC
updateRuntimeUp() 检测运行时是否就绪
syncNodeStatus() 同步节点状态到 API Server
canAdmitPod() Pod 准入检查
deletePod() 异步删除 Pod
dispatchWork() 将 Pod 分发到 PodWorker
initializeModules() 初始化不依赖运行时的模块
initializeRuntimeDependentModules() 初始化依赖运行时的模块
kubeGenericRuntimeManager 方法(CRI 适配层)
方法 作用
SyncPod() 同步 Pod 到运行时(核心方法)
KillPod() 终止 Pod
startContainer() 启动单个容器(pull→create→start→postStart hook)
killContainer() 终止单个容器
createPodSandbox() 创建 Pod Sandbox
generatePodSandboxConfig() 生成 Sandbox 配置
generateContainerConfig() 生成容器配置

2.4 内部调用关系

复制代码
main()
  └─ app.NewKubeletCommand()
       └─ Run()
            └─ run()
                 ├─ UnsecuredDependencies()       // 构建依赖
                 ├─ buildKubeletClientConfig()     // 构建 API Server 客户端
                 ├─ cm.NewContainerManager()       // 构建容器管理器
                 ├─ PreInitRuntimeService()        // 初始化 CRI 连接
                 └─ RunKubelet()
                      └─ createAndInitKubelet()
                           ├─ kubelet.NewMainKubelet()   // 构造 Kubelet
                           ├─ BirthCry()
                           └─ StartGarbageCollection()
                      └─ startKubelet()
                           ├─ k.Run(podCfg.Updates())    // 启动主循环
                           ├─ k.ListenAndServe()         // HTTP 服务器
                           └─ k.ListenAndServePodResources() // PodResources gRPC

2.5 数据流入流出方式

数据流入:

  1. PodConfig 通道 (<-chan kubetypes.PodUpdate):三个来源汇聚

    • API Server(ListWatch)→ config.NewSourceApiserver()
    • 静态文件目录 → config.NewSourceFile()
    • HTTP URL → config.NewSourceURL()
  2. PLEG 事件通道 (<-chan *pleg.PodLifecycleEvent):容器状态变化事件

  3. 探针结果通道:livenessManager/readinessManager/startupManager 的 Updates()

  4. 定时器通道:syncTicker(1s)、housekeepingTicker(2s)

数据流出:

  1. Pod 状态更新:statusManager → API Server PATCH
  2. Node 状态更新:syncNodeStatus → API Server PATCH/UPDATE
  3. Node Lease 更新:nodeLeaseController → API Server
  4. 事件记录:recorder → API Server Events
  5. CRI gRPC 调用:runtimeService/imageService → 容器运行时

三、核心业务逻辑深度解析

3.1 Kubelet 启动流程

3.1.1 入口与命令构造

cmd/kubelet/kubelet.go 中的 main() 函数极其简洁:

go 复制代码
func main() {
    rand.Seed(time.Now().UnixNano())
    command := app.NewKubeletCommand()
    logs.InitLogs()
    defer logs.FlushLogs()
    if err := command.Execute(); err != nil {
        os.Exit(1)
    }
}

NewKubeletCommand() 构造 cobra.Command,关键设计决策:

  • DisableFlagParsing: true :禁用 Cobra 内置 flag 解析,因为 Kubelet 有flag 优先级规则(命令行 flag > 配置文件 > 默认值),需要手动处理
  • 构建 KubeletFlags(命令行标志)和 KubeletConfiguration(配置文件参数)两套配置
  • 加载配置文件后,通过 kubeletConfigFlagPrecedence() 重新解析命令行以确保 flag 优先级
3.1.2 Run() 函数完整流程
go 复制代码
func Run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Dependencies, ...) error {
    // 1. 设置日志格式
    logOption := logs.NewOptions()
    logOption.LogFormat = s.Logging.Format
    logOption.Apply()

    // 2. 记录版本
    klog.InfoS("Kubelet version", "kubeletVersion", version.Get())

    // 3. OS 特定初始化(Windows Service 等)
    initForOS(...)

    // 4. 进入主运行函数
    run(ctx, s, kubeDeps, featureGate)
}

run() 函数是真正的启动逻辑,执行步骤如下:

Step 1: 设置 Feature Gate

go 复制代码
err = utilfeature.DefaultMutableFeatureGate.SetFromMap(s.KubeletConfiguration.FeatureGates)

Step 2: 校验配置

go 复制代码
options.ValidateKubeletServer(s)

Step 3: 获取文件锁(防止多实例)

go 复制代码
flock.Acquire(s.LockFilePath)

Step 4: 注册 /configz 端点

go 复制代码
initConfigz(&s.KubeletConfiguration)

Step 5: 判断 standalone 模式

go 复制代码
standaloneMode := len(s.KubeConfig) > 0  // false = standalone

Step 6: 构建依赖(若未提供)

go 复制代码
kubeDeps, err = UnsecuredDependencies(s, featureGate)

Step 7: 初始化云提供商

go 复制代码
cloud, err := cloudprovider.InitCloudProvider(s.CloudProvider, s.CloudConfigFile)
kubeDeps.Cloud = cloud

Step 8: 确定 Node 名称

go 复制代码
hostName, err := nodeutil.GetHostname(s.HostnameOverride)
nodeName, err := getNodeName(kubeDeps.Cloud, hostName)  // 云提供商可能重写

Step 9: 构建 API Server 客户端(支持 TLS Bootstrap)

go 复制代码
clientConfig, closeAllConns, err := buildKubeletClientConfig(ctx, s, nodeName)
kubeDeps.KubeClient = clientset.NewForConfig(clientConfig)
kubeDeps.EventClient = ...       // 独立的事件客户端(有 QPS 限制)
kubeDeps.HeartbeatClient = ...   // 独立的心跳客户端(无 QPS 限制,有超时)

Step 10: 构建 Auth

go 复制代码
auth, runAuthenticatorCAReload, err := BuildAuth(nodeName, kubeDeps.KubeClient, ...)

Step 11: 初始化 cAdvisor

go 复制代码
kubeDeps.CAdvisorInterface, err = cadvisor.New(...)

Step 12: 构建 ContainerManager(包含 CPU/Memory/Topology/Device Manager)

go 复制代码
kubeDeps.ContainerManager, err = cm.NewContainerManager(...)

Step 13: 预初始化运行时服务(建立 CRI gRPC 连接)

go 复制代码
kubelet.PreInitRuntimeService(...)
  // 对 Docker: 启动 dockershim
  // 对 Remote: 无操作
  // 建立 gRPC 客户端:
  kubeDeps.RemoteRuntimeService = remote.NewRemoteRuntimeService(endpoint, timeout)
  kubeDeps.RemoteImageService = remote.NewRemoteImageService(endpoint, timeout)

Step 14: RunKubelet(创建并启动 Kubelet 实例)

go 复制代码
RunKubelet(s, kubeDeps, s.RunOnce)
  // → createAndInitKubelet()
  //   → NewMainKubelet()     // 构造完整 Kubelet
  //   → BirthCry()           // 发送启动事件
  //   → StartGarbageCollection()  // 启动 GC
  // → startKubelet()
  //   → go k.Run(podCfg.Updates())   // 主同步循环
  //   → go k.ListenAndServe(...)     // HTTP/HTTPS 服务
  //   → go k.ListenAndServePodResources()  // PodResources gRPC
3.1.3 NewMainKubelet() 构造详解

这是整个 Kubelet 最复杂的构造函数,创建约 30 个子模块

  1. 校验参数:rootDirectory、SyncFrequency、IPTables 参数
  2. Node Informer:启动 Node 的 SharedInformer
  3. PodConfigmakePodSourceConfig() 创建三源 Pod 配置
  4. Secret/ConfigMap Manager:根据变更检测策略选择 Watch/TTLCache/Get
  5. Prober 结果管理器:livenessManager、readinessManager、startupManager
  6. PodCachekubecontainer.NewCache()
  7. PodManagerkubepod.NewBasicPodManager()
  8. StatusManagerstatus.NewManager()
  9. ContainerRuntimekuberuntime.NewKubeGenericRuntimeManager()
  10. RuntimeCachekubecontainer.NewRuntimeCache()
  11. PLEGpleg.NewGenericPLEG()
  12. ContainerGCkubecontainer.NewContainerGC()
  13. ImageManagerimages.NewImageGCManager()
  14. ProbeManagerprober.NewManager()
  15. VolumePluginMgrNewInitializedVolumePluginMgr()
  16. PluginManagerpluginmanager.NewPluginManager()
  17. VolumeManagervolumemanager.NewVolumeManager()
  18. PodWorkersnewPodWorkers(klet.syncPod, ...)
  19. WorkQueuequeue.NewBasicWorkQueue()
  20. EvictionManagereviction.NewManager()
  21. 准入链:admitHandlers、softAdmitHandlers
  22. NodeLeaseControllerlease.NewController()
  23. ShutdownManagernodeshutdown.NewManager()
  24. OOMWatcheroomwatcher.NewWatcher()
  25. DNS Configurerdns.NewConfigurer()
  26. StatsProvider:根据运行时类型选择 CRI/cAdvisor Stats Provider
  27. ContainerLogManagerlogs.NewContainerLogManager()
  28. RuntimeClassManagerruntimeclass.NewManager()
  29. ServerCertificateManager:若启用 ServerTLSBootstrap
  30. NodeStatusFuncsdefaultNodeStatusFuncs()

3.2 Kubelet.Run() 主循环启动

go 复制代码
func (kl *Kubelet) Run(updates <-chan kubetypes.PodUpdate) {
    // 1. 设置日志服务器
    if kl.logServer == nil {
        kl.logServer = http.StripPrefix("/logs/", http.FileServer(http.Dir("/var/log/")))
    }

    // 2. 启动云资源同步
    go kl.cloudResourceSyncManager.Run(wait.NeverStop)

    // 3. 初始化不依赖运行时的模块
    kl.initializeModules()
    //   → metrics.Register()        注册 Prometheus 指标
    //   → kl.setupDataDirs()        创建目录结构
    //   → kl.imageManager.Start()   启动镜像管理
    //   → kl.serverCertificateManager.Start()
    //   → kl.oomWatcher.Start()     OOM 监控
    //   → kl.resourceAnalyzer.Start()

    // 4. 启动 Volume Manager
    go kl.volumeManager.Run(kl.sourcesReady, wait.NeverStop)

    // 5. 节点状态同步
    go wait.Until(kl.syncNodeStatus, kl.nodeStatusUpdateFrequency, wait.NeverStop)
    go kl.fastStatusUpdateOnce()  // 快速初始化:立即更新 CIDR/运行时/节点状态

    // 6. Node Lease
    go kl.nodeLeaseController.Run(wait.NeverStop)

    // 7. 运行时状态检查(每5秒)
    go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop)

    // 8. 设置 iptables 规则
    kl.initNetworkUtil()

    // 9. Pod Killer 协程
    go wait.Until(kl.podKiller.PerformPodKillingWork, 1*time.Second, wait.NeverStop)

    // 10. StatusManager 启动
    kl.statusManager.Start()

    // 11. RuntimeClass 同步
    kl.runtimeClassManager.Start(wait.NeverStop)

    // 12. PLEG 启动
    kl.pleg.Start()

    // 13. 进入主同步循环(永不返回)
    kl.syncLoop(updates, kl)
}

3.3 syncLoop 核心同步循环

go 复制代码
func (kl *Kubelet) syncLoop(updates <-chan kubetypes.PodUpdate, handler SyncHandler) {
    klog.InfoS("Starting kubelet main sync loop")

    syncTicker := time.NewTicker(time.Second)          // 1秒
    housekeepingTicker := time.NewTicker(housekeepingPeriod) // 2秒
    plegCh := kl.pleg.Watch()

    // 指数退避参数(运行时错误时)
    const (base = 100*time.Millisecond; max = 5*time.Second; factor = 2)
    duration := base

    // 检查 resolv.conf 限制
    kl.dnsConfigurer.CheckLimitsForResolvConf()

    for {
        // 运行时错误时指数退避
        if err := kl.runtimeState.runtimeErrors(); err != nil {
            time.Sleep(duration)
            duration = time.Duration(math.Min(float64(max), factor*float64(duration)))
            continue
        }
        duration = base  // 成功则重置

        kl.syncLoopMonitor.Store(kl.clock.Now())
        if !kl.syncLoopIteration(updates, handler, syncTicker.C, housekeepingTicker.C, plegCh) {
            break  // 仅当 configCh 关闭时退出
        }
        kl.syncLoopMonitor.Store(kl.clock.Now())
    }
}
syncLoopIteration 详解 --- 五路事件分发

这是 Kubelet 最关键的事件分发逻辑,使用 select5个通道读取:

go 复制代码
func (kl *Kubelet) syncLoopIteration(
    configCh       <-chan kubetypes.PodUpdate,   // Pod 配置变更
    handler        SyncHandler,
    syncCh         <-chan time.Time,              // 周期同步(1s)
    housekeepingCh <-chan time.Time,              // 清理周期(2s)
    plegCh         <-chan *pleg.PodLifecycleEvent // PLEG 事件
) bool {
    select {
    case u, open := <-configCh:
        // Pod 配置来源变更
        if !open { return false }  // 通道关闭 = 退出
        switch u.Op {
        case kubetypes.ADD:
            handler.HandlePodAdditions(u.Pods)
        case kubetypes.UPDATE:
            handler.HandlePodUpdates(u.Pods)
        case kubetypes.REMOVE:
            handler.HandlePodRemoves(u.Pods)
        case kubetypes.RECONCILE:
            handler.HandlePodReconcile(u.Pods)
        case kubetypes.DELETE:
            handler.HandlePodUpdates(u.Pods)  // DELETE = UPDATE(优雅删除)
        }
        kl.sourcesReady.AddSource(u.Source)

    case e := <-plegCh:
        // PLEG 事件:容器状态变化
        if e.Type == pleg.ContainerStarted {
            kl.lastContainerStartedTime.Add(e.ID, time.Now())
        }
        if isSyncPodWorthy(e) {
            if pod, ok := kl.podManager.GetPodByUID(e.ID); ok {
                handler.HandlePodSyncs([]*v1.Pod{pod})
            }
        }
        if e.Type == pleg.ContainerDied {
            kl.cleanUpContainersInPod(e.ID, containerID)
        }

    case <-syncCh:
        // 周期性同步(由 workQueue 驱动)
        podsToSync := kl.getPodsToSync()
        handler.HandlePodSyncs(podsToSync)

    case update := <-kl.livenessManager.Updates():
        // Liveness 探针失败 → 重启容器
        if update.Result == proberesults.Failure {
            handleProbeSync(kl, update, handler, "liveness", "unhealthy")
        }

    case update := <-kl.readinessManager.Updates():
        // Readiness 探针变化 → 更新就绪状态
        ready := update.Result == proberesults.Success
        kl.statusManager.SetContainerReadiness(...)
        handleProbeSync(kl, update, handler, "readiness", ...)

    case update := <-kl.startupManager.Updates():
        // Startup 探针变化 → 更新启动状态
        kl.statusManager.SetContainerStartup(...)
        handleProbeSync(kl, update, handler, "startup", ...)

    case <-housekeepingCh:
        // 清理工作
        handler.HandlePodCleanups()
    }
    return true
}

关键设计要点

  • select 的 case 是伪随机顺序评估的,不保证优先级
  • PLEG 事件过滤:isSyncPodWorthy() 排除 ContainerRemoved(不影响 Pod 状态)
  • DELETE 被视为 UPDATE:因为需要走优雅删除流程
  • 运行时错误时整个循环退避(不处理任何事件)

3.4 syncPod 事务脚本详解

syncPod 是单个 Pod 同步的完整事务脚本,代码约 200 行,步骤极为严格:

go 复制代码
func (kl *Kubelet) syncPod(o syncPodOptions) error {
    pod := o.pod
    mirrorPod := o.mirrorPod
    podStatus := o.podStatus
    updateType := o.updateType

    // ═══ Step 0: Kill 类型直接处理 ═══
    if updateType == kubetypes.SyncPodKill {
        apiPodStatus := killPodOptions.PodStatusFunc(pod, podStatus)
        kl.statusManager.SetPodStatus(pod, apiPodStatus)
        return kl.killPod(pod, nil, podStatus, ...)
    }

    // ═══ Step 1: 检查 Pod 是否正在优雅终止 ═══
    if kl.podKiller.IsPodPendingTerminationByPodName(podFullName) {
        return fmt.Errorf("pod %q is pending termination", podFullName)
    }

    // ═══ Step 2: 记录 Pod Worker 启动延迟 ═══
    if updateType == kubetypes.SyncPodCreate {
        metrics.PodWorkerStartDuration.Observe(...)
    }

    // ═══ Step 3: 生成 API Pod Status ═══
    apiPodStatus := kl.generateAPIPodStatus(pod, podStatus)

    // ═══ Step 4: 记录 Pod 启动延迟 ═══
    // (Pending → Running 时)

    // ═══ Step 5: 准入检查 ═══
    runnable := kl.canRunPod(pod)
    if !runnable.Admit {
        apiPodStatus.Reason = runnable.Reason
        // 设置所有容器 Waiting 原因为 "Blocked"
    }

    // ═══ Step 6: 更新 StatusManager ═══
    kl.statusManager.SetPodStatus(pod, apiPodStatus)

    // ═══ Step 7: 不可运行则 Kill ═══
    if !runnable.Admit || pod.DeletionTimestamp != nil || apiPodStatus.Phase == v1.PodFailed {
        return kl.killPod(pod, nil, podStatus, nil)
    }

    // ═══ Step 8: 网络就绪检查 ═══
    if err := kl.runtimeState.networkErrors(); err != nil && !kubecontainer.IsHostNetworkPod(pod) {
        return fmt.Errorf("%s: %v", NetworkNotReadyErrorMsg, err)
    }

    // ═══ Step 9: 创建/更新 Pod Cgroup ═══
    pcm := kl.containerManager.NewPodContainerManager()
    if !kl.podIsTerminated(pod) {
        // 如果 cgroup 不存在且非首次同步 → 先 kill 再重建
        if !pcm.Exists(pod) && !firstSync {
            kl.killPod(pod, nil, podStatus, nil)
        }
        pcm.EnsureExists(pod)
        kl.containerManager.UpdateQOSCgroups()
    }

    // ═══ Step 10: 静态 Pod 镜像处理 ═══
    if kubetypes.IsStaticPod(pod) {
        // 镜像 Pod 过期则删除,不存在则创建
    }

    // ═══ Step 11: 创建 Pod 数据目录 ═══
    kl.makePodDataDirs(pod)

    // ═══ Step 12: 等待 Volume 挂载 ═══
    if !kl.podIsTerminated(pod) {
        kl.volumeManager.WaitForAttachAndMount(pod)  // 最多等 2m3s
    }

    // ═══ Step 13: 获取 Pull Secrets ═══
    pullSecrets := kl.getPullSecretsForPod(pod)

    // ═══ Step 14: 调用容器运行时 SyncPod ═══
    result := kl.containerRuntime.SyncPod(pod, podStatus, pullSecrets, kl.backOff)

    return nil
}

3.5 CRI 交互 --- kubeGenericRuntimeManager.SyncPod

kuberuntime/kuberuntime_manager.go 中的 SyncPod 是与容器运行时交互的核心方法:

go 复制代码
func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, podStatus *kubecontainer.PodStatus,
    pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) PodSyncResult {

    // Step 1: 计算 Sandbox 和容器变更
    podContainerChanges := m.computePodActions(pod, podStatus)

    // Step 2: Kill 不应存在的 Sandbox
    if podContainerChanges.KillPod {
        killResult := m.killPodWithSyncResult(pod, ...)
        // ...
    }

    // Step 3: Kill 不应存在的容器(如镜像变更、注解变更等)
    for _, containerToKill := range podContainerChanges.ContainersToKill {
        m.killContainer(pod, containerToKill.ID, containerToKill.Name, ...)
    }

    // Step 4: 若需要新建 Sandbox
    if podContainerChanges.CreateSandbox {
        podSandboxID, msg, err = m.createPodSandbox(pod, podContainerChanges.Attempt)
        // 生成 Sandbox 配置并调用 CRI
        // m.runtimeService.RunPodSandbox(podSandboxConfig, runtimeHandler)
    }

    // Step 5: 启动 Init 容器
    for _, idx := range podContainerChanges.InitContainersToStart {
        m.startContainer(podSandboxID, podSandboxConfig, containerStartSpec(initContainer), ...)
    }

    // Step 6: 启动业务容器
    for _, idx := range podContainerChanges.ContainersToStart {
        m.startContainer(podSandboxID, podSandboxConfig, containerStartSpec(container), ...)
    }

    return
}
startContainer 四步曲
go 复制代码
func (m *kubeGenericRuntimeManager) startContainer(...) (string, error) {
    // Step 1: 拉取镜像
    imageRef, msg, err := m.imagePuller.EnsureImageExists(pod, container, pullSecrets, ...)

    // Step 2: 创建容器
    containerConfig, cleanup, err := m.generateContainerConfig(container, pod, ...)
    m.internalLifecycle.PreCreateContainer(pod, container, containerConfig)
    containerID, err := m.runtimeService.CreateContainer(podSandboxID, containerConfig, podSandboxConfig)

    // Step 3: 启动容器
    m.internalLifecycle.PreStartContainer(pod, container, containerID)
    err = m.runtimeService.StartContainer(containerID)

    // Step 4: PostStart Hook
    if container.Lifecycle != nil && container.Lifecycle.PostStart != nil {
        msg, handlerErr := m.runner.Run(kubeContainerID, pod, container, container.Lifecycle.PostStart)
        if handlerErr != nil {
            m.killContainer(pod, kubeContainerID, container.Name, "FailedPostStartHook", ...)
        }
    }
}

3.6 PLEG (Pod Lifecycle Event Generator)

架构设计

PLEG 是 Kubelet 与容器运行时之间的状态桥梁,解决"如何感知容器状态变化"的问题:

go 复制代码
type PodLifecycleEventGenerator interface {
    Start()
    Watch() chan *PodLifecycleEvent
    Healthy() (bool, error)
}

type PodLifecycleEvent struct {
    ID   types.UID
    Type PodLifeCycleEventType  // ContainerStarted/ContainerDied/ContainerRemoved/ContainerChanged/PodSync
    Data interface{}
}
GenericPLEG 实现

GenericPLEG 使用**定期重列表(Relist)**策略,而非事件驱动:

go 复制代码
type GenericPLEG struct {
    relistPeriod time.Duration        // 重列表周期(默认1秒)
    runtime      kubecontainer.Runtime
    eventChannel chan *PodLifecycleEvent
    podRecords   podRecords           // 存储旧/新状态的对比
    relistTime   atomic.Value         // 上次重列表时间
    cache        kubecontainer.Cache
}

relist 流程

  1. 调用 runtime.GetPods(true) 获取所有容器(包括已退出的)
  2. 与上次记录的 podRecords 对比
  3. 对每个容器状态变化生成事件:
    • non-existent → running:ContainerStarted
    • running → exited:ContainerDied
    • exited → non-existent:ContainerRemoved
    • 其他变化:ContainerChanged
  4. 更新 podCache
  5. 将事件发送到 eventChannel

健康检查:如果超过 3 分钟没有成功 relist,PLEG 报告不健康。

3.7 Eviction 驱逐机制

驱逐信号与阈值

Eviction Manager 监控以下信号:

信号 说明
memory.available 节点可用内存
nodefs.available 节点根文件系统可用空间
nodefs.inodesFree 节点根文件系统可用 inode
imagefs.available 镜像文件系统可用空间
imagefs.inodesFree 镜像文件系统可用 inode
pid.available 可用 PID 数

阈值分为 Hard (立即驱逐)和 Soft(有宽限期)。

驱逐流程
go 复制代码
func (m *managerImpl) synchronize(diskInfoProvider, podFunc) []*v1.Pod {
    // 1. 获取节点状态摘要
    summary, err := m.summaryProvider.Get()

    // 2. 观察各信号的当前值
    observations = signalsToObservations(m.signalToRankFunc, summary)

    // 3. 检查哪些阈值被触发
    thresholdsMet = m.thresholdsMet(observations, m.config.Thresholds)

    // 4. 更新节点条件(MemoryPressure/DiskPressure/PIDPressure)
    m.updateNodeConditions(thresholdsMet)

    // 5. 若有新触发的阈值(经过宽限期),开始驱逐
    if len(localThresholds) > 0 {
        // 6. 对 Pod 按优先级排序(BestEffort > Burstable > Guaranteed)
        // 7. 选择 Pod 进行驱逐
        // 8. 优先回收非关键 Pod
    }

    // 9. 尝试节点级回收(如镜像 GC)
    for _, nodeReclaimFuncs := range m.signalToNodeReclaimFuncs {
        nodeReclaimFuncs(...)
    }

    // 10. 驱选中的 Pod
    for _, pod := range podsToEvict {
        m.killPodFunc(pod)
    }
}
准入控制

Eviction Manager 同时实现 PodAdmitHandler 接口:

  • 节点有 MemoryPressure → 拒绝 BestEffort Pod(Critical Pod 除外)
  • 节点有 DiskPressure/PIDPressure → 拒绝所有非 Critical Pod

3.8 Volume 挂载流程

VolumeManager 使用 Desired State + Actual State 双缓存模型:

go 复制代码
type volumeManager struct {
    desiredStateOfWorld cache.DesiredStateOfWorld  // 期望状态
    actualStateOfWorld  cache.ActualStateOfWorld    // 实际状态
    reconciler          reconciler.Reconciler        // 协调器
    desiredStateOfWorldPopulator populator.DesiredStateOfWorldPopulator
    operationExecutor   operationexecutor.OperationExecutor
}

运行流程

  1. DesiredStateOfWorldPopulator:定期从 PodManager 获取 Pod 列表,将需要的 Volume 添加到 DesiredStateOfWorld
  2. Reconciler :循环对比 Desired 与 Actual,执行差异操作:
    • Desired 有但 Actual 没有 → Attach + Mount
    • Actual 有但 Desired 没有 → Unmount + Detach
  3. OperationExecutor:异步执行 Attach/Detach/Mount/Unmount 操作

3.9 Probe 健康检查

Prober 架构
go 复制代码
type prober struct {
    exec          execprobe.Prober
    readinessHTTP httpprobe.Prober
    livenessHTTP  httpprobe.Prober
    startupHTTP   httpprobe.Prober
    tcp           tcpprobe.Prober
    runner        kubecontainer.CommandRunner
}

支持三种探针类型,每种支持 HTTP/TCP/Exec 三种方式:

探针类型 初始值 失败处理
Liveness Success 重启容器
Readiness Failure 从 Service Endpoints 移除
Startup Unknown 阻塞 Liveness/Readiness
Worker 运行

每个容器的每种探针都有一个独立的 worker 协程:

go 复制代码
func (w *worker) run() {
    // 随机初始延迟(防止重启风暴)
    time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod)))

    probeTicker := time.NewTicker(probeTickerPeriod)
    defer probeTicker.Stop()

    for {
        select {
        case <-probeTicker.C:
        case <-w.stopCh:
            return
        case <-w.manualTriggerCh:
        }

        // 跳过 InitialDelaySeconds
        // 跳过 Startup Probe 尚未完成的情况
        result := w.probeManager.probe(w.probeType, w.pod, ...)

        // 结果一致性检查:连续相同结果才变更状态
        if result == w.lastResult {
            w.resultRun++
        } else {
            w.resultRun = 1
        }

        // 最小失败次数 = 3 (maxProbeRetries)
        if w.resultRun < w.spec.SuccessThreshold || w.resultRun < w.spec.FailureThreshold {
            continue
        }

        w.resultsManager.Set(w.containerID, result)
    }
}

3.10 镜像管理

ImageGCManager
go 复制代码
type realImageGCManager struct {
    runtime     container.Runtime
    imageRecords map[string]*imageRecord
    policy      ImageGCPolicy
    statsProvider StatsProvider
    imageCache  imageCache
    sandboxImage string  // Sandbox 镜像免于 GC
}

GC 策略

复制代码
HighThresholdPercent ────  触发 GC
        │
        │  GC 目标:降至 LowThreshold 以下
        │
LowThresholdPercent  ────  GC 停止

GC 流程

  1. 获取镜像文件系统使用率
  2. 若超过 HighThreshold → 触发 GC
  3. 按最近使用时间排序镜像
  4. 逐个删除最久未使用的镜像,直到使用率低于 LowThreshold
  5. 不删除 Sandbox 镜像和正在使用的镜像

四、Mermaid 图表

图1:Kubelet 组件架构图

#mermaid-svg-Khwnjnm9ZNHCr8l4{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .error-icon{fill:#552222;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .marker.cross{stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 p{margin:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label text{fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label span{color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster-label span p{background-color:transparent;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 span{fill:#333;color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node rect,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node circle,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node ellipse,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node polygon,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .rough-node .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label text,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label{text-anchor:middle;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .rough-node .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label,#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label{text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node.clickable{cursor:pointer;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .arrowheadPath{fill:#333333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster text{fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .cluster span{color:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Khwnjnm9ZNHCr8l4 rect.text{fill:none;stroke-width:0;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape p,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .icon-shape .label rect,#mermaid-svg-Khwnjnm9ZNHCr8l4 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Khwnjnm9ZNHCr8l4 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Khwnjnm9ZNHCr8l4 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Khwnjnm9ZNHCr8l4 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Container Runtime
CRI Layer
Sub-Modules
Kubelet Core
API Server
configCh
configCh
configCh
plegCh
syncCh
probeCh
housekeepingCh
ListWatch
Update Status
syncPod
gRPC
gRPC
API Server
main
NewKubeletCommand
Run
NewMainKubelet
Kubelet Struct
syncLoop
syncLoopIteration
HandlePodAdditions
HandlePodUpdates
HandlePodRemoves
PLEG Events
HandlePodSyncs
Probe Updates
HandlePodCleanups
PodWorkers
PodManager
StatusManager
VolumeManager
EvictionManager
ProbeManager
ContainerManager
ImageManager
ContainerGC
PLEG
PluginManager
kubeGenericRuntimeManager
RemoteRuntimeService

gRPC Client
RemoteImageService

gRPC Client
Containerd / CRI-O / Docker

图2:Kubelet 启动流程图

#mermaid-svg-fPsi3MxhxTlJ1Tqo{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .error-icon{fill:#552222;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .marker.cross{stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo p{margin:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label text{fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label span{color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster-label span p{background-color:transparent;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo span{fill:#333;color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node rect,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node circle,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node ellipse,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node polygon,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .rough-node .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label text,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label{text-anchor:middle;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .rough-node .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label,#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label{text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node.clickable{cursor:pointer;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .arrowheadPath{fill:#333333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster text{fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .cluster span{color:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fPsi3MxhxTlJ1Tqo rect.text{fill:none;stroke-width:0;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape p,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .icon-shape .label rect,#mermaid-svg-fPsi3MxhxTlJ1Tqo .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fPsi3MxhxTlJ1Tqo .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fPsi3MxhxTlJ1Tqo .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fPsi3MxhxTlJ1Tqo :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Yes
No
Yes
No
main
NewKubeletCommand
Parse Flags & Config
Config File?
loadConfigFile
Use Default Config
kubeletConfigFlagPrecedence
ValidateKubeletConfiguration
Dynamic Kubelet Config?
BootstrapKubeletConfigController
Skip
UnsecuredDependencies
Build KubeClient

TLS Bootstrap?
Init CloudProvider
Init cAdvisor
NewContainerManager
PreInitRuntimeService

Connect CRI
RunKubelet
NewMainKubelet

30+ sub-modules
BirthCry
StartGarbageCollection
startKubelet
go k.Run

Main Loop
go ListenAndServe

HTTPS Server
go ListenAndServePodResources

gRPC Server
initializeModules
go volumeManager.Run
go syncNodeStatus
go nodeLeaseController.Run
go updateRuntimeUp

5s interval
initializeRuntimeDependentModules
PLEG.Start
syncLoop

NEVER RETURNS

图3:Pod 生命周期管理流程图

#mermaid-svg-a5d57kKAiw6Fk90U{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-a5d57kKAiw6Fk90U .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-a5d57kKAiw6Fk90U .error-icon{fill:#552222;}#mermaid-svg-a5d57kKAiw6Fk90U .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-a5d57kKAiw6Fk90U .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-a5d57kKAiw6Fk90U .marker{fill:#333333;stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .marker.cross{stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-a5d57kKAiw6Fk90U p{margin:0;}#mermaid-svg-a5d57kKAiw6Fk90U .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label text{fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label span{color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster-label span p{background-color:transparent;}#mermaid-svg-a5d57kKAiw6Fk90U .label text,#mermaid-svg-a5d57kKAiw6Fk90U span{fill:#333;color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .node rect,#mermaid-svg-a5d57kKAiw6Fk90U .node circle,#mermaid-svg-a5d57kKAiw6Fk90U .node ellipse,#mermaid-svg-a5d57kKAiw6Fk90U .node polygon,#mermaid-svg-a5d57kKAiw6Fk90U .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .rough-node .label text,#mermaid-svg-a5d57kKAiw6Fk90U .node .label text,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label,#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label{text-anchor:middle;}#mermaid-svg-a5d57kKAiw6Fk90U .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .rough-node .label,#mermaid-svg-a5d57kKAiw6Fk90U .node .label,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label,#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label{text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .node.clickable{cursor:pointer;}#mermaid-svg-a5d57kKAiw6Fk90U .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .arrowheadPath{fill:#333333;}#mermaid-svg-a5d57kKAiw6Fk90U .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-a5d57kKAiw6Fk90U .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-a5d57kKAiw6Fk90U .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster text{fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U .cluster span{color:#333;}#mermaid-svg-a5d57kKAiw6Fk90U div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-a5d57kKAiw6Fk90U .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-a5d57kKAiw6Fk90U rect.text{fill:none;stroke-width:0;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape p,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-a5d57kKAiw6Fk90U .icon-shape .label rect,#mermaid-svg-a5d57kKAiw6Fk90U .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-a5d57kKAiw6Fk90U .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-a5d57kKAiw6Fk90U .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-a5d57kKAiw6Fk90U :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} ADD
UPDATE
REMOVE
DELETE
RECONCILE
No
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Pod Update arrives

via configCh
Op Type?
HandlePodAdditions
HandlePodUpdates
HandlePodRemoves
HandlePodReconcile
PodManager.AddPod
canAdmitPod?
rejectPod

Phase=Failed
dispatchWork

SyncPodCreate
PodManager.UpdatePod
dispatchWork

SyncPodUpdate
PodManager.DeletePod
deletePod

→ podKiller
probeManager.RemovePod
PodWorkers.UpdatePod
syncPod

Transaction Script
SyncPodKill?
killPod
generateAPIPodStatus
canRunPod?
killPod

Status=Blocked
Network Ready?
Return Error
Create/Update Cgroups
Static Pod?
Create/Update Mirror Pod
makePodDataDirs
WaitForAttachAndMount
containerRuntime.SyncPod
Return Result

图4:CRI (Container Runtime Interface) 交互图

Container Runtime RemoteImageService (gRPC) RemoteRuntimeService (gRPC) kubeGenericRuntimeManager Kubelet Container Runtime RemoteImageService (gRPC) RemoteRuntimeService (gRPC) kubeGenericRuntimeManager Kubelet #mermaid-svg-JqR9fcvFlNTDvYqV{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-JqR9fcvFlNTDvYqV .error-icon{fill:#552222;}#mermaid-svg-JqR9fcvFlNTDvYqV .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-JqR9fcvFlNTDvYqV .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-JqR9fcvFlNTDvYqV .marker{fill:#333333;stroke:#333333;}#mermaid-svg-JqR9fcvFlNTDvYqV .marker.cross{stroke:#333333;}#mermaid-svg-JqR9fcvFlNTDvYqV svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-JqR9fcvFlNTDvYqV p{margin:0;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-JqR9fcvFlNTDvYqV .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .sequenceNumber{fill:white;}#mermaid-svg-JqR9fcvFlNTDvYqV #sequencenumber{fill:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-JqR9fcvFlNTDvYqV .messageText{fill:#333;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV .labelText,#mermaid-svg-JqR9fcvFlNTDvYqV .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .loopText,#mermaid-svg-JqR9fcvFlNTDvYqV .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-JqR9fcvFlNTDvYqV .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-JqR9fcvFlNTDvYqV .noteText,#mermaid-svg-JqR9fcvFlNTDvYqV .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-JqR9fcvFlNTDvYqV .actorPopupMenu{position:absolute;}#mermaid-svg-JqR9fcvFlNTDvYqV .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-JqR9fcvFlNTDvYqV .actor-man circle,#mermaid-svg-JqR9fcvFlNTDvYqV line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-JqR9fcvFlNTDvYqV :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Pod SyncFlow altNeed Kill Pod altNeed Create Sandbox altPostStart Hook exists loopFor each container to start SyncPod(pod, podStatus, pullSecrets, backOff)computePodActions(对比期望vs实际)StopPodSandbox(sandboxID)gRPC StopPodSandboxRemovePodSandbox(sandboxID)gRPC RemovePodSandboxgeneratePodSandboxConfig(pod)RunPodSandbox(config, runtimeHandler)gRPC RunPodSandboxsandboxIDsandboxIDPullImage(imageSpec, authConfig, podSandboxConfig)gRPC PullImageimageRefimageRefgenerateContainerConfig(container, pod)CreateContainer(sandboxID, containerConfig, sandboxConfig)gRPC CreateContainercontainerIDcontainerIDStartContainer(containerID)gRPC StartContainerOK/Errorrunner.Run(postStartHook)PodSyncResult

图5:Volume 挂载流程图

#mermaid-svg-GlJknFcfdOB82PTF{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-GlJknFcfdOB82PTF .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-GlJknFcfdOB82PTF .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-GlJknFcfdOB82PTF .error-icon{fill:#552222;}#mermaid-svg-GlJknFcfdOB82PTF .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-GlJknFcfdOB82PTF .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-GlJknFcfdOB82PTF .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-GlJknFcfdOB82PTF .marker{fill:#333333;stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .marker.cross{stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-GlJknFcfdOB82PTF p{margin:0;}#mermaid-svg-GlJknFcfdOB82PTF .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label text{fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label span{color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster-label span p{background-color:transparent;}#mermaid-svg-GlJknFcfdOB82PTF .label text,#mermaid-svg-GlJknFcfdOB82PTF span{fill:#333;color:#333;}#mermaid-svg-GlJknFcfdOB82PTF .node rect,#mermaid-svg-GlJknFcfdOB82PTF .node circle,#mermaid-svg-GlJknFcfdOB82PTF .node ellipse,#mermaid-svg-GlJknFcfdOB82PTF .node polygon,#mermaid-svg-GlJknFcfdOB82PTF .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .rough-node .label text,#mermaid-svg-GlJknFcfdOB82PTF .node .label text,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label,#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label{text-anchor:middle;}#mermaid-svg-GlJknFcfdOB82PTF .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .rough-node .label,#mermaid-svg-GlJknFcfdOB82PTF .node .label,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label,#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label{text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .node.clickable{cursor:pointer;}#mermaid-svg-GlJknFcfdOB82PTF .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .arrowheadPath{fill:#333333;}#mermaid-svg-GlJknFcfdOB82PTF .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-GlJknFcfdOB82PTF .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-GlJknFcfdOB82PTF .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-GlJknFcfdOB82PTF .cluster text{fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF .cluster span{color:#333;}#mermaid-svg-GlJknFcfdOB82PTF div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-GlJknFcfdOB82PTF .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-GlJknFcfdOB82PTF rect.text{fill:none;stroke-width:0;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape,#mermaid-svg-GlJknFcfdOB82PTF .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape p,#mermaid-svg-GlJknFcfdOB82PTF .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-GlJknFcfdOB82PTF .icon-shape .label rect,#mermaid-svg-GlJknFcfdOB82PTF .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-GlJknFcfdOB82PTF .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-GlJknFcfdOB82PTF .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-GlJknFcfdOB82PTF :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Kubelet syncPod
Volume Plugin Types
VolumeManager
Add Pod Volumes
Get Pod Status
Compare DSW vs ASW
DSW has, ASW missing
DSW has, ASW missing
ASW has, DSW missing
ASW has, DSW missing
Execute
Update
DesiredStateOfWorldPopulator

100ms loop
DesiredStateOfWorld Cache
ActualStateOfWorld Cache
Reconciler

100ms loop
OperationExecutor
Diff?
Attach Volume
Mount Volume
Unmount Volume
Detach Volume
VolumePlugin

Attach/Mount/Unmount/Detach
CSI Plugin
In-Tree Plugin
NFS Plugin
HostPath Plugin
syncPod
WaitForAttachAndMount

Timeout: 2m3s

图6:Pod 状态同步流程图

渲染错误: Mermaid 渲染失败: Parse error on line 23: ...>/api/v1/namespaces/{ns}/pods/{name}/sta -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'DIAMOND_START'

图7:PLEG (Pod Lifecycle Event Generator) 架构图

渲染错误: Mermaid 渲染失败: Parse error on line 5: ...odRecords
UID → {old, current}] -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'DIAMOND_START'

图8:Probe 健康检查流程图

#mermaid-svg-ocgNSMdWIrzhrQ5A{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-ocgNSMdWIrzhrQ5A .error-icon{fill:#552222;}#mermaid-svg-ocgNSMdWIrzhrQ5A .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-ocgNSMdWIrzhrQ5A .marker{fill:#333333;stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .marker.cross{stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-ocgNSMdWIrzhrQ5A p{margin:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label text{fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label span{color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster-label span p{background-color:transparent;}#mermaid-svg-ocgNSMdWIrzhrQ5A .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A span{fill:#333;color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node rect,#mermaid-svg-ocgNSMdWIrzhrQ5A .node circle,#mermaid-svg-ocgNSMdWIrzhrQ5A .node ellipse,#mermaid-svg-ocgNSMdWIrzhrQ5A .node polygon,#mermaid-svg-ocgNSMdWIrzhrQ5A .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .rough-node .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label text,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label{text-anchor:middle;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .rough-node .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label,#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label{text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node.clickable{cursor:pointer;}#mermaid-svg-ocgNSMdWIrzhrQ5A .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .arrowheadPath{fill:#333333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster text{fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A .cluster span{color:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-ocgNSMdWIrzhrQ5A .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-ocgNSMdWIrzhrQ5A rect.text{fill:none;stroke-width:0;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape p,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-ocgNSMdWIrzhrQ5A .icon-shape .label rect,#mermaid-svg-ocgNSMdWIrzhrQ5A .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-ocgNSMdWIrzhrQ5A .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-ocgNSMdWIrzhrQ5A .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-ocgNSMdWIrzhrQ5A :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Result Managers
Updates Channel
Updates Channel
Updates Channel
Result Processing
count < threshold
count >= FailureThreshold
count >= SuccessThreshold
Yes
No
Yes
No
Yes
Probe Result
Consecutive

Same Result?
Continue Waiting
Set Failure
Set Success
liveness?
Kill Container

→ PLEG detects ContainerDied

→ syncPod restarts
readiness?
Set Container Not Ready

→ Remove from Service Endpoints
startup?
Block Liveness/Readiness
Container Ready
Worker Per Probe
Initial Delay
HTTP
TCP
Exec
worker goroutine
Ticker: PeriodSeconds
Skip until

InitialDelaySeconds
Probe Type?
HTTP Get

followNonLocalRedirects=false
TCP Dial
Exec in Container
ProbeManager
AddPod
RemovePod
prober.Manager
Register workers

per container per probe type
Stop & Remove workers
livenessManager

Initial: Success
readinessManager

Initial: Failure
startupManager

Initial: Unknown
syncLoopIteration

图9:Image 管理流程图

#mermaid-svg-Zjgp7AP2PvUBa2T5{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .error-icon{fill:#552222;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .marker.cross{stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 p{margin:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label text{fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label span{color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster-label span p{background-color:transparent;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 span{fill:#333;color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node rect,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node circle,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node ellipse,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node polygon,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .rough-node .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label text,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label{text-anchor:middle;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .rough-node .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label,#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label{text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node.clickable{cursor:pointer;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .arrowheadPath{fill:#333333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster text{fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .cluster span{color:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Zjgp7AP2PvUBa2T5 rect.text{fill:none;stroke-width:0;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape p,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .icon-shape .label rect,#mermaid-svg-Zjgp7AP2PvUBa2T5 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zjgp7AP2PvUBa2T5 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Zjgp7AP2PvUBa2T5 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Zjgp7AP2PvUBa2T5 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Image Record Tracking
Image GC
No
Yes
Yes
No
Yes
No
Detect new image
Image in use
Image removed
Image Pull
Yes
No
EnsureImageExists
Auth
Registry Auth
imagePuller
Serialize Pulls?
QPS/Burst Limited

Queue
Parallel Pull
Pull via CRI
RemoteImageService

PullImage gRPC
Container Runtime
credentialProvider

Keyring + Plugin
ImageGCManager
Start goroutine

5min period
GarbageCollect
Get ImageFs Usage %
Usage > High

Threshold?
No GC needed
Sort images by

lastUsed time

oldest first
For each unused image
Usage < Low

Threshold?
GC Complete
Delete Image via CRI
Sandbox Image?
Skip - never delete sandbox
RemoveImage gRPC
imageRecord
firstDetected time
lastUsed time
size bytes
Add Record
Update lastUsed
Delete Record

图10:Eviction 驱逐机制流程图

渲染错误: Mermaid 渲染失败: Parse error on line 21: ...s

  1. BestEffort (first)
  2. Burst -----------------------^ Expecting 'SQE', 'DOUBLECIRCLEEND', 'PE', '-)', 'STADIUMEND', 'SUBROUTINEEND', 'PIPE', 'CYLINDEREND', 'DIAMOND_STOP', 'TAGEND', 'TRAPEND', 'INVTRAPEND', 'UNICODE_TEXT', 'TEXT', 'TAGSTART', got 'PS'

五、关键代码逐行解析

5.1 PodConfig 多源汇聚

go 复制代码
// pkg/kubelet/config/config.go
func NewPodConfig(mode PodConfigNotificationMode, recorder record.EventRecorder) *PodConfig {
    cfg := &PodConfig{
        pods:    map[string][]*v1.Pod{},    // 每个来源的 Pod 列表
        mux:     &podMux{},                  // 多路复用器
        updates: make(chan kubetypes.PodUpdate, 50), // 更新通道,缓冲50
        sources: sets.String{},              // 已见过的来源
    }
    // 启动汇聚协程
    go cfg.mux.loop(ctx, cfg.updates)
    return cfg
}

// Channel() 为每个来源创建独立通道,mux.loop() 将多个通道合并到 updates
func (c *PodConfig) Channel(source string) chan<- interface{} {
    ch := make(chan interface{}, 50)
    c.mux.add(source, ch)
    return ch
}

// mux.loop() 从所有来源通道读取,合并后发送到 updates
func (m *podMux) loop(ctx, updates) {
    for {
        select {
        case <-m.sources[0]:  // 实际使用 reflect.Select 动态选择
            // 合并所有来源,发送增量更新
        }
    }
}

5.2 PodWorkers 异步同步

go 复制代码
// pkg/kubelet/pod_workers.go
func (p *podWorkers) UpdatePod(options *UpdatePodOptions) {
    p.podLock.Lock()
    defer p.podLock.Unlock()

    podUID := options.Pod.UID
    
    // 若该 Pod 没有独立协程,创建一个
    if _, exists := p.podUpdates[podUID]; !exists {
        ch := make(chan UpdatePodOptions, 1)
        p.podUpdates[podUID] = ch
        go p.managePodLoop(podUID, ch)  // 每个 Pod 一个 goroutine
    }

    // 若 Pod 正在工作,存储为 lastUndeliveredWorkUpdate(丢弃中间状态)
    if p.isWorking[podUID] {
        p.lastUndeliveredWorkUpdate[podUID] = *options
        return
    }

    // 直接发送到 Pod 通道
    p.podUpdates[podUID] <- *options
}

func (p *podWorkers) managePodLoop(podUID types.UID, updates chan UpdatePodOptions) {
    for update := range updates {
        p.isWorking[podUID] = true
        // 执行 syncPodFn(即 Kubelet.syncPod)
        err := p.syncPodFn(syncPodOptions{
            mirrorPod:  update.MirrorPod,
            pod:        update.Pod,
            podStatus:  status,
            updateType: update.UpdateType,
            killPodOptions: update.KillPodOptions,
        })
        
        // 有未投递的工作则继续处理
        if undelivered, ok := p.lastUndeliveredWorkUpdate[podUID]; ok {
            p.podUpdates[podUID] <- undelivered
            delete(p.lastUndeliveredWorkUpdate, podUID)
        } else {
            p.isWorking[podUID] = false
        }
    }
}

关键设计

  • 每 Pod 一个协程:避免全局锁,提高并发性
  • 中间状态丢弃 :若 Pod 正在同步,新的更新只保留最新的(lastUndeliveredWorkUpdate
  • BackOff 处理:同步失败时使用指数退避

5.3 容器运行时远程连接

go 复制代码
// pkg/kubelet/cri/remote/remote_runtime.go
func NewRemoteRuntimeService(endpoint string, connectionTimeout time.Duration) (internalapi.RuntimeService, error) {
    addr, dialer, err := util.GetAddressAndDialer(endpoint)  // Unix socket 或 TCP
    
    conn, err := grpc.DialContext(ctx, addr,
        grpc.WithInsecure(),
        grpc.WithContextDialer(dialer),
        grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(maxMsgSize)),
    )
    
    return &remoteRuntimeService{
        timeout:       connectionTimeout,
        runtimeClient: runtimeapi.NewRuntimeServiceClient(conn),  // gRPC Stub
    }, nil
}

5.4 Node 状态上报

go 复制代码
// pkg/kubelet/kubelet_node_status.go
func (kl *Kubelet) registerWithAPIServer() {
    if kl.registrationCompleted { return }
    
    for {
        time.Sleep(step)  // 指数退避,最大7s
        node, err := kl.initialNode(context.TODO())  // 构造 Node 对象
        registered := kl.tryRegisterWithAPIServer(node)
        if registered {
            kl.registrationCompleted = true
            return
        }
    }
}

func (kl *Kubelet) syncNodeStatus() {
    kl.syncNodeStatusMux.Lock()
    defer kl.syncNodeStatusMux.Unlock()
    
    // 首次注册
    if kl.kubeClient != nil && !kl.registrationCompleted {
        kl.registerWithAPIServer()
    }
    
    // 周期性更新
    if kl.lastStatusReportTime.Add(kl.nodeStatusReportFrequency).Before(kl.clock.Now()) {
        kl.fastStatusUpdateOnce()  // CIDR + Runtime + NodeStatus
    }
    
    // 同步节点状态
    kl.tryUpdateNodeStatus()
}

5.5 Cgroup 管理与资源分配

ContainerManager 内嵌 CPU Manager、Memory Manager、Device Manager、Topology Manager:

go 复制代码
// pkg/kubelet/cm/container_manager_linux.go
type containerManager struct {
    cgroupManager      CgroupManager
    cpuManager         cpumanager.Manager
    memoryManager      memorymanager.Manager
    deviceManager      devicemanager.Manager
    topologyManager    topologymanager.Manager
}

func (cm *containerManager) Start(node, activePods, sourcesReady, statusManager, runtimeService) {
    // 1. 初始化 cgroup 树
    cm.cgroupManager.Create(...)
    
    // 2. 启动 CPU Manager
    cm.cpuManager.Start(...)
    
    // 3. 启动 Memory Manager
    cm.memoryManager.Start(...)
    
    // 4. 启动 Device Manager
    cm.deviceManager.Start(...)
    
    // 5. 启动 Topology Manager
    cm.topologyManager.Start(...)
    
    // 6. QoS cgroup 管理
    go cm.qosContainerManager.Run(...)
}

CPU Manager 策略

  • none:默认,不做特殊 CPU 分配
  • static:为 Guaranteed Pod 的整数 CPU 请求分配独占 CPU 核心

Topology Manager 策略

  • none:不做 NUMA 感知
  • best-effort:尽量满足,不满足也不拒绝
  • restricted:必须满足,否则拒绝 Pod
  • single-numa-node:所有资源必须在同一 NUMA 节点

六、总结

6.1 核心设计模式

模式 应用场景
Controller 模式 syncLoop → syncPod(Observe → Diff → Act)
Producer-Consumer PodConfig → syncLoop;PLEG → syncLoop;ProbeManager → syncLoop
Desired vs Actual State VolumeManager(DSW/ASW 双缓存)
Per-Worker Goroutine PodWorkers(每 Pod 一个协程)
Chain of Responsibility admitHandlers(Eviction → Sysctl → ActiveDeadline → ResourceAllocate)
Event Sourcing PLEG 通过事件序列反映容器状态变化
Exponential Backoff 运行时错误、容器重启、镜像拉取
Dependency Injection Dependencies 容器注入所有外部依赖

6.2 关键常量与配置

常量 含义
plegRelistPeriod 1s PLEG 重列表周期
plegChannelCapacity 1000 PLEG 事件通道容量
relistThreshold 3min PLEG 健康检查阈值
housekeepingPeriod 2s 清理循环周期
backOffPeriod 10s 基础退避周期
MaxContainerBackOff 300s 最大退避周期
ContainerGCPeriod 60s 容器 GC 周期
ImageGCPeriod 300s 镜像 GC 周期
evictionMonitoringPeriod 10s 驱逐监控周期
podAttachAndMountTimeout 2m3s Volume 挂载等待超时

6.3 线程模型

Kubelet 启动后运行以下关键 goroutine:

  1. syncLoop(主循环)--- 永不退出
  2. PodWorkers(每 Pod 一个)--- 跟随 Pod 生命周期
  3. PLEG relist(1s 周期)
  4. syncNodeStatus(按 nodeStatusUpdateFrequency)
  5. fastStatusUpdateOnce(启动时快速路径)
  6. nodeLeaseController(按 lease 续约间隔)
  7. updateRuntimeUp(5s 周期)
  8. containerGC(1min 周期)
  9. imageGC(5min 周期)
  10. evictionManager(10s 周期)
  11. volumeManager reconciler(100ms 周期)
  12. DSW populator(100ms 周期)
  13. probeManager workers(按 Pod 容器探针 PeriodSeconds)
  14. cloudResourceSyncManager
  15. pluginManager.Run
  16. containerLogManager
  17. podKiller(1s 周期)
  18. shutdownManager

总计约 18+ 类长期运行的 goroutine,加上每个 Pod/容器的探针 worker,构成了 Kubelet 复杂的并发模型。


本文档基于 Kubernetes 源码严格分析,所有代码引用均来自 cmd/kubelet/pkg/kubelet/ 下的实际 Go 源文件。

相关推荐
LONGZETECH1 小时前
Unity 3D工业级教育软件实战:200+无人机装调任务的碰撞检测与交互落地
3d·unity·架构·游戏引擎·无人机·交互·cocos2d
xier_ran1 小时前
【infra之路】模块三:Kubernetes (上) — 概念、集群搭建、Pod 与 Deployment
云原生·容器·kubernetes
IT策士1 小时前
第 23篇 k8s之Pod:多容器 Pod 与设计模式(Sidecar 等)
设计模式·容器·kubernetes
面汤放盐1 小时前
分布式下的系统,什么是算是好的架构设计
架构
code 小楊1 小时前
AI Agent 进阶范式 Plan-and-Execute 深度详解:原理、架构、实战与工程落地
人工智能·架构
村口张大爷12 小时前
05 — 分层架构与依赖倒置
后端·架构·系统架构
lauo14 小时前
从FunloomAI到ibbot:当你的手机不再是“手机”,而是你的AI副脑和生产节点
人工智能·智能手机·架构·开源·github
零壹AI实验室14 小时前
阶跃星辰Step 3.7 Flash开源实测:196B MoE架构,400 tokens/s是噱头还是真性能?
架构
uzong14 小时前
面试官:如何做好架构设计
后端·架构