【星海出品】K8S调度器leader

发现K8S的技术资料越写越多,独立阐述一下K8S-Scheduler-leader

调度器通过Watch机制来发现集群中【新创建】且尚未被调度【unscheduled】到节点上的pod。

由于 Pod 中的容器和 Pod 本身可能有不同的要求,调度程序会过滤掉任何不满足 Pod 特定调度需求的节点。

在集群中找到一个 Pod 的所有可调度节点,然后根据一系列函数对这些可调度节点打分, 选出其中得分最高的节点来运行 Pod。

调度器将这个调度决定通知给 kube-apiserver,这个过程叫做绑定。

bash 复制代码
检查调度器是否正常	
kubectl get pods -n kube-system | grep kube-scheduler

如果调度失败,可通过kubectl describe pod查看。

其基本信息已经存储在etcd中,需通过delete删除pod

bash 复制代码
kubectl delete pod

静态pod运行

/etc/kubernetes/manifests 目录下,例如 kube-scheduler.yaml

yaml 复制代码
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: /etc/kubernetes/kubeconfig
  qps: 100
  burst: 150
profiles:
  - schedulerName: default-scheduler
    plugins:
      postFilter:
        disabled:
          - name: DefaultPreemption
      preFilter:
        enabled:
          - name: CheckCSIStorageCapacity
      filter:
        enabled:
          - name: CheckPodCountLimit
          - name: CheckPodLimitResources
          - name: CheckCSIStorageCapacity
          - name: LvmVolumeCapacity
    pluginConfig:
    - name: CheckPodCountLimit
      args:
        podCountLimit: 2
    - name: CheckPodLimitResources
      args:
        limitRatio:
          cpu: 0.7
          memory: 0.7

调度器默认不启用高可用,需要手动设置

bash 复制代码
# 启动命令示例
kube-scheduler \
  --leader-elect=true \
  --leader-elect-lease-duration=15s \
  --leader-elect-renew-deadline=10s \
  --leader-elect-retry-period=2s

自主编写K8S扩展调度器

https://github.com/kubernetes/kubernetes/tree/master/plugin/pkg/scheduler

go 复制代码
package main

import (
    "context"
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/rest"
    "k8s.io/client-go/tools/clientcmd"
    "k8s.io/api/core/v1"
    "k8s.io/apimachinery/pkg/api/resource"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/klog/v2"
)

func main() {
    // 1. 配置Kubernetes客户端
    config, err := rest.InClusterConfig() // 集群内模式
    // config, err := clientcmd.BuildConfigFromFlags("", "/path/to/kubeconfig") // 集群外模式
    if err != nil {
        klog.Fatal(err)
    }
    clientset := kubernetes.NewForConfigOrDie(config)
 
    // 2. 定义调度逻辑(示例:按CPU资源调度)
    scheduler := NewCustomScheduler(clientset)
    scheduler.Schedule()
}
 
type CustomScheduler struct {
    clientset *kubernetes.Clientset
}
 
func NewCustomScheduler(clientset *kubernetes.Clientset) *CustomScheduler {
    return &CustomScheduler{clientset: clientset}
}
 
func (s *CustomScheduler) Schedule() {
    // 3. 监听待调度Pod
    podList, err := s.clientset.CoreV1().Pods("").List(context.TODO(), metav1.ListOptions{
        FieldSelector: "spec.nodeName="", // 未调度的Pod
    })
    if err != nil {
        klog.Fatal(err)
    }
 
    for _, pod := range podList.Items {
        // 4. 筛选可用节点
        nodes, err := s.clientset.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{})
        if err != nil {
            klog.Fatal(err)
        }
 
        var suitableNodes []v1.Node
        for _, node := range nodes.Items {
            if s.isNodeSuitable(pod, node) {
                suitableNodes = append(suitableNodes, node)
            }
        }
 
        // 5. 选择最优节点(示例:随机选择)
        if len(suitableNodes) > 0 {
            selectedNode := suitableNodes[0] // 实际场景需优化选择算法
            s.bindPodToNode(pod, selectedNode)
        }
    }
}
 
func (s *CustomScheduler) isNodeSuitable(pod v1.Pod, node v1.Node) bool {
    // 示例:检查节点CPU资源是否满足Pod请求
    cpuRequest := pod.Spec.Containers[0].Resources.Requests.Cpu()
    if cpuRequest == nil {
        return true // 无CPU请求,默认允许
    }
 
    nodeCPU := node.Status.Allocatable.Cpu()
    if nodeCPU.Cmp(*cpuRequest) >= 0 {
        return true
    }
    return false
}
 
func (s *CustomScheduler) bindPodToNode(pod v1.Pod, node v1.Node) {
    // 6. 将Pod绑定到选定节点
    binding := &v1.Binding{
        ObjectMeta: metav1.ObjectMeta{
            Name:      pod.Name,
            Namespace: pod.Namespace,
        },
        Target: v1.ObjectReference{
            APIVersion: "v1",
            Kind:       "Node",
            Name:       node.Name,
        },
    }
 
    err := s.clientset.CoreV1().Pods(pod.Namespace).Bind(context.TODO(), binding, metav1.CreateOptions{})
    if err != nil {
        klog.Errorf("Failed to bind Pod %s to Node %s: %v", pod.Name, node.Name, err)
    } else {
        klog.Infof("Pod %s bound to Node %s", pod.Name, node.Name)
    }
}
bash 复制代码
# Dockerfile
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN go mod tidy
RUN CGO_ENABLED=0 GOOS=linux go build -o custom-scheduler
 
FROM alpine:latest
COPY --from=builder /app/custom-scheduler /usr/local/bin/
ENTRYPOINT ["custom-scheduler"]

https://www.qikqiak.com/k8strain/scheduler/overview/

构建镜像并推送至镜像仓库:

bash 复制代码
docker build -t your-dockerhub-id/custom-scheduler:v1 .
docker push your-dockerhub-id/custom-scheduler:v1

部署多用例

yaml 复制代码
# custom-scheduler-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-scheduler
spec:
  replicas: 3 # 部署3个实例
  selector:
    matchLabels:
      app: custom-scheduler
  template:
    metadata:
      labels:
        app: custom-scheduler
    spec:
      containers:
      - name: scheduler
        image: your-dockerhub-id/custom-scheduler:v1
        args:
        - --leader-elect=true # 启用领导者选举
        - --leader-elect-lease-duration=15s
        - --leader-elect-renew-deadline=10s
        - --leader-elect-retry-period=2s
        - --v=2 # 日志级别
        resources:
          limits:
            cpu: "1"
            memory: 512Mi
          requests:
            cpu: "0.5"
            memory: 256Mi

为调度器创建ServiceAccount并绑定ClusterRole:

yaml 复制代码
# custom-scheduler-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: custom-scheduler
  namespace: kube-system
 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-scheduler-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: custom-scheduler
  namespace: kube-system

将调度器注册为K8S的组件

yaml 复制代码
# custom-scheduler-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-scheduler-config
  namespace: kube-system
data:
  scheduler-config.yaml: |
    apiVersion: kubescheduler.config.k8s.io/v1beta3
    kind: KubeSchedulerConfiguration
    leaderElection:
      leaderElect: true
      resourceLock: leases
      resourceName: custom-scheduler
      resourceNamespace: kube-system
bash 复制代码
kubectl get pods -n kube-system | grep custom-scheduler

创建pod

yaml 复制代码
# custom-scheduled-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: custom-scheduled-pod
spec:
  schedulerName: custom-scheduler # 指定使用自定义调度器
  containers:
  - name: nginx
    image: nginx
    resources:
      requests:
        cpu: "500m"

检查

powershell 复制代码
kubectl get pod custom-scheduled-pod -o wide

集成监控

相关推荐
能不能别报错3 小时前
K8s学习笔记(二十四) ingress
笔记·学习·kubernetes
能不能别报错3 小时前
K8s学习笔记(二十三) 网络策略 NetworkPolicy
笔记·学习·kubernetes
suknna4 小时前
记一次 Kubebuilder Operator 开发中的 CRD 注解超限问题
kubernetes
love530love4 小时前
【笔记】Podman Desktop 部署 开源数字人 HeyGem.ai
人工智能·windows·笔记·python·容器·开源·podman
weixin_436525076 小时前
Docker 镜像导出与导入教程(Windows - Linux)
运维·docker·容器
victory04316 小时前
K8S 安装 部署 文档
算法·贪心算法·kubernetes
..Move...13 小时前
快速搭建Docker私有仓库指南
运维·docker·容器
jiuri_121514 小时前
Docker使用详解:在ARM64嵌入式环境部署Python应用
python·docker·容器
Mr.小海15 小时前
gunicorn和docker冲突吗
docker·容器·gunicorn
深思慎考20 小时前
微服务即时通讯系统(服务端)——Speech 语音模块开发(2)
linux·c++·微服务·云原生·架构·语音识别·聊天室项目