k8s的集群调度

k8s的集群调度:

scheduler: 负责调度资源，把pod调度到node节点。

预算策略

优先策略

List-watch

k8s集群当中,通过list-watch的机制进行每个组件的协作，保持数据同步,每个组件之间的解耦。

kubectl配置文件，向APIserver发送命令---apiserver把命令发送各个组件。

调度过程和策略：

schedule是k8s集群的调度器，把pod分配到集群的节点，以下几个问题：

1、公平，每个节点能够分配资源

2、资源高效利用: 集群当中的资源可以被最大化使用

3、效率:调度的性能要搞好，能够尽快的完成大批量的pod的调度工作

4、灵活: 允许用户根据自己的需求，控制和改变调度的逻辑

scheduler是一个单独运行的程序，启动之后就会一直监听apiserver，获取报文中的字段：spec.nodeName

创建pod时，为每个pod创建一个binding，表示该往哪个节点上部署。

创建pod到节点时。有两个策略，先执行预算策略，再执行优先策略，这两步的操作都必须成功，否则立刻返回报错。

也就是说，部署的node，必须满足这两个策略：预算策略和优先策路

预算策略

predicate 自带一些算法，选择node节点（scheduler 自带的算法策略，不需要人为干预 ）

1、podfitsresources: pod适应资源，检查节点上的剩余资源是否满足pod请求的资源，主要是cpu和内存

2、podditshost：适应主机，如果pod指定了node的name，nginx1要部署在node1，检测主机名是否存在，存在要和pod指定的名称匹配。才可以调度过去

3、podselectormatches:pod选择器匹配，创建pod的时候可以根据node的标签来进行匹配。查找指定的node节点上标签是否存在，存在的标签是否匹配。

4、nodiskconflict:无磁盘冲突，确保已挂载的卷于pod的卷不发生冲突，除非目录是只读，才会覆盖。

如果预算策略不满足，pod将始终处于Pending状态，不断的重试调度，直到有节点满足条件为止。

node1 node2 node3

经过预算策略，上述三个节点都满条件，那该怎么办? ------》优先。

优先策略

1、leastquestedpriority：最底请求优先级，通过算法计算节点上的cpu和内存的使用率，确定节点的权重

使用率越低的节点相应的权重越高，调度时更倾向于使用率的节点，实现资源合理的利用

2、balanceresourceallocation:平衡资源分配，cpu和内存的使用率，给节点赋予权重，权重算的是cpu和内存使用率接近。权重越高。

和上面的leastrequestedpriority最低请求优先级一起使用

node1 cpu和内存使用率: 20:60

node2 cpu和内存使用率: 50:50

node2再被调度时会被优先。

3、imagelocalitypriority：节点是否已经有了要部署的镜像，镜像的总数成正比，满足的镜像数越多，权重越好。

以上的策略scheduler自带的算法。0
通过预算策略选择出可以部署的节点，再通过优先选择出来最好的节点，以上都是自带的算法。

指定节点

指定了节点，会跳过scheduler调度策略，这个策略是强制匹配。

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      nodeName: node01
#通过节点名来指定容器部署在哪台节点

指定标签：

指定节点标签，是要经过scheduler，如果节点不满足条件，pod会进入pendine状态。直到节点满足条件为止。

查看标签

kubectl get nodes --show-labels

自定义标签：

kubectl label nodes node02 test1=c

删除标签：

kubectl label nodes master01 test-

复写

kubectl label nodes node02 test1=6 --overwrite

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        test: c
      containers:
      - image: nginx:1.22
        name: nginx

亲和性

亲和性分为：节点亲和性 pod亲和性

软策略和硬策略

node节点的亲和性：

preferredDuringSchedulingIgnoredDuringExecution 软策略：

选择node节点时，我声明了我最好部署在node1上，软策略会尽量满足这个条件。不一定会完全部署在node01节点上。

requiredDuringSchedulingIgnoredDuringExecution 硬策略：

选择pod时，声明了node01，我是硬策略，必须满足硬策略的条件。必须部罢在node01.强制性要求。

pod的亲和性：

preferredDuringSchedulingIgnoredDuringExecution 软策略：

要求调度器将pod调度到其他pod的亲和性匹配的节点上。可以是，也可以不是，尽量满足

requiredDuringSchedulingIgnoredDuringExecution 硬策略：

要求调度器将pod调度到其他pod的pod nginx1 node01

pod nginx2

pod nginx2必须要和nginx的亲和性匹配，只能往node01.亲和性匹配的节点上。必须往node01

键值的运算关系：

标签：都是更具标签来选择亲和性。

In：在

选择的标签值，在node节点上存在

Notin：不在

选择label的值不在node节点上

Gt 大于，大于选择的标签值

Lt 小于，小于选择的标签值

Exists 存在，选择标签对象，不考虑值

DoesNotExist: 选择不具有指定标签的对象。不考虑值。

node节点的亲和性：

软策略：多个软策略，执行权重高的软策略

硬策略无法满足，状态是pending

硬策略

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
#选择亲和性部署的方式
        nodeAffinity:
#选择的是node节点的亲和性
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
#选择亲和性的策略，你要选择哪个node作为硬策略，匹配的节点的标签
            - matchExpressions:
#定义一个符合我要选择的node节点的信息
              - key: test
                operator: In
#指定键值对的算法，指定键值对的算法为 Exists 和 DoesNotExist不能使用values字段
                values:
                - c

NotIn

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: NotIn
                values:
                - c

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: Gt
                values:
                - "900"

Gt Lt 只能比较整数值

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: Lt
                values:
                - "1201"

exists

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: Exists

DoesNotExist

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: DoesNotExist

软策略

复制代码

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.22
        name: nginx
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: memory
                operator: DoesNotExist

weight：多个软策略，谁的权重值高（值越大优先级越高），就执行谁

面试题：

你在部署pod的时候选择什么样的策略：

node亲和性：性能不一致，用软策略在性能高的多部署。

节点故障或节点维护，只能选择硬策略，把故障排除在外。