9. Redis Operator (2) —— Sentinel部署

0. 简介

上一篇,我们借由Redis的单机部署,学习了一下Operator的基本使用,今天,我们在此基础上,部署一下Redis的Sentinel模式。

Sentinel本质上是为了解决Redis集群的高可用诞生的,一般而言有三种方式实现高可用:

  • Redis Sentinel:利用≥3的奇数个哨兵(Sentinel)节点,在主节点发生故障迁移时,从剩余的从节点中选择主节点,从而实现故障迁移,保证集群运行;其特点是运维复杂度低,在中小规模集群中比较合适;
  • Redis Cluster:通过数据分片,分片内主从保证数据强一致性,分片间多主分担写入,本质上是哈希分片,去中心化的一种方案;适合大中规模的集群;
  • 第三方VIP方案:这种方案就更好理解了,通过给集群对外暴露虚拟IP,再通过内部监控实时选择主节点(读写)和从节点(读)。

其中,Redis Sentinel比较适合我们的学习集群(哈哈,实际只有一台主机),其整体方案如下所示:

graph LR %% 客户端层 Client1([客户端
App1]) Client2([客户端
App2]) ClientN([客户端群组
AppN]) %% Sentinel集群层 subgraph Sentinel Cluster direction LR S1[Sentinel 节点1] S2[Sentinel 节点2] S3[Sentinel 节点3] S1<-->|Gossip协议
PING/PONG|S2 S2<-->|Gossip协议
PING/PONG|S3 S1<-->|Gossip协议
PING/PONG|S3 end %% Redis数据层 subgraph Redis 数据节点 direction BT Master([主节点
Master]) Slave1([从节点1
Slave]) Slave2([从节点2
Slave]) Master==主从复制
SYNC命令==>Slave1 Master==主从复制
SYNC命令==>Slave2 end %% 监控关系 S1-.-|监控心跳
每1秒PING|Master S2-.-|监控心跳
每1秒PING|Master S3-.-|监控心跳
每1秒PING|Master %% 客户端访问路径 Client1-->|1.查询主节点地址|S1 S1-->|2.返回Master地址|Client1 Client1==>|3.直连读写|Master Client2==>|1.查询主节点地址|S2 S2-->|2.返回Slave地址|Client2 Client2==>|3.只读访问|Slave1 %% 故障转移通道 S1===|选举Leader Sentinel|S2 S2===|执行故障转移|Slave1 Slave1-.->|切换为新Master|Master

在我们的例子中,我们也将实现一个如上图所示的方案的Sentinel:包含三个sentinel节点和三个redis节点。

1. 开发环境

所有的开发环境都上一篇相同,但是kind搭建的集群信息修改如下:

yaml 复制代码
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  # port forward 80 on the host to 80 on this node
  extraPortMappings:
  - containerPort: 30950
    hostPort: 80
    # optional: set the bind address on the host
    # 0.0.0.0 is the current default
    listenAddress: "127.0.0.1"
    # optional: set the protocol to one of TCP, UDP, SCTP.
    # TCP is the default
    protocol: TCP
  - containerPort: 30999
    hostPort: 6378
    protocol: TCP
  - containerPort: 31000
    hostPort: 6379
    protocol: TCP
  - containerPort: 31001
    hostPort: 26379
    protocol: TCP
- role: worker
  extraMounts:
  # 主机目录映射到节点容器
  - hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker1
    containerPath: /data
  labels:
    iguochan.io/redis-node: redis1
- role: worker
  extraMounts:
  # 主机目录映射到节点容器
  - hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker2
    containerPath: /data
  labels:
    iguochan.io/redis-node: redis2
- role: worker
  extraMounts:
  # 主机目录映射到节点容器
  - hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker3
    containerPath: /data
  labels:
    iguochan.io/redis-node: redis3

因为我们实现Sentinel集群之后,会使用redis-cli指令去验证,所以我们也对外暴露一下主节点的端口,用于写操作(这实际是不规范的哈,本质上应该由sentinel集群返回redis主节点的地址,然后再根据地址进行访问)。所以我们对外暴露了6378这个端口作为主节点访问端口。

另外,我们对外暴露6379作为集群对外的redis读端口,暴露26379作为sentinel端口。

除此,我们还给每一个node打上了不同的标记,以保证后续使用到的Statefulset对应的不同pod会调度到相对应的node上,这是因为,每个node的存储和配置理应是不同的,我们的redis也不是一个无状态的服务,所以从我浅显的理解上,应该要保证pod调度到对应的机器上。(当然,这没有经过太多的深思熟虑,如果大家有更好的方案,可以在评论区给出)。

2. Operator 开发

2.1 创建API

我们在原有的工程的基础上创建API:

bash 复制代码
$ kubebuilder create api --group cache --version v1 --kind RedisSentinel

2.2 实现Controller

首先我们需要确定一下方案,其基本的方案如下图所示,我们通过生成RedisSentinel的CR,来管理整个集群,其中通过三个不同的服务对外暴露上面说到的主服务端口读端口sentinel端口;另外通过statefuleset来实现对redis以及sentinel的pod管理。

graph TD subgraph Kubernetes Operator[Operator控制器] -->|管理| RedisSentinelCR[RedisSentinel CR] RedisSentinelCR -->|创建| RedisCluster[Redis集群] RedisSentinelCR -->|创建| SentinelCluster[Sentinel集群] subgraph Redis集群 RedisMaster[Redis Master] RedisSlave1[Redis Slave 1] RedisSlave2[Redis Slave 2] RedisMaster -->|数据复制| RedisSlave1 RedisMaster -->|数据复制| RedisSlave2 end subgraph Sentinel集群 Sentinel1[Sentinel 1] Sentinel2[Sentinel 2] Sentinel3[Sentinel 3] Sentinel1 -->|监控| RedisMaster Sentinel2 -->|监控| RedisMaster Sentinel3 -->|监控| RedisMaster end Services[服务暴露] Services --> RedisMasterService[Master服务:6378] Services --> RedisSlaveService[Slave服务:6379] Services --> SentinelService[Sentinel服务:26379] RedisMasterService --> RedisMaster RedisSlaveService --> RedisSlave1 & RedisSlave2 SentinelService --> Sentinel1 & Sentinel2 & Sentinel3 end Client[客户端应用] -->|写请求| RedisMasterService Client -->|读请求| RedisSlaveService Client -->|查询主节点| SentinelService

2.2.1 定义CRD

go 复制代码
/*
Copyright 2025.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
    "k8s.io/apimachinery/pkg/api/resource"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.

// RedisSentinelSpec defines the desired state of RedisSentinel
type RedisSentinelSpec struct {
    // Image: Redis Image
    // +kubebuilder:default="redis:7.0"
    Image string `json:"image,omitempty"`

    // SentinelImage: Sentinel Image
    // +kubebuilder:default="redis:7.0"
    SentinelImage string `json:"sentinelImage,omitempty"`

    // RedisReplicas: Number of Redis instances
    // +kubebuilder:default=3
    RedisReplicas int32 `json:"redisReplicas,omitempty"`

    // SentinelReplicas: Number of Sentinel instances
    // +kubebuilder:default=3
    SentinelReplicas int32 `json:"sentinelReplicas,omitempty"`

    // NodePort: Redis NodePort for external access
    // +kubebuilder:validation:Minimum=30000
    // +kubebuilder:validation:Maximum=32767
    // +kubebuilder:default=30999
    MasterNodePort int32 `json:"masterNodePort,omitempty"`

    // NodePort: Redis NodePort for external access
    // +kubebuilder:validation:Minimum=30000
    // +kubebuilder:validation:Maximum=32767
    // +kubebuilder:default=31000
    NodePort int32 `json:"nodePort,omitempty"`

    // SentinelNodePort: Sentinel NodePort for external access
    // +kubebuilder:validation:Minimum=30000
    // +kubebuilder:validation:Maximum=32767
    // +kubebuilder:default=31001
    SentinelNodePort int32 `json:"sentinelNodePort,omitempty"`

    // Storage configuration
    Storage RedisSentinelStorageSpec `json:"storage,omitempty"`
}

// RedisSentinelStorageSpec defines storage configuration for RedisSentinel
type RedisSentinelStorageSpec struct {
    // Storage size
    // +kubebuilder:default="1Gi"
    Size resource.Quantity `json:"size,omitempty"`

    // Host path directory
    // +kubebuilder:default="/data"
    HostPath string `json:"hostPath,omitempty"`
}

// RedisSentinelStatus defines the observed state of RedisSentinel
type RedisSentinelStatus struct {
    // Deployment phase
    Phase RedisPhase `json:"phase,omitempty"`

    // Redis endpoint
    Endpoint string `json:"endpoint,omitempty"`

    // Sentinel endpoint
    SentinelEndpoint string `json:"sentinelEndpoint,omitempty"`

    // Master node name
    Master string `json:"master,omitempty"`

    LastRoleUpdateTime metav1.Time `json:"lastRoleUpdateTime,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:resource:path=redissentinels,scope=Namespaced,shortName=rss
//+kubebuilder:printcolumn:JSONPath=".status.phase",name=phase,type=string,description="Current phase"
//+kubebuilder:printcolumn:name="RedisEndpoint",type="string",JSONPath=".status.endpoint",description="Redis endpoint"
//+kubebuilder:printcolumn:name="SentinelEndpoint",type="string",JSONPath=".status.sentinelEndpoint",description="Sentinel endpoint"
//+kubebuilder:printcolumn:name="Image",type="string",JSONPath=".spec.image",description="Redis image"
//+kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp",description="Creation time"

// RedisSentinel is the Schema for the redissentinels API
type RedisSentinel struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec   RedisSentinelSpec   `json:"spec,omitempty"`
    Status RedisSentinelStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// RedisSentinelList contains a list of RedisSentinel
type RedisSentinelList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []RedisSentinel `json:"items"`
}

func init() {
    SchemeBuilder.Register(&RedisSentinel{}, &RedisSentinelList{})
}

以上对RedisSentinel的CRD进行了定义,和RedisStandalone相比,多了不少,其中包括Sentinel的镜像,以及我们上面说到的三个端口等。

2.2.2 实现controller

go 复制代码
/*
Copyright 2025.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package controller

import (
    "context"
    "fmt"
    "strings"
    "time"

    cachev1 "github.com/IguoChan/redis-operator/api/v1"
    "github.com/go-redis/redis"
    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
    "k8s.io/apimachinery/pkg/api/errors"
    "k8s.io/apimachinery/pkg/api/resource"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/types"
    "k8s.io/apimachinery/pkg/util/intstr"
    "k8s.io/client-go/tools/record"
    "k8s.io/utils/pointer"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/log"
)

// RedisSentinelReconciler reconciles a RedisSentinel object
type RedisSentinelReconciler struct {
    client.Client
    Scheme   *runtime.Scheme
    Recorder record.EventRecorder
}

const (
    MasterPort   = 6378
    SentinelPort = 26379
)

//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=statefulsets,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;patch
//+kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=persistentvolumeclaims,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=persistentvolumes,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=configmaps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=endpoints,verbs=create;get;list;update;watch;patch
//+kubebuilder:rbac:groups="",resources=events,verbs=create;patch

func (r *RedisSentinelReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    logger := log.FromContext(ctx)
    logger.Info("Reconciling RedisSentinel", "request", req.NamespacedName)

    // Fetch the RedisSentinel instance
    redisSentinel := &cachev1.RedisSentinel{}
    if err := r.Get(ctx, req.NamespacedName, redisSentinel); err != nil {
       if errors.IsNotFound(err) {
          // CR deleted, ignore
          return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
       }
       return ctrl.Result{RequeueAfter: 30 * time.Second}, err
    }

    // 检查是否有 Pod 被删除
    if r.checkPodDeletion(ctx, redisSentinel) {
       logger.Info("Detected pod deletion, waiting for failover")
       return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
    }

    // Reconcile Redis StatefulSet
    if err := r.reconcileRedisStatefulSet(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Redis StatefulSet")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // Reconcile Sentinel StatefulSet
    if err := r.reconcileSentinelStatefulSet(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Sentinel StatefulSet")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // Reconcile Redis Service
    if err := r.reconcileRedisService(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Redis Service")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // Reconcile Sentinel Service
    if err := r.reconcileSentinelService(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Sentinel Service")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // Reconcile ConfigMaps
    if err := r.reconcileConfigMaps(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile ConfigMaps")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // Reconcile Persistent Volumes
    if err := r.reconcilePersistentVolumes(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Persistent Volumes")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // 更新角色标签
    if err := r.updateRedisRoleLabels(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to update Redis role labels")
       r.Recorder.Eventf(redisSentinel, corev1.EventTypeWarning, "LabelUpdateFailed",
          "Failed to update Redis role labels: %v", err)
       return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
    }

    // 创建/更新主节点端点
    if err := r.reconcileRedisMasterEndpoints(ctx, redisSentinel); err != nil {
       logger.Error(err, "Failed to reconcile Redis master Endpoints")
       return ctrl.Result{RequeueAfter: 30 * time.Second}, err
    }

    // Update status
    return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseReady, nil)
}

func (r *RedisSentinelReconciler) reconcileRedisStatefulSet(ctx context.Context, rs *cachev1.RedisSentinel) error {
    logger := log.FromContext(ctx)
    name := rs.Name + "-redis"

    // Get replicas
    replicas := rs.Spec.RedisReplicas
    if replicas < 1 {
       replicas = 3
    }

    sts := &appsv1.StatefulSet{
       ObjectMeta: metav1.ObjectMeta{
          Name:      name,
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "redis"),
       },
       Spec: appsv1.StatefulSetSpec{
          ServiceName: name + "-headless",
          Replicas:    pointer.Int32(replicas),
          Selector: &metav1.LabelSelector{
             MatchLabels: labelsForRedisSentinel(rs.Name, "redis"),
          },
          Template: corev1.PodTemplateSpec{
             ObjectMeta: metav1.ObjectMeta{
                Labels: labelsForRedisSentinel(rs.Name, "redis"),
             },
             Spec: corev1.PodSpec{
                Containers: []corev1.Container{
                   {
                      Name:            "redis",
                      Image:           rs.Spec.Image,
                      ImagePullPolicy: corev1.PullIfNotPresent,
                      Ports: []corev1.ContainerPort{
                         {
                            Name:          "redis",
                            ContainerPort: RedisPort,
                         },
                      },
                      VolumeMounts: []corev1.VolumeMount{
                         {
                            Name:      "redis-config",
                            MountPath: "/redis-config",
                         },
                         {
                            Name:      "redis-data",
                            MountPath: "/data",
                         },
                      },
                      Command: []string{
                         "sh", "-c",
                         "sh /redis-config/init.sh",
                      },
                   },
                },
                Volumes: []corev1.Volume{
                   {
                      Name: "redis-config",
                      VolumeSource: corev1.VolumeSource{
                         ConfigMap: &corev1.ConfigMapVolumeSource{
                            LocalObjectReference: corev1.LocalObjectReference{
                               Name: rs.Name + "-redis-config",
                            },
                         },
                      },
                   },
                },
             },
          },
          VolumeClaimTemplates: []corev1.PersistentVolumeClaim{
             {
                ObjectMeta: metav1.ObjectMeta{
                   Name: "redis-data",
                },
                Spec: corev1.PersistentVolumeClaimSpec{
                   AccessModes: []corev1.PersistentVolumeAccessMode{
                      corev1.ReadWriteOnce,
                   },
                   Resources: corev1.ResourceRequirements{
                      Requests: corev1.ResourceList{
                         corev1.ResourceStorage: rs.Spec.Storage.Size,
                      },
                   },
                   StorageClassName: pointer.String("redis-storage"),
                },
             },
          },
       },
    }

    // Set controller reference
    if err := ctrl.SetControllerReference(rs, sts, r.Scheme); err != nil {
       return err
    }

    // Create or update StatefulSet
    foundSts := &appsv1.StatefulSet{}
    err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, foundSts)
    if err != nil && errors.IsNotFound(err) {
       logger.Info("Creating Redis StatefulSet", "name", sts.Name)
       if err := r.Create(ctx, sts); err != nil {
          return err
       }
    } else if err != nil {
       return err
    } else {
       logger.Info("Updating Redis StatefulSet", "name", sts.Name)
       sts.Spec.DeepCopyInto(&foundSts.Spec)
       if err := r.Update(ctx, foundSts); err != nil {
          return err
       }
    }

    return nil
}

func (r *RedisSentinelReconciler) reconcileSentinelStatefulSet(ctx context.Context, rs *cachev1.RedisSentinel) error {
    logger := log.FromContext(ctx)
    name := rs.Name + "-sentinel"

    // Get replicas
    replicas := rs.Spec.SentinelReplicas
    if replicas < 1 {
       replicas = 3
    }

    sts := &appsv1.StatefulSet{
       ObjectMeta: metav1.ObjectMeta{
          Name:      name,
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "sentinel"),
       },
       Spec: appsv1.StatefulSetSpec{
          ServiceName: name + "-headless",
          Replicas:    pointer.Int32(replicas),
          Selector: &metav1.LabelSelector{
             MatchLabels: labelsForRedisSentinel(rs.Name, "sentinel"),
          },
          Template: corev1.PodTemplateSpec{
             ObjectMeta: metav1.ObjectMeta{
                Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
             },
             Spec: corev1.PodSpec{
                Containers: []corev1.Container{
                   {
                      Name:            "sentinel",
                      Image:           rs.Spec.SentinelImage,
                      ImagePullPolicy: corev1.PullIfNotPresent,
                      Ports: []corev1.ContainerPort{
                         {
                            Name:          "sentinel",
                            ContainerPort: SentinelPort,
                         },
                      },
                      Command: []string{"sh", "-c",
                         "sh /sentinel-config/init.sh",
                      },
                      VolumeMounts: []corev1.VolumeMount{
                         {
                            Name:      "sentinel-config",
                            MountPath: "/sentinel-config",
                         },
                         {
                            Name:      "sentinel-config-dir",
                            MountPath: "/tmp", // 可写目录
                         },
                      },
                   },
                },
                Volumes: []corev1.Volume{
                   {
                      Name: "sentinel-config",
                      VolumeSource: corev1.VolumeSource{
                         ConfigMap: &corev1.ConfigMapVolumeSource{
                            LocalObjectReference: corev1.LocalObjectReference{
                               Name: fmt.Sprintf("%s-sentinel-config", rs.Name),
                            },
                         },
                      },
                   },
                   {
                      Name: "sentinel-config-dir",
                      VolumeSource: corev1.VolumeSource{
                         EmptyDir: &corev1.EmptyDirVolumeSource{},
                      },
                   },
                },
             },
          },
       },
    }

    // Set controller reference
    if err := ctrl.SetControllerReference(rs, sts, r.Scheme); err != nil {
       return err
    }

    // Create or update StatefulSet
    foundSts := &appsv1.StatefulSet{}
    err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, foundSts)
    if err != nil && errors.IsNotFound(err) {
       logger.Info("Creating Sentinel StatefulSet", "name", sts.Name)
       if err := r.Create(ctx, sts); err != nil {
          return err
       }
    } else if err != nil {
       return err
    } else {
       logger.Info("Updating Sentinel StatefulSet", "name", sts.Name)
       sts.Spec.DeepCopyInto(&foundSts.Spec)
       if err := r.Update(ctx, foundSts); err != nil {
          return err
       }
    }

    return nil
}

func (r *RedisSentinelReconciler) reconcileRedisService(ctx context.Context, rs *cachev1.RedisSentinel) error {
    // Headless Service for StatefulSet
    headlessSvc := &corev1.Service{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-redis-headless",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "redis"),
       },
       Spec: corev1.ServiceSpec{
          ClusterIP: corev1.ClusterIPNone,
          Selector:  labelsForRedisSentinel(rs.Name, "redis"),
          Ports: []corev1.ServicePort{
             {
                Name:       "redis",
                Port:       RedisPort,
                TargetPort: intstr.FromInt(int(RedisPort)),
             },
          },
       },
    }

    // NodePort Service for external access
    nodePortSvc := &corev1.Service{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-redis",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "redis"),
       },
       Spec: corev1.ServiceSpec{
          Type:     corev1.ServiceTypeNodePort,
          Selector: labelsForRedisSentinel(rs.Name, "redis"),
          Ports: []corev1.ServicePort{
             {
                Name:       "redis",
                Port:       RedisPort,
                TargetPort: intstr.FromInt(int(RedisPort)),
                NodePort:   rs.Spec.NodePort,
             },
          },
       },
    }

    // NodePort Master Service
    selector := labelsForRedisSentinel(rs.Name, "redis")
    selector["redis-role"] = "master"
    masterNodePortSvc := &corev1.Service{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-redis-master",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "redis"),
       },
       Spec: corev1.ServiceSpec{
          Type:     corev1.ServiceTypeNodePort,
          Selector: selector,
          Ports: []corev1.ServicePort{
             {
                Name:       "redis",
                Port:       MasterPort,
                TargetPort: intstr.FromInt(int(RedisPort)),
                NodePort:   rs.Spec.MasterNodePort,
             },
          },
       },
    }

    // Set controller references
    if err := ctrl.SetControllerReference(rs, headlessSvc, r.Scheme); err != nil {
       return err
    }
    if err := ctrl.SetControllerReference(rs, nodePortSvc, r.Scheme); err != nil {
       return err
    }
    if err := ctrl.SetControllerReference(rs, masterNodePortSvc, r.Scheme); err != nil {
       return err
    }

    // Create or update services
    if err := r.createOrUpdateService(ctx, headlessSvc); err != nil {
       return err
    }
    if err := r.createOrUpdateService(ctx, nodePortSvc); err != nil {
       return err
    }
    return r.createOrUpdateService(ctx, masterNodePortSvc)
}

func (r *RedisSentinelReconciler) reconcileSentinelService(ctx context.Context, rs *cachev1.RedisSentinel) error {
    // Headless Service for StatefulSet
    headlessSvc := &corev1.Service{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-sentinel-headless",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "sentinel"),
       },
       Spec: corev1.ServiceSpec{
          ClusterIP: corev1.ClusterIPNone,
          Selector:  labelsForRedisSentinel(rs.Name, "sentinel"),
          Ports: []corev1.ServicePort{
             {
                Name:       "sentinel",
                Port:       SentinelPort,
                TargetPort: intstr.FromInt(int(SentinelPort)),
             },
          },
       },
    }

    // NodePort Service for external access
    nodePortSvc := &corev1.Service{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-sentinel",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "sentinel"),
       },
       Spec: corev1.ServiceSpec{
          Type:     corev1.ServiceTypeNodePort,
          Selector: labelsForRedisSentinel(rs.Name, "sentinel"),
          Ports: []corev1.ServicePort{
             {
                Name:       "sentinel",
                Port:       SentinelPort,
                TargetPort: intstr.FromInt(int(SentinelPort)),
                NodePort:   rs.Spec.SentinelNodePort,
             },
          },
       },
    }

    // Set controller references
    if err := ctrl.SetControllerReference(rs, headlessSvc, r.Scheme); err != nil {
       return err
    }
    if err := ctrl.SetControllerReference(rs, nodePortSvc, r.Scheme); err != nil {
       return err
    }

    // Create or update services
    if err := r.createOrUpdateService(ctx, headlessSvc); err != nil {
       return err
    }
    return r.createOrUpdateService(ctx, nodePortSvc)
}

func (r *RedisSentinelReconciler) reconcileConfigMaps(ctx context.Context, rs *cachev1.RedisSentinel) error {
    masterHost := fmt.Sprintf("%s-redis-0.%s-redis-headless.%s.svc.cluster.local", rs.Name, rs.Name, rs.Namespace)

    // Redis ConfigMap
    redisCM := &corev1.ConfigMap{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-redis-config",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "redis"),
       },
       Data: map[string]string{
          "redis-master.conf":  redisMasterConfig,
          "redis-replica.conf": redisReplicaConfig,
          "init.sh":            redisInitSh(masterHost),
       },
    }

    sentinelCM := &corev1.ConfigMap{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-sentinel-config",
          Namespace: rs.Namespace,
          Labels:    labelsForRedisSentinel(rs.Name, "sentinel"),
       },
       Data: map[string]string{
          "sentinel.conf": sentinelConfig(masterHost, rs.Spec.SentinelReplicas),
          "init.sh":       sentinelInitConfig,
       },
    }

    // Set controller references
    if err := ctrl.SetControllerReference(rs, redisCM, r.Scheme); err != nil {
       return err
    }
    if err := ctrl.SetControllerReference(rs, sentinelCM, r.Scheme); err != nil {
       return err
    }

    // Create or update ConfigMaps
    if err := r.createOrUpdateConfigMap(ctx, redisCM); err != nil {
       return err
    }
    return r.createOrUpdateConfigMap(ctx, sentinelCM)
}

func (r *RedisSentinelReconciler) reconcilePersistentVolumes(ctx context.Context, rs *cachev1.RedisSentinel) error {
    logger := log.FromContext(ctx)

    replicas := rs.Spec.RedisReplicas
    if replicas < 1 {
       replicas = 3
    }

    // Create PVs for each Redis instance
    for i := 0; i < int(replicas); i++ {
       pvName := fmt.Sprintf("%s-redis-pv-%d", rs.Name, i)
       pvPath := fmt.Sprintf("%s/%s/redis-%d", rs.Spec.Storage.HostPath, rs.Name, i)

       pv := &corev1.PersistentVolume{
          ObjectMeta: metav1.ObjectMeta{
             Name: pvName,
          },
          Spec: corev1.PersistentVolumeSpec{
             Capacity: corev1.ResourceList{
                corev1.ResourceStorage: rs.Spec.Storage.Size,
             },
             AccessModes: []corev1.PersistentVolumeAccessMode{
                corev1.ReadWriteOnce,
             },
             PersistentVolumeReclaimPolicy: corev1.PersistentVolumeReclaimRetain,
             StorageClassName:              "redis-storage",
             PersistentVolumeSource: corev1.PersistentVolumeSource{
                HostPath: &corev1.HostPathVolumeSource{
                   Path: pvPath,
                },
             },
             NodeAffinity: &corev1.VolumeNodeAffinity{
                Required: &corev1.NodeSelector{
                   NodeSelectorTerms: []corev1.NodeSelectorTerm{
                      {
                         MatchExpressions: []corev1.NodeSelectorRequirement{
                            {
                               Key:      "iguochan.io/redis-node",
                               Operator: corev1.NodeSelectorOpIn,
                               Values:   []string{fmt.Sprintf("redis%d", i%3+1)},
                            },
                         },
                      },
                   },
                },
             },
          },
       }

       // Create PV
       if err := r.Create(ctx, pv); err != nil {
          if !errors.IsAlreadyExists(err) {
             logger.Error(err, "Failed to create PV", "name", pv.Name)
             return err
          }
          logger.Info("PV already exists", "name", pv.Name)
       }
    }

    // Create PVs for each Sentinel instance (if needed)
    for i := 0; i < int(rs.Spec.SentinelReplicas); i++ {
       pvName := fmt.Sprintf("%s-sentinel-pv-%d", rs.Name, i)
       pvPath := fmt.Sprintf("%s/%s/sentinel-%d", rs.Spec.Storage.HostPath, rs.Name, i)

       pv := &corev1.PersistentVolume{
          ObjectMeta: metav1.ObjectMeta{
             Name: pvName,
          },
          Spec: corev1.PersistentVolumeSpec{
             Capacity: corev1.ResourceList{
                corev1.ResourceStorage: resource.MustParse("100Mi"),
             },
             AccessModes: []corev1.PersistentVolumeAccessMode{
                corev1.ReadWriteOnce,
             },
             PersistentVolumeReclaimPolicy: corev1.PersistentVolumeReclaimRetain,
             StorageClassName:              "redis-storage",
             PersistentVolumeSource: corev1.PersistentVolumeSource{
                HostPath: &corev1.HostPathVolumeSource{
                   Path: pvPath,
                },
             },
             NodeAffinity: &corev1.VolumeNodeAffinity{
                Required: &corev1.NodeSelector{
                   NodeSelectorTerms: []corev1.NodeSelectorTerm{
                      {
                         MatchExpressions: []corev1.NodeSelectorRequirement{
                            {
                               Key:      "iguochan.io/redis-node",
                               Operator: corev1.NodeSelectorOpIn,
                               Values:   []string{fmt.Sprintf("redis%d", i%3+1)},
                            },
                         },
                      },
                   },
                },
             },
          },
       }

       // Create PV
       if err := r.Create(ctx, pv); err != nil {
          if !errors.IsAlreadyExists(err) {
             logger.Error(err, "Failed to create PV", "name", pv.Name)
             return err
          }
          logger.Info("PV already exists", "name", pv.Name)
       }
    }

    return nil
}

func (r *RedisSentinelReconciler) updateStatus(ctx context.Context, rs *cachev1.RedisSentinel, phase cachev1.RedisPhase, err error) error {
    if err != nil && phase != cachev1.RedisPhaseReady {
       rs.Status.Phase = phase
       _ = r.Status().Update(ctx, rs)
       return fmt.Errorf("err: %+v or phase: %s", err, phase)
    }

    // 1. 确保所有 Sentinel Pods 就绪
    sentinelReady, sentinelErr := r.checkPodsReady(ctx, rs, "sentinel")
    if !sentinelReady {
       return sentinelErr
    }

    // 2. 确保所有 Redis Pods 就绪(主节点必须就绪)
    redisReady, redisErr := r.checkPodsReady(ctx, rs, "redis")
    if !redisReady {
       return redisErr
    }

    // 3. 确保 Sentinel 服务可用
    sentinelSvc, svcErr := r.validateService(ctx, rs, rs.Name+"-sentinel")
    if svcErr != nil {
       return svcErr
    }

    // 4. 确保 Redis 服务可用(指向主节点)
    redisSvc, svcErr := r.validateService(ctx, rs, rs.Name+"-redis")
    if svcErr != nil {
       return svcErr
    }

    // 5. 更新状态
    rs.Status.SentinelEndpoint = fmt.Sprintf("%s:%d", sentinelSvc.Spec.ClusterIP, sentinelSvc.Spec.Ports[0].Port)
    rs.Status.Endpoint = fmt.Sprintf("%s:%d", redisSvc.Spec.ClusterIP, redisSvc.Spec.Ports[0].Port)
    rs.Status.Phase = cachev1.RedisPhaseReady

    return r.Status().Update(ctx, rs)
}

// 辅助函数:检查特定角色的 Pods 状态
func (r *RedisSentinelReconciler) checkPodsReady(ctx context.Context, rs *cachev1.RedisSentinel, role string) (bool, error) {
    podList := &corev1.PodList{}
    labels := client.MatchingLabels{
       "app":       "redis-sentinel",
       "name":      rs.Name,
       "component": role, // "redis" 或 "sentinel"
    }

    if err := r.List(ctx, podList, labels); err != nil {
       r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
          "list %s pods failed: %s", role, err.Error())
       return false, err
    }

    if len(podList.Items) == 0 {
       msg := fmt.Sprintf("no %s pods available", role)
       r.Recorder.Event(rs, corev1.EventTypeNormal, RecordReasonWaiting, msg)
       rs.Status.Phase = cachev1.RedisPhasePending
       return false, r.Status().Update(ctx, rs)
    }

    allReady := true
    for _, pod := range podList.Items {
       if !isPodReady(pod) {
          allReady = false
          break
       }
    }

    if !allReady {
       msg := fmt.Sprintf("not all %s pods are ready", role)
       r.Recorder.Event(rs, corev1.EventTypeNormal, RecordReasonWaiting, msg)
       rs.Status.Phase = cachev1.RedisPhasePending
       return false, r.Status().Update(ctx, rs)
    }

    return true, nil
}

// 辅助函数:检查 Pod 就绪状态
func isPodReady(pod corev1.Pod) bool {
    for _, cond := range pod.Status.Conditions {
       if cond.Type == corev1.PodReady {
          return cond.Status == corev1.ConditionTrue
       }
    }
    return false
}

// 辅助函数:验证服务可用性
func (r *RedisSentinelReconciler) validateService(ctx context.Context, rs *cachev1.RedisSentinel, svcName string) (*corev1.Service, error) {
    svc := &corev1.Service{}
    key := types.NamespacedName{Namespace: rs.Namespace, Name: svcName}

    // 获取 Service 对象
    if err := r.Get(ctx, key, svc); err != nil {
       r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
          "get %s service failed: %s", svcName, err.Error())
       rs.Status.Phase = cachev1.RedisPhaseError
       return nil, r.Status().Update(ctx, rs)
    }

    // 验证对应的 Endpoints
    endpoints := &corev1.Endpoints{}
    if err := r.Get(ctx, key, endpoints); err != nil {
       r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
          "get %s endpoints failed: %s", svcName, err.Error())
       rs.Status.Phase = cachev1.RedisPhaseError
       return nil, r.Status().Update(ctx, rs)
    }

    // 检查可用终端
    if len(endpoints.Subsets) == 0 || len(endpoints.Subsets[0].Addresses) == 0 {
       r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
          "%s service has no endpoints", svcName)
       rs.Status.Phase = cachev1.RedisPhaseError
       return nil, r.Status().Update(ctx, rs)
    }

    return svc, nil
}

// Helper functions
func labelsForRedisSentinel(name, role string) map[string]string {
    return map[string]string{
       "app":       "redis-sentinel",
       "name":      name,
       "component": role,
    }
}

func (r *RedisSentinelReconciler) createOrUpdateService(ctx context.Context, svc *corev1.Service) error {
    foundSvc := &corev1.Service{}
    err := r.Get(ctx, types.NamespacedName{Name: svc.Name, Namespace: svc.Namespace}, foundSvc)
    if err != nil && errors.IsNotFound(err) {
       return r.Create(ctx, svc)
    } else if err != nil {
       return err
    }

    // Preserve existing NodePort if not specified
    if svc.Spec.Type == corev1.ServiceTypeNodePort {
       for i, p := range svc.Spec.Ports {
          foundSvc.Spec.Ports[i].Port = p.Port
          foundSvc.Spec.Ports[i].TargetPort = p.TargetPort
          foundSvc.Spec.Ports[i].NodePort = p.NodePort
       }
    }

    return r.Update(ctx, foundSvc)
}

func (r *RedisSentinelReconciler) createOrUpdateConfigMap(ctx context.Context, cm *corev1.ConfigMap) error {
    foundCM := &corev1.ConfigMap{}
    err := r.Get(ctx, types.NamespacedName{Name: cm.Name, Namespace: cm.Namespace}, foundCM)
    if err != nil && errors.IsNotFound(err) {
       return r.Create(ctx, cm)
    } else if err != nil {
       return err
    }

    foundCM.Data = cm.Data
    return r.Update(ctx, foundCM)
}

func (r *RedisSentinelReconciler) updateRedisRoleLabels(ctx context.Context, rs *cachev1.RedisSentinel) error {
    // 获取所有 Redis Pod
    podList := &corev1.PodList{}
    if err := r.List(ctx, podList, client.MatchingLabels{
       "app":       "redis-sentinel",
       "name":      rs.Name,
       "component": "redis",
    }); err != nil {
       return err
    }

    // 获取主节点地址
    var ip string
    var err error
    sentinelPods := &corev1.PodList{}
    if err = r.List(ctx, sentinelPods, client.MatchingLabels{
       "app":       "redis-sentinel",
       "name":      rs.Name,
       "component": "sentinel",
    }); err != nil || len(sentinelPods.Items) == 0 {
       return fmt.Errorf("list sentinel pods err: %+v or len(sentinelPods.Items) == 0", err)
    }

    if ip, _, err = r.getSentinelMasterAddr(ctx, &sentinelPods.Items[0]); err != nil {
       return err
    }

    // 更新 Pod 标签
    for _, pod := range podList.Items {
       newRole := "slave"
       if pod.Status.PodIP == ip || strings.Contains(ip, pod.Spec.Hostname) {
          newRole = "master"
       }

       if pod.Labels["redis-role"] != newRole {
          patch := client.MergeFrom(pod.DeepCopy())
          if pod.Labels == nil {
             pod.Labels = make(map[string]string)
          }
          pod.Labels["redis-role"] = newRole
          if err := r.Patch(ctx, &pod, patch); err != nil {
             return err
          }
       }
    }
    return nil
}

func (r *RedisSentinelReconciler) reconcileRedisMasterEndpoints(ctx context.Context, rs *cachev1.RedisSentinel) error {
    // 获取主节点 Pod
    podList := &corev1.PodList{}
    if err := r.List(ctx, podList, client.MatchingLabels{
       "app":        "redis-sentinel",
       "name":       rs.Name,
       "component":  "redis",
       "redis-role": "master",
    }); err != nil {
       return err
    }

    if len(podList.Items) == 0 {
       return nil // 没有主节点
    }

    masterPod := podList.Items[0]
    if masterPod.Status.PodIP == "" {
       return fmt.Errorf("reconcileRedisMasterEndpoints: masterPod.Status.PodIP is empty")
    }

    endpoints := &corev1.Endpoints{
       ObjectMeta: metav1.ObjectMeta{
          Name:      rs.Name + "-redis-master",
          Namespace: rs.Namespace,
       },
       Subsets: []corev1.EndpointSubset{
          {
             Addresses: []corev1.EndpointAddress{
                {
                   IP: masterPod.Status.PodIP,
                   TargetRef: &corev1.ObjectReference{
                      Kind:      "Pod",
                      Name:      masterPod.Name,
                      Namespace: masterPod.Namespace,
                   },
                },
             },
             Ports: []corev1.EndpointPort{
                {
                   Port: RedisPort,
                },
             },
          },
       },
    }

    // 设置控制器引用
    if err := ctrl.SetControllerReference(rs, endpoints, r.Scheme); err != nil {
       return err
    }

    // 创建或更新端点
    found := &corev1.Endpoints{}
    err := r.Get(ctx, types.NamespacedName{Name: endpoints.Name, Namespace: endpoints.Namespace}, found)
    if err != nil && errors.IsNotFound(err) {
       return r.Create(ctx, endpoints)
    } else if err != nil {
       return err
    }

    // 比较并更新
    needsUpdate := false
    if len(found.Subsets) == 0 {
       needsUpdate = true
    } else if len(found.Subsets[0].Addresses) == 0 ||
       found.Subsets[0].Addresses[0].IP != masterPod.Status.PodIP {
       needsUpdate = true
    }

    if needsUpdate {
       found.Subsets = endpoints.Subsets
       return r.Update(ctx, found)
    }

    return nil
}

func (r *RedisSentinelReconciler) getSentinelMasterAddr(ctx context.Context, sentinelPod *corev1.Pod) (string, string, error) {
    logger := log.FromContext(ctx)

    sentinelAddr := fmt.Sprintf("%s:%d", sentinelPod.Status.PodIP, SentinelPort)
    sentinelClient := redis.NewSentinelClient(&redis.Options{
       Addr:     sentinelAddr,
       Password: "", // 如果有密码需要添加
       DB:       0,
    })

    // 添加故障转移检测
    var masterIP, masterPort string
    var lastErr error

    // 最多重试 5 次,每次间隔 2 秒
    for i := 0; i < 5; i++ {
       result, err := sentinelClient.GetMasterAddrByName("mymaster").Result()
       if err == nil && len(result) >= 2 {
          masterIP = result[0]
          masterPort = result[1]

          // 验证主节点是否实际存在
          if r.isPodAlive(ctx, masterIP) {
             logger.Info(fmt.Sprintf("getSentinelMasterAddr: %s, %s", masterIP, masterPort))
             return masterIP, masterPort, nil
          }
          logger.Info("Master IP reported but pod not alive", "ip", masterIP)
       } else if err != nil {
          lastErr = err
       }

       // 等待 2 秒后重试
       time.Sleep(2 * time.Second)
    }

    return "", "", fmt.Errorf("failed to get valid master address after 5 attempts: %v", lastErr)
}

// 检查 Pod 是否实际存在
func (r *RedisSentinelReconciler) isPodAlive(ctx context.Context, ip string) bool {
    pods := &corev1.PodList{}
    if err := r.List(ctx, pods); err != nil {
       return false
    }

    for _, pod := range pods.Items {
       if pod.Status.PodIP == ip || strings.Contains(ip, pod.Spec.Hostname) {
          return pod.DeletionTimestamp == nil
       }
    }
    return false
}

func (r *RedisSentinelReconciler) checkPodDeletion(ctx context.Context, rs *cachev1.RedisSentinel) bool {
    // 获取所有 Redis Pod
    podList := &corev1.PodList{}
    if err := r.List(ctx, podList, client.MatchingLabels{
       "app":       "redis-sentinel",
       "name":      rs.Name,
       "component": "redis",
    }); err != nil {
       return false
    }

    // 检查是否有 Pod 正在删除中
    for _, pod := range podList.Items {
       if pod.DeletionTimestamp != nil {
          return true
       }
    }
    return false
}

// SetupWithManager sets up the controller with the Manager.
func (r *RedisSentinelReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
       For(&cachev1.RedisSentinel{}).
       Owns(&appsv1.StatefulSet{}).
       Owns(&corev1.Service{}).
       Owns(&corev1.ConfigMap{}).
       Owns(&corev1.Endpoints{}).
       Complete(r)
}

其中有一些点需要注意一下,比如我们通过给不同的Pod标记其此时是否是主节点,从而给对应的Pod打上对应的标记,从而保证xxx-redis-master服务能找到主节点。

2.2.3 准入控制

bash 复制代码
kubebuilder create webhook --group cache --version v1 --kind RedisSentinel --defaulting --programmatic-validation

通过以上命令给RedisSentinel创建准入控制,其相关设置如下,这里就不赘述了:

go 复制代码
/*
Copyright 2025.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
    "fmt"

    "k8s.io/apimachinery/pkg/api/resource"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/util/validation/field"
    ctrl "sigs.k8s.io/controller-runtime"
    logf "sigs.k8s.io/controller-runtime/pkg/log"
    "sigs.k8s.io/controller-runtime/pkg/webhook"
)

const (
    MinRedisSize    = "1Gi"   // Redis 最小存储大小
    MaxRedisSize    = "100Gi" // Redis 最大存储大小
    MinSentinelSize = "100Mi" // Sentinel 最小存储大小
    MaxSentinelSize = "1Gi"   // Sentinel 最大存储大小
    MinReplicas     = 3       // 最小副本数
    MaxReplicas     = 7       // 最大副本数
    MinQuorum       = 2       // Sentinel 最小仲裁数
)

// log is for logging in this package.
var redissentinellog = logf.Log.WithName("redissentinel-resource")

func (r *RedisSentinel) SetupWebhookWithManager(mgr ctrl.Manager) error {
    return ctrl.NewWebhookManagedBy(mgr).
       For(r).
       Complete()
}

// TODO(user): EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!

//+kubebuilder:webhook:path=/mutate-cache-iguochan-io-v1-redissentinel,mutating=true,failurePolicy=fail,sideEffects=None,groups=cache.iguochan.io,resources=redissentinels,verbs=create;update,versions=v1,name=mredissentinel.kb.io,admissionReviewVersions=v1

var _ webhook.Defaulter = &RedisSentinel{}

// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *RedisSentinel) Default() {
    redissentinellog.Info("default", "name", r.Name)

    // 设置默认Redis镜像
    if r.Spec.Image == "" {
       r.Spec.Image = "redis:7.0"
       redissentinellog.Info("Setting default Redis image", "image", r.Spec.Image)
    }

    // 设置默认Sentinel镜像
    if r.Spec.SentinelImage == "" {
       r.Spec.SentinelImage = "redis:7.0"
       redissentinellog.Info("Setting default Sentinel image", "image", r.Spec.SentinelImage)
    }

    // 设置默认RedisMaster端口
    if r.Spec.MasterNodePort == 0 {
       r.Spec.MasterNodePort = 30999
       redissentinellog.Info("Setting default Redis Master NodePort", "nodePort", r.Spec.MasterNodePort)
    }

    // 设置默认Redis端口
    if r.Spec.NodePort == 0 {
       r.Spec.NodePort = 31000
       redissentinellog.Info("Setting default Redis NodePort", "nodePort", r.Spec.NodePort)
    }

    // 设置默认Sentinel端口
    if r.Spec.SentinelNodePort == 0 {
       r.Spec.SentinelNodePort = 31001
       redissentinellog.Info("Setting default Sentinel NodePort", "sentinelNodePort", r.Spec.SentinelNodePort)
    }

    // 设置默认Redis副本数
    if r.Spec.RedisReplicas == 0 {
       r.Spec.RedisReplicas = 3
       redissentinellog.Info("Setting default Redis replicas", "redisReplicas", r.Spec.RedisReplicas)
    }

    // 设置默认Sentinel副本数
    if r.Spec.SentinelReplicas == 0 {
       r.Spec.SentinelReplicas = 3
       redissentinellog.Info("Setting default Sentinel replicas", "sentinelReplicas", r.Spec.SentinelReplicas)
    }

    // 设置默认存储路径
    if r.Spec.Storage.HostPath == "" {
       r.Spec.Storage.HostPath = "/data"
       redissentinellog.Info("Setting default host path", "hostPath", r.Spec.Storage.HostPath)
    }

    // 设置默认Redis存储大小
    if r.Spec.Storage.Size.IsZero() {
       size := resource.MustParse("1Gi")
       r.Spec.Storage.Size = size
       redissentinellog.Info("Setting default Redis storage size", "size", size.String())
    }
}

// TODO(user): change verbs to "verbs=create;update;delete" if you want to enable deletion validation.
//+kubebuilder:webhook:path=/validate-cache-iguochan-io-v1-redissentinel,mutating=false,failurePolicy=fail,sideEffects=None,groups=cache.iguochan.io,resources=redissentinels,verbs=create;update,versions=v1,name=vredissentinel.kb.io,admissionReviewVersions=v1

var _ webhook.Validator = &RedisSentinel{}

// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateCreate() error {
    redissentinellog.Info("validate create", "name", r.Name)

    return r.validateRedisSentinel()
}

// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateUpdate(old runtime.Object) error {
    redissentinellog.Info("validate update", "name", r.Name)

    oldSentinel, ok := old.(*RedisSentinel)
    if !ok {
       return fmt.Errorf("expected a RedisSentinel object but got %T", old)
    }

    if err := r.validateRedisSentinel(); err != nil {
       return err
    }

    // 验证禁止修改的字段
    if oldSentinel.Spec.Image != r.Spec.Image {
       return field.Forbidden(
          field.NewPath("spec", "image"),
          "Redis image cannot be changed after creation",
       )
    }

    if oldSentinel.Spec.SentinelImage != r.Spec.SentinelImage {
       return field.Forbidden(
          field.NewPath("spec", "sentinelImage"),
          "Sentinel image cannot be changed after creation",
       )
    }

    if oldSentinel.Spec.Storage.HostPath != r.Spec.Storage.HostPath {
       return field.Forbidden(
          field.NewPath("spec", "storage", "hostPath"),
          "hostPath cannot be changed after creation",
       )
    }

    return nil
}

// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateDelete() error {
    redissentinellog.Info("validate delete", "name", r.Name)

    // TODO(user): fill in your validation logic upon object deletion.
    return nil
}

// validateRedisSentinel 执行所有验证逻辑
func (r *RedisSentinel) validateRedisSentinel() error {
    allErrs := field.ErrorList{}

    // 验证Redis存储大小范围
    if err := validateStorageSize(
       r.Spec.Storage.Size,
       MinRedisSize,
       MaxRedisSize,
       field.NewPath("spec", "storage", "size"),
       "Redis"); err != nil {
       allErrs = append(allErrs, err)
    }

    // 验证Sentinel副本数
    if r.Spec.SentinelReplicas < MinReplicas {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "sentinelReplicas"),
          r.Spec.SentinelReplicas,
          fmt.Sprintf("Sentinel replicas must be at least %d", MinReplicas),
       ))
    } else if r.Spec.SentinelReplicas > MaxReplicas {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "sentinelReplicas"),
          r.Spec.SentinelReplicas,
          fmt.Sprintf("Sentinel replicas must be no more than %d", MaxReplicas),
       ))
    }

    // 验证Redis副本数
    if r.Spec.RedisReplicas < MinReplicas {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "redisReplicas"),
          r.Spec.RedisReplicas,
          fmt.Sprintf("Redis replicas must be at least %d", MinReplicas),
       ))
    } else if r.Spec.RedisReplicas > MaxReplicas {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "redisReplicas"),
          r.Spec.RedisReplicas,
          fmt.Sprintf("Redis replicas must be no more than %d", MaxReplicas),
       ))
    }

    // 验证Sentinel仲裁数要求
    if r.Spec.SentinelReplicas < MinQuorum*2-1 {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "sentinelReplicas"),
          r.Spec.SentinelReplicas,
          fmt.Sprintf("Sentinel replicas must be at least %d for a quorum of %d", MinQuorum*2-1, MinQuorum),
       ))
    }

    // 验证RedisMaster端口范围
    if r.Spec.MasterNodePort < 30000 || r.Spec.MasterNodePort > 32767 {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "nodePort"),
          r.Spec.MasterNodePort,
          "Redis master nodePort must be between 30000 and 32767",
       ))
    }

    // 验证Redis端口范围
    if r.Spec.NodePort < 30000 || r.Spec.NodePort > 32767 {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "nodePort"),
          r.Spec.NodePort,
          "Redis nodePort must be between 30000 and 32767",
       ))
    }

    // 验证Sentinel端口范围
    if r.Spec.SentinelNodePort < 30000 || r.Spec.SentinelNodePort > 32767 {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "sentinelNodePort"),
          r.Spec.SentinelNodePort,
          "Sentinel nodePort must be between 30000 and 32767",
       ))
    }

    // 验证主机路径安全
    if !isValidHostPath(r.Spec.Storage.HostPath) {
       allErrs = append(allErrs, field.Invalid(
          field.NewPath("spec", "storage", "hostPath"),
          r.Spec.Storage.HostPath,
          "invalid host path, only /data directory is allowed",
       ))
    }

    if len(allErrs) == 0 {
       return nil
    }

    return allErrs.ToAggregate()
}

// 辅助函数:验证存储大小
func validateStorageSize(size resource.Quantity, min, max string, path *field.Path, resourceType string) *field.Error {
    minSize := resource.MustParse(min)
    maxSize := resource.MustParse(max)

    if size.Cmp(minSize) < 0 {
       return field.Invalid(
          path,
          size.String(),
          fmt.Sprintf("%s storage size must be at least %s", resourceType, min),
       )
    }

    if size.Cmp(maxSize) > 0 {
       return field.Invalid(
          path,
          size.String(),
          fmt.Sprintf("%s storage size must be no more than %s", resourceType, max),
       )
    }

    return nil
}

// 验证仲裁数量的辅助函数
func validateSentinelQuorum(replicas int32) bool {
    // Sentinel需要大多数节点在线才能选举
    // 公式:大多数节点 = (replicas/2) + 1
    return replicas >= MinQuorum && replicas%2 == 1
}

3. 验证

通过一系列命令将CRD发布之后,我们开始验证。首先我们验证基本流程:

主节点端口读写
bash 复制代码
$ redis-cli -h 127.0.0.1 -p 6378
127.0.0.1:6378> get key2
(nil)
127.0.0.1:6378> set key2 hello
OK
从节点端口读
bash 复制代码
$ redis-cli -h 127.0.0.1 -p 6379
127.0.0.1:6379> get key2
"hello"
127.0.0.1:6379> set key2 hello1
(error) READONLY You can't write against a read only replica.

可以发现,从节点只能读不能写;但是这也不是一定的,因为很有可能长连接连接到的是主节点。

sentinel端口
bash 复制代码
$ redis-cli -h 127.0.0.1 -p 26379
127.0.0.1:26379> SENTINEL master mymaster # 验证master
 1) "name"
 2) "mymaster"
 3) "ip"
 4) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
 5) "port"
 6) "6379"
 7) "runid"
 8) "7792152f59bc4716a8d88a76cd39ed19c2bc0c92"
 9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "415"
19) "last-ping-reply"
20) "415"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "7391"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "25018208"
29) "config-epoch"
30) "0"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "10000"
39) "parallel-syncs"
40) "1"
127.0.0.1:26379> SENTINEL slaves mymaster # 验证slave
1)  1) "name"
    2) "10.244.1.5:6379"
    3) "ip"
    4) "10.244.1.5"
    5) "port"
    6) "6379"
    7) "runid"
    8) "099f9411d941e3bfe6888870afd260e9b5eea60e"
    9) "flags"
   10) "slave"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "247"
   19) "last-ping-reply"
   20) "247"
   21) "down-after-milliseconds"
   22) "5000"
   23) "info-refresh"
   24) "7115"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "25029569"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "ok"
   33) "master-host"
   34) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
   35) "master-port"
   36) "6379"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "3549038"
   41) "replica-announced"
   42) "1"
2)  1) "name"
    2) "10.244.3.2:6379"
    3) "ip"
    4) "10.244.3.2"
    5) "port"
    6) "6379"
    7) "runid"
    8) "ae038757a97446ccc7325812d929b7c1e7a3fa0f"
    9) "flags"
   10) "slave"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "247"
   19) "last-ping-reply"
   20) "247"
   21) "down-after-milliseconds"
   22) "5000"
   23) "info-refresh"
   24) "7241"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "25029572"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "ok"
   33) "master-host"
   34) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
   35) "master-port"
   36) "6379"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "3549038"
   41) "replica-announced"
   42) "1"
127.0.0.1:26379> SENTINEL get-master-addr-by-name mymaster
1) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
2) "6379"
failover验证

我们发现此时的主节点是redissentinel-sample-redis-0,这时候我们删了这个节点:

bash 复制代码
$ k get pod --show-labels
NAME                              READY   STATUS    RESTARTS        AGE   LABELS
redissentinel-sample-redis-0      1/1     Running   2 (7h10m ago)   19d   app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=master,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-0
redissentinel-sample-redis-1      1/1     Running   2 (7h10m ago)   19d   app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=slave,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-1
redissentinel-sample-redis-2      1/1     Running   2 (7h10m ago)   19d   app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=slave,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-2
$ k delete pod redissentinel-sample-redis-0
pod "redissentinel-sample-redis-0" deleted

此时我们回到主节点端口:

bash 复制代码
127.0.0.1:6378> get key2
"hello"
127.0.0.1:6378> set key2 hello1
OK
127.0.0.1:6378> get key2
"hello1"

可以看到,主节点端口依然可以进行读写操作,我们再去看从节点端口:

bash 复制代码
127.0.0.1:6379> get key2
"hello1"
127.0.0.1:6379> set key2 hello
(error) READONLY You can't write against a read only replica.

最后再去sentinel端口验证一下此时的主节点:

bash 复制代码
127.0.0.1:26379> SENTINEL master mymaster
 ...
 3) "ip"
 4) "10.244.1.5"
 ...

而这个节点是节点3:

bash 复制代码
$ k get pod -o wide
NAME                              READY   STATUS    RESTARTS        AGE     IP           NODE            NOMINATED NODE   READINESS GATES
redissentinel-sample-redis-0      1/1     Running   0               3m54s   10.244.2.6   multi-worker    <none>           <none>
redissentinel-sample-redis-1      1/1     Running   2 (7h21m ago)   19d     10.244.3.2   multi-worker2   <none>           <none>
redissentinel-sample-redis-2      1/1     Running   2 (7h21m ago)   19d     10.244.1.5   multi-worker3   <none>           <none>

但是这个方案还是有很大问题的,我在多次尝试后会发现:

  1. 后续redis-cli需要重连,因为这些链接是TCP的长连接;
  2. 如果发生了故障转移,可能需要一点时间才能将这个role转移过来,这点应该可以通过更优雅的代码实现,但是这里是做一个demo,我就不深究了,本质上是为了学习Operator的实现。
相关推荐
ai小鬼头2 小时前
AIStarter教你快速打包GPT-SoVITS-v2,解锁AI应用市场新玩法
前端·后端·github
paopaokaka_luck3 小时前
基于SpringBoot+Vue的汽车租赁系统(协同过滤算法、腾讯地图API、支付宝沙盒支付、WebsSocket实时聊天、ECharts图形化分析)
vue.js·spring boot·后端·websocket·算法·汽车·echarts
giao源3 小时前
Spring Boot 整合 Shiro 实现单用户与多用户认证授权指南
java·spring boot·后端·安全性测试
【本人】3 小时前
Django基础(四)———模板常用过滤器
后端·python·django
豌豆花下猫4 小时前
Python 潮流周刊#111:Django迎来 20 周年、OpenAI 前员工分享工作体验(摘要)
后端·python·ai
LaoZhangAI4 小时前
ComfyUI集成GPT-Image-1完全指南:8步实现AI图像创作革命【2025最新】
前端·后端
LaoZhangAI4 小时前
Cline + Gemini API 完整配置与使用指南【2025最新】
前端·后端
LaoZhangAI4 小时前
Cline + Claude API 完全指南:2025年智能编程最佳实践
前端·后端