0. 简介
上一篇,我们借由Redis的单机部署,学习了一下Operator的基本使用,今天,我们在此基础上,部署一下Redis的Sentinel模式。
Sentinel本质上是为了解决Redis集群的高可用诞生的,一般而言有三种方式实现高可用:
Redis Sentinel
:利用≥3的奇数个哨兵(Sentinel)节点,在主节点发生故障迁移时,从剩余的从节点中选择主节点,从而实现故障迁移,保证集群运行;其特点是运维复杂度低,在中小规模集群中比较合适;Redis Cluster
:通过数据分片,分片内主从保证数据强一致性,分片间多主分担写入,本质上是哈希分片,去中心化的一种方案;适合大中规模的集群;第三方VIP方案
:这种方案就更好理解了,通过给集群对外暴露虚拟IP,再通过内部监控实时选择主节点(读写)和从节点(读)。
其中,Redis Sentinel
比较适合我们的学习集群(哈哈,实际只有一台主机),其整体方案如下所示:
App1]) Client2([客户端
App2]) ClientN([客户端群组
AppN]) %% Sentinel集群层 subgraph Sentinel Cluster direction LR S1[Sentinel 节点1] S2[Sentinel 节点2] S3[Sentinel 节点3] S1<-->|Gossip协议
PING/PONG|S2 S2<-->|Gossip协议
PING/PONG|S3 S1<-->|Gossip协议
PING/PONG|S3 end %% Redis数据层 subgraph Redis 数据节点 direction BT Master([主节点
Master]) Slave1([从节点1
Slave]) Slave2([从节点2
Slave]) Master==主从复制
SYNC命令==>Slave1 Master==主从复制
SYNC命令==>Slave2 end %% 监控关系 S1-.-|监控心跳
每1秒PING|Master S2-.-|监控心跳
每1秒PING|Master S3-.-|监控心跳
每1秒PING|Master %% 客户端访问路径 Client1-->|1.查询主节点地址|S1 S1-->|2.返回Master地址|Client1 Client1==>|3.直连读写|Master Client2==>|1.查询主节点地址|S2 S2-->|2.返回Slave地址|Client2 Client2==>|3.只读访问|Slave1 %% 故障转移通道 S1===|选举Leader Sentinel|S2 S2===|执行故障转移|Slave1 Slave1-.->|切换为新Master|Master
在我们的例子中,我们也将实现一个如上图所示的方案的Sentinel:包含三个sentinel节点和三个redis节点。
1. 开发环境
所有的开发环境都上一篇相同,但是kind搭建的集群信息修改如下:
yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
# port forward 80 on the host to 80 on this node
extraPortMappings:
- containerPort: 30950
hostPort: 80
# optional: set the bind address on the host
# 0.0.0.0 is the current default
listenAddress: "127.0.0.1"
# optional: set the protocol to one of TCP, UDP, SCTP.
# TCP is the default
protocol: TCP
- containerPort: 30999
hostPort: 6378
protocol: TCP
- containerPort: 31000
hostPort: 6379
protocol: TCP
- containerPort: 31001
hostPort: 26379
protocol: TCP
- role: worker
extraMounts:
# 主机目录映射到节点容器
- hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker1
containerPath: /data
labels:
iguochan.io/redis-node: redis1
- role: worker
extraMounts:
# 主机目录映射到节点容器
- hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker2
containerPath: /data
labels:
iguochan.io/redis-node: redis2
- role: worker
extraMounts:
# 主机目录映射到节点容器
- hostPath: /Users/chenyiguo/workspace/k8s/kind/multi_data/worker3
containerPath: /data
labels:
iguochan.io/redis-node: redis3
因为我们实现Sentinel集群之后,会使用redis-cli
指令去验证,所以我们也对外暴露一下主节点的端口,用于写操作(这实际是不规范的哈,本质上应该由sentinel集群返回redis主节点的地址,然后再根据地址进行访问)。所以我们对外暴露了6378
这个端口作为主节点访问端口。
另外,我们对外暴露6379
作为集群对外的redis读端口,暴露26379
作为sentinel端口。
除此,我们还给每一个node打上了不同的标记,以保证后续使用到的Statefulset对应的不同pod会调度到相对应的node上,这是因为,每个node的存储和配置理应是不同的,我们的redis也不是一个无状态的服务,所以从我浅显的理解上,应该要保证pod调度到对应的机器上。(当然,这没有经过太多的深思熟虑,如果大家有更好的方案,可以在评论区给出)。
2. Operator 开发
2.1 创建API
我们在原有的工程的基础上创建API:
bash
$ kubebuilder create api --group cache --version v1 --kind RedisSentinel
2.2 实现Controller
首先我们需要确定一下方案,其基本的方案如下图所示,我们通过生成RedisSentinel
的CR,来管理整个集群,其中通过三个不同的服务对外暴露上面说到的主服务端口
、读端口
和sentinel端口
;另外通过statefuleset
来实现对redis
以及sentinel
的pod管理。
2.2.1 定义CRD
go
/*
Copyright 2025.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1
import (
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// RedisSentinelSpec defines the desired state of RedisSentinel
type RedisSentinelSpec struct {
// Image: Redis Image
// +kubebuilder:default="redis:7.0"
Image string `json:"image,omitempty"`
// SentinelImage: Sentinel Image
// +kubebuilder:default="redis:7.0"
SentinelImage string `json:"sentinelImage,omitempty"`
// RedisReplicas: Number of Redis instances
// +kubebuilder:default=3
RedisReplicas int32 `json:"redisReplicas,omitempty"`
// SentinelReplicas: Number of Sentinel instances
// +kubebuilder:default=3
SentinelReplicas int32 `json:"sentinelReplicas,omitempty"`
// NodePort: Redis NodePort for external access
// +kubebuilder:validation:Minimum=30000
// +kubebuilder:validation:Maximum=32767
// +kubebuilder:default=30999
MasterNodePort int32 `json:"masterNodePort,omitempty"`
// NodePort: Redis NodePort for external access
// +kubebuilder:validation:Minimum=30000
// +kubebuilder:validation:Maximum=32767
// +kubebuilder:default=31000
NodePort int32 `json:"nodePort,omitempty"`
// SentinelNodePort: Sentinel NodePort for external access
// +kubebuilder:validation:Minimum=30000
// +kubebuilder:validation:Maximum=32767
// +kubebuilder:default=31001
SentinelNodePort int32 `json:"sentinelNodePort,omitempty"`
// Storage configuration
Storage RedisSentinelStorageSpec `json:"storage,omitempty"`
}
// RedisSentinelStorageSpec defines storage configuration for RedisSentinel
type RedisSentinelStorageSpec struct {
// Storage size
// +kubebuilder:default="1Gi"
Size resource.Quantity `json:"size,omitempty"`
// Host path directory
// +kubebuilder:default="/data"
HostPath string `json:"hostPath,omitempty"`
}
// RedisSentinelStatus defines the observed state of RedisSentinel
type RedisSentinelStatus struct {
// Deployment phase
Phase RedisPhase `json:"phase,omitempty"`
// Redis endpoint
Endpoint string `json:"endpoint,omitempty"`
// Sentinel endpoint
SentinelEndpoint string `json:"sentinelEndpoint,omitempty"`
// Master node name
Master string `json:"master,omitempty"`
LastRoleUpdateTime metav1.Time `json:"lastRoleUpdateTime,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:resource:path=redissentinels,scope=Namespaced,shortName=rss
//+kubebuilder:printcolumn:JSONPath=".status.phase",name=phase,type=string,description="Current phase"
//+kubebuilder:printcolumn:name="RedisEndpoint",type="string",JSONPath=".status.endpoint",description="Redis endpoint"
//+kubebuilder:printcolumn:name="SentinelEndpoint",type="string",JSONPath=".status.sentinelEndpoint",description="Sentinel endpoint"
//+kubebuilder:printcolumn:name="Image",type="string",JSONPath=".spec.image",description="Redis image"
//+kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp",description="Creation time"
// RedisSentinel is the Schema for the redissentinels API
type RedisSentinel struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec RedisSentinelSpec `json:"spec,omitempty"`
Status RedisSentinelStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// RedisSentinelList contains a list of RedisSentinel
type RedisSentinelList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []RedisSentinel `json:"items"`
}
func init() {
SchemeBuilder.Register(&RedisSentinel{}, &RedisSentinelList{})
}
以上对RedisSentinel的CRD进行了定义,和RedisStandalone相比,多了不少,其中包括Sentinel的镜像,以及我们上面说到的三个端口等。
2.2.2 实现controller
go
/*
Copyright 2025.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controller
import (
"context"
"fmt"
"strings"
"time"
cachev1 "github.com/IguoChan/redis-operator/api/v1"
"github.com/go-redis/redis"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/intstr"
"k8s.io/client-go/tools/record"
"k8s.io/utils/pointer"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
)
// RedisSentinelReconciler reconciles a RedisSentinel object
type RedisSentinelReconciler struct {
client.Client
Scheme *runtime.Scheme
Recorder record.EventRecorder
}
const (
MasterPort = 6378
SentinelPort = 26379
)
//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=cache.iguochan.io,resources=redissentinels/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=statefulsets,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;patch
//+kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=persistentvolumeclaims,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=persistentvolumes,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=configmaps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=endpoints,verbs=create;get;list;update;watch;patch
//+kubebuilder:rbac:groups="",resources=events,verbs=create;patch
func (r *RedisSentinelReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
logger.Info("Reconciling RedisSentinel", "request", req.NamespacedName)
// Fetch the RedisSentinel instance
redisSentinel := &cachev1.RedisSentinel{}
if err := r.Get(ctx, req.NamespacedName, redisSentinel); err != nil {
if errors.IsNotFound(err) {
// CR deleted, ignore
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}
return ctrl.Result{RequeueAfter: 30 * time.Second}, err
}
// 检查是否有 Pod 被删除
if r.checkPodDeletion(ctx, redisSentinel) {
logger.Info("Detected pod deletion, waiting for failover")
return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
}
// Reconcile Redis StatefulSet
if err := r.reconcileRedisStatefulSet(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Redis StatefulSet")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// Reconcile Sentinel StatefulSet
if err := r.reconcileSentinelStatefulSet(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Sentinel StatefulSet")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// Reconcile Redis Service
if err := r.reconcileRedisService(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Redis Service")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// Reconcile Sentinel Service
if err := r.reconcileSentinelService(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Sentinel Service")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// Reconcile ConfigMaps
if err := r.reconcileConfigMaps(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile ConfigMaps")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// Reconcile Persistent Volumes
if err := r.reconcilePersistentVolumes(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Persistent Volumes")
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// 更新角色标签
if err := r.updateRedisRoleLabels(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to update Redis role labels")
r.Recorder.Eventf(redisSentinel, corev1.EventTypeWarning, "LabelUpdateFailed",
"Failed to update Redis role labels: %v", err)
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseError, err)
}
// 创建/更新主节点端点
if err := r.reconcileRedisMasterEndpoints(ctx, redisSentinel); err != nil {
logger.Error(err, "Failed to reconcile Redis master Endpoints")
return ctrl.Result{RequeueAfter: 30 * time.Second}, err
}
// Update status
return ctrl.Result{RequeueAfter: 30 * time.Second}, r.updateStatus(ctx, redisSentinel, cachev1.RedisPhaseReady, nil)
}
func (r *RedisSentinelReconciler) reconcileRedisStatefulSet(ctx context.Context, rs *cachev1.RedisSentinel) error {
logger := log.FromContext(ctx)
name := rs.Name + "-redis"
// Get replicas
replicas := rs.Spec.RedisReplicas
if replicas < 1 {
replicas = 3
}
sts := &appsv1.StatefulSet{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Spec: appsv1.StatefulSetSpec{
ServiceName: name + "-headless",
Replicas: pointer.Int32(replicas),
Selector: &metav1.LabelSelector{
MatchLabels: labelsForRedisSentinel(rs.Name, "redis"),
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "redis",
Image: rs.Spec.Image,
ImagePullPolicy: corev1.PullIfNotPresent,
Ports: []corev1.ContainerPort{
{
Name: "redis",
ContainerPort: RedisPort,
},
},
VolumeMounts: []corev1.VolumeMount{
{
Name: "redis-config",
MountPath: "/redis-config",
},
{
Name: "redis-data",
MountPath: "/data",
},
},
Command: []string{
"sh", "-c",
"sh /redis-config/init.sh",
},
},
},
Volumes: []corev1.Volume{
{
Name: "redis-config",
VolumeSource: corev1.VolumeSource{
ConfigMap: &corev1.ConfigMapVolumeSource{
LocalObjectReference: corev1.LocalObjectReference{
Name: rs.Name + "-redis-config",
},
},
},
},
},
},
},
VolumeClaimTemplates: []corev1.PersistentVolumeClaim{
{
ObjectMeta: metav1.ObjectMeta{
Name: "redis-data",
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{
corev1.ReadWriteOnce,
},
Resources: corev1.ResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceStorage: rs.Spec.Storage.Size,
},
},
StorageClassName: pointer.String("redis-storage"),
},
},
},
},
}
// Set controller reference
if err := ctrl.SetControllerReference(rs, sts, r.Scheme); err != nil {
return err
}
// Create or update StatefulSet
foundSts := &appsv1.StatefulSet{}
err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, foundSts)
if err != nil && errors.IsNotFound(err) {
logger.Info("Creating Redis StatefulSet", "name", sts.Name)
if err := r.Create(ctx, sts); err != nil {
return err
}
} else if err != nil {
return err
} else {
logger.Info("Updating Redis StatefulSet", "name", sts.Name)
sts.Spec.DeepCopyInto(&foundSts.Spec)
if err := r.Update(ctx, foundSts); err != nil {
return err
}
}
return nil
}
func (r *RedisSentinelReconciler) reconcileSentinelStatefulSet(ctx context.Context, rs *cachev1.RedisSentinel) error {
logger := log.FromContext(ctx)
name := rs.Name + "-sentinel"
// Get replicas
replicas := rs.Spec.SentinelReplicas
if replicas < 1 {
replicas = 3
}
sts := &appsv1.StatefulSet{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Spec: appsv1.StatefulSetSpec{
ServiceName: name + "-headless",
Replicas: pointer.Int32(replicas),
Selector: &metav1.LabelSelector{
MatchLabels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "sentinel",
Image: rs.Spec.SentinelImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Ports: []corev1.ContainerPort{
{
Name: "sentinel",
ContainerPort: SentinelPort,
},
},
Command: []string{"sh", "-c",
"sh /sentinel-config/init.sh",
},
VolumeMounts: []corev1.VolumeMount{
{
Name: "sentinel-config",
MountPath: "/sentinel-config",
},
{
Name: "sentinel-config-dir",
MountPath: "/tmp", // 可写目录
},
},
},
},
Volumes: []corev1.Volume{
{
Name: "sentinel-config",
VolumeSource: corev1.VolumeSource{
ConfigMap: &corev1.ConfigMapVolumeSource{
LocalObjectReference: corev1.LocalObjectReference{
Name: fmt.Sprintf("%s-sentinel-config", rs.Name),
},
},
},
},
{
Name: "sentinel-config-dir",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
},
},
},
},
}
// Set controller reference
if err := ctrl.SetControllerReference(rs, sts, r.Scheme); err != nil {
return err
}
// Create or update StatefulSet
foundSts := &appsv1.StatefulSet{}
err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, foundSts)
if err != nil && errors.IsNotFound(err) {
logger.Info("Creating Sentinel StatefulSet", "name", sts.Name)
if err := r.Create(ctx, sts); err != nil {
return err
}
} else if err != nil {
return err
} else {
logger.Info("Updating Sentinel StatefulSet", "name", sts.Name)
sts.Spec.DeepCopyInto(&foundSts.Spec)
if err := r.Update(ctx, foundSts); err != nil {
return err
}
}
return nil
}
func (r *RedisSentinelReconciler) reconcileRedisService(ctx context.Context, rs *cachev1.RedisSentinel) error {
// Headless Service for StatefulSet
headlessSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-redis-headless",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Spec: corev1.ServiceSpec{
ClusterIP: corev1.ClusterIPNone,
Selector: labelsForRedisSentinel(rs.Name, "redis"),
Ports: []corev1.ServicePort{
{
Name: "redis",
Port: RedisPort,
TargetPort: intstr.FromInt(int(RedisPort)),
},
},
},
}
// NodePort Service for external access
nodePortSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-redis",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeNodePort,
Selector: labelsForRedisSentinel(rs.Name, "redis"),
Ports: []corev1.ServicePort{
{
Name: "redis",
Port: RedisPort,
TargetPort: intstr.FromInt(int(RedisPort)),
NodePort: rs.Spec.NodePort,
},
},
},
}
// NodePort Master Service
selector := labelsForRedisSentinel(rs.Name, "redis")
selector["redis-role"] = "master"
masterNodePortSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-redis-master",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeNodePort,
Selector: selector,
Ports: []corev1.ServicePort{
{
Name: "redis",
Port: MasterPort,
TargetPort: intstr.FromInt(int(RedisPort)),
NodePort: rs.Spec.MasterNodePort,
},
},
},
}
// Set controller references
if err := ctrl.SetControllerReference(rs, headlessSvc, r.Scheme); err != nil {
return err
}
if err := ctrl.SetControllerReference(rs, nodePortSvc, r.Scheme); err != nil {
return err
}
if err := ctrl.SetControllerReference(rs, masterNodePortSvc, r.Scheme); err != nil {
return err
}
// Create or update services
if err := r.createOrUpdateService(ctx, headlessSvc); err != nil {
return err
}
if err := r.createOrUpdateService(ctx, nodePortSvc); err != nil {
return err
}
return r.createOrUpdateService(ctx, masterNodePortSvc)
}
func (r *RedisSentinelReconciler) reconcileSentinelService(ctx context.Context, rs *cachev1.RedisSentinel) error {
// Headless Service for StatefulSet
headlessSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-sentinel-headless",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Spec: corev1.ServiceSpec{
ClusterIP: corev1.ClusterIPNone,
Selector: labelsForRedisSentinel(rs.Name, "sentinel"),
Ports: []corev1.ServicePort{
{
Name: "sentinel",
Port: SentinelPort,
TargetPort: intstr.FromInt(int(SentinelPort)),
},
},
},
}
// NodePort Service for external access
nodePortSvc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-sentinel",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Spec: corev1.ServiceSpec{
Type: corev1.ServiceTypeNodePort,
Selector: labelsForRedisSentinel(rs.Name, "sentinel"),
Ports: []corev1.ServicePort{
{
Name: "sentinel",
Port: SentinelPort,
TargetPort: intstr.FromInt(int(SentinelPort)),
NodePort: rs.Spec.SentinelNodePort,
},
},
},
}
// Set controller references
if err := ctrl.SetControllerReference(rs, headlessSvc, r.Scheme); err != nil {
return err
}
if err := ctrl.SetControllerReference(rs, nodePortSvc, r.Scheme); err != nil {
return err
}
// Create or update services
if err := r.createOrUpdateService(ctx, headlessSvc); err != nil {
return err
}
return r.createOrUpdateService(ctx, nodePortSvc)
}
func (r *RedisSentinelReconciler) reconcileConfigMaps(ctx context.Context, rs *cachev1.RedisSentinel) error {
masterHost := fmt.Sprintf("%s-redis-0.%s-redis-headless.%s.svc.cluster.local", rs.Name, rs.Name, rs.Namespace)
// Redis ConfigMap
redisCM := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-redis-config",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "redis"),
},
Data: map[string]string{
"redis-master.conf": redisMasterConfig,
"redis-replica.conf": redisReplicaConfig,
"init.sh": redisInitSh(masterHost),
},
}
sentinelCM := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-sentinel-config",
Namespace: rs.Namespace,
Labels: labelsForRedisSentinel(rs.Name, "sentinel"),
},
Data: map[string]string{
"sentinel.conf": sentinelConfig(masterHost, rs.Spec.SentinelReplicas),
"init.sh": sentinelInitConfig,
},
}
// Set controller references
if err := ctrl.SetControllerReference(rs, redisCM, r.Scheme); err != nil {
return err
}
if err := ctrl.SetControllerReference(rs, sentinelCM, r.Scheme); err != nil {
return err
}
// Create or update ConfigMaps
if err := r.createOrUpdateConfigMap(ctx, redisCM); err != nil {
return err
}
return r.createOrUpdateConfigMap(ctx, sentinelCM)
}
func (r *RedisSentinelReconciler) reconcilePersistentVolumes(ctx context.Context, rs *cachev1.RedisSentinel) error {
logger := log.FromContext(ctx)
replicas := rs.Spec.RedisReplicas
if replicas < 1 {
replicas = 3
}
// Create PVs for each Redis instance
for i := 0; i < int(replicas); i++ {
pvName := fmt.Sprintf("%s-redis-pv-%d", rs.Name, i)
pvPath := fmt.Sprintf("%s/%s/redis-%d", rs.Spec.Storage.HostPath, rs.Name, i)
pv := &corev1.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: pvName,
},
Spec: corev1.PersistentVolumeSpec{
Capacity: corev1.ResourceList{
corev1.ResourceStorage: rs.Spec.Storage.Size,
},
AccessModes: []corev1.PersistentVolumeAccessMode{
corev1.ReadWriteOnce,
},
PersistentVolumeReclaimPolicy: corev1.PersistentVolumeReclaimRetain,
StorageClassName: "redis-storage",
PersistentVolumeSource: corev1.PersistentVolumeSource{
HostPath: &corev1.HostPathVolumeSource{
Path: pvPath,
},
},
NodeAffinity: &corev1.VolumeNodeAffinity{
Required: &corev1.NodeSelector{
NodeSelectorTerms: []corev1.NodeSelectorTerm{
{
MatchExpressions: []corev1.NodeSelectorRequirement{
{
Key: "iguochan.io/redis-node",
Operator: corev1.NodeSelectorOpIn,
Values: []string{fmt.Sprintf("redis%d", i%3+1)},
},
},
},
},
},
},
},
}
// Create PV
if err := r.Create(ctx, pv); err != nil {
if !errors.IsAlreadyExists(err) {
logger.Error(err, "Failed to create PV", "name", pv.Name)
return err
}
logger.Info("PV already exists", "name", pv.Name)
}
}
// Create PVs for each Sentinel instance (if needed)
for i := 0; i < int(rs.Spec.SentinelReplicas); i++ {
pvName := fmt.Sprintf("%s-sentinel-pv-%d", rs.Name, i)
pvPath := fmt.Sprintf("%s/%s/sentinel-%d", rs.Spec.Storage.HostPath, rs.Name, i)
pv := &corev1.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: pvName,
},
Spec: corev1.PersistentVolumeSpec{
Capacity: corev1.ResourceList{
corev1.ResourceStorage: resource.MustParse("100Mi"),
},
AccessModes: []corev1.PersistentVolumeAccessMode{
corev1.ReadWriteOnce,
},
PersistentVolumeReclaimPolicy: corev1.PersistentVolumeReclaimRetain,
StorageClassName: "redis-storage",
PersistentVolumeSource: corev1.PersistentVolumeSource{
HostPath: &corev1.HostPathVolumeSource{
Path: pvPath,
},
},
NodeAffinity: &corev1.VolumeNodeAffinity{
Required: &corev1.NodeSelector{
NodeSelectorTerms: []corev1.NodeSelectorTerm{
{
MatchExpressions: []corev1.NodeSelectorRequirement{
{
Key: "iguochan.io/redis-node",
Operator: corev1.NodeSelectorOpIn,
Values: []string{fmt.Sprintf("redis%d", i%3+1)},
},
},
},
},
},
},
},
}
// Create PV
if err := r.Create(ctx, pv); err != nil {
if !errors.IsAlreadyExists(err) {
logger.Error(err, "Failed to create PV", "name", pv.Name)
return err
}
logger.Info("PV already exists", "name", pv.Name)
}
}
return nil
}
func (r *RedisSentinelReconciler) updateStatus(ctx context.Context, rs *cachev1.RedisSentinel, phase cachev1.RedisPhase, err error) error {
if err != nil && phase != cachev1.RedisPhaseReady {
rs.Status.Phase = phase
_ = r.Status().Update(ctx, rs)
return fmt.Errorf("err: %+v or phase: %s", err, phase)
}
// 1. 确保所有 Sentinel Pods 就绪
sentinelReady, sentinelErr := r.checkPodsReady(ctx, rs, "sentinel")
if !sentinelReady {
return sentinelErr
}
// 2. 确保所有 Redis Pods 就绪(主节点必须就绪)
redisReady, redisErr := r.checkPodsReady(ctx, rs, "redis")
if !redisReady {
return redisErr
}
// 3. 确保 Sentinel 服务可用
sentinelSvc, svcErr := r.validateService(ctx, rs, rs.Name+"-sentinel")
if svcErr != nil {
return svcErr
}
// 4. 确保 Redis 服务可用(指向主节点)
redisSvc, svcErr := r.validateService(ctx, rs, rs.Name+"-redis")
if svcErr != nil {
return svcErr
}
// 5. 更新状态
rs.Status.SentinelEndpoint = fmt.Sprintf("%s:%d", sentinelSvc.Spec.ClusterIP, sentinelSvc.Spec.Ports[0].Port)
rs.Status.Endpoint = fmt.Sprintf("%s:%d", redisSvc.Spec.ClusterIP, redisSvc.Spec.Ports[0].Port)
rs.Status.Phase = cachev1.RedisPhaseReady
return r.Status().Update(ctx, rs)
}
// 辅助函数:检查特定角色的 Pods 状态
func (r *RedisSentinelReconciler) checkPodsReady(ctx context.Context, rs *cachev1.RedisSentinel, role string) (bool, error) {
podList := &corev1.PodList{}
labels := client.MatchingLabels{
"app": "redis-sentinel",
"name": rs.Name,
"component": role, // "redis" 或 "sentinel"
}
if err := r.List(ctx, podList, labels); err != nil {
r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
"list %s pods failed: %s", role, err.Error())
return false, err
}
if len(podList.Items) == 0 {
msg := fmt.Sprintf("no %s pods available", role)
r.Recorder.Event(rs, corev1.EventTypeNormal, RecordReasonWaiting, msg)
rs.Status.Phase = cachev1.RedisPhasePending
return false, r.Status().Update(ctx, rs)
}
allReady := true
for _, pod := range podList.Items {
if !isPodReady(pod) {
allReady = false
break
}
}
if !allReady {
msg := fmt.Sprintf("not all %s pods are ready", role)
r.Recorder.Event(rs, corev1.EventTypeNormal, RecordReasonWaiting, msg)
rs.Status.Phase = cachev1.RedisPhasePending
return false, r.Status().Update(ctx, rs)
}
return true, nil
}
// 辅助函数:检查 Pod 就绪状态
func isPodReady(pod corev1.Pod) bool {
for _, cond := range pod.Status.Conditions {
if cond.Type == corev1.PodReady {
return cond.Status == corev1.ConditionTrue
}
}
return false
}
// 辅助函数:验证服务可用性
func (r *RedisSentinelReconciler) validateService(ctx context.Context, rs *cachev1.RedisSentinel, svcName string) (*corev1.Service, error) {
svc := &corev1.Service{}
key := types.NamespacedName{Namespace: rs.Namespace, Name: svcName}
// 获取 Service 对象
if err := r.Get(ctx, key, svc); err != nil {
r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
"get %s service failed: %s", svcName, err.Error())
rs.Status.Phase = cachev1.RedisPhaseError
return nil, r.Status().Update(ctx, rs)
}
// 验证对应的 Endpoints
endpoints := &corev1.Endpoints{}
if err := r.Get(ctx, key, endpoints); err != nil {
r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
"get %s endpoints failed: %s", svcName, err.Error())
rs.Status.Phase = cachev1.RedisPhaseError
return nil, r.Status().Update(ctx, rs)
}
// 检查可用终端
if len(endpoints.Subsets) == 0 || len(endpoints.Subsets[0].Addresses) == 0 {
r.Recorder.Eventf(rs, corev1.EventTypeWarning, RecordReasonFailed,
"%s service has no endpoints", svcName)
rs.Status.Phase = cachev1.RedisPhaseError
return nil, r.Status().Update(ctx, rs)
}
return svc, nil
}
// Helper functions
func labelsForRedisSentinel(name, role string) map[string]string {
return map[string]string{
"app": "redis-sentinel",
"name": name,
"component": role,
}
}
func (r *RedisSentinelReconciler) createOrUpdateService(ctx context.Context, svc *corev1.Service) error {
foundSvc := &corev1.Service{}
err := r.Get(ctx, types.NamespacedName{Name: svc.Name, Namespace: svc.Namespace}, foundSvc)
if err != nil && errors.IsNotFound(err) {
return r.Create(ctx, svc)
} else if err != nil {
return err
}
// Preserve existing NodePort if not specified
if svc.Spec.Type == corev1.ServiceTypeNodePort {
for i, p := range svc.Spec.Ports {
foundSvc.Spec.Ports[i].Port = p.Port
foundSvc.Spec.Ports[i].TargetPort = p.TargetPort
foundSvc.Spec.Ports[i].NodePort = p.NodePort
}
}
return r.Update(ctx, foundSvc)
}
func (r *RedisSentinelReconciler) createOrUpdateConfigMap(ctx context.Context, cm *corev1.ConfigMap) error {
foundCM := &corev1.ConfigMap{}
err := r.Get(ctx, types.NamespacedName{Name: cm.Name, Namespace: cm.Namespace}, foundCM)
if err != nil && errors.IsNotFound(err) {
return r.Create(ctx, cm)
} else if err != nil {
return err
}
foundCM.Data = cm.Data
return r.Update(ctx, foundCM)
}
func (r *RedisSentinelReconciler) updateRedisRoleLabels(ctx context.Context, rs *cachev1.RedisSentinel) error {
// 获取所有 Redis Pod
podList := &corev1.PodList{}
if err := r.List(ctx, podList, client.MatchingLabels{
"app": "redis-sentinel",
"name": rs.Name,
"component": "redis",
}); err != nil {
return err
}
// 获取主节点地址
var ip string
var err error
sentinelPods := &corev1.PodList{}
if err = r.List(ctx, sentinelPods, client.MatchingLabels{
"app": "redis-sentinel",
"name": rs.Name,
"component": "sentinel",
}); err != nil || len(sentinelPods.Items) == 0 {
return fmt.Errorf("list sentinel pods err: %+v or len(sentinelPods.Items) == 0", err)
}
if ip, _, err = r.getSentinelMasterAddr(ctx, &sentinelPods.Items[0]); err != nil {
return err
}
// 更新 Pod 标签
for _, pod := range podList.Items {
newRole := "slave"
if pod.Status.PodIP == ip || strings.Contains(ip, pod.Spec.Hostname) {
newRole = "master"
}
if pod.Labels["redis-role"] != newRole {
patch := client.MergeFrom(pod.DeepCopy())
if pod.Labels == nil {
pod.Labels = make(map[string]string)
}
pod.Labels["redis-role"] = newRole
if err := r.Patch(ctx, &pod, patch); err != nil {
return err
}
}
}
return nil
}
func (r *RedisSentinelReconciler) reconcileRedisMasterEndpoints(ctx context.Context, rs *cachev1.RedisSentinel) error {
// 获取主节点 Pod
podList := &corev1.PodList{}
if err := r.List(ctx, podList, client.MatchingLabels{
"app": "redis-sentinel",
"name": rs.Name,
"component": "redis",
"redis-role": "master",
}); err != nil {
return err
}
if len(podList.Items) == 0 {
return nil // 没有主节点
}
masterPod := podList.Items[0]
if masterPod.Status.PodIP == "" {
return fmt.Errorf("reconcileRedisMasterEndpoints: masterPod.Status.PodIP is empty")
}
endpoints := &corev1.Endpoints{
ObjectMeta: metav1.ObjectMeta{
Name: rs.Name + "-redis-master",
Namespace: rs.Namespace,
},
Subsets: []corev1.EndpointSubset{
{
Addresses: []corev1.EndpointAddress{
{
IP: masterPod.Status.PodIP,
TargetRef: &corev1.ObjectReference{
Kind: "Pod",
Name: masterPod.Name,
Namespace: masterPod.Namespace,
},
},
},
Ports: []corev1.EndpointPort{
{
Port: RedisPort,
},
},
},
},
}
// 设置控制器引用
if err := ctrl.SetControllerReference(rs, endpoints, r.Scheme); err != nil {
return err
}
// 创建或更新端点
found := &corev1.Endpoints{}
err := r.Get(ctx, types.NamespacedName{Name: endpoints.Name, Namespace: endpoints.Namespace}, found)
if err != nil && errors.IsNotFound(err) {
return r.Create(ctx, endpoints)
} else if err != nil {
return err
}
// 比较并更新
needsUpdate := false
if len(found.Subsets) == 0 {
needsUpdate = true
} else if len(found.Subsets[0].Addresses) == 0 ||
found.Subsets[0].Addresses[0].IP != masterPod.Status.PodIP {
needsUpdate = true
}
if needsUpdate {
found.Subsets = endpoints.Subsets
return r.Update(ctx, found)
}
return nil
}
func (r *RedisSentinelReconciler) getSentinelMasterAddr(ctx context.Context, sentinelPod *corev1.Pod) (string, string, error) {
logger := log.FromContext(ctx)
sentinelAddr := fmt.Sprintf("%s:%d", sentinelPod.Status.PodIP, SentinelPort)
sentinelClient := redis.NewSentinelClient(&redis.Options{
Addr: sentinelAddr,
Password: "", // 如果有密码需要添加
DB: 0,
})
// 添加故障转移检测
var masterIP, masterPort string
var lastErr error
// 最多重试 5 次,每次间隔 2 秒
for i := 0; i < 5; i++ {
result, err := sentinelClient.GetMasterAddrByName("mymaster").Result()
if err == nil && len(result) >= 2 {
masterIP = result[0]
masterPort = result[1]
// 验证主节点是否实际存在
if r.isPodAlive(ctx, masterIP) {
logger.Info(fmt.Sprintf("getSentinelMasterAddr: %s, %s", masterIP, masterPort))
return masterIP, masterPort, nil
}
logger.Info("Master IP reported but pod not alive", "ip", masterIP)
} else if err != nil {
lastErr = err
}
// 等待 2 秒后重试
time.Sleep(2 * time.Second)
}
return "", "", fmt.Errorf("failed to get valid master address after 5 attempts: %v", lastErr)
}
// 检查 Pod 是否实际存在
func (r *RedisSentinelReconciler) isPodAlive(ctx context.Context, ip string) bool {
pods := &corev1.PodList{}
if err := r.List(ctx, pods); err != nil {
return false
}
for _, pod := range pods.Items {
if pod.Status.PodIP == ip || strings.Contains(ip, pod.Spec.Hostname) {
return pod.DeletionTimestamp == nil
}
}
return false
}
func (r *RedisSentinelReconciler) checkPodDeletion(ctx context.Context, rs *cachev1.RedisSentinel) bool {
// 获取所有 Redis Pod
podList := &corev1.PodList{}
if err := r.List(ctx, podList, client.MatchingLabels{
"app": "redis-sentinel",
"name": rs.Name,
"component": "redis",
}); err != nil {
return false
}
// 检查是否有 Pod 正在删除中
for _, pod := range podList.Items {
if pod.DeletionTimestamp != nil {
return true
}
}
return false
}
// SetupWithManager sets up the controller with the Manager.
func (r *RedisSentinelReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&cachev1.RedisSentinel{}).
Owns(&appsv1.StatefulSet{}).
Owns(&corev1.Service{}).
Owns(&corev1.ConfigMap{}).
Owns(&corev1.Endpoints{}).
Complete(r)
}
其中有一些点需要注意一下,比如我们通过给不同的Pod标记其此时是否是主节点,从而给对应的Pod打上对应的标记,从而保证xxx-redis-master
服务能找到主节点。
2.2.3 准入控制
bash
kubebuilder create webhook --group cache --version v1 --kind RedisSentinel --defaulting --programmatic-validation
通过以上命令给RedisSentinel创建准入控制,其相关设置如下,这里就不赘述了:
go
/*
Copyright 2025.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1
import (
"fmt"
"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/util/validation/field"
ctrl "sigs.k8s.io/controller-runtime"
logf "sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/controller-runtime/pkg/webhook"
)
const (
MinRedisSize = "1Gi" // Redis 最小存储大小
MaxRedisSize = "100Gi" // Redis 最大存储大小
MinSentinelSize = "100Mi" // Sentinel 最小存储大小
MaxSentinelSize = "1Gi" // Sentinel 最大存储大小
MinReplicas = 3 // 最小副本数
MaxReplicas = 7 // 最大副本数
MinQuorum = 2 // Sentinel 最小仲裁数
)
// log is for logging in this package.
var redissentinellog = logf.Log.WithName("redissentinel-resource")
func (r *RedisSentinel) SetupWebhookWithManager(mgr ctrl.Manager) error {
return ctrl.NewWebhookManagedBy(mgr).
For(r).
Complete()
}
// TODO(user): EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
//+kubebuilder:webhook:path=/mutate-cache-iguochan-io-v1-redissentinel,mutating=true,failurePolicy=fail,sideEffects=None,groups=cache.iguochan.io,resources=redissentinels,verbs=create;update,versions=v1,name=mredissentinel.kb.io,admissionReviewVersions=v1
var _ webhook.Defaulter = &RedisSentinel{}
// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *RedisSentinel) Default() {
redissentinellog.Info("default", "name", r.Name)
// 设置默认Redis镜像
if r.Spec.Image == "" {
r.Spec.Image = "redis:7.0"
redissentinellog.Info("Setting default Redis image", "image", r.Spec.Image)
}
// 设置默认Sentinel镜像
if r.Spec.SentinelImage == "" {
r.Spec.SentinelImage = "redis:7.0"
redissentinellog.Info("Setting default Sentinel image", "image", r.Spec.SentinelImage)
}
// 设置默认RedisMaster端口
if r.Spec.MasterNodePort == 0 {
r.Spec.MasterNodePort = 30999
redissentinellog.Info("Setting default Redis Master NodePort", "nodePort", r.Spec.MasterNodePort)
}
// 设置默认Redis端口
if r.Spec.NodePort == 0 {
r.Spec.NodePort = 31000
redissentinellog.Info("Setting default Redis NodePort", "nodePort", r.Spec.NodePort)
}
// 设置默认Sentinel端口
if r.Spec.SentinelNodePort == 0 {
r.Spec.SentinelNodePort = 31001
redissentinellog.Info("Setting default Sentinel NodePort", "sentinelNodePort", r.Spec.SentinelNodePort)
}
// 设置默认Redis副本数
if r.Spec.RedisReplicas == 0 {
r.Spec.RedisReplicas = 3
redissentinellog.Info("Setting default Redis replicas", "redisReplicas", r.Spec.RedisReplicas)
}
// 设置默认Sentinel副本数
if r.Spec.SentinelReplicas == 0 {
r.Spec.SentinelReplicas = 3
redissentinellog.Info("Setting default Sentinel replicas", "sentinelReplicas", r.Spec.SentinelReplicas)
}
// 设置默认存储路径
if r.Spec.Storage.HostPath == "" {
r.Spec.Storage.HostPath = "/data"
redissentinellog.Info("Setting default host path", "hostPath", r.Spec.Storage.HostPath)
}
// 设置默认Redis存储大小
if r.Spec.Storage.Size.IsZero() {
size := resource.MustParse("1Gi")
r.Spec.Storage.Size = size
redissentinellog.Info("Setting default Redis storage size", "size", size.String())
}
}
// TODO(user): change verbs to "verbs=create;update;delete" if you want to enable deletion validation.
//+kubebuilder:webhook:path=/validate-cache-iguochan-io-v1-redissentinel,mutating=false,failurePolicy=fail,sideEffects=None,groups=cache.iguochan.io,resources=redissentinels,verbs=create;update,versions=v1,name=vredissentinel.kb.io,admissionReviewVersions=v1
var _ webhook.Validator = &RedisSentinel{}
// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateCreate() error {
redissentinellog.Info("validate create", "name", r.Name)
return r.validateRedisSentinel()
}
// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateUpdate(old runtime.Object) error {
redissentinellog.Info("validate update", "name", r.Name)
oldSentinel, ok := old.(*RedisSentinel)
if !ok {
return fmt.Errorf("expected a RedisSentinel object but got %T", old)
}
if err := r.validateRedisSentinel(); err != nil {
return err
}
// 验证禁止修改的字段
if oldSentinel.Spec.Image != r.Spec.Image {
return field.Forbidden(
field.NewPath("spec", "image"),
"Redis image cannot be changed after creation",
)
}
if oldSentinel.Spec.SentinelImage != r.Spec.SentinelImage {
return field.Forbidden(
field.NewPath("spec", "sentinelImage"),
"Sentinel image cannot be changed after creation",
)
}
if oldSentinel.Spec.Storage.HostPath != r.Spec.Storage.HostPath {
return field.Forbidden(
field.NewPath("spec", "storage", "hostPath"),
"hostPath cannot be changed after creation",
)
}
return nil
}
// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *RedisSentinel) ValidateDelete() error {
redissentinellog.Info("validate delete", "name", r.Name)
// TODO(user): fill in your validation logic upon object deletion.
return nil
}
// validateRedisSentinel 执行所有验证逻辑
func (r *RedisSentinel) validateRedisSentinel() error {
allErrs := field.ErrorList{}
// 验证Redis存储大小范围
if err := validateStorageSize(
r.Spec.Storage.Size,
MinRedisSize,
MaxRedisSize,
field.NewPath("spec", "storage", "size"),
"Redis"); err != nil {
allErrs = append(allErrs, err)
}
// 验证Sentinel副本数
if r.Spec.SentinelReplicas < MinReplicas {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "sentinelReplicas"),
r.Spec.SentinelReplicas,
fmt.Sprintf("Sentinel replicas must be at least %d", MinReplicas),
))
} else if r.Spec.SentinelReplicas > MaxReplicas {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "sentinelReplicas"),
r.Spec.SentinelReplicas,
fmt.Sprintf("Sentinel replicas must be no more than %d", MaxReplicas),
))
}
// 验证Redis副本数
if r.Spec.RedisReplicas < MinReplicas {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "redisReplicas"),
r.Spec.RedisReplicas,
fmt.Sprintf("Redis replicas must be at least %d", MinReplicas),
))
} else if r.Spec.RedisReplicas > MaxReplicas {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "redisReplicas"),
r.Spec.RedisReplicas,
fmt.Sprintf("Redis replicas must be no more than %d", MaxReplicas),
))
}
// 验证Sentinel仲裁数要求
if r.Spec.SentinelReplicas < MinQuorum*2-1 {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "sentinelReplicas"),
r.Spec.SentinelReplicas,
fmt.Sprintf("Sentinel replicas must be at least %d for a quorum of %d", MinQuorum*2-1, MinQuorum),
))
}
// 验证RedisMaster端口范围
if r.Spec.MasterNodePort < 30000 || r.Spec.MasterNodePort > 32767 {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "nodePort"),
r.Spec.MasterNodePort,
"Redis master nodePort must be between 30000 and 32767",
))
}
// 验证Redis端口范围
if r.Spec.NodePort < 30000 || r.Spec.NodePort > 32767 {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "nodePort"),
r.Spec.NodePort,
"Redis nodePort must be between 30000 and 32767",
))
}
// 验证Sentinel端口范围
if r.Spec.SentinelNodePort < 30000 || r.Spec.SentinelNodePort > 32767 {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "sentinelNodePort"),
r.Spec.SentinelNodePort,
"Sentinel nodePort must be between 30000 and 32767",
))
}
// 验证主机路径安全
if !isValidHostPath(r.Spec.Storage.HostPath) {
allErrs = append(allErrs, field.Invalid(
field.NewPath("spec", "storage", "hostPath"),
r.Spec.Storage.HostPath,
"invalid host path, only /data directory is allowed",
))
}
if len(allErrs) == 0 {
return nil
}
return allErrs.ToAggregate()
}
// 辅助函数:验证存储大小
func validateStorageSize(size resource.Quantity, min, max string, path *field.Path, resourceType string) *field.Error {
minSize := resource.MustParse(min)
maxSize := resource.MustParse(max)
if size.Cmp(minSize) < 0 {
return field.Invalid(
path,
size.String(),
fmt.Sprintf("%s storage size must be at least %s", resourceType, min),
)
}
if size.Cmp(maxSize) > 0 {
return field.Invalid(
path,
size.String(),
fmt.Sprintf("%s storage size must be no more than %s", resourceType, max),
)
}
return nil
}
// 验证仲裁数量的辅助函数
func validateSentinelQuorum(replicas int32) bool {
// Sentinel需要大多数节点在线才能选举
// 公式:大多数节点 = (replicas/2) + 1
return replicas >= MinQuorum && replicas%2 == 1
}
3. 验证
通过一系列命令将CRD发布之后,我们开始验证。首先我们验证基本流程:
主节点端口读写
bash
$ redis-cli -h 127.0.0.1 -p 6378
127.0.0.1:6378> get key2
(nil)
127.0.0.1:6378> set key2 hello
OK
从节点端口读
bash
$ redis-cli -h 127.0.0.1 -p 6379
127.0.0.1:6379> get key2
"hello"
127.0.0.1:6379> set key2 hello1
(error) READONLY You can't write against a read only replica.
可以发现,从节点只能读不能写;但是这也不是一定的,因为很有可能长连接连接到的是主节点。
sentinel端口
bash
$ redis-cli -h 127.0.0.1 -p 26379
127.0.0.1:26379> SENTINEL master mymaster # 验证master
1) "name"
2) "mymaster"
3) "ip"
4) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
5) "port"
6) "6379"
7) "runid"
8) "7792152f59bc4716a8d88a76cd39ed19c2bc0c92"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "415"
19) "last-ping-reply"
20) "415"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "7391"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "25018208"
29) "config-epoch"
30) "0"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "10000"
39) "parallel-syncs"
40) "1"
127.0.0.1:26379> SENTINEL slaves mymaster # 验证slave
1) 1) "name"
2) "10.244.1.5:6379"
3) "ip"
4) "10.244.1.5"
5) "port"
6) "6379"
7) "runid"
8) "099f9411d941e3bfe6888870afd260e9b5eea60e"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "247"
19) "last-ping-reply"
20) "247"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "7115"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "25029569"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "3549038"
41) "replica-announced"
42) "1"
2) 1) "name"
2) "10.244.3.2:6379"
3) "ip"
4) "10.244.3.2"
5) "port"
6) "6379"
7) "runid"
8) "ae038757a97446ccc7325812d929b7c1e7a3fa0f"
9) "flags"
10) "slave"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "247"
19) "last-ping-reply"
20) "247"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "7241"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "25029572"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "3549038"
41) "replica-announced"
42) "1"
127.0.0.1:26379> SENTINEL get-master-addr-by-name mymaster
1) "redissentinel-sample-redis-0.redissentinel-sample-redis-headless.default.svc.cluster.local"
2) "6379"
failover验证
我们发现此时的主节点是redissentinel-sample-redis-0
,这时候我们删了这个节点:
bash
$ k get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
redissentinel-sample-redis-0 1/1 Running 2 (7h10m ago) 19d app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=master,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-0
redissentinel-sample-redis-1 1/1 Running 2 (7h10m ago) 19d app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=slave,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-1
redissentinel-sample-redis-2 1/1 Running 2 (7h10m ago) 19d app=redis-sentinel,component=redis,controller-revision-hash=redissentinel-sample-redis-9c894dbc9,name=redissentinel-sample,redis-role=slave,statefulset.kubernetes.io/pod-name=redissentinel-sample-redis-2
$ k delete pod redissentinel-sample-redis-0
pod "redissentinel-sample-redis-0" deleted
此时我们回到主节点端口:
bash
127.0.0.1:6378> get key2
"hello"
127.0.0.1:6378> set key2 hello1
OK
127.0.0.1:6378> get key2
"hello1"
可以看到,主节点端口依然可以进行读写操作,我们再去看从节点端口:
bash
127.0.0.1:6379> get key2
"hello1"
127.0.0.1:6379> set key2 hello
(error) READONLY You can't write against a read only replica.
最后再去sentinel端口验证一下此时的主节点:
bash
127.0.0.1:26379> SENTINEL master mymaster
...
3) "ip"
4) "10.244.1.5"
...
而这个节点是节点3:
bash
$ k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redissentinel-sample-redis-0 1/1 Running 0 3m54s 10.244.2.6 multi-worker <none> <none>
redissentinel-sample-redis-1 1/1 Running 2 (7h21m ago) 19d 10.244.3.2 multi-worker2 <none> <none>
redissentinel-sample-redis-2 1/1 Running 2 (7h21m ago) 19d 10.244.1.5 multi-worker3 <none> <none>
但是这个方案还是有很大问题的,我在多次尝试后会发现:
- 后续
redis-cli
需要重连,因为这些链接是TCP的长连接; - 如果发生了故障转移,可能需要一点时间才能将这个role转移过来,这点应该可以通过更优雅的代码实现,但是这里是做一个demo,我就不深究了,本质上是为了学习Operator的实现。