整体架构图
Informer
Informer 是一个用于在 Kubernetes 集群中跟踪和处理对象(如 Pod、Service、Node 等)变化的工具。它提供了一种方便的方式来监视和响应集群中资源的增加、修改和删除操作。Informer 通过与 Kubernetes API Server 进行交互,实时获取资源的变化,并触发相应的事件处理逻辑。我们可以使用 Informer 来构建自定义的控制器,实现对集群中资源的实时监控和处理。
Cache
Client-go 中的 Cache 是一个用于存储和管理 Kubernetes 资源对象的本地缓存机制。它提供了一种在客户端应用程序中缓存和跟踪集群中资源对象的方式,以减少对 API Server 的请求次数并提高性能。
Cache 在应用程序启动时会从 API Server 中获取初始化数据,并在后续的操作中保持与 API Server 的同步。它会自动处理资源对象的增加、修改和删除操作,并更新本地缓存的数据,以便应用程序可以快速读取和操作资源对象,而无需频繁地与 API Server 进行通信。
使用 Cache,开发人员可以轻松地获取和操作已缓存的资源对象,而无需每次操作都与 API Server 进行通信,从而提高应用程序的性能和响应速度。同时,Cache 还提供了一些便捷的方法来查询和筛选资源对象,以满足应用程序的需求。
Queue
Queue 是一个用于在Client-go中实现工作队列的工具。它提供了一种机制来管理要处理的任务,并确保任务按顺序进行处理。
Queue 主要用于管理需要异步处理的事件或操作。应用程序可以将待处理的任务添加到队列中,然后按照需要从队列中取出任务进行处理。Queue 还提供了一些实用的功能,如任务的优先级排序、任务的延迟处理等。
使用 Queue,可以有效地控制并发处理任务,避免资源竞争和冲突。它可以帮助应用程序实现更高效的异步处理,提高吞吐量和响应能力。
Client-go 中的 Queue 可以与 Informer、Controller 等结合使用,帮助我们构建高效、可靠的控制器或处理逻辑,以处理 Kubernetes 集群中的事件和操作。
client-go中watch源码解析
在client-go中使用watch API用于监控特定资源,通过资源的事件发生和变化,感知资源的状态变化并及时反馈给客户端,这种方式会加大APIServer的处理请求的压力,因为这个API会一直轮询APIServer以获取信息
go
// https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/kubernetes/typed/core/v1/pod.go#L105
func (c *pods) Watch(ctx context.Context, opts metav1.ListOptions) (watch.Interface, error) {
var timeout time.Duration
if opts.TimeoutSeconds != nil {
timeout = time.Duration(*opts.TimeoutSeconds) * time.Second
}
opts.Watch = true
return c.client.Get().
Namespace(c.ns).
Resource("pods").
VersionedParams(&opts, scheme.ParameterCodec).
Timeout(timeout).
Watch(ctx)
}
// <https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/rest/request.go#L703>
func (r *Request) Watch(ctx context.Context) (watch.Interface, error) {
...
for {
if err := retry.Before(ctx, r); err != nil {
return nil, retry.WrapPreviousError(err)
}
req, err := r.newHTTPRequest(ctx)
if err != nil {
return nil, err
}
resp, err := client.Do(req)
retry.After(ctx, r, resp, err)
if err == nil && resp.StatusCode == http.StatusOK {
return r.newStreamWatcher(resp)
}
....
在StreamWatcher的receive方法中,通过轮询方式解码收到的响应,并将其解析为包括Type和Object的Event,客户端通过监听result channel获取Event,直到收到sw.done的退出信号才会退出
go
// https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/watch/streamwatcher.go#L100
func (sw *StreamWatcher) receive() {
defer utilruntime.HandleCrash()
defer close(sw.result)
defer sw.Stop()
for {
action, obj, err := sw.source.Decode()
if err != nil {
switch err {
case io.EOF:
// watch closed normally
case io.ErrUnexpectedEOF:
klog.V(1).Infof("Unexpected EOF during watch stream event decoding: %v", err)
default:
if net.IsProbableEOF(err) || net.IsTimeout(err) {
klog.V(5).Infof("Unable to decode an event from the watch stream: %v", err)
} else {
select {
case <-sw.done:
case sw.result <- Event{
Type: Error,
Object: sw.reporter.AsObject(fmt.Errorf("unable to decode an event from the watch stream: %v", err)),
}:
}
}
}
return
}
select {
case <-sw.done:
return
case sw.result <- Event{
Type: action,
Object: obj,
}:
}
}
}
对于每次的得到response,进行是否需要重试判断,如果需要重试就会继续轮询
go
// <https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/rest/request.go#L744>
if retry.IsNextRetry(ctx, r, req, resp, err, isErrRetryableFunc) {
return false, nil
}
所以我们一般不适用watch获取Kubernetes资源信息,而是使用Informer来实现这个功能 Informer 的好处:
- 减轻 APIServer 的压力
- 自动处理包括错误处理在内的监听事件的整个过程
- 将对象以线程安全的方式保存在存储器中
- 提供可用的Handler,帮助我们获取对象
Informer 在整个过程中扮演着关键角色:
- 同步来自 Reflector 的监听事件
- 分发监听事件以通知自定义控制器处理事件
Informer只会在注册特定事件发生时才会进行handler的处理逻辑
go
// <https://github.com/kubernetes/client-go/blob/master/tools/cache/delta_fifo.go#L603>
err := process(item, isInInitialList)
在自定义控制器的一侧,我们只需要从队列中取出对象,获取相关的具体对象,然后进行自己的业务处理。
SharedInformerFactory
SharedInformerFactory 主要负责创建各种 Informer,例如 Deployment Informer,Informer通过resync同步时间周期性地从 APIServer 同步资源。
go
// <https://github1s.com/kubernetes/client-go/blob/master/informers/factory.go#L125>
// NewSharedInformerFactoryWithOptions constructs a new instance of a SharedInformerFactory with additional options.
func NewSharedInformerFactoryWithOptions(client kubernetes.Interface, defaultResync time.Duration, options ...SharedInformerOption) SharedInformerFactory {
factory := &sharedInformerFactory{
client: client,
namespace: v1.NamespaceAll,
defaultResync: defaultResync,
informers: make(map[reflect.Type]cache.SharedIndexInformer),
startedInformers: make(map[reflect.Type]bool),
customResync: make(map[reflect.Type]time.Duration),
}
// Apply all options
for _, opt := range options {
factory = opt(factory)
}
return factory
}
通过informer监听deployment的创建和删除
创建一个deployment的informer,并注册对应的事件监听器handler,然后启动informer
go
func deployInformer(clientset *kubernetes.Clientset) v1.DeploymentInformer {
sharedInformers := informers.NewSharedInformerFactory(clientset, 1*time.Second)
depInformer := sharedInformers.Apps().V1().Deployments()
ch := make(<-chan struct{})
sharedInformers.Start(ch)
sharedInformers.WaitForCacheSync(ch)
return depInformer
}
Once informer start, the threadsafe store will be created Using informers lister to get object 当informer启动后,会自动创建并发安全的缓存已存储从APIServer同步的Event数据,informer可以通过lister从缓存中获取Event数据
go
func listWith(depInformer v1.DeploymentInformer) {
name := "nginx-deployment"
deploy, err := depInformer.Lister().Deployments("default").Get(name)
if apierrors.IsNotFound(err) {
log.Println("not found")
return
}
if err != nil {
return
}
log.Println("name: ", deploy.Name)
}
注册添加和删除事件,并启动informer
go
func deployInformer(clientset *kubernetes.Clientset) v1.DeploymentInformer {
sharedInformers := informers.NewSharedInformerFactory(clientset, 1*time.Second)
depInformer := sharedInformers.Apps().V1().Deployments()
depInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(item interface{}) {
log.Println("add item")
},
DeleteFunc: func(item interface{}) {
log.Println("delete item")
},
},
)
ch := make(<-chan struct{})
sharedInformers.Start(ch)
sharedInformers.WaitForCacheSync(ch)
return depInformer
}
启动程序
go
func main() {
clientset, err := NewClient()
if err != nil {
log.Fatal(err)
}
depInformer, queue := deployInformer(clientset)
for {
// list(clientset)
listWith(depInformer)
time.Sleep(10 * time.Second)
}
}
创建一个deployment
sh
➜ client-go-app ka -f nginx-deployment.yaml
deployment.apps/nginx-deployment created
➜ client-go-app kg deploy
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 15s
我们可以看到lister感知到了deployment的创建,并调用了对应的事件handler,打印了相应的日志
sh
➜ client-go-app go run main.go
2024/03/24 13:21:28 kube config file path: /home/going/.kube/config
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
删除deployment
sh
➜ client-go-app kd -f nginx-deployment.yaml
deployment.apps "nginx-deployment" deleted
删除的事件也被能被lister感知
sh
➜ client-go-app go run main.go
2024/03/24 13:21:28 kube config file path: /home/going/.kube/config
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 add item
2024/03/24 13:21:28 not found
2024/03/24 13:21:34 add item
2024/03/24 13:21:38 name: nginx-deployment
2024/03/24 13:21:48 name: nginx-deployment
2024/03/24 13:21:50 delete item
Queue是 Informer 的关键组成部分,它被用于在 client-go 和自定义控制器之间调度监听事件,并通知控制器从列表器中获取资源信息。我们可以在处理程序函数中,通过队列来调度监听事件。
go
func deployInformer(clientset *kubernetes.Clientset) (v1.DeploymentInformer, workqueue.RateLimitingInterface) {
sharedInformers := informers.NewSharedInformerFactory(clientset, 1*time.Second)
depInformer := sharedInformers.Apps().V1().Deployments()
queue := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "test")
depInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(item interface{}) {
log.Println("add item")
queue.Add(item)
},
DeleteFunc: func(item interface{}) {
log.Println("delete item")
queue.Add(item)
},
},
)
ch := make(<-chan struct{})
sharedInformers.Start(ch)
sharedInformers.WaitForCacheSync(ch)
return depInformer, queue
}
通过从Queue中弹出数据来获取事件,然后解析出namespace和name,最终通过使用namespace和name进行lister操作来获取资源。
go
func main() {
clientset, err := NewClient()
if err != nil {
log.Fatal(err)
}
depInformer, queue := deployInformer(clientset)
for {
// list(clientset)
item, shutdown := queue.Get()
if shutdown {
return
}
key, err := cache.MetaNamespaceKeyFunc(item)
if err != nil {
continue
}
ns, name, _ := cache.SplitMetaNamespaceKey(key)
listWith(depInformer, ns, name)
time.Sleep(10 * time.Second)
}
}
启动程序
sh
➜ client-go-app go run main.go
2024/03/24 14:35:55 kube config file path: /home/going/.kube/config
通过创建和删除deployment来测试整个流程
sh
➜ client-go-app ka -f nginx-deployment.yaml
deployment.apps/nginx-deployment created
➜ client-go-app kg deploy
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 15s
➜ client-go-app kd -f nginx-deployment.yaml
deployment.apps "nginx-deployment" deleted
对应的事件已经被监听到并执行相应的打印输出
sh
➜ client-go-app go run main.go
2024/03/24 14:35:55 kube config file path: /home/going/.kube/config
2024/03/24 14:35:55 add item
2024/03/24 14:35:55 add item
2024/03/24 14:35:55 add item
2024/03/24 14:35:55 add item
2024/03/24 14:35:55 add item
2024/03/24 14:35:55 name: lister
2024/03/24 14:36:04 add item
2024/03/24 14:36:05 name: ingress-kong
2024/03/24 14:36:15 name: proxy-kong
2024/03/24 14:36:25 name: coredns
2024/03/24 14:36:35 name: local-path-provisioner
2024/03/24 14:36:40 delete item
2024/03/24 14:36:45 not found
2024/03/24 14:36:55 not found
ResourceVersion
ResourceVersion 是表示资源当前状态的指纹,唯一标识资源的当前状态,一旦资源发生更改,它将会改变。例如,我们可以通过在标签中添加 lastResourceVersion: "874320"
来编辑 Pod。
sh
➜ client-go-app kg pod
NAME READY STATUS RESTARTS AGE
lister-9f8bf577b-8pbht 1/1 Running 0 19h
➜ client-go-app k edit pod lister-9f8bf577b-8pbht
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2024-03-23T09:45:16Z"
generateName: lister-9f8bf577b-
labels:
app: lister
pod-template-hash: 9f8bf577b
lastResourceVersion: "874320" // modified version
name: lister-9f8bf577b-8pbht
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: lister-9f8bf577b
uid: 408c0707-fbc6-41a5-92ab-9d3d3227503f
resourceVersion: "874320" // current version
uid: c46ec159-8996-42b7-9469-64971a05bad4
...
在修改的内容中,我们记录了当前的 resourceVersion,即 "874320",然后保存并再次检查,我们可以看到 resourceVersion 已更新为 "1004329"。
yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2024-03-23T09:45:16Z"
generateName: lister-9f8bf577b-
labels:
app: lister
lastResourceVersion: "874320" // recorded version
pod-template-hash: 9f8bf577b
name: lister-9f8bf577b-8pbht
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: lister-9f8bf577b
uid: 408c0707-fbc6-41a5-92ab-9d3d3227503f
resourceVersion: "1004329" // updated version
...
需要明确的一点是,不要更新 Informer 的缓存资源,因为它由 Informer 维护并保存在Store中,为了避免 APIServer 和 Informer 之间的不一致性,它应该被视为只读。因此,正如注释所说,必须将其视为只读。
go
// <https://github1s.com/kubernetes/client-go/blob/master/listers/apps/v1/deployment.go#L70>
// DeploymentNamespaceLister helps list and get Deployments.
// All objects returned here must be treated as read-only.
type DeploymentNamespaceLister interface {
// List lists all Deployments in the indexer for a given namespace.
// Objects returned here must be treated as read-only.
List(selector labels.Selector) (ret []*v1.Deployment, err error)
// Get retrieves the Deployment from the indexer for a given namespace and name.
// Objects returned here must be treated as read-only.
Get(name string) (*v1.Deployment, error)
DeploymentNamespaceListerExpansion
}
如果我们想筛选从 APIServer 获取的资源,可以使用 FilteredSharedInformerFactory。例如,我们可以筛选具有标签 app=test
并且仅存在于 test namespace的 Deployment 资源。
go
filteredSharedInformers := informers.NewFilteredSharedInformerFactory(
clientset, 1 * time.Minute, "test", internalinterfaces.TweakListOptionsFunc(
func(o *metav1.ListOptions){
o.LabelSelector = "app=test"
o.Kind = "Deployment"
}),
)