client-go(三) workqueue

这次我们再来看下workqueue, workqueue也是client-go的重要组成部分, 我们先看下官方提供的示例图

可以看到, workqueue再下半部分CustomController中, 通过Resource Event Handlers把事件推入workqueue

workqueue

我们先看下staging/src/k8s.io/client-go/util/workqueue/doc.go文件中对workqueue提供的功能的说明

go 复制代码

// Package workqueue provides a simple queue that supports the following
// features:
//  * Fair: items processed in the order in which they are added.
//  * Stingy: a single item will not be processed multiple times concurrently,
//      and if an item is added multiple times before it can be processed, it
//      will only be processed once.
//  * Multiple consumers and producers. In particular, it is allowed for an
//      item to be reenqueued while it is being processed.
//  * Shutdown notifications.
package workqueue

简单的翻译下:

Fair: item根据添加的顺序处理
Stingy: 一个item不会被并发的进行多次处理, 并且, 如果在处理之前, 一个item被多次添加, 也只会被处理一次
可以有多个消费者和生产者, 特别的, 允许item在处理时可以重新入队列
关闭时可以触发通知

workqueue中提供了三种队列, 不同队列实现可应对不同的使用场景

Interface: FIFO队列接口，先进先出队列，并支持去重机制。
DelayingInterface: 延迟队列接口，基于Interface接口封装，延迟一段时间后再将元素存入队列。
RateLimitingInterface: 限速队列接口，基于DelayingInterface接口封装，支持元素存入队列时进行速率限制。他们的关系如下图

FIFO队列

接着我们再看下FIFO队列的接口定义, 如下:

go 复制代码

type Interface interface {
	Add(item interface{}) // 给队列添加元素, 可以是任务类型元素
	Len() int // 返回当前队列的长度
	Get() (item interface{}, shutdown bool) // 获取队列头部的一个元素
	Done(item interface{}) // 标记队列中该元素已被处理
	ShutDown() // 关闭队列
	ShuttingDown() bool // 查询队列是否正在关闭
}

通过IDE, 我们发现, 此接口有一个实现, 数据结构如下:

go 复制代码

type Type struct {
	// queue defines the order in which we will work on items. Every
	// element of queue should be in the dirty set and not in the
	// processing set.
	queue []t

	// dirty defines all of the items that need to be processed.
	dirty set

	// Things that are currently being processed are in the processing set.
	// These things may be simultaneously in the dirty set. When we finish
	// processing something and remove it from this set, we'll check if
	// it's in the dirty set, and if so, add it to the queue.
	processing set

	cond *sync.Cond

	shuttingDown bool

	metrics queueMetrics

	unfinishedWorkUpdatePeriod time.Duration
	clock                      clock.Clock
}

在该数据结构中最主要的字段有queue, dirty, processing. 其中queue字段是实际存储元素的地方, 他是slice结构的, 用于保证元素的有序; dirty字段是实现上面Stingy功能的关键;processing字段用于标记机制, 标记一个元素是否正在被处理.

我们先看下几个方法的代码实现, 如下:

add方法

go 复制代码

func (q *Type) Add(item interface{}) {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	if q.shuttingDown {
		return
	}
	if q.dirty.has(item) {
		return
	}

	q.metrics.add(item)

	q.dirty.insert(item)
	if q.processing.has(item) {
		return
	}

	q.queue = append(q.queue, item)
	q.cond.Signal()
}

get方法

go 复制代码

func (q *Type) Get() (item interface{}, shutdown bool) {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	for len(q.queue) == 0 && !q.shuttingDown {
		q.cond.Wait()
	}
	if len(q.queue) == 0 {
		// We must be shutting down.
		return nil, true
	}

	item, q.queue = q.queue[0], q.queue[1:]

	q.metrics.get(item)

	q.processing.insert(item)
	q.dirty.delete(item)

	return item, false
}

done方法

go 复制代码

func (q *Type) Done(item interface{}) {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()

	q.metrics.done(item)

	q.processing.delete(item)
	if q.dirty.has(item) {
		q.queue = append(q.queue, item)
		q.cond.Signal()
	}
}

通过上面的三个方法, 我们大概就清楚了, 在高并发下如何保证再处理一个item之前, 哪怕其被添加了多次, 但也只会被处理一次, 原因为:

首先通过锁的方式, 保证只有一个goroutine会进入方法
在get方法调用之前, 如果有多个goroutine调用add方法, 因为有dirty的存在, 在add方法里判断了, 如果dirty中有这个item, 就返回了
当一个goroutine-记为g1-调用get方法获取了一个item时, 会把该元素添加到processing中, 此时, 如果其他goroutine-记为g2-再添加相同的item, 不会直接添加到queue中, 而是在添加到dirty之后, 就判断是否在processing中, 如果在processing中就直接返回了; 而后, 在g1调用done方法时, 又会判断dirty中是否有这个item, 如果有, 就重新入队列

延迟队列

我们看下延迟队列的接口定义

go 复制代码

type DelayingInterface interface {
	Interface
	// AddAfter adds an item to the workqueue after the indicated duration has passed
	AddAfter(item interface{}, duration time.Duration)
}

延迟队列的实现

再看下延迟队列的实现, 其结构体如下:

go 复制代码

type delayingType struct {
	Interface

	// clock tracks time for delayed firing
	clock clock.Clock

	// stopCh lets us signal a shutdown to the waiting loop
	stopCh chan struct{}
	// stopOnce guarantees we only signal shutdown a single time
	stopOnce sync.Once

	// heartbeat ensures we wait no more than maxWait before firing
	heartbeat clock.Ticker

	// waitingForAddCh is a buffered channel that feeds waitingForAdd
	waitingForAddCh chan *waitFor

	// metrics counts the number of retries
	metrics           retryMetrics
	deprecatedMetrics retryMetrics
}

初始化方法为:

go 复制代码

func newDelayingQueue(clock clock.Clock, name string) DelayingInterface {
	ret := &delayingType{
		Interface:         NewNamed(name),
		clock:             clock,
		heartbeat:         clock.NewTicker(maxWait),
		stopCh:            make(chan struct{}),
		waitingForAddCh:   make(chan *waitFor, 1000),
		metrics:           newRetryMetrics(name),
		deprecatedMetrics: newDeprecatedRetryMetrics(name),
	}

	go ret.waitingLoop()

	return ret
}

AddAfter方法

go 复制代码

func (q *delayingType) AddAfter(item interface{}, duration time.Duration) {
	// don't add if we're already shutting down
	if q.ShuttingDown() {
		return
	}

	q.metrics.retry()
	q.deprecatedMetrics.retry()

	// immediately add things with no delay
	if duration <= 0 {
		q.Add(item)
		return
	}

	select {
	case <-q.stopCh:
		// unblock if ShutDown() is called
	case q.waitingForAddCh <- &waitFor{data: item, readyAt: q.clock.Now().Add(duration)}:
	}
}

go 复制代码

func (q *delayingType) waitingLoop() {
    ...

	for {
        ...
		select {
		case <-q.stopCh:
			return

		case <-q.heartbeat.C():
			// continue the loop, which will add ready items

		case <-nextReadyAt:
			// continue the loop, which will add ready items

		case waitEntry := <-q.waitingForAddCh:
			if waitEntry.readyAt.After(q.clock.Now()) {
				insert(waitingForQueue, waitingEntryByData, waitEntry)
			} else {
				q.Add(waitEntry.data)
			}

			drained := false
			for !drained {
				select {
				case waitEntry := <-q.waitingForAddCh:
					if waitEntry.readyAt.After(q.clock.Now()) {
						insert(waitingForQueue, waitingEntryByData, waitEntry)
					} else {
						q.Add(waitEntry.data)
					}
				default:
					drained = true
				}
			}
		}
	}
}

上面的代码中, waitingForAddCh字段是延迟队列的重点, 是该字段在初始化时为1000, 也就说通过AddAfter方法插入元素时，是非阻塞状态的，只有当插入的元素大于或等于1000时，延迟队列才会处于阻塞状态; 初始化方法还起了一个新的goroutine运行了一个waitingLoop方法, 该方法主要从waitingForAddCh拿到数据后入队列

在AddAfter方法中, 插入一个item（元素）参数，并附带一个duration（延迟时间）参数，该duration参数用于指定元素延迟插入FIFO队列的时间。如果duration小于或等于0，会直接将元素插入FIFO队列中。

限速队列

我们看下限速队列的接口定义

go 复制代码

type RateLimitingInterface interface {
	DelayingInterface

	// AddRateLimited adds an item to the workqueue after the rate limiter says it's ok
	AddRateLimited(item interface{})

	// Forget indicates that an item is finished being retried.  Doesn't matter whether it's for perm failing
	// or for success, we'll stop the rate limiter from tracking it.  This only clears the `rateLimiter`, you
	// still have to call `Done` on the queue.
	Forget(item interface{})

	// NumRequeues returns back how many times the item was requeued
	NumRequeues(item interface{}) int
}

限速队列的实现

再看下限速队列的实现, 其结构体如下:

go 复制代码

type rateLimitingType struct {
	DelayingInterface

	rateLimiter RateLimiter
}

限速队列在初始化时, 需要提供一个RateLimiter, WorkQueue提供了4种限速算法，以应对不同的场景. 这4种限速算法分别如下:

令牌桶算法（BucketRateLimiter）
排队指数算法（ItemExponentialFailureRateLimiter）
计数器算法（ItemFastSlowRateLimiter）
混合模式（MaxOfRateLimiter）, 将多种限速算法混合使用