golang中的迭代器和 iter 包

Go 1.23 正式引入了iter包，这是一个专门用于迭代器的新特性。

迭代器就是可以被用来遍历的数据结构，在Go中提供了slice，array，map，channel，它们可以被 for-range 遍历。在Go1.23之后，自定义的结构也可以被 for-range 遍历，只要符合迭代器规范。

说到迭代器，就要提到 yield ，不同编程语言对 yield 的实现不同，在Go中，迭代器是一个函数，而此函数的返回值又是一个函数，返回值需要满足如下格式

go 复制代码

func(yield func(V) bool)
func(yield func(K, V) bool)

这是泛型写法，第一个函数对应的是元素遍历

go 复制代码

for V := range Iterator {}

第二个函数对应的键值对遍历

go 复制代码

for K, V := range Iterator {}

yield是一个函数，其返回值为 bool，用来判断遍历是否要停止。在 for-range 中 break / return 会使遍历停止，其次是遍历完成后会停止，而你写的迭代器是给 for-range 使用的，所以，你需要判断 bool 值来决定是否要停止遍历。

在实现 Iterator 的时候我们需要将输出的值传入 yield 函数，并接收和判断来自 for-range 的停止指令。

go 复制代码

type Member struct {
	members []string
}

func NewMember() *Member {
	m := &Member{}
	m.members = make([]string, 0)
	return m
}

func (m *Member) Add(name string) {
	m.members = append(m.members, name)
}

func (m *Member) Iterator() func(yield func(string) bool) {
	return func(yield func(string) bool) {
		for _, v := range m.members {
			if !yield(strings.ToUpper(v)) {
				break
			}
		}
	}
}

func ShowMembers() {
	m := NewMember()
	m.Add("zhangsan")
	m.Add("lisi")
	m.Add("wangwu")

	for v := range m.Iterator() {
		if v == "LISI" {
			break
		}
		fmt.Println("name:", v)
	}
}

同时iter/iter.go提供了两个类型Seq和Seq2，方便使用泛型定义。

go 复制代码

type Seq[V any] func(yield func(V) bool)
type Seq2[K, V any] func(yield func(K, V) bool)

iter 包的目标是提供一种统一和高效的迭代方法，因此官方增加了 slices 和 maps 包，即在 slice 和 map 上增加了一些方法，它们都用到了自定义迭代器。

go 复制代码

// slices/iter.go
func All[Slice ~[]E, E any](s Slice) iter.Seq2[int, E]
func Values[Slice ~[]E, E any](s Slice) iter.Seq[E]
...

// maps/iter.go
func All[Map ~map[K]V, K comparable, V any](m Map) iter.Seq2[K, V]
func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]
func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V]
...

Iterators ¶

The new iter package provides the basic definitions for working with user-defined iterators.

The slices package adds several functions that work with iterators:

All returns an iterator over slice indexes and values.
Values returns an iterator over slice elements.
Backward returns an iterator that loops over a slice backward.
Collect collects values from an iterator into a new slice.
AppendSeq appends values from an iterator to an existing slice.
Sorted collects values from an iterator into a new slice, and then sorts the slice.
SortedFunc is like Sorted but with a comparison function.
SortedStableFunc is like SortFunc but uses a stable sort algorithm.
Chunk returns an iterator over consecutive sub-slices of up to n elements of a slice.

The maps package adds several functions that work with iterators:

All returns an iterator over key-value pairs from a map.
Keys returns an iterator over keys in a map.
Values returns an iterator over values in a map.
Insert adds the key-value pairs from an iterator to an existing map.
Collect collects key-value pairs from an iterator into a new map and returns it.

关于 Pull 迭代器

上面讲到的迭代器都是Push迭代器，也就是说，我们平常使用的都是Push迭代器。但这并不是使用迭代器的唯一方法。例如，有时我们可能需要并行迭代两个容器。这时我们就需要用到另外一种不同类型的迭代器，于是需要Pull迭代器。

Push 迭代器和 Pull 迭代器的区别：

Push 迭代器将序列中的每个值推送到 yield 函数。Push 迭代器是 Go 标准库中的标准迭代器，并由 for/range 语句直接支持。
Pull 迭代器的工作方式则相反。每次调用 Pull 迭代器时，它都会从序列中拉出另一个值并返回该值。for/range 语句不直接支持 Pull 迭代器；但可以通过编写一个简单的 for 循环遍历 Pull 迭代器。

通常不需要自行实现一个 Pull 迭代器，新的标准库中 iter.Pull 和 iter.Pull2 函数能够将标准迭代器转为 Pull 迭代器。

go 复制代码

// Pull converts the "push-style" iterator sequence seq
// into a "pull-style" iterator accessed by the two functions
// next and stop.
//
// Next returns the next value in the sequence
// and a boolean indicating whether the value is valid.
// When the sequence is over, next returns the zero V and false.
// It is valid to call next after reaching the end of the sequence
// or after calling stop. These calls will continue
// to return the zero V and false.
//
// Stop ends the iteration. It must be called when the caller is
// no longer interested in next values and next has not yet
// signaled that the sequence is over (with a false boolean return).
// It is valid to call stop multiple times and when next has
// already returned false. Typically, callers should "defer stop()".
//
// It is an error to call next or stop from multiple goroutines
// simultaneously.
//
// If the iterator panics during a call to next (or stop),
// then next (or stop) itself panics with the same value.
func Pull[V any](seq Seq[V]) (next func() (V, bool), stop func())

func Pull2[K, V any](seq Seq2[K, V]) (next func() (K, V, bool), stop func())

Pull 和 Pull2 函数的参数分别是一个迭代器。

go 复制代码

func ShowMembers() {
	m := NewMember()
	m.Add("zhangsan")
	m.Add("lisi")
	m.Add("wangwu")

	// for v := range m.Iterator() {
	// 	if v == "LISI" {
	// 		break
	// 	}
	// 	fmt.Println("name:", v)
	// }

	next, stop := iter.Pull(m.Iterator())
	defer stop()
	for {
		v, ok := next()
		if !ok {
			break
		}
		if v == "LISI" {
			stop()
			break
		}
		fmt.Println("name:", v)
	}
}

示例二：将一个迭代器中的两个连续值对作为一个键值对，返回一个新的迭代器。

go 复制代码

// Pairs 返回一个迭代器，遍历 seq 中连续的值对。
func Pairs[V any](seq iter.Seq[V]) iter.Seq2[V, V] {
	return func(yield func(V, V) bool) {
		next, stop := iter.Pull(seq)
		defer stop()
		for {
			v1, ok1 := next()
			if !ok1 {
				return
			}
			v2, ok2 := next()
			// If ok2 is false, v2 should be the
			// zero value; yield one last pair.
			if !yield(v1, v2) {
				return
			}
			if !ok2 {
				return
			}
		}
	}
}

一次性迭代器

通常迭代器都会支持多次遍历，但是在某些场景下，例如，从网络或文件读取字节流时，迭代结束后就不会再读到值了。这种返回一次性迭代器的函数或方法需要在文档注释中标明。

go 复制代码

// Lines 返回一个从 r 按行读取的迭代器
// 返回的是一次性迭代器
func (r *Reader) Lines() iter.Seq[string]

直接传递函数给迭代器

对于一些简单的场景，比如只是想打印下元素，我们甚至可以不使用 for/range 语句遍历迭代器。可以直接将函数当成参数传递给迭代器。例如上面打印集合中元素的示例，也可以写成下面的形式。

go 复制代码

func ShowMembers2() {
	m := NewMember()
	m.Add("zhangsan")
	m.Add("lisi")
	m.Add("wangwu")

	m.Iterator()(func(v string) bool {
		fmt.Println("name:", v)
		return true
	})
}