12 go.sum 是如何保证依赖安全的?校验机制源码解析

本文基于 Go 1.25.0 源码进行分析

1. 整体流程图

复制代码
下载模块 zip
    │
    ▼
计算 zip 哈希 (dirhash.HashZip)
    │
    ▼
checkModSum(mod, hash)
    │
    ├─► go.sum 中已存在且匹配 ─► 通过
    │
    ├─► go.sum 中存在但不匹配 ─► SECURITY ERROR
    │
    └─► go.sum 中不存在
            │
            ▼
        useSumDB(mod)?
            │
            ├─► No (GOSUMDB=off 或 GONOSUMDB 匹配)
            │       │
            │       ▼
            │   添加到 go.sum
            │
            └─► Yes
                    │
                    ▼
                checkSumDB(mod, hash)
                    │
                    ├─► sumdb 返回匹配 ─► 添加到 go.sum
                    │
                    └─► sumdb 返回不匹配 ─► SECURITY ERROR

2. go.sum 文件格式

go.sum 文件每行包含三个字段:模块路径、版本、哈希值。

复制代码
golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=

每个模块通常有两条记录:

  • 模块内容的哈希(zip 文件)
  • go.mod 文件的哈希(以 /go.mod 结尾)

解析逻辑位于 cmd/go/internal/modfetch/fetch.go

go 复制代码
// 577: src\cmd\go\internal\modfetch\fetch.go
func readGoSum(dst map[module.Version][]string, file string, data []byte) {
	lineno := 0
	for len(data) > 0 {
		var line []byte
		lineno++
		i := bytes.IndexByte(data, '\n')
		if i < 0 {
			line, data = data, nil
		} else {
			line, data = data[:i], data[i+1:]
		}
		f := strings.Fields(string(line))
		if len(f) == 0 {
			// blank line; skip it
			continue
		}
		if len(f) != 3 {
			if cfg.CmdName == "mod tidy" {
				// ignore malformed line so that go mod tidy can fix go.sum
				continue
			} else {
				base.Fatalf("malformed go.sum:\n%s:%d: wrong number of fields %v\n", file, lineno, len(f))
			}
		}
		if f[2] == emptyGoModHash {
			// Old bug; drop it.
			continue
		}
		mod := module.Version{Path: f[0], Version: f[1]}
		dst[mod] = append(dst[mod], f[2])
	}
}

3. 哈希算法:h1

Go 使用 h1: 前缀标识的哈希算法,核心实现在 golang.org/x/mod/sumdb/dirhash 包:

go 复制代码
// 44: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func Hash1(files []string, open func(string) (io.ReadCloser, error)) (string, error) {
	h := sha256.New()
	files = append([]string(nil), files...)
	slices.Sort(files)
	for _, file := range files {
		if strings.Contains(file, "\n") {
			return "", errors.New("dirhash: filenames with newlines are not supported")
		}
		r, err := open(file)
		if err != nil {
			return "", err
		}
		hf := sha256.New()
		_, err = io.Copy(hf, r)
		r.Close()
		if err != nil {
			return "", err
		}
		fmt.Fprintf(h, "%x  %s\n", hf.Sum(nil), file)
	}
	return "h1:" + base64.StdEncoding.EncodeToString(h.Sum(nil)), nil
}

算法流程:

  1. 对文件名排序
  2. 对每个文件计算 SHA-256
  3. 生成格式 <hex hash> <filename>\n
  4. 对所有行的拼接再计算 SHA-256
  5. Base64 编码后加 h1: 前缀

对于 zip 文件,使用 HashZip 函数:

go 复制代码
// 111: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func HashZip(zipfile string, hash Hash) (string, error) {
	z, err := zip.OpenReader(zipfile)
	if err != nil {
		return "", err
	}
	defer z.Close()
	var files []string
	zfiles := make(map[string]*zip.File)
	for _, file := range z.File {
		files = append(files, file.Name)
		zfiles[file.Name] = file
	}
	zipOpen := func(name string) (io.ReadCloser, error) {
		f := zfiles[name]
		if f == nil {
			return nil, fmt.Errorf("file %q not found in zip", name) // should never happen
		}
		return f.Open()
	}
	return hash(files, zipOpen)
}

4. 校验流程

4.1 入口:checkModSum

当下载模块时,会调用 checkModSum 进行校验:

go 复制代码
// 739: src\cmd\go\internal\modfetch\fetch.go
func checkModSum(mod module.Version, h string) error {
	// We lock goSum when manipulating it,
	// but we arrange to release the lock when calling checkSumDB,
	// so that parallel calls to checkModHash can execute parallel calls
	// to checkSumDB.

	// Check whether mod+h is listed in go.sum already. If so, we're done.
	goSum.mu.Lock()
	inited, err := initGoSum()
	if err != nil {
		goSum.mu.Unlock()
		return err
	}
	done := inited && haveModSumLocked(mod, h)
	if inited {
		st := goSum.status[modSum{mod, h}]
		st.used = true
		goSum.status[modSum{mod, h}] = st
	}
	goSum.mu.Unlock()

	if done {
		return nil
	}

	// Not listed, so we want to add them.
	// Consult checksum database if appropriate.
	if useSumDB(mod) {
		// Calls base.Fatalf if mismatch detected.
		if err := checkSumDB(mod, h); err != nil {
			return err
		}
	}

	// Add mod+h to go.sum, if it hasn't appeared already.
	if inited {
		goSum.mu.Lock()
		addModSumLocked(mod, h)
		st := goSum.status[modSum{mod, h}]
		st.dirty = true
		goSum.status[modSum{mod, h}] = st
		goSum.mu.Unlock()
	}
	return nil
}

流程:

  1. 先检查 go.sum 中是否已有匹配的哈希
  2. 如果没有,查询 checksum database(sumdb 是 Go 官方提供的远程 HTTP 服务)
  3. 通过验证后,将新哈希添加到 go.sum

4.2 哈希冲突检测

当下载的哈希与 go.sum 中记录不匹配时:

go 复制代码
// 788: src\cmd\go\internal\modfetch\fetch.go
func haveModSumLocked(mod module.Version, h string) bool {
	sumFileName := "go.sum"
	if strings.HasSuffix(GoSumFile, "go.work.sum") {
		sumFileName = "go.work.sum"
	}
	for _, vh := range goSum.m[mod] {
		if h == vh {
			return true
		}
		if strings.HasPrefix(vh, "h1:") {
			base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s:     %v"+goSumMismatch, mod.Path, mod.Version, h, sumFileName, vh)
		}
	}
	// Also check workspace sums.
	foundMatch := false
	// Check sums from all files in case there are conflicts between
	// the files.
	for goSumFile, goSums := range goSum.w {
		for _, vh := range goSums[mod] {
			if h == vh {
				foundMatch = true
			} else if strings.HasPrefix(vh, "h1:") {
				base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s:     %v"+goSumMismatch, mod.Path, mod.Version, h, goSumFile, vh)
			}
		}
	}
	return foundMatch
}

发生哈希不匹配时,会输出安全错误:

go 复制代码
// 1045: src\cmd\go\internal\modfetch\fetch.go
const goSumMismatch = `

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.
`

5. Checksum Database (sumdb)

5.1 是否使用 sumdb

useSumDB 函数判断是否需要查询校验数据库:

go 复制代码
// 35: src\cmd\go\internal\modfetch\sumdb.go
func useSumDB(mod module.Version) bool {
	if mod.Path == "golang.org/toolchain" {
		must := true
		// Downloaded toolchains cannot be listed in go.sum,
		// so we require checksum database lookups even if
		// GOSUMDB=off or GONOSUMDB matches the pattern.
		// If GOSUMDB=off, then the eventual lookup will fail
		// with a good error message.

		// Exception #1: using GOPROXY=file:// to test a distpack.
		if strings.HasPrefix(cfg.GOPROXY, "file://") && !strings.ContainsAny(cfg.GOPROXY, ",|") {
			must = false
		}
		// Exception #2: the Go proxy+checksum database cannot check itself
		// while doing the initial download.
		if strings.Contains(os.Getenv("GIT_HTTP_USER_AGENT"), "proxy.golang.org") {
			must = false
		}

		// Another potential exception would be GOPROXY=direct,
		// but that would make toolchain downloads only as secure
		// as HTTPS, and in particular they'd be susceptible to MITM
		// attacks on systems with less-than-trustworthy root certificates.
		// The checksum database provides a stronger guarantee,
		// so we don't make that exception.

		// Otherwise, require the checksum database.
		if must {
			return true
		}
	}
	return cfg.GOSUMDB != "off" && !module.MatchPrefixPatterns(cfg.GONOSUMDB, mod.Path)
}

关键:

  • GOSUMDB=off:完全禁用
  • GONOSUMDB:匹配的模块路径不查询 sumdb
  • GOPRIVATE:私有模块默认不查询

5.2 默认 sumdb 公钥

Go 内置了 sum.golang.org 的公钥:

go 复制代码
// 7: src\cmd\go\internal\modfetch\key.go
var knownGOSUMDB = map[string]string{
	"sum.golang.org": "sum.golang.org+033de0ae+Ac4zctda0e5eza+HJyk9SxEdh+s3Ux18htTTAD8OuAn8",
}

5.3 连接 sumdb

dbDial 函数初始化 sumdb 客户端:

go 复制代码
// 89: src\cmd\go\internal\modfetch\sumdb.go
func dbDial() (dbName string, db *sumdb.Client, err error) {
	// $GOSUMDB can be "key" or "key url",
	// and the key can be a full verifier key
	// or a host on our list of known keys.

	// Special case: sum.golang.google.cn
	// is an alias, reachable inside mainland China,
	// for sum.golang.org. If there are more
	// of these we should add a map like knownGOSUMDB.
	gosumdb := cfg.GOSUMDB
	if gosumdb == "sum.golang.google.cn" {
		gosumdb = "sum.golang.org https://sum.golang.google.cn"
	}

	if gosumdb == "off" {
		return "", nil, fmt.Errorf("checksum database disabled by GOSUMDB=off")
	}

	key := strings.Fields(gosumdb)
	if len(key) >= 1 {
		if k := knownGOSUMDB[key[0]]; k != "" {
			key[0] = k
		}
	}
	if len(key) == 0 {
		return "", nil, fmt.Errorf("missing GOSUMDB")
	}
	if len(key) > 2 {
		return "", nil, fmt.Errorf("invalid GOSUMDB: too many fields")
	}
	vkey, err := note.NewVerifier(key[0])
	if err != nil {
		return "", nil, fmt.Errorf("invalid GOSUMDB: %v", err)
	}
	name := vkey.Name()

	// No funny business in the database name.
	direct, err := url.Parse("https://" + name)
	if err != nil || strings.HasSuffix(name, "/") || *direct != (url.URL{Scheme: "https", Host: direct.Host, Path: direct.Path, RawPath: direct.RawPath}) || direct.RawPath != "" || direct.Host == "" {
		return "", nil, fmt.Errorf("invalid sumdb name (must be host[/path]): %s %+v", name, *direct)
	}

	// Determine how to get to database.
	var base *url.URL
	if len(key) >= 2 {
		// Use explicit alternate URL listed in $GOSUMDB,
		// bypassing both the default URL derivation and any proxies.
		u, err := url.Parse(key[1])
		if err != nil {
			return "", nil, fmt.Errorf("invalid GOSUMDB URL: %v", err)
		}
		base = u
	}

	return name, sumdb.NewClient(&dbClient{key: key[0], name: name, direct: direct, base: base}), nil
}

5.4 查询 sumdb

checkSumDB 向 sumdb 查询并验证哈希:

go 复制代码
// 831: src\cmd\go\internal\modfetch\fetch.go
func checkSumDB(mod module.Version, h string) error {
	modWithoutSuffix := mod
	noun := "module"
	if before, found := strings.CutSuffix(mod.Version, "/go.mod"); found {
		noun = "go.mod"
		modWithoutSuffix.Version = before
	}

	db, lines, err := lookupSumDB(mod)
	if err != nil {
		return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: %v", noun, err))
	}

	have := mod.Path + " " + mod.Version + " " + h
	prefix := mod.Path + " " + mod.Version + " h1:"
	for _, line := range lines {
		if line == have {
			return nil
		}
		if strings.HasPrefix(line, prefix) {
			return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: checksum mismatch\n\tdownloaded: %v\n\t%s: %v"+sumdbMismatch, noun, h, db, line[len(prefix)-len("h1:"):]))
		}
	}
	return nil
}

lookupSumDB 调用 sumdb 客户端的 Lookup 方法:

go 复制代码
// 69: src\cmd\go\internal\modfetch\sumdb.go
func lookupSumDB(mod module.Version) (dbname string, lines []string, err error) {
	dbOnce.Do(func() {
		dbName, db, dbErr = dbDial()
	})
	if dbErr != nil {
		return "", nil, dbErr
	}
	lines, err = db.Lookup(mod.Path, mod.Version)
	return dbName, lines, err
}

6. sumdb.Client 核心逻辑

6.1 Lookup 方法

go 复制代码
// 209: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) Lookup(path, vers string) (lines []string, err error) {
	atomic.StoreUint32(&c.didLookup, 1)

	if c.skip(path) {
		return nil, ErrGONOSUMDB
	}

	defer func() {
		if err != nil {
			err = fmt.Errorf("%s@%s: %v", path, vers, err)
		}
	}()

	if err := c.init(); err != nil {
		return nil, err
	}

	// Prepare encoded cache filename / URL.
	epath, err := module.EscapePath(path)
	if err != nil {
		return nil, err
	}
	evers, err := module.EscapeVersion(strings.TrimSuffix(vers, "/go.mod"))
	if err != nil {
		return nil, err
	}
	remotePath := "/lookup/" + epath + "@" + evers
	file := c.name + remotePath

	// Fetch the data.
	// The lookupCache avoids redundant ReadCache/GetURL operations
	// (especially since go.sum lines tend to come in pairs for a given
	// path and version) and also avoids having multiple of the same
	// request in flight at once.
	type cached struct {
		data []byte
		err  error
	}
	result := c.record.Do(file, func() interface{} {
		// Try the on-disk cache, or else get from web.
		writeCache := false
		data, err := c.ops.ReadCache(file)
		if err != nil {
			data, err = c.ops.ReadRemote(remotePath)
			if err != nil {
				return cached{nil, err}
			}
			writeCache = true
		}

		// Validate the record before using it for anything.
		id, text, treeMsg, err := tlog.ParseRecord(data)
		if err != nil {
			return cached{nil, err}
		}
		if err := c.mergeLatest(treeMsg); err != nil {
			return cached{nil, err}
		}
		if err := c.checkRecord(id, text); err != nil {
			return cached{nil, err}
		}

		// Now that we've validated the record,
		// save it to the on-disk cache (unless that's where it came from).
		if writeCache {
			c.ops.WriteCache(file, data)
		}

		return cached{data, nil}
	}).(cached)
	if result.err != nil {
		return nil, result.err
	}

	// Extract the lines for the specific version we want
	// (with or without /go.mod).
	prefix := path + " " + vers + " "
	var hashes []string
	for _, line := range strings.Split(string(result.data), "\n") {
		if strings.HasPrefix(line, prefix) {
			hashes = append(hashes, line)
		}
	}
	return hashes, nil
}

流程:

  1. 先查本地缓存
  2. 缓存未命中则请求远程服务器
  3. 验证记录的签名和一致性
  4. 缓存验证通过的记录
  5. 提取匹配的哈希行

6.2 透明日志验证

sumdb 使用 Merkle Tree 实现透明日志(Transparent Log),确保服务器不能篡改历史记录。

checkRecord 验证记录哈希:

go 复制代码
// 478: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkRecord(id int64, data []byte) error {
	c.latestMu.Lock()
	latest := c.latest
	c.latestMu.Unlock()

	if id >= latest.N {
		return fmt.Errorf("cannot validate record %d in tree of size %d", id, latest.N)
	}
	hashes, err := tlog.TileHashReader(latest, &c.tileReader).ReadHashes([]int64{tlog.StoredHashIndex(0, id)})
	if err != nil {
		return err
	}
	if hashes[0] == tlog.RecordHash(data) {
		return nil
	}
	return fmt.Errorf("cannot authenticate record data in server response")
}

checkTrees 检测时间线分叉(服务器篡改):

go 复制代码
// 426: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkTrees(older tlog.Tree, olderNote []byte, newer tlog.Tree, newerNote []byte) error {
	thr := tlog.TileHashReader(newer, &c.tileReader)
	h, err := tlog.TreeHash(older.N, thr)
	if err != nil {
		if older.N == newer.N {
			return fmt.Errorf("checking tree#%d: %v", older.N, err)
		}
		return fmt.Errorf("checking tree#%d against tree#%d: %v", older.N, newer.N, err)
	}
	if h == older.Hash {
		return nil
	}

	// Detected a fork in the tree timeline.
	// Start by reporting the inconsistent signed tree notes.
	var buf bytes.Buffer
	fmt.Fprintf(&buf, "SECURITY ERROR\n")
	fmt.Fprintf(&buf, "go.sum database server misbehavior detected!\n\n")
	indent := func(b []byte) []byte {
		return bytes.Replace(b, []byte("\n"), []byte("\n\t"), -1)
	}
	fmt.Fprintf(&buf, "old database:\n\t%s\n", indent(olderNote))
	fmt.Fprintf(&buf, "new database:\n\t%s\n", indent(newerNote))

	// ... 生成不一致性证明 ...
	
	c.ops.SecurityError(buf.String())
	return ErrSecurity
}

7. 本地缓存

sumdb 查询结果缓存在 $GOMODCACHE/cache/download/sumdb/ 目录:

go 复制代码
// 286: src\cmd\go\internal\modfetch\sumdb.go
func (*dbClient) ReadCache(file string) ([]byte, error) {
	targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
	data, err := lockedfile.Read(targ)
	// lockedfile.Write does not atomically create the file with contents.
	// There is a moment between file creation and locking the file for writing,
	// during which the empty file can be locked for reading.
	// Treat observing an empty file as file not found.
	if err == nil && len(data) == 0 {
		err = &fs.PathError{Op: "read", Path: targ, Err: fs.ErrNotExist}
	}
	return data, err
}

// WriteCache updates cached lookups or tiles.
func (*dbClient) WriteCache(file string, data []byte) {
	targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
	os.MkdirAll(filepath.Dir(targ), 0777)
	lockedfile.Write(targ, bytes.NewReader(data), 0666)
}

8. 相关环境变量

变量 作用
GOSUMDB 指定 checksum database,默认 sum.golang.org,设为 off 禁用
GONOSUMDB 不查询 sumdb 的模块路径模式
GOPRIVATE 私有模块,隐含设置 GONOSUMDB
GOMODCACHE 模块缓存目录,sumdb 缓存也在其中

9. 源码文件索引

文件 职责
cmd/go/internal/modfetch/fetch.go go.sum 读写、校验入口
cmd/go/internal/modfetch/sumdb.go sumdb 客户端适配层
cmd/go/internal/modfetch/key.go 内置 sumdb 公钥
golang.org/x/mod/sumdb/client.go sumdb 客户端核心逻辑
golang.org/x/mod/sumdb/dirhash/hash.go h1 哈希算法实现
golang.org/x/mod/sumdb/tlog/ 透明日志实现
相关推荐
pusheng20252 小时前
燃料电池电化学传感器在硫化物固态电池安全监测中的技术优势解析
前端·人工智能·安全
小二·2 小时前
Go 语言系统编程与云原生开发实战(第12篇)云原生部署实战:Helm Chart × GitOps × 多环境管理(生产级落地)
开发语言·云原生·golang
天空属于哈夫克32 小时前
Go 语言实战:构建一个企微外部群“技术贴收藏夹”小程序后端
小程序·golang·企业微信
小二·2 小时前
Go 语言系统编程与云原生开发实战(第13篇)工程效能实战:Monorepo × 依赖治理 × 构建加速(10万行代码实测)
开发语言·云原生·golang
暴躁小师兄数据学院2 小时前
【WEB3.0零基础转行笔记】Golang编程篇-第4讲:Go语言中的流程控制
开发语言·后端·golang·web3·区块链
数字护盾(和中)3 小时前
数字 “黑天鹅” 频发:从亚冬会网攻到朝日啤酒断供的安全警示
网络·安全·web安全
浩浩测试一下3 小时前
内网---> ForceChangePassword 权限滥用
java·服务器·网络·安全·web安全·网络安全·系统安全
爱思考的发菜_汽车网络信息安全3 小时前
汽车网络安全:RSA算法详细解析
安全·web安全·汽车
ん贤3 小时前
双Token的致命漏洞,你的系统安全吗?
安全·系统安全