12 go.sum 是如何保证依赖安全的?校验机制源码解析

本文基于 Go 1.25.0 源码进行分析

1. 整体流程图

复制代码
下载模块 zip
    │
    ▼
计算 zip 哈希 (dirhash.HashZip)
    │
    ▼
checkModSum(mod, hash)
    │
    ├─► go.sum 中已存在且匹配 ─► 通过
    │
    ├─► go.sum 中存在但不匹配 ─► SECURITY ERROR
    │
    └─► go.sum 中不存在
            │
            ▼
        useSumDB(mod)?
            │
            ├─► No (GOSUMDB=off 或 GONOSUMDB 匹配)
            │       │
            │       ▼
            │   添加到 go.sum
            │
            └─► Yes
                    │
                    ▼
                checkSumDB(mod, hash)
                    │
                    ├─► sumdb 返回匹配 ─► 添加到 go.sum
                    │
                    └─► sumdb 返回不匹配 ─► SECURITY ERROR

2. go.sum 文件格式

go.sum 文件每行包含三个字段:模块路径、版本、哈希值。

复制代码
golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=

每个模块通常有两条记录:

  • 模块内容的哈希(zip 文件)
  • go.mod 文件的哈希(以 /go.mod 结尾)

解析逻辑位于 cmd/go/internal/modfetch/fetch.go

go 复制代码
// 577: src\cmd\go\internal\modfetch\fetch.go
func readGoSum(dst map[module.Version][]string, file string, data []byte) {
	lineno := 0
	for len(data) > 0 {
		var line []byte
		lineno++
		i := bytes.IndexByte(data, '\n')
		if i < 0 {
			line, data = data, nil
		} else {
			line, data = data[:i], data[i+1:]
		}
		f := strings.Fields(string(line))
		if len(f) == 0 {
			// blank line; skip it
			continue
		}
		if len(f) != 3 {
			if cfg.CmdName == "mod tidy" {
				// ignore malformed line so that go mod tidy can fix go.sum
				continue
			} else {
				base.Fatalf("malformed go.sum:\n%s:%d: wrong number of fields %v\n", file, lineno, len(f))
			}
		}
		if f[2] == emptyGoModHash {
			// Old bug; drop it.
			continue
		}
		mod := module.Version{Path: f[0], Version: f[1]}
		dst[mod] = append(dst[mod], f[2])
	}
}

3. 哈希算法:h1

Go 使用 h1: 前缀标识的哈希算法,核心实现在 golang.org/x/mod/sumdb/dirhash 包:

go 复制代码
// 44: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func Hash1(files []string, open func(string) (io.ReadCloser, error)) (string, error) {
	h := sha256.New()
	files = append([]string(nil), files...)
	slices.Sort(files)
	for _, file := range files {
		if strings.Contains(file, "\n") {
			return "", errors.New("dirhash: filenames with newlines are not supported")
		}
		r, err := open(file)
		if err != nil {
			return "", err
		}
		hf := sha256.New()
		_, err = io.Copy(hf, r)
		r.Close()
		if err != nil {
			return "", err
		}
		fmt.Fprintf(h, "%x  %s\n", hf.Sum(nil), file)
	}
	return "h1:" + base64.StdEncoding.EncodeToString(h.Sum(nil)), nil
}

算法流程:

  1. 对文件名排序
  2. 对每个文件计算 SHA-256
  3. 生成格式 <hex hash> <filename>\n
  4. 对所有行的拼接再计算 SHA-256
  5. Base64 编码后加 h1: 前缀

对于 zip 文件,使用 HashZip 函数:

go 复制代码
// 111: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func HashZip(zipfile string, hash Hash) (string, error) {
	z, err := zip.OpenReader(zipfile)
	if err != nil {
		return "", err
	}
	defer z.Close()
	var files []string
	zfiles := make(map[string]*zip.File)
	for _, file := range z.File {
		files = append(files, file.Name)
		zfiles[file.Name] = file
	}
	zipOpen := func(name string) (io.ReadCloser, error) {
		f := zfiles[name]
		if f == nil {
			return nil, fmt.Errorf("file %q not found in zip", name) // should never happen
		}
		return f.Open()
	}
	return hash(files, zipOpen)
}

4. 校验流程

4.1 入口:checkModSum

当下载模块时,会调用 checkModSum 进行校验:

go 复制代码
// 739: src\cmd\go\internal\modfetch\fetch.go
func checkModSum(mod module.Version, h string) error {
	// We lock goSum when manipulating it,
	// but we arrange to release the lock when calling checkSumDB,
	// so that parallel calls to checkModHash can execute parallel calls
	// to checkSumDB.

	// Check whether mod+h is listed in go.sum already. If so, we're done.
	goSum.mu.Lock()
	inited, err := initGoSum()
	if err != nil {
		goSum.mu.Unlock()
		return err
	}
	done := inited && haveModSumLocked(mod, h)
	if inited {
		st := goSum.status[modSum{mod, h}]
		st.used = true
		goSum.status[modSum{mod, h}] = st
	}
	goSum.mu.Unlock()

	if done {
		return nil
	}

	// Not listed, so we want to add them.
	// Consult checksum database if appropriate.
	if useSumDB(mod) {
		// Calls base.Fatalf if mismatch detected.
		if err := checkSumDB(mod, h); err != nil {
			return err
		}
	}

	// Add mod+h to go.sum, if it hasn't appeared already.
	if inited {
		goSum.mu.Lock()
		addModSumLocked(mod, h)
		st := goSum.status[modSum{mod, h}]
		st.dirty = true
		goSum.status[modSum{mod, h}] = st
		goSum.mu.Unlock()
	}
	return nil
}

流程:

  1. 先检查 go.sum 中是否已有匹配的哈希
  2. 如果没有,查询 checksum database(sumdb 是 Go 官方提供的远程 HTTP 服务)
  3. 通过验证后,将新哈希添加到 go.sum

4.2 哈希冲突检测

当下载的哈希与 go.sum 中记录不匹配时:

go 复制代码
// 788: src\cmd\go\internal\modfetch\fetch.go
func haveModSumLocked(mod module.Version, h string) bool {
	sumFileName := "go.sum"
	if strings.HasSuffix(GoSumFile, "go.work.sum") {
		sumFileName = "go.work.sum"
	}
	for _, vh := range goSum.m[mod] {
		if h == vh {
			return true
		}
		if strings.HasPrefix(vh, "h1:") {
			base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s:     %v"+goSumMismatch, mod.Path, mod.Version, h, sumFileName, vh)
		}
	}
	// Also check workspace sums.
	foundMatch := false
	// Check sums from all files in case there are conflicts between
	// the files.
	for goSumFile, goSums := range goSum.w {
		for _, vh := range goSums[mod] {
			if h == vh {
				foundMatch = true
			} else if strings.HasPrefix(vh, "h1:") {
				base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s:     %v"+goSumMismatch, mod.Path, mod.Version, h, goSumFile, vh)
			}
		}
	}
	return foundMatch
}

发生哈希不匹配时,会输出安全错误:

go 复制代码
// 1045: src\cmd\go\internal\modfetch\fetch.go
const goSumMismatch = `

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.
`

5. Checksum Database (sumdb)

5.1 是否使用 sumdb

useSumDB 函数判断是否需要查询校验数据库:

go 复制代码
// 35: src\cmd\go\internal\modfetch\sumdb.go
func useSumDB(mod module.Version) bool {
	if mod.Path == "golang.org/toolchain" {
		must := true
		// Downloaded toolchains cannot be listed in go.sum,
		// so we require checksum database lookups even if
		// GOSUMDB=off or GONOSUMDB matches the pattern.
		// If GOSUMDB=off, then the eventual lookup will fail
		// with a good error message.

		// Exception #1: using GOPROXY=file:// to test a distpack.
		if strings.HasPrefix(cfg.GOPROXY, "file://") && !strings.ContainsAny(cfg.GOPROXY, ",|") {
			must = false
		}
		// Exception #2: the Go proxy+checksum database cannot check itself
		// while doing the initial download.
		if strings.Contains(os.Getenv("GIT_HTTP_USER_AGENT"), "proxy.golang.org") {
			must = false
		}

		// Another potential exception would be GOPROXY=direct,
		// but that would make toolchain downloads only as secure
		// as HTTPS, and in particular they'd be susceptible to MITM
		// attacks on systems with less-than-trustworthy root certificates.
		// The checksum database provides a stronger guarantee,
		// so we don't make that exception.

		// Otherwise, require the checksum database.
		if must {
			return true
		}
	}
	return cfg.GOSUMDB != "off" && !module.MatchPrefixPatterns(cfg.GONOSUMDB, mod.Path)
}

关键:

  • GOSUMDB=off:完全禁用
  • GONOSUMDB:匹配的模块路径不查询 sumdb
  • GOPRIVATE:私有模块默认不查询

5.2 默认 sumdb 公钥

Go 内置了 sum.golang.org 的公钥:

go 复制代码
// 7: src\cmd\go\internal\modfetch\key.go
var knownGOSUMDB = map[string]string{
	"sum.golang.org": "sum.golang.org+033de0ae+Ac4zctda0e5eza+HJyk9SxEdh+s3Ux18htTTAD8OuAn8",
}

5.3 连接 sumdb

dbDial 函数初始化 sumdb 客户端:

go 复制代码
// 89: src\cmd\go\internal\modfetch\sumdb.go
func dbDial() (dbName string, db *sumdb.Client, err error) {
	// $GOSUMDB can be "key" or "key url",
	// and the key can be a full verifier key
	// or a host on our list of known keys.

	// Special case: sum.golang.google.cn
	// is an alias, reachable inside mainland China,
	// for sum.golang.org. If there are more
	// of these we should add a map like knownGOSUMDB.
	gosumdb := cfg.GOSUMDB
	if gosumdb == "sum.golang.google.cn" {
		gosumdb = "sum.golang.org https://sum.golang.google.cn"
	}

	if gosumdb == "off" {
		return "", nil, fmt.Errorf("checksum database disabled by GOSUMDB=off")
	}

	key := strings.Fields(gosumdb)
	if len(key) >= 1 {
		if k := knownGOSUMDB[key[0]]; k != "" {
			key[0] = k
		}
	}
	if len(key) == 0 {
		return "", nil, fmt.Errorf("missing GOSUMDB")
	}
	if len(key) > 2 {
		return "", nil, fmt.Errorf("invalid GOSUMDB: too many fields")
	}
	vkey, err := note.NewVerifier(key[0])
	if err != nil {
		return "", nil, fmt.Errorf("invalid GOSUMDB: %v", err)
	}
	name := vkey.Name()

	// No funny business in the database name.
	direct, err := url.Parse("https://" + name)
	if err != nil || strings.HasSuffix(name, "/") || *direct != (url.URL{Scheme: "https", Host: direct.Host, Path: direct.Path, RawPath: direct.RawPath}) || direct.RawPath != "" || direct.Host == "" {
		return "", nil, fmt.Errorf("invalid sumdb name (must be host[/path]): %s %+v", name, *direct)
	}

	// Determine how to get to database.
	var base *url.URL
	if len(key) >= 2 {
		// Use explicit alternate URL listed in $GOSUMDB,
		// bypassing both the default URL derivation and any proxies.
		u, err := url.Parse(key[1])
		if err != nil {
			return "", nil, fmt.Errorf("invalid GOSUMDB URL: %v", err)
		}
		base = u
	}

	return name, sumdb.NewClient(&dbClient{key: key[0], name: name, direct: direct, base: base}), nil
}

5.4 查询 sumdb

checkSumDB 向 sumdb 查询并验证哈希:

go 复制代码
// 831: src\cmd\go\internal\modfetch\fetch.go
func checkSumDB(mod module.Version, h string) error {
	modWithoutSuffix := mod
	noun := "module"
	if before, found := strings.CutSuffix(mod.Version, "/go.mod"); found {
		noun = "go.mod"
		modWithoutSuffix.Version = before
	}

	db, lines, err := lookupSumDB(mod)
	if err != nil {
		return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: %v", noun, err))
	}

	have := mod.Path + " " + mod.Version + " " + h
	prefix := mod.Path + " " + mod.Version + " h1:"
	for _, line := range lines {
		if line == have {
			return nil
		}
		if strings.HasPrefix(line, prefix) {
			return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: checksum mismatch\n\tdownloaded: %v\n\t%s: %v"+sumdbMismatch, noun, h, db, line[len(prefix)-len("h1:"):]))
		}
	}
	return nil
}

lookupSumDB 调用 sumdb 客户端的 Lookup 方法:

go 复制代码
// 69: src\cmd\go\internal\modfetch\sumdb.go
func lookupSumDB(mod module.Version) (dbname string, lines []string, err error) {
	dbOnce.Do(func() {
		dbName, db, dbErr = dbDial()
	})
	if dbErr != nil {
		return "", nil, dbErr
	}
	lines, err = db.Lookup(mod.Path, mod.Version)
	return dbName, lines, err
}

6. sumdb.Client 核心逻辑

6.1 Lookup 方法

go 复制代码
// 209: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) Lookup(path, vers string) (lines []string, err error) {
	atomic.StoreUint32(&c.didLookup, 1)

	if c.skip(path) {
		return nil, ErrGONOSUMDB
	}

	defer func() {
		if err != nil {
			err = fmt.Errorf("%s@%s: %v", path, vers, err)
		}
	}()

	if err := c.init(); err != nil {
		return nil, err
	}

	// Prepare encoded cache filename / URL.
	epath, err := module.EscapePath(path)
	if err != nil {
		return nil, err
	}
	evers, err := module.EscapeVersion(strings.TrimSuffix(vers, "/go.mod"))
	if err != nil {
		return nil, err
	}
	remotePath := "/lookup/" + epath + "@" + evers
	file := c.name + remotePath

	// Fetch the data.
	// The lookupCache avoids redundant ReadCache/GetURL operations
	// (especially since go.sum lines tend to come in pairs for a given
	// path and version) and also avoids having multiple of the same
	// request in flight at once.
	type cached struct {
		data []byte
		err  error
	}
	result := c.record.Do(file, func() interface{} {
		// Try the on-disk cache, or else get from web.
		writeCache := false
		data, err := c.ops.ReadCache(file)
		if err != nil {
			data, err = c.ops.ReadRemote(remotePath)
			if err != nil {
				return cached{nil, err}
			}
			writeCache = true
		}

		// Validate the record before using it for anything.
		id, text, treeMsg, err := tlog.ParseRecord(data)
		if err != nil {
			return cached{nil, err}
		}
		if err := c.mergeLatest(treeMsg); err != nil {
			return cached{nil, err}
		}
		if err := c.checkRecord(id, text); err != nil {
			return cached{nil, err}
		}

		// Now that we've validated the record,
		// save it to the on-disk cache (unless that's where it came from).
		if writeCache {
			c.ops.WriteCache(file, data)
		}

		return cached{data, nil}
	}).(cached)
	if result.err != nil {
		return nil, result.err
	}

	// Extract the lines for the specific version we want
	// (with or without /go.mod).
	prefix := path + " " + vers + " "
	var hashes []string
	for _, line := range strings.Split(string(result.data), "\n") {
		if strings.HasPrefix(line, prefix) {
			hashes = append(hashes, line)
		}
	}
	return hashes, nil
}

流程:

  1. 先查本地缓存
  2. 缓存未命中则请求远程服务器
  3. 验证记录的签名和一致性
  4. 缓存验证通过的记录
  5. 提取匹配的哈希行

6.2 透明日志验证

sumdb 使用 Merkle Tree 实现透明日志(Transparent Log),确保服务器不能篡改历史记录。

checkRecord 验证记录哈希:

go 复制代码
// 478: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkRecord(id int64, data []byte) error {
	c.latestMu.Lock()
	latest := c.latest
	c.latestMu.Unlock()

	if id >= latest.N {
		return fmt.Errorf("cannot validate record %d in tree of size %d", id, latest.N)
	}
	hashes, err := tlog.TileHashReader(latest, &c.tileReader).ReadHashes([]int64{tlog.StoredHashIndex(0, id)})
	if err != nil {
		return err
	}
	if hashes[0] == tlog.RecordHash(data) {
		return nil
	}
	return fmt.Errorf("cannot authenticate record data in server response")
}

checkTrees 检测时间线分叉(服务器篡改):

go 复制代码
// 426: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkTrees(older tlog.Tree, olderNote []byte, newer tlog.Tree, newerNote []byte) error {
	thr := tlog.TileHashReader(newer, &c.tileReader)
	h, err := tlog.TreeHash(older.N, thr)
	if err != nil {
		if older.N == newer.N {
			return fmt.Errorf("checking tree#%d: %v", older.N, err)
		}
		return fmt.Errorf("checking tree#%d against tree#%d: %v", older.N, newer.N, err)
	}
	if h == older.Hash {
		return nil
	}

	// Detected a fork in the tree timeline.
	// Start by reporting the inconsistent signed tree notes.
	var buf bytes.Buffer
	fmt.Fprintf(&buf, "SECURITY ERROR\n")
	fmt.Fprintf(&buf, "go.sum database server misbehavior detected!\n\n")
	indent := func(b []byte) []byte {
		return bytes.Replace(b, []byte("\n"), []byte("\n\t"), -1)
	}
	fmt.Fprintf(&buf, "old database:\n\t%s\n", indent(olderNote))
	fmt.Fprintf(&buf, "new database:\n\t%s\n", indent(newerNote))

	// ... 生成不一致性证明 ...
	
	c.ops.SecurityError(buf.String())
	return ErrSecurity
}

7. 本地缓存

sumdb 查询结果缓存在 $GOMODCACHE/cache/download/sumdb/ 目录:

go 复制代码
// 286: src\cmd\go\internal\modfetch\sumdb.go
func (*dbClient) ReadCache(file string) ([]byte, error) {
	targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
	data, err := lockedfile.Read(targ)
	// lockedfile.Write does not atomically create the file with contents.
	// There is a moment between file creation and locking the file for writing,
	// during which the empty file can be locked for reading.
	// Treat observing an empty file as file not found.
	if err == nil && len(data) == 0 {
		err = &fs.PathError{Op: "read", Path: targ, Err: fs.ErrNotExist}
	}
	return data, err
}

// WriteCache updates cached lookups or tiles.
func (*dbClient) WriteCache(file string, data []byte) {
	targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
	os.MkdirAll(filepath.Dir(targ), 0777)
	lockedfile.Write(targ, bytes.NewReader(data), 0666)
}

8. 相关环境变量

变量 作用
GOSUMDB 指定 checksum database,默认 sum.golang.org,设为 off 禁用
GONOSUMDB 不查询 sumdb 的模块路径模式
GOPRIVATE 私有模块,隐含设置 GONOSUMDB
GOMODCACHE 模块缓存目录,sumdb 缓存也在其中

9. 源码文件索引

文件 职责
cmd/go/internal/modfetch/fetch.go go.sum 读写、校验入口
cmd/go/internal/modfetch/sumdb.go sumdb 客户端适配层
cmd/go/internal/modfetch/key.go 内置 sumdb 公钥
golang.org/x/mod/sumdb/client.go sumdb 客户端核心逻辑
golang.org/x/mod/sumdb/dirhash/hash.go h1 哈希算法实现
golang.org/x/mod/sumdb/tlog/ 透明日志实现
相关推荐
维构lbs智能定位9 小时前
厂区人员定位管理系统|以智能定位,守护化工厂区每一寸安全(二)
安全·厂区人员管理定位系统
JiaWen技术圈10 小时前
nginx 安全响应头 介绍
运维·nginx·安全
Jason_zhao_MR10 小时前
RK3576 MIPI Camera ISP调试:主观调优与工程实战(下)
stm32·嵌入式硬件·安全·系统架构·嵌入式
周伯通*12 小时前
为安全考虑,已锁定该用户帐户,原因是登录尝试或密码更改尝试过多。请稍候片刻再重试或与系统管理员或技术支持联系。
安全
geovindu13 小时前
go: Semaphore Pattern
开发语言·后端·设计模式·golang·企业级信号量模式
审判长烧鸡14 小时前
【PHPer转Go】fmt vs log/slog
go·php
效能革命笔记14 小时前
企业软件供应链安全优选:Gitee CodePecker SCA核心能力与选型参考
安全·gitee
黎阳之光14 小时前
黎阳之光:视频孪生智慧厂网一体化解决方案|污水处理全场景智能化升级
大数据·人工智能·物联网·安全·数字孪生
dusk_star16 小时前
go语言--笔记--封装、组合(继承)
笔记·golang
漓漾li16 小时前
每日面试题(2026-05-20)- GO AI agent全栈
后端·架构·go