
本文基于 Go 1.25.0 源码进行分析
1. 整体流程图
下载模块 zip
│
▼
计算 zip 哈希 (dirhash.HashZip)
│
▼
checkModSum(mod, hash)
│
├─► go.sum 中已存在且匹配 ─► 通过
│
├─► go.sum 中存在但不匹配 ─► SECURITY ERROR
│
└─► go.sum 中不存在
│
▼
useSumDB(mod)?
│
├─► No (GOSUMDB=off 或 GONOSUMDB 匹配)
│ │
│ ▼
│ 添加到 go.sum
│
└─► Yes
│
▼
checkSumDB(mod, hash)
│
├─► sumdb 返回匹配 ─► 添加到 go.sum
│
└─► sumdb 返回不匹配 ─► SECURITY ERROR
2. go.sum 文件格式
go.sum 文件每行包含三个字段:模块路径、版本、哈希值。
golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
每个模块通常有两条记录:
- 模块内容的哈希(zip 文件)
- go.mod 文件的哈希(以
/go.mod结尾)
解析逻辑位于 cmd/go/internal/modfetch/fetch.go:
go
// 577: src\cmd\go\internal\modfetch\fetch.go
func readGoSum(dst map[module.Version][]string, file string, data []byte) {
lineno := 0
for len(data) > 0 {
var line []byte
lineno++
i := bytes.IndexByte(data, '\n')
if i < 0 {
line, data = data, nil
} else {
line, data = data[:i], data[i+1:]
}
f := strings.Fields(string(line))
if len(f) == 0 {
// blank line; skip it
continue
}
if len(f) != 3 {
if cfg.CmdName == "mod tidy" {
// ignore malformed line so that go mod tidy can fix go.sum
continue
} else {
base.Fatalf("malformed go.sum:\n%s:%d: wrong number of fields %v\n", file, lineno, len(f))
}
}
if f[2] == emptyGoModHash {
// Old bug; drop it.
continue
}
mod := module.Version{Path: f[0], Version: f[1]}
dst[mod] = append(dst[mod], f[2])
}
}
3. 哈希算法:h1
Go 使用 h1: 前缀标识的哈希算法,核心实现在 golang.org/x/mod/sumdb/dirhash 包:
go
// 44: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func Hash1(files []string, open func(string) (io.ReadCloser, error)) (string, error) {
h := sha256.New()
files = append([]string(nil), files...)
slices.Sort(files)
for _, file := range files {
if strings.Contains(file, "\n") {
return "", errors.New("dirhash: filenames with newlines are not supported")
}
r, err := open(file)
if err != nil {
return "", err
}
hf := sha256.New()
_, err = io.Copy(hf, r)
r.Close()
if err != nil {
return "", err
}
fmt.Fprintf(h, "%x %s\n", hf.Sum(nil), file)
}
return "h1:" + base64.StdEncoding.EncodeToString(h.Sum(nil)), nil
}
算法流程:
- 对文件名排序
- 对每个文件计算 SHA-256
- 生成格式
<hex hash> <filename>\n - 对所有行的拼接再计算 SHA-256
- Base64 编码后加
h1:前缀
对于 zip 文件,使用 HashZip 函数:
go
// 111: src\cmd\vendor\golang.org\x\mod\sumdb\dirhash\hash.go
func HashZip(zipfile string, hash Hash) (string, error) {
z, err := zip.OpenReader(zipfile)
if err != nil {
return "", err
}
defer z.Close()
var files []string
zfiles := make(map[string]*zip.File)
for _, file := range z.File {
files = append(files, file.Name)
zfiles[file.Name] = file
}
zipOpen := func(name string) (io.ReadCloser, error) {
f := zfiles[name]
if f == nil {
return nil, fmt.Errorf("file %q not found in zip", name) // should never happen
}
return f.Open()
}
return hash(files, zipOpen)
}
4. 校验流程
4.1 入口:checkModSum
当下载模块时,会调用 checkModSum 进行校验:
go
// 739: src\cmd\go\internal\modfetch\fetch.go
func checkModSum(mod module.Version, h string) error {
// We lock goSum when manipulating it,
// but we arrange to release the lock when calling checkSumDB,
// so that parallel calls to checkModHash can execute parallel calls
// to checkSumDB.
// Check whether mod+h is listed in go.sum already. If so, we're done.
goSum.mu.Lock()
inited, err := initGoSum()
if err != nil {
goSum.mu.Unlock()
return err
}
done := inited && haveModSumLocked(mod, h)
if inited {
st := goSum.status[modSum{mod, h}]
st.used = true
goSum.status[modSum{mod, h}] = st
}
goSum.mu.Unlock()
if done {
return nil
}
// Not listed, so we want to add them.
// Consult checksum database if appropriate.
if useSumDB(mod) {
// Calls base.Fatalf if mismatch detected.
if err := checkSumDB(mod, h); err != nil {
return err
}
}
// Add mod+h to go.sum, if it hasn't appeared already.
if inited {
goSum.mu.Lock()
addModSumLocked(mod, h)
st := goSum.status[modSum{mod, h}]
st.dirty = true
goSum.status[modSum{mod, h}] = st
goSum.mu.Unlock()
}
return nil
}
流程:
- 先检查 go.sum 中是否已有匹配的哈希
- 如果没有,查询 checksum database(
sumdb是 Go 官方提供的远程 HTTP 服务) - 通过验证后,将新哈希添加到 go.sum
4.2 哈希冲突检测
当下载的哈希与 go.sum 中记录不匹配时:
go
// 788: src\cmd\go\internal\modfetch\fetch.go
func haveModSumLocked(mod module.Version, h string) bool {
sumFileName := "go.sum"
if strings.HasSuffix(GoSumFile, "go.work.sum") {
sumFileName = "go.work.sum"
}
for _, vh := range goSum.m[mod] {
if h == vh {
return true
}
if strings.HasPrefix(vh, "h1:") {
base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s: %v"+goSumMismatch, mod.Path, mod.Version, h, sumFileName, vh)
}
}
// Also check workspace sums.
foundMatch := false
// Check sums from all files in case there are conflicts between
// the files.
for goSumFile, goSums := range goSum.w {
for _, vh := range goSums[mod] {
if h == vh {
foundMatch = true
} else if strings.HasPrefix(vh, "h1:") {
base.Fatalf("verifying %s@%s: checksum mismatch\n\tdownloaded: %v\n\t%s: %v"+goSumMismatch, mod.Path, mod.Version, h, goSumFile, vh)
}
}
}
return foundMatch
}
发生哈希不匹配时,会输出安全错误:
go
// 1045: src\cmd\go\internal\modfetch\fetch.go
const goSumMismatch = `
SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.
For more information, see 'go help module-auth'.
`
5. Checksum Database (sumdb)
5.1 是否使用 sumdb
useSumDB 函数判断是否需要查询校验数据库:
go
// 35: src\cmd\go\internal\modfetch\sumdb.go
func useSumDB(mod module.Version) bool {
if mod.Path == "golang.org/toolchain" {
must := true
// Downloaded toolchains cannot be listed in go.sum,
// so we require checksum database lookups even if
// GOSUMDB=off or GONOSUMDB matches the pattern.
// If GOSUMDB=off, then the eventual lookup will fail
// with a good error message.
// Exception #1: using GOPROXY=file:// to test a distpack.
if strings.HasPrefix(cfg.GOPROXY, "file://") && !strings.ContainsAny(cfg.GOPROXY, ",|") {
must = false
}
// Exception #2: the Go proxy+checksum database cannot check itself
// while doing the initial download.
if strings.Contains(os.Getenv("GIT_HTTP_USER_AGENT"), "proxy.golang.org") {
must = false
}
// Another potential exception would be GOPROXY=direct,
// but that would make toolchain downloads only as secure
// as HTTPS, and in particular they'd be susceptible to MITM
// attacks on systems with less-than-trustworthy root certificates.
// The checksum database provides a stronger guarantee,
// so we don't make that exception.
// Otherwise, require the checksum database.
if must {
return true
}
}
return cfg.GOSUMDB != "off" && !module.MatchPrefixPatterns(cfg.GONOSUMDB, mod.Path)
}
关键:
GOSUMDB=off:完全禁用GONOSUMDB:匹配的模块路径不查询 sumdbGOPRIVATE:私有模块默认不查询
5.2 默认 sumdb 公钥
Go 内置了 sum.golang.org 的公钥:
go
// 7: src\cmd\go\internal\modfetch\key.go
var knownGOSUMDB = map[string]string{
"sum.golang.org": "sum.golang.org+033de0ae+Ac4zctda0e5eza+HJyk9SxEdh+s3Ux18htTTAD8OuAn8",
}
5.3 连接 sumdb
dbDial 函数初始化 sumdb 客户端:
go
// 89: src\cmd\go\internal\modfetch\sumdb.go
func dbDial() (dbName string, db *sumdb.Client, err error) {
// $GOSUMDB can be "key" or "key url",
// and the key can be a full verifier key
// or a host on our list of known keys.
// Special case: sum.golang.google.cn
// is an alias, reachable inside mainland China,
// for sum.golang.org. If there are more
// of these we should add a map like knownGOSUMDB.
gosumdb := cfg.GOSUMDB
if gosumdb == "sum.golang.google.cn" {
gosumdb = "sum.golang.org https://sum.golang.google.cn"
}
if gosumdb == "off" {
return "", nil, fmt.Errorf("checksum database disabled by GOSUMDB=off")
}
key := strings.Fields(gosumdb)
if len(key) >= 1 {
if k := knownGOSUMDB[key[0]]; k != "" {
key[0] = k
}
}
if len(key) == 0 {
return "", nil, fmt.Errorf("missing GOSUMDB")
}
if len(key) > 2 {
return "", nil, fmt.Errorf("invalid GOSUMDB: too many fields")
}
vkey, err := note.NewVerifier(key[0])
if err != nil {
return "", nil, fmt.Errorf("invalid GOSUMDB: %v", err)
}
name := vkey.Name()
// No funny business in the database name.
direct, err := url.Parse("https://" + name)
if err != nil || strings.HasSuffix(name, "/") || *direct != (url.URL{Scheme: "https", Host: direct.Host, Path: direct.Path, RawPath: direct.RawPath}) || direct.RawPath != "" || direct.Host == "" {
return "", nil, fmt.Errorf("invalid sumdb name (must be host[/path]): %s %+v", name, *direct)
}
// Determine how to get to database.
var base *url.URL
if len(key) >= 2 {
// Use explicit alternate URL listed in $GOSUMDB,
// bypassing both the default URL derivation and any proxies.
u, err := url.Parse(key[1])
if err != nil {
return "", nil, fmt.Errorf("invalid GOSUMDB URL: %v", err)
}
base = u
}
return name, sumdb.NewClient(&dbClient{key: key[0], name: name, direct: direct, base: base}), nil
}
5.4 查询 sumdb
checkSumDB 向 sumdb 查询并验证哈希:
go
// 831: src\cmd\go\internal\modfetch\fetch.go
func checkSumDB(mod module.Version, h string) error {
modWithoutSuffix := mod
noun := "module"
if before, found := strings.CutSuffix(mod.Version, "/go.mod"); found {
noun = "go.mod"
modWithoutSuffix.Version = before
}
db, lines, err := lookupSumDB(mod)
if err != nil {
return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: %v", noun, err))
}
have := mod.Path + " " + mod.Version + " " + h
prefix := mod.Path + " " + mod.Version + " h1:"
for _, line := range lines {
if line == have {
return nil
}
if strings.HasPrefix(line, prefix) {
return module.VersionError(modWithoutSuffix, fmt.Errorf("verifying %s: checksum mismatch\n\tdownloaded: %v\n\t%s: %v"+sumdbMismatch, noun, h, db, line[len(prefix)-len("h1:"):]))
}
}
return nil
}
lookupSumDB 调用 sumdb 客户端的 Lookup 方法:
go
// 69: src\cmd\go\internal\modfetch\sumdb.go
func lookupSumDB(mod module.Version) (dbname string, lines []string, err error) {
dbOnce.Do(func() {
dbName, db, dbErr = dbDial()
})
if dbErr != nil {
return "", nil, dbErr
}
lines, err = db.Lookup(mod.Path, mod.Version)
return dbName, lines, err
}
6. sumdb.Client 核心逻辑
6.1 Lookup 方法
go
// 209: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) Lookup(path, vers string) (lines []string, err error) {
atomic.StoreUint32(&c.didLookup, 1)
if c.skip(path) {
return nil, ErrGONOSUMDB
}
defer func() {
if err != nil {
err = fmt.Errorf("%s@%s: %v", path, vers, err)
}
}()
if err := c.init(); err != nil {
return nil, err
}
// Prepare encoded cache filename / URL.
epath, err := module.EscapePath(path)
if err != nil {
return nil, err
}
evers, err := module.EscapeVersion(strings.TrimSuffix(vers, "/go.mod"))
if err != nil {
return nil, err
}
remotePath := "/lookup/" + epath + "@" + evers
file := c.name + remotePath
// Fetch the data.
// The lookupCache avoids redundant ReadCache/GetURL operations
// (especially since go.sum lines tend to come in pairs for a given
// path and version) and also avoids having multiple of the same
// request in flight at once.
type cached struct {
data []byte
err error
}
result := c.record.Do(file, func() interface{} {
// Try the on-disk cache, or else get from web.
writeCache := false
data, err := c.ops.ReadCache(file)
if err != nil {
data, err = c.ops.ReadRemote(remotePath)
if err != nil {
return cached{nil, err}
}
writeCache = true
}
// Validate the record before using it for anything.
id, text, treeMsg, err := tlog.ParseRecord(data)
if err != nil {
return cached{nil, err}
}
if err := c.mergeLatest(treeMsg); err != nil {
return cached{nil, err}
}
if err := c.checkRecord(id, text); err != nil {
return cached{nil, err}
}
// Now that we've validated the record,
// save it to the on-disk cache (unless that's where it came from).
if writeCache {
c.ops.WriteCache(file, data)
}
return cached{data, nil}
}).(cached)
if result.err != nil {
return nil, result.err
}
// Extract the lines for the specific version we want
// (with or without /go.mod).
prefix := path + " " + vers + " "
var hashes []string
for _, line := range strings.Split(string(result.data), "\n") {
if strings.HasPrefix(line, prefix) {
hashes = append(hashes, line)
}
}
return hashes, nil
}
流程:
- 先查本地缓存
- 缓存未命中则请求远程服务器
- 验证记录的签名和一致性
- 缓存验证通过的记录
- 提取匹配的哈希行
6.2 透明日志验证
sumdb 使用 Merkle Tree 实现透明日志(Transparent Log),确保服务器不能篡改历史记录。
checkRecord 验证记录哈希:
go
// 478: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkRecord(id int64, data []byte) error {
c.latestMu.Lock()
latest := c.latest
c.latestMu.Unlock()
if id >= latest.N {
return fmt.Errorf("cannot validate record %d in tree of size %d", id, latest.N)
}
hashes, err := tlog.TileHashReader(latest, &c.tileReader).ReadHashes([]int64{tlog.StoredHashIndex(0, id)})
if err != nil {
return err
}
if hashes[0] == tlog.RecordHash(data) {
return nil
}
return fmt.Errorf("cannot authenticate record data in server response")
}
checkTrees 检测时间线分叉(服务器篡改):
go
// 426: src\cmd\vendor\golang.org\x\mod\sumdb\client.go
func (c *Client) checkTrees(older tlog.Tree, olderNote []byte, newer tlog.Tree, newerNote []byte) error {
thr := tlog.TileHashReader(newer, &c.tileReader)
h, err := tlog.TreeHash(older.N, thr)
if err != nil {
if older.N == newer.N {
return fmt.Errorf("checking tree#%d: %v", older.N, err)
}
return fmt.Errorf("checking tree#%d against tree#%d: %v", older.N, newer.N, err)
}
if h == older.Hash {
return nil
}
// Detected a fork in the tree timeline.
// Start by reporting the inconsistent signed tree notes.
var buf bytes.Buffer
fmt.Fprintf(&buf, "SECURITY ERROR\n")
fmt.Fprintf(&buf, "go.sum database server misbehavior detected!\n\n")
indent := func(b []byte) []byte {
return bytes.Replace(b, []byte("\n"), []byte("\n\t"), -1)
}
fmt.Fprintf(&buf, "old database:\n\t%s\n", indent(olderNote))
fmt.Fprintf(&buf, "new database:\n\t%s\n", indent(newerNote))
// ... 生成不一致性证明 ...
c.ops.SecurityError(buf.String())
return ErrSecurity
}
7. 本地缓存
sumdb 查询结果缓存在 $GOMODCACHE/cache/download/sumdb/ 目录:
go
// 286: src\cmd\go\internal\modfetch\sumdb.go
func (*dbClient) ReadCache(file string) ([]byte, error) {
targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
data, err := lockedfile.Read(targ)
// lockedfile.Write does not atomically create the file with contents.
// There is a moment between file creation and locking the file for writing,
// during which the empty file can be locked for reading.
// Treat observing an empty file as file not found.
if err == nil && len(data) == 0 {
err = &fs.PathError{Op: "read", Path: targ, Err: fs.ErrNotExist}
}
return data, err
}
// WriteCache updates cached lookups or tiles.
func (*dbClient) WriteCache(file string, data []byte) {
targ := filepath.Join(cfg.GOMODCACHE, "cache/download/sumdb", file)
os.MkdirAll(filepath.Dir(targ), 0777)
lockedfile.Write(targ, bytes.NewReader(data), 0666)
}
8. 相关环境变量
| 变量 | 作用 |
|---|---|
GOSUMDB |
指定 checksum database,默认 sum.golang.org,设为 off 禁用 |
GONOSUMDB |
不查询 sumdb 的模块路径模式 |
GOPRIVATE |
私有模块,隐含设置 GONOSUMDB |
GOMODCACHE |
模块缓存目录,sumdb 缓存也在其中 |
9. 源码文件索引
| 文件 | 职责 |
|---|---|
cmd/go/internal/modfetch/fetch.go |
go.sum 读写、校验入口 |
cmd/go/internal/modfetch/sumdb.go |
sumdb 客户端适配层 |
cmd/go/internal/modfetch/key.go |
内置 sumdb 公钥 |
golang.org/x/mod/sumdb/client.go |
sumdb 客户端核心逻辑 |
golang.org/x/mod/sumdb/dirhash/hash.go |
h1 哈希算法实现 |
golang.org/x/mod/sumdb/tlog/ |
透明日志实现 |