本文档详细介绍
golibs项目中 Makefile (Protobuf 编译)、protocol (gRPC 协议层)与 registry(etcd 服务注册与发现)三大模块的设计思想、架构关系与完整代码实现。
📑 目录
- [1. 整体架构概览](#1. 整体架构概览 "#1-%E6%95%B4%E4%BD%93%E6%9E%B6%E6%9E%84%E6%A6%82%E8%A7%88")
- [2. Makefile --- Protobuf 编译构建](#2. Makefile — Protobuf 编译构建 "#2-makefile--protobuf-%E7%BC%96%E8%AF%91%E6%9E%84%E5%BB%BA")
- [3. protocol 模块](#3. protocol 模块 "#3-protocol-%E6%A8%A1%E5%9D%97")
- [3.1 types --- 公共 Protobuf 类型定义](#3.1 types — 公共 Protobuf 类型定义 "#31-types--%E5%85%AC%E5%85%B1-protobuf-%E7%B1%BB%E5%9E%8B%E5%AE%9A%E4%B9%89")
- [3.2 ip --- IP 定位服务(gRPC)](#3.2 ip — IP 定位服务(gRPC) "#32-ip--ip-%E5%AE%9A%E4%BD%8D%E6%9C%8D%E5%8A%A1grpc")
- [3.3 interceptor --- gRPC 拦截器](#3.3 interceptor — gRPC 拦截器 "#33-interceptor--grpc-%E6%8B%A6%E6%88%AA%E5%99%A8")
- [4. registry 模块](#4. registry 模块 "#4-registry-%E6%A8%A1%E5%9D%97")
- [4.1 核心接口与数据结构](#4.1 核心接口与数据结构 "#41-%E6%A0%B8%E5%BF%83%E6%8E%A5%E5%8F%A3%E4%B8%8E%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84")
- [4.2 etcd 注册中心实现](#4.2 etcd 注册中心实现 "#42-etcd-%E6%B3%A8%E5%86%8C%E4%B8%AD%E5%BF%83%E5%AE%9E%E7%8E%B0")
- [4.3 Watcher 服务监听](#4.3 Watcher 服务监听 "#43-watcher-%E6%9C%8D%E5%8A%A1%E7%9B%91%E5%90%AC")
- [4.4 discover --- gRPC Resolver](#4.4 discover — gRPC Resolver "#44-discover--grpc-resolver")
- [5. 端到端调用流程](#5. 端到端调用流程 "#5-%E7%AB%AF%E5%88%B0%E7%AB%AF%E8%B0%83%E7%94%A8%E6%B5%81%E7%A8%8B")
- [6. 快速上手](#6. 快速上手 "#6-%E5%BF%AB%E9%80%9F%E4%B8%8A%E6%89%8B")
1. 整体架构概览
go
┌─────────────────────────────────────────────────────────────────────────┐
│ golibs 项目 │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────┐ protoc 编译 ┌──────────────────────────────────┐ │
│ │ Makefile │ ───────────────▶ │ protocol/ │ │
│ └───────────┘ │ ┌──────────┐ ┌──────────────┐ │ │
│ │ │ types/ │ │ ip/ │ │ │
│ │ │ .proto │ │ .proto + Go │ │ │
│ │ │ .pb.go │ │ .pb.go │ │ │
│ │ └──────────┘ │ _grpc.pb.go │ │ │
│ │ │ client.go │ │ │
│ │ └──────────────┘ │ │
│ │ ┌──────────────────────────┐ │ │
│ │ │ interceptor/ │ │ │
│ │ │ logger / metadata / │ │ │
│ │ │ recovery │ │ │
│ │ └──────────────────────────┘ │ │
│ └──────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ registry/ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐ │ │
│ │ │ Registry │ │ Watcher │ │ Service │ │ discover/ │ │ │
│ │ │ (etcd) │ │ (watch) │ │ (unmarshal)│ │ resolver │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────┐ │
│ │ etcd 集群 │ │
│ └───────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
模块职责:
| 模块 | 职责 |
|---|---|
Makefile |
使用 protoc 编译 .proto 文件为 Go 代码 |
protocol/types |
定义跨服务通用的 Protobuf 类型(Error、Wrappers、Timestamp) |
protocol/ip |
IP 定位 gRPC 服务定义、客户端实现 |
protocol/interceptor |
gRPC 一元拦截器(日志、metadata 传递、panic 恢复) |
registry |
基于 etcd 的服务注册、注销、心跳保活、服务发现 |
registry/discover |
实现 gRPC resolver.Builder/resolver.Resolver,对接 etcd 做客户端负载均衡 |
2. Makefile --- Protobuf 编译构建
Makefile 提供了三个 make 目标,用于一键编译 Protobuf 和整理依赖:
makefile
GOPATH:=$(shell go env GOPATH)
API_PROTO_FILES=$(shell find src -name *.proto)
.PHONY: types
types:
@protoc --proto_path=. \
--proto_path=./protocol/types \
--go_out=paths=source_relative:. \
--go-errors_out=paths=source_relative:. \
./protocol/types/error.proto
@protoc --proto_path=. \
--proto_path=./protocol/types \
--go_out=paths=source_relative:. \
--go-errors_out=paths=source_relative:. \
./protocol/types/wrappers.proto
@protoc --proto_path=. \
--proto_path=./protocol/types \
--go_out=paths=source_relative:. \
--go-errors_out=paths=source_relative:. \
./protocol/types/timestamp.proto
.PHONY: ip
ip:
@protoc --proto_path=. \
--go_out=paths=source_relative:. \
./protocol/ip/ip_message.proto
@protoc --proto_path=. \
--proto_path=./protocol/ip \
--go-grpc_out=. \
./protocol/ip/ip_service.proto
.PHONY: tidy
tidy:
@go mod tidy
编译流程图
go
make types make ip
│ │
▼ ▼
┌───────────────┐ ┌─────────────────┐
│ error.proto │──▶ error.pb.go │ ip_message.proto│──▶ ip_message.pb.go
│ wrappers.proto│──▶ wrappers.pb.go │ ip_service.proto│──▶ ip_service_grpc.pb.go
│ timestamp.proto──▶ timestamp.pb.go └─────────────────┘
└───────────────┘
│ │
│ protoc 插件 │
├── --go_out (生成消息体) ├── --go_out (生成消息体)
└── --go-errors_out (生成错误码) └── --go-grpc_out (生成 gRPC 服务存根)
关键参数说明:
| 参数 | 说明 |
|---|---|
--proto_path=. |
以项目根目录为 proto 搜索路径 |
--proto_path=./protocol/types |
支持 types 包内的相互引用 |
--go_out=paths=source_relative:. |
生成的 .pb.go 与 .proto 同目录 |
--go-errors_out=paths=source_relative:. |
生成自定义错误码(types 专用) |
--go-grpc_out=. |
生成 gRPC 服务端/客户端存根代码 |
3. protocol 模块
3.1 types --- 公共 Protobuf 类型定义
protocol/types/ 下定义了三个通用的 .proto 文件,供所有 gRPC 服务共享使用。
3.1.1 Error(统一错误类型)
protobuf
syntax = "proto3";
package types;
option go_package = "gitee.com/ha666/golibs/protocol/types";
message Error {
//异常代码, 用来判断异常类型
string code = 1;
//异常详细信息
string message = 2;
}
📌 设计理念: 将业务错误码(如
"USER_NOT_FOUND")和错误描述封装在 gRPC 响应体内,而非依赖 gRPC status code,使得客户端可以统一处理业务异常。
3.1.2 Wrappers(基本类型包装器)
提供对基本类型(double、float、int64、uint64、int32、uint32、bool、string、bytes)的包装消息,用于区分"字段未设置"和"字段为默认零值"的情况:
protobuf
message Int64Value {
int64 value = 1;
}
message StringValue {
string value = 1;
}
message BoolValue {
bool value = 1;
}
// ... 还有 DoubleValue、FloatValue、UInt64Value、Int32Value、UInt32Value、BytesValue
3.1.3 Timestamp(时间戳)
protobuf
message Timestamp {
// 自 Unix 纪元(1970-01-01T00:00:00Z)以来的秒数
int64 seconds = 1;
// 纳秒级精度的非负小数部分
int32 nanos = 2;
}
类似 Google 的
google.protobuf.Timestamp,但放在项目自有的types包下,可以配合自定义的序列化逻辑使用。
3.2 ip --- IP 定位服务(gRPC)
3.2.1 消息定义
protobuf
syntax = "proto3";
package ip;
import "protocol/types/error.proto";
option go_package = "./protocol/ip";
message GetLocateByIPReq {
string ip = 1;
}
message GetLocateByIPReply {
types.Error error = 1;
Locate locate = 2;
}
message Locate {
string full_address = 1;
string country = 2;
string province = 3;
string city = 4;
string district = 5;
string street = 6;
}
3.2.2 服务定义
protobuf
syntax = "proto3";
package ip;
import "protocol/ip/ip_message.proto";
option go_package = "./protocol/ip";
service IPService {
rpc GetLocateByIP(GetLocateByIPReq) returns (GetLocateByIPReply);
}
请求/响应关系图:
perl
客户端 服务端
│ │
│ GetLocateByIPReq { ip: "1.2.3.4" } │
│ ──────────────────────────────────────▶ │
│ │
│ GetLocateByIPReply { │
│ error: nil, │
│ locate: { │
│ full_address: "中国上海市浦东新区", │
│ country: "中国", │
│ province: "上海市", │
│ city: "上海市", │
│ district: "浦东新区", │
│ street: "..." │
│ } │
│ } │
│ ◀────────────────────────────────────── │
3.2.3 客户端实现 (client.go)
提供两种连接方式:直连 和 通过 etcd 服务发现。
go
package ip
import (
"fmt"
"time"
"gitee.com/ha666/golibs/protocol/interceptor"
"gitee.com/ha666/golibs/registry/discover"
clientv3 "go.etcd.io/etcd/client/v3"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/resolver"
)
// Client IP服务客户端
type Client struct {
conn *grpc.ClientConn
IPServiceClient
}
// NewClient 创建IP客户端(直连模式)
func NewClient(addr string) (*Client, error) {
conn, err := grpc.NewClient(addr,
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithChainUnaryInterceptor(
interceptor.ClientLogger,
interceptor.ClientMetadata,
),
grpc.WithDefaultCallOptions(
grpc.MaxCallRecvMsgSize(1024*1024*1), // 1MB
grpc.MaxCallSendMsgSize(1024*1024*1),
),
)
if err != nil {
return nil, fmt.Errorf("failed to connect: %w", err)
}
return &Client{
conn: conn,
IPServiceClient: NewIPServiceClient(conn),
}, nil
}
// NewClientWithEtcd 创建IP客户端(etcd 服务发现模式)
func NewClientWithEtcd() (*Client, error) {
// 1. 创建 etcd 客户端
etcdClient, err := clientv3.New(clientv3.Config{
Endpoints: []string{"localhost:2379"},
DialTimeout: 1 * time.Second,
})
if err != nil {
return nil, err
}
// 2. 注册自定义 resolver
builder := discover.NewBuilder(etcdClient)
resolver.Register(builder)
// 3. 使用 etcd 方案连接 gRPC 服务
// 目标字符串格式:etcd:///service-name
conn, err := grpc.NewClient("etcd:///ip-service",
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithChainUnaryInterceptor(
interceptor.ClientLogger,
interceptor.ClientMetadata,
),
grpc.WithDefaultCallOptions(
grpc.MaxCallRecvMsgSize(1024*1024*1), // 1MB
grpc.MaxCallSendMsgSize(1024*1024*1),
),
grpc.WithDefaultServiceConfig(`{"loadBalancingPolicy":"round_robin"}`),
grpc.WithResolvers(builder),
)
if err != nil {
return nil, err
}
return &Client{
conn: conn,
IPServiceClient: NewIPServiceClient(conn),
}, nil
}
// Close 关闭连接
func (c *Client) Close() error {
return c.conn.Close()
}
两种连接模式对比:
arduino
┌─────────────────────────────────────────────┐
│ 直连模式 (NewClient) │
│ │
│ Client ──────────────────▶ Server │
│ grpc.NewClient("ip:port") │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ etcd 服务发现模式 (NewClientWithEtcd) │
│ │
│ Client ──▶ etcd resolver ──▶ etcd ──▶ [Server1, Server2] │
│ │ │
│ └── round_robin 负载均衡 │
│ │
│ 目标地址:etcd:///ip-service │
└─────────────────────────────────────────────────────────────┘
3.3 interceptor --- gRPC 拦截器
protocol/interceptor/ 包含三个拦截器,分别处理日志记录 、metadata 传递 、panic 恢复。
3.3.1 Logger(日志拦截器)
分为服务端和客户端两个拦截器,记录请求的方法名、耗时、请求/响应内容。
go
package interceptor
import (
"context"
"time"
"gitee.com/ha666/golibs"
"gitee.com/ha666/golibs/logs"
"google.golang.org/grpc"
"google.golang.org/grpc/peer"
)
// ServerLogger 服务端日志拦截器
func ServerLogger(ctx context.Context, req any, info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler) (any, error) {
var clientIPPort string
if p, ok := peer.FromContext(ctx); ok {
clientIPPort = p.Addr.String()
} else {
clientIPPort = "unknown"
}
start := time.Now()
logs.Info(ctx, "[server] method=%s, client=%s, req=%+v",
info.FullMethod, clientIPPort, req)
reply, err := handler(ctx, req)
consume := golibs.Since(start)
if err != nil {
logs.Error(ctx, "[server] method=%s, client=%s, req=%+v, consume:%dms, err:%+v",
info.FullMethod, clientIPPort, req, consume, err)
} else {
logs.Info(ctx, "[server] method=%s, client=%s, req=%+v, consume:%dms, reply=%+v",
info.FullMethod, clientIPPort, req, consume, reply)
}
return reply, err
}
// ClientLogger 客户端日志拦截器
func ClientLogger(ctx context.Context, method string, req, reply any,
cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
start := time.Now()
var p peer.Peer
logs.Info(ctx, "[client] method=%s, req=%+v", method, req)
err := invoker(ctx, method, req, reply, cc, append(opts, grpc.Peer(&p))...)
serverAddr := cc.Target()
if p.Addr != nil {
serverAddr = p.Addr.String()
}
consume := golibs.Since(start)
if err != nil {
logs.Error(ctx, "[client] method=%s, server=%s, req=%+v, consume:%dms, err:%+v",
method, serverAddr, req, consume, err)
} else {
logs.Info(ctx, "[client] method=%s, server=%s, req=%+v, consume:%dms, reply=%+v",
method, serverAddr, req, consume, reply)
}
return err
}
日志拦截器执行时序:
scss
客户端 服务端
│ │
│ ┌─ ClientLogger ─┐ │
│ │ 记录 req │ │
│ │ │ │
│ │ invoker() ────────────▶ │ ┌─ ServerLogger ─┐
│ │ │ │ │ 记录 client IP │
│ │ │ │ │ 记录 req │
│ │ │ │ │ │
│ │ │ │ │ handler() │
│ │ │ │ │ │
│ │ │ │ │ 记录 consume │
│ │ ◀──── reply ────────── │ └─────────────────┘
│ │ 记录 consume │ │
│ └─────────────────┘ │
3.3.2 Metadata(元数据传递拦截器)
实现上下文数据(如 trace_id)在 gRPC 客户端/服务端之间的透明传递。
go
package interceptor
import (
"context"
"fmt"
"gitee.com/ha666/golibs"
"google.golang.org/grpc"
"google.golang.org/grpc/metadata"
)
// 存储所有需要在gRPC中传递的键
var grpcTransmitKeys = map[string]string{
golibs.CtxTraceId: golibs.CtxTraceId,
}
// GetTransmitKeys 获取所有需要传递的键
func GetTransmitKeys() map[string]string {
return grpcTransmitKeys
}
// ClientMetadata 客户端拦截器:将 ctx.Value 附加到 gRPC metadata
func ClientMetadata(ctx context.Context, method string, req, reply any,
cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
outgoingCtx := attachContextValuesToMetadata(ctx)
return invoker(outgoingCtx, method, req, reply, cc, opts...)
}
// attachContextValuesToMetadata 将 ctx.Value 中的值附加到 metadata
func attachContextValuesToMetadata(ctx context.Context) context.Context {
md, ok := metadata.FromOutgoingContext(ctx)
if !ok {
md = metadata.MD{}
}
for _, mdKey := range GetTransmitKeys() {
if val := ctx.Value(mdKey); val != nil {
var strValue string
switch v := val.(type) {
case string:
strValue = v
case fmt.Stringer:
strValue = v.String()
default:
strValue = fmt.Sprintf("%v", v)
}
md.Set(mdKey, strValue)
}
}
return metadata.NewOutgoingContext(ctx, md)
}
// ServerMetadata 服务端拦截器:从 gRPC metadata 提取值到 context
func ServerMetadata(ctx context.Context, req any, info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler) (any, error) {
ctx = extractMetadataToContext(ctx)
return handler(ctx, req)
}
// extractMetadataToContext 从 metadata 提取值到 context
func extractMetadataToContext(ctx context.Context) context.Context {
md, ok := metadata.FromIncomingContext(ctx)
if !ok {
return ctx
}
mdToContextKey := make(map[string]struct{})
for _, mdKey := range GetTransmitKeys() {
mdToContextKey[mdKey] = struct{}{}
}
for mdKey, values := range md {
if _, exists := mdToContextKey[mdKey]; exists && len(values) > 0 {
ctx = context.WithValue(ctx, mdKey, values[0])
}
}
return ctx
}
// GetStringValue 获取字符串值
func GetStringValue(ctx context.Context, key string) string {
if value := ctx.Value(key); value != nil {
if str, ok := value.(string); ok {
return str
}
}
return ""
}
Metadata 传递流程:
java
┌────────────── 客户端进程 ──────────────┐ ┌────────────── 服务端进程 ──────────────┐
│ │ │ │
│ ctx.Value("trace_id") = "abc-123" │ │ │
│ │ │ │ │
│ ▼ │ │ │
│ ClientMetadata 拦截器 │ │ │
│ ┌─────────────────────────────────┐ │ │ │
│ │ 遍历 grpcTransmitKeys │ │ │ │
│ │ ctx.Value("trace_id") → "abc" │ │ │ │
│ │ md.Set("trace_id", "abc-123") │ │ │ │
│ └─────────────────────────────────┘ │ │ │
│ │ │ │ │
│ gRPC metadata (HTTP/2 Headers) │ │ │
│ ═══════════════════════════════════▶ │ │ ServerMetadata 拦截器 │
│ │ │ ┌──────────────────────────────────┐ │
│ │ │ │ md["trace_id"] → "abc-123" │ │
│ │ │ │ ctx = WithValue(ctx, "trace_id", │ │
│ │ │ │ "abc-123") │ │
│ │ │ └──────────────────────────────────┘ │
│ │ │ │ │
│ │ │ ▼ │
│ │ │ handler(ctx, req) --- 业务代码可读到 │
│ │ │ ctx.Value("trace_id") == "abc-123" │
└─────────────────────────────────────────┘ └─────────────────────────────────────────┘
3.3.3 Recovery(Panic 恢复拦截器)
防止服务端 handler 中的 panic 导致整个 gRPC 服务崩溃,将 panic 转化为 codes.Internal 错误并记录堆栈。
go
package interceptor
import (
"context"
"runtime/debug"
"gitee.com/ha666/golibs/logs"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
)
// Recovery 用于gRPC服务端一元调用的Recovery拦截器
// 捕获handler中的panic,记录堆栈信息,并返回Internal错误
func Recovery(ctx context.Context, req any, info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler) (resp any, err error) {
defer func() {
if r := recover(); r != nil {
stack := debug.Stack()
logs.Error(ctx, "panic: %v,%s", r, stack)
err = status.Errorf(codes.Internal, "internal server error: %v", r)
}
}()
return handler(ctx, req)
}
Recovery 拦截器单元测试:
go
func TestRecovery_NoPanic(t *testing.T) {
handler := func(ctx context.Context, req any) (any, error) {
return "ok", nil
}
resp, err := Recovery(context.Background(), nil, &grpc.UnaryServerInfo{}, handler)
if err != nil {
t.Fatalf("expected no error, got: %v", err)
}
if resp != "ok" {
t.Fatalf("expected resp 'ok', got: %v", resp)
}
}
func TestRecovery_WithPanic(t *testing.T) {
handler := func(ctx context.Context, req any) (any, error) {
panic("something went wrong")
}
_, err := Recovery(context.Background(), nil, &grpc.UnaryServerInfo{}, handler)
if err == nil {
t.Fatal("expected error after panic, got nil")
}
st, ok := status.FromError(err)
if !ok {
t.Fatal("expected grpc status error")
}
if st.Code() != codes.Internal {
t.Fatalf("expected codes.Internal, got: %v", st.Code())
}
}
拦截器链装配示意
在 gRPC 客户端/服务端创建时,拦截器以 链式 方式注册:
scss
┌─────────────────────── 客户端拦截器链 ─────────────────────┐
│ │
│ 请求 ──▶ ClientLogger ──▶ ClientMetadata ──▶ invoker() │
│ │
└────────────────────────────────────────────────────────────┘
┌─────────────────────── 服务端拦截器链 ─────────────────────┐
│ │
│ 请求 ──▶ Recovery ──▶ ServerLogger ──▶ ServerMetadata │
│ ──▶ handler() │
│ │
└────────────────────────────────────────────────────────────┘
4. registry 模块
4.1 核心接口与数据结构
Watcher 接口
go
package registry
// Watcher is service watcher.
type Watcher interface {
// Next returns services in the following two cases:
// 1.the first time to watch and the service instance list is not empty.
// 2.any service instance changes found.
// if the above two conditions are not met, it will block until context deadline exceeded or canceled
Next() ([]string, error)
// Stop close the watcher.
Stop() error
}
辅助函数 (service.go)
go
package registry
import "fmt"
func unmarshal(data []byte) (string, error) {
if len(data) == 0 {
return "", fmt.Errorf("not found data")
}
return string(data), nil
}
4.2 etcd 注册中心实现
registry/etcd.go 是整个服务注册与发现的核心,提供 注册 、注销 、查询 、心跳保活 能力。
etcd 键值结构
bash
etcd key 格式:/{env}/{serviceName}/{endpoint}
etcd value: endpoint 字符串(如 "127.0.0.1:1234")
示例:
/local/helloworld/127.0.0.1:1234 → "127.0.0.1:1234"
/local/helloworld/127.0.0.1:5678 → "127.0.0.1:5678"
完整代码
go
package registry
import (
"context"
"fmt"
"math/rand/v2"
"time"
clientv3 "go.etcd.io/etcd/client/v3"
)
// Option is etcd registry option.
type Option func(o *options)
type options struct {
ctx context.Context
env string
ttl time.Duration
interval time.Duration
maxRetry int
}
// Context with registry context.
func Context(ctx context.Context) Option {
return func(o *options) { o.ctx = ctx }
}
func WithEnv(env string) Option {
return func(o *options) { o.env = env }
}
// WithTTL with register ttl.
func WithTTL(ttl time.Duration) Option {
return func(o *options) { o.ttl = ttl }
}
func WithInterval(interval time.Duration) Option {
return func(o *options) { o.interval = interval }
}
func MaxRetry(num int) Option {
return func(o *options) { o.maxRetry = num }
}
// Registry is etcd registry.
type Registry struct {
name string
opts *options
client *clientv3.Client
kv clientv3.KV
lease clientv3.Lease
ctxMap map[string]context.CancelFunc
}
// New creates etcd registry
func New(client *clientv3.Client, name string, opts ...Option) (r *Registry) {
if name == "" {
panic("缺少name参数")
}
op := &options{
ctx: context.Background(),
env: "abc",
ttl: time.Second * 15,
interval: time.Second * 5,
maxRetry: 5,
}
for _, o := range opts {
o(op)
}
return &Registry{
name: name,
opts: op,
client: client,
kv: clientv3.NewKV(client),
ctxMap: make(map[string]context.CancelFunc),
}
}
// Register the registration.
func (r *Registry) Register(ctx context.Context, endpoint string) error {
key := r.getNodeKey(ctx, endpoint)
if r.lease != nil {
r.lease.Close()
}
r.lease = clientv3.NewLease(r.client)
leaseID, err := r.registerWithKV(ctx, key, endpoint)
if err != nil {
return err
}
hctx, cancel := context.WithCancel(r.opts.ctx)
r.ctxMap[endpoint] = cancel
go r.heartBeat(hctx, leaseID, key, endpoint)
return nil
}
// Deregister the registration.
func (r *Registry) Deregister(ctx context.Context, endpoint string) error {
defer func() {
if r.lease != nil {
r.lease.Close()
}
}()
if cancel, ok := r.ctxMap[endpoint]; ok {
cancel()
delete(r.ctxMap, endpoint)
}
key := r.getNodeKey(ctx, endpoint)
_, err := r.client.Delete(ctx, key)
return err
}
// GetService return the service instances in memory according to the service name.
func (r *Registry) GetService(ctx context.Context) ([]string, error) {
key := r.getServiceKey(ctx)
resp, err := r.kv.Get(ctx, key, clientv3.WithPrefix())
if err != nil {
return nil, err
}
items := make([]string, 0, len(resp.Kvs))
for _, kv := range resp.Kvs {
si, err := unmarshal(kv.Value)
if err != nil {
return nil, err
}
if si == "" {
continue
}
items = append(items, si)
}
return items, nil
}
// Watch creates a watcher according to the service name.
func (r *Registry) Watch(ctx context.Context) (Watcher, error) {
key := r.getServiceKey(ctx)
return newWatcher(ctx, key, r.name, r.client)
}
// registerWithKV create a new lease, return current leaseID
func (r *Registry) registerWithKV(ctx context.Context, key string,
value string) (clientv3.LeaseID, error) {
grant, err := r.lease.Grant(ctx, int64(r.opts.ttl.Seconds()))
if err != nil {
return 0, err
}
_, err = r.client.Put(ctx, key, value, clientv3.WithLease(grant.ID))
if err != nil {
return 0, err
}
return grant.ID, nil
}
func (r *Registry) heartBeat(ctx context.Context, leaseID clientv3.LeaseID,
key string, value string) {
curLeaseID := leaseID
kac, err := r.client.KeepAlive(ctx, leaseID)
if err != nil {
curLeaseID = 0
}
for {
if curLeaseID == 0 {
var retreat []int
for retryCnt := 0; retryCnt < r.opts.maxRetry; retryCnt++ {
if ctx.Err() != nil {
return
}
idChan := make(chan clientv3.LeaseID, 1)
errChan := make(chan error, 1)
cancelCtx, cancel := context.WithCancel(ctx)
go func() {
defer cancel()
id, registerErr := r.registerWithKV(cancelCtx, key, value)
if registerErr != nil {
errChan <- registerErr
} else {
idChan <- id
}
}()
select {
case <-time.After(3 * time.Second):
cancel()
continue
case <-errChan:
continue
case curLeaseID = <-idChan:
}
kac, err = r.client.KeepAlive(ctx, curLeaseID)
if err == nil {
break
}
retreat = append(retreat, 1<<retryCnt)
time.Sleep(time.Duration(retreat[rand.IntN(len(retreat))]) * time.Second)
}
if _, ok := <-kac; !ok {
return
}
}
select {
case _, ok := <-kac:
if !ok {
if ctx.Err() != nil {
return
}
curLeaseID = 0
continue
}
case <-r.opts.ctx.Done():
return
}
}
}
func (r *Registry) getNodeKey(ctx context.Context, endpoint string) string {
return fmt.Sprintf("/%s/%s/%s", r.opts.env, r.name, endpoint)
}
func (r *Registry) getServiceKey(ctx context.Context) string {
return fmt.Sprintf("/%s/%s", r.opts.env, r.name)
}
注册与心跳保活流程
scss
Registry.Register()
│
▼
┌─────────────────────┐
│ 1. 创建 Lease │
│ 2. Grant(TTL=15s) │
│ 3. Put(key, value, │
│ WithLease) │
└────────┬────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ heartBeat goroutine(后台运行) │
│ │
│ KeepAlive(leaseID) → 持续续约 │
│ │ │
│ ├── 续约成功 → 继续等待下一次 KeepAlive 响应 │
│ │ │
│ └── 续约失败(通道关闭) → 重新注册 │
│ │ │
│ ├── registerWithKV (最多 maxRetry 次) │
│ │ ├── 3s 超时控制 │
│ │ └── 指数退避重试 │
│ │ │
│ └── 全部重试失败 → goroutine 退出 │
└─────────────────────────────────────────────────────┘
配置选项说明:
| Option | 默认值 | 说明 |
|---|---|---|
WithEnv(env) |
"abc" |
环境标识,用于 key 前缀隔离(如 local/dev/prod) |
WithTTL(ttl) |
15s |
etcd Lease 生存时间 |
WithInterval(interval) |
5s |
预留心跳间隔(当前通过 KeepAlive 自动续约) |
MaxRetry(num) |
5 |
心跳断开后最大重试次数 |
Context(ctx) |
context.Background() |
全局生命周期上下文 |
4.3 Watcher 服务监听
registry/watcher.go 实现了 Watcher 接口,基于 etcd Watch 机制实时感知服务实例变化。
go
package registry
import (
"context"
"time"
clientv3 "go.etcd.io/etcd/client/v3"
)
var _ Watcher = (*watcher)(nil)
type watcher struct {
key string
ctx context.Context
cancel context.CancelFunc
client *clientv3.Client
watchChan clientv3.WatchChan
watcher clientv3.Watcher
kv clientv3.KV
first bool
serviceName string
}
func newWatcher(ctx context.Context, key, name string,
client *clientv3.Client) (*watcher, error) {
w := &watcher{
key: key,
client: client,
watcher: clientv3.NewWatcher(client),
kv: clientv3.NewKV(client),
first: true,
serviceName: name,
}
w.ctx, w.cancel = context.WithCancel(ctx)
w.watchChan = w.watcher.Watch(w.ctx, key,
clientv3.WithPrefix(), clientv3.WithRev(0), clientv3.WithKeysOnly())
err := w.watcher.RequestProgress(w.ctx)
if err != nil {
return nil, err
}
return w, nil
}
func (w *watcher) Next() ([]string, error) {
if w.first {
item, err := w.getInstance()
w.first = false
return item, err
}
select {
case <-w.ctx.Done():
return nil, w.ctx.Err()
case watchResp, ok := <-w.watchChan:
if !ok || watchResp.Err() != nil {
time.Sleep(time.Second)
err := w.reWatch()
if err != nil {
return nil, err
}
}
return w.getInstance()
}
}
func (w *watcher) Stop() error {
w.cancel()
return w.watcher.Close()
}
func (w *watcher) getInstance() ([]string, error) {
resp, err := w.kv.Get(w.ctx, w.key, clientv3.WithPrefix())
if err != nil {
return nil, err
}
items := make([]string, 0, len(resp.Kvs))
for _, kv := range resp.Kvs {
si, err := unmarshal(kv.Value)
if err != nil {
return nil, err
}
if si == "" {
continue
}
items = append(items, si)
}
return items, nil
}
func (w *watcher) reWatch() error {
w.watcher.Close()
w.watcher = clientv3.NewWatcher(w.client)
w.watchChan = w.watcher.Watch(w.ctx, w.key,
clientv3.WithPrefix(), clientv3.WithRev(0), clientv3.WithKeysOnly())
return w.watcher.RequestProgress(w.ctx)
}
Watcher 工作流程:
scss
w.Next()
│
├── 首次调用?
│ │
│ YES ──▶ getInstance() ──▶ 返回当前所有服务实例
│ │
│ NO ──▶ 阻塞等待 watchChan
│ │
│ ├── 收到事件 ──▶ getInstance() ──▶ 返回最新实例列表
│ │
│ ├── 通道关闭/错误 ──▶ reWatch() ──▶ 重建 Watch
│ │
│ └── ctx.Done() ──▶ 返回错误
│
w.Stop()
│
└── cancel() + watcher.Close()
4.4 discover --- gRPC Resolver
registry/discover/resolver.go 实现了 gRPC 的 resolver.Builder 和 resolver.Resolver 接口,让 gRPC 客户端可以通过 etcd:///service-name 格式的地址自动发现服务。
go
package discover
import (
"context"
"fmt"
"strings"
"sync"
"time"
"gitee.com/ha666/golibs"
"gitee.com/ha666/golibs/logs"
"go.etcd.io/etcd/api/v3/mvccpb"
clientv3 "go.etcd.io/etcd/client/v3"
"google.golang.org/grpc/resolver"
)
// etcdBuilder 实现了 resolver.Builder
type etcdBuilder struct {
client *clientv3.Client
serviceTTL int64
}
// NewBuilder 创建一个 etcd resolver builder
func NewBuilder(client *clientv3.Client) resolver.Builder {
return &etcdBuilder{client: client}
}
// Build 为给定目标创建新的 resolver
func (b *etcdBuilder) Build(target resolver.Target, cc resolver.ClientConn,
opts resolver.BuildOptions) (resolver.Resolver, error) {
serviceName := strings.TrimPrefix(target.URL.Path, "/")
if serviceName == "" {
return nil, fmt.Errorf("etcd resolver: missing service name in target URL")
}
ctx, cancel := context.WithCancel(context.Background())
r := &etcdResolver{
client: b.client,
serviceName: serviceName,
ctx: ctx,
cancel: cancel,
cc: cc,
addrs: make(map[string]bool),
rn: make(chan struct{}, 1),
}
go r.watchService()
return r, nil
}
// Scheme 返回此 resolver 的 scheme
func (b *etcdBuilder) Scheme() string {
return "etcd"
}
// etcdResolver 实现了 resolver.Resolver
type etcdResolver struct {
client *clientv3.Client
serviceName string
ctx context.Context
cancel context.CancelFunc
cc resolver.ClientConn
addrs map[string]bool
mu sync.Mutex
rn chan struct{}
env string
}
// ResolveNow 被 gRPC 调用,提示 resolver 可以重新解析
func (r *etcdResolver) ResolveNow(o resolver.ResolveNowOptions) {
select {
case r.rn <- struct{}{}:
default:
}
}
// Close 关闭 resolver
func (r *etcdResolver) Close() {
r.cancel()
}
// watchService 监听 etcd 中服务地址的变化
func (r *etcdResolver) watchService() {
if err := r.sync(); err != nil {
logs.Error(nil, "etcd resolver: initial sync failed: %v", err)
}
keyPrefix := fmt.Sprintf("/%s/%s/", golibs.Env, r.serviceName)
r.watchWithRetry(keyPrefix)
}
// sync 从 etcd 获取当前所有服务地址,并更新到 gRPC
func (r *etcdResolver) sync() error {
ctx, cancel := context.WithTimeout(r.ctx, 5*time.Second)
defer cancel()
keyPrefix := fmt.Sprintf("/%s/%s/", golibs.Env, r.serviceName)
resp, err := r.client.Get(ctx, keyPrefix, clientv3.WithPrefix())
if err != nil {
return err
}
newAddrs := make(map[string]bool)
for _, kv := range resp.Kvs {
addr := string(kv.Value)
if addr != "" {
newAddrs[addr] = true
}
}
return r.updateState(newAddrs)
}
// updateState 将地址集合转换为 resolver.State 并更新
func (r *etcdResolver) updateState(newAddrs map[string]bool) error {
r.mu.Lock()
defer r.mu.Unlock()
if mapsEqual(r.addrs, newAddrs) {
return nil
}
var addresses []resolver.Address
for addr := range newAddrs {
addresses = append(addresses, resolver.Address{Addr: addr})
}
state := resolver.State{Addresses: addresses}
if err := r.cc.UpdateState(state); err != nil {
return err
}
r.addrs = newAddrs
logs.Info(nil, "etcd resolver: updated addresses for %s: %v",
r.serviceName, addresses)
return nil
}
// watchWithRetry 启动 watch,并在出错时自动重试(指数退避)
func (r *etcdResolver) watchWithRetry(keyPrefix string) {
retryDelay := time.Second
maxRetryDelay := 30 * time.Second
for {
select {
case <-r.ctx.Done():
return
default:
}
watchChan := r.client.Watch(r.ctx, keyPrefix, clientv3.WithPrefix())
if err := r.handleWatch(watchChan); err != nil {
logs.Warn(nil, "etcd resolver: watch error: %v, retrying in %v",
err, retryDelay)
select {
case <-r.ctx.Done():
return
case <-time.After(retryDelay):
}
retryDelay *= 2
if retryDelay > maxRetryDelay {
retryDelay = maxRetryDelay
}
} else {
retryDelay = time.Second
}
}
}
// handleWatch 处理 watch 事件
func (r *etcdResolver) handleWatch(watchChan clientv3.WatchChan) error {
for {
select {
case <-r.ctx.Done():
return nil
case wresp, ok := <-watchChan:
if !ok {
return fmt.Errorf("watch channel closed")
}
if wresp.Err() != nil {
return wresp.Err()
}
if err := r.processWatchResponse(wresp); err != nil {
logs.Error(nil,
"etcd resolver: process watch response error: %v", err)
}
}
}
}
// processWatchResponse 处理单个 watch 响应,更新地址
func (r *etcdResolver) processWatchResponse(wresp clientv3.WatchResponse) error {
needFullSync := false
for _, ev := range wresp.Events {
if ev.Type == mvccpb.DELETE {
needFullSync = true
break
}
}
// DELETE 事件无法获取 value,采用全量同步
if needFullSync {
if err := r.sync(); err != nil {
logs.Error(nil,
"etcd resolver: full sync after delete failed: %v", err)
}
return nil
}
// PUT 事件:增量更新
r.mu.Lock()
defer r.mu.Unlock()
newAddrs := make(map[string]bool)
for k := range r.addrs {
newAddrs[k] = true
}
for _, ev := range wresp.Events {
if ev.Type == mvccpb.PUT {
addr := string(ev.Kv.Value)
if addr != "" {
newAddrs[addr] = true
}
}
}
return r.updateStateNoLock(newAddrs)
}
// updateStateNoLock 内部使用,调用前需要持有锁
func (r *etcdResolver) updateStateNoLock(newAddrs map[string]bool) error {
if mapsEqual(r.addrs, newAddrs) {
return nil
}
var addresses []resolver.Address
for addr := range newAddrs {
addresses = append(addresses, resolver.Address{Addr: addr})
}
state := resolver.State{Addresses: addresses}
if err := r.cc.UpdateState(state); err != nil {
return err
}
r.addrs = newAddrs
logs.Info(nil, "etcd resolver: updated addresses for %s: %v",
r.serviceName, addresses)
return nil
}
// mapsEqual 比较两个 map 是否相同
func mapsEqual(a, b map[string]bool) bool {
if len(a) != len(b) {
return false
}
for k := range a {
if !b[k] {
return false
}
}
return true
}
gRPC Resolver 解析流程:
bash
gRPC Client 拨号 "etcd:///ip-service"
│
▼
┌─────────────────────────────────┐
│ etcdBuilder.Build() │
│ serviceName = "ip-service" │
│ 启动 watchService goroutine │
└────────────────┬────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ watchService() │
│ │
│ 1. sync() --- 全量拉取 │
│ GET /{env}/ip-service/ (prefix) │
│ ┌──────────────────────────────────┐ │
│ │ kv1: 192.168.1.10:8080 │ │
│ │ kv2: 192.168.1.11:8080 │ │
│ └──────────────────────────────────┘ │
│ updateState → cc.UpdateState([addr1, addr2]) │
│ │
│ 2. watchWithRetry(keyPrefix) --- 增量监听 │
│ ┌──────────────────────────────────┐ │
│ │ PUT → 新增地址(增量更新) │ │
│ │ DELETE → 触发全量 sync() │ │
│ └──────────────────────────────────┘ │
│ │
│ 3. 出错时指数退避重试 (1s → 2s → 4s → ... → 30s) │
└─────────────────────────────────────────────────────┘
│
▼
gRPC ClientConn 得到最新服务地址列表
配合 round_robin 策略实现负载均衡
5. 端到端调用流程
以 通过 etcd 服务发现调用 IP 定位服务 为例,展示完整的调用链路:
css
┌─────────────── 调用方 ──────────────────────────────────────────────────┐
│ │
│ 1. ctx = context.WithValue(ctx, "trace_id", "test-123456") │
│ │
│ 2. client, _ := ip.NewClientWithEtcd() │
│ ├── 创建 etcd 客户端 → 连接 etcd://localhost:2379 │
│ ├── 注册 etcdBuilder(scheme="etcd") │
│ ├── grpc.NewClient("etcd:///ip-service") │
│ │ ├── etcdBuilder.Build() → etcdResolver │
│ │ │ └── sync() → GET /prod/ip-service/ (prefix) │
│ │ │ └── 得到 [192.168.1.10:8080, 192.168.1.11:8080] │
│ │ ├── round_robin 负载均衡 │
│ │ └── 拦截器链: [ClientLogger, ClientMetadata] │
│ └── 返回 Client 对象 │
│ │
│ 3. reply, _ := client.GetLocateByIP(ctx, &GetLocateByIPReq{ │
│ Ip: "223.161.208.123", │
│ }) │
│ ├── ClientLogger: 记录 [client] method=..., req=... │
│ ├── ClientMetadata: md.Set("trace_id", "test-123456") │
│ ├── invoker() → gRPC 调用 (round_robin 选择后端) │
│ │ │
│ │ ┌───────── 服务端 ──────────────────────────────────┐ │
│ │ │ Recovery: defer recover() │ │
│ │ │ ServerLogger: 记录 client IP、req │ │
│ │ │ ServerMetadata: │ │
│ │ │ md["trace_id"] → ctx = WithValue("trace_id", │ │
│ │ │ "test-123456") │ │
│ │ │ handler(ctx, req) → 业务逻辑处理 │ │
│ │ │ └── 查询 IP → 返回 Locate{...} │ │
│ │ │ ServerLogger: 记录 consume, reply │ │
│ │ └───────────────────────────────────────────────────┘ │
│ │ │
│ ├── ClientLogger: 记录 consume, reply │
│ └── 返回 reply │
│ │
│ 4. reply.GetLocate().GetCity() → "上海市" │
│ │
│ 5. client.Close() │
└─────────────────────────────────────────────────────────────────────────┘
6. 快速上手
6.1 编译 Protobuf
bash
# 编译公共类型
make types
# 编译 IP 服务
make ip
# 整理依赖
make tidy
6.2 服务注册
go
import (
"context"
"time"
clientv3 "go.etcd.io/etcd/client/v3"
"gitee.com/ha666/golibs/registry"
)
// 创建 etcd 客户端
client, _ := clientv3.New(clientv3.Config{
Endpoints: []string{"127.0.0.1:2379"},
DialTimeout: time.Second,
})
// 创建注册中心
r := registry.New(client, "ip-service",
registry.WithEnv("prod"),
registry.WithTTL(15*time.Second),
registry.MaxRetry(5),
)
// 注册服务
_ = r.Register(context.Background(), "192.168.1.10:8080")
// 程序退出时注销
defer r.Deregister(context.Background(), "192.168.1.10:8080")
6.3 服务发现
go
// 方式一:直接查询
services, _ := r.GetService(context.Background())
for _, svc := range services {
fmt.Println("endpoint:", svc)
}
// 方式二:Watch 持续监听
w, _ := r.Watch(context.Background())
defer w.Stop()
for {
services, err := w.Next() // 阻塞直到有变化
if err != nil {
break
}
fmt.Println("当前服务列表:", services)
}
6.4 gRPC 客户端调用
go
import "gitee.com/ha666/golibs/protocol/ip"
// 方式一:直连
client, _ := ip.NewClient("127.0.0.1:9123")
// 方式二:通过 etcd 自动发现 + 负载均衡
client, _ := ip.NewClientWithEtcd()
ctx := context.WithValue(context.Background(), "trace_id", "req-001")
reply, err := client.GetLocateByIP(ctx, &ip.GetLocateByIPReq{
Ip: "223.161.208.123",
})
if err != nil {
log.Fatal(err)
}
if reply.GetError() != nil {
log.Fatalf("业务错误: %s - %s", reply.Error.Code, reply.Error.Message)
}
fmt.Printf("定位结果: %s\n", reply.GetLocate().GetFullAddress())
defer client.Close()
📝 文档版本: v1.0 | Go 版本: 1.25.0 | 模块路径:
gitee.com/ha666/golibs