插件基本原理概述

protoc插件本质是一个可执行程序
protoc进程通过进程间通信的方式与插件进行通信。

protoc插件进程作为protoc主进程的子进程被调用，并且通过管道进行进程间通信。

工作原理

插件工作的原理图如下所示：

图上流程可以简要概括为：

protoc主进程fork出插件子进程、创建父子进程间通信的管道，并且重定向子进程的标准输入（ stdin ）和 标准输出 （ stdout ）；
protoc主进程将proto文件数据通过protobuf协议序列化，经由管道传送给子进程；
子进程应该从stdin中读入请求数据，经过处理后，将响应数据通过protobuf协议序列化，并直接写入stdout；
protoc主进程从管道读入响应数据，处理后并最终将结果输出到指定位置。

感兴趣的读者可以阅读protocolbuffers源码，探究更详细的流程。

插件命名规则

插件子进程是被主进程fork出来的，就需要指定运行哪一个可执行文件作为子进程。

protoc运行指定插件的规则为： 运行protoc时，如果指定了--NAME_out=OUT_DIR作为参数，那么protoc主进程会调用名为protoc-gen-NAME的可执行程序作为插件子进程，例子如下：

bash 复制代码

protoc test.proto --plugin_out=/path/of/out
# 如果以上面的指令运行了protoc，那么可执行程序protoc-gen-plugin就会被调用。

注意：

默认情况下，protoc会在环境变量 （PATH） 下寻找你所指定的插件；

可以显式指定插件的路径：运行protoc时指定参数： --plugin=protoc-gen-NAME=/path/to/yourplugin

通信协议

在上文中，我们介绍到protoc主进程与protoc插件的通信协议为protobuf

具体的message定义为CodeGeneratorRequest和CodeGeneratorResponse，具体定义如下所示（来自protobuf源码，为了简洁下图将大部分注释删除）：

总结

结合前文针对工作原理和通信协议的描述，可知，需要开发一个protoc插件，概括为如下几个步骤：

使用你习惯的语言编写一个程序，程序中应该完成如下动作：
- 从stdin中读入请求数据，并反序列化成CodeGeneratorRequest；
- 实现自定义插件生成文件内容的逻辑；
- 将输出的内容封装成CodeGeneratorResponse并序列化；
- 将序列化后的内容写入stdout。
将这个可执行程序编译成一个二进制可执行程序，并且按照规则将可执行程序命名为protoc-gen-NAME，其中NAME就是你自定义插件的名字。

插件开发实践

了解了protoc插件的工作原理后，在此处我们使用golang语言作为示例，开发一个protoc插件：一个通过字段上的注释生成Verify方法的插件。

golang的protobuf库中封装了方便开发者使用的结构体和相关方法，开发者可以使用它们完成插件的开发。

protogen提供了描述proto文件的高层的概念，可以不用细究底层细节就可以完成插件的开发。

解析输入

通过对下列几个关键结构体的解析，得到插件想要的信息

File

Go 复制代码

type File struct {
  Desc  protoreflect.FileDescriptor
  Proto *descriptorpb.FileDescriptorProto

  GoDescriptorIdent GoIdent       // name of Go variable for the file descriptor
  GoPackageName     GoPackageName // name of this file's Go package
  GoImportPath      GoImportPath  // import path of this file's Go package

  Enums      []*Enum      // top-level enum declarations
  Messages   []*Message   // top-level message declarations
  Extensions []*Extension // top-level extension declarations
  Services   []*Service   // top-level service declarations

  Generate bool // true if we should generate code for this file

  // GeneratedFilenamePrefix is used to construct filenames for generated
  // files associated with this source file.
  //
  // For example, the source file "dir/foo.proto" might have a filename prefix
  // of "dir/foo". Appending ".pb.go" produces an output file of "dir/foo.pb.go".
  GeneratedFilenamePrefix string
  // contains filtered or unexported fields
}

File表示代表了一个proto文件

Messages：proto文件中定义的所有message
Enums：proto文件中定义的所有enum
Services：proto文件中定义的所有service
GoPackageName：该proto文件生成后的go包名

Message

Go 复制代码

type Message struct {
  Desc protoreflect.MessageDescriptor

  GoIdent GoIdent // name of the generated Go type

  Fields []*Field // message field declarations
  Oneofs []*Oneof // message oneof declarations

  Enums      []*Enum      // nested enum declarations
  Messages   []*Message   // nested message declarations
  Extensions []*Extension // nested extension declarations

  Location Location   // location of this message
  Comments CommentSet // comments associated with this message
}

Message表示了proto文件中的message定义

Fields：message中的所有字段
Messages：message定义中的嵌套的message定义
Comments：注释

Field

Go 复制代码

type Field struct {
  Desc protoreflect.FieldDescriptor

  // GoName is the base name of this field's Go field and methods.
  // For code generated by protoc-gen-go, this means a field named
  // '{{GoName}}' and a getter method named 'Get{{GoName}}'.
  GoName string // e.g., "FieldName"

  // GoIdent is the base name of a top-level declaration for this field.
  // For code generated by protoc-gen-go, this means a wrapper type named
  // '{{GoIdent}}' for members fields of a oneof, and a variable named
  // 'E_{{GoIdent}}' for extension fields.
  GoIdent GoIdent // e.g., "MessageName_FieldName"

  Parent   *Message // message in which this field is declared; nil if top-level extension
  Oneof    *Oneof   // containing oneof; nil if not part of a oneof
  Extendee *Message // extended message for extension fields; nil otherwise

  Enum    *Enum    // type for enum fields; nil otherwise
  Message *Message // type for message or group fields; nil otherwise

  Location Location   // location of this field
  Comments CommentSet // comments associated with this field
}

Field表示了Message中的一个字段

CommentSet

Go 复制代码

type CommentSet struct {
    LeadingDetached []Comments
    Leading         Comments
    Trailing        Comments
}

三个成员变量代表的注释含义如下所示：

产生输出

GeneratedFile

go 复制代码

func (g *GeneratedFile) P(v ...interface{})
// 我们主要关注这个方法，这个方法帮助我们往输出的文件中写入一行
// 通过这个方法，我们可以一行一行地输出我们想要输出的内容

编写插件逻辑

得益于良好的封装，我们可以很简单地实现上述的插件工作流程

Go 复制代码

func main() {
    protogen.Options{
        ParamFunc: func(name, value string) error {
            return nil
        },
    }.Run(func(plugin *protogen.Plugin) error {
        for _, file := range plugin.Files {
            if file.Generate {
                handleFile(plugin, file)
            }
        }
        return nil
    })
}

插件的主逻辑如下：

创建一个protogen.Options对象，并指定参数
调用protogen.Options对象的Run方法，并且实现自定义的生成代码的逻辑。

案例解析

protoc-gen-verifier

需求

我们开发一个插件用于简单地生成message字段的校验方法，这个方法可以根据指定的校验规则对message中的字段进行校验。

从CommentSet中我们可以获取到每个字段上面的注释，所以为了简单起见，我们规定校验规则需要在字段的leading comments中指定。

校验规则语法格式为：// @verify: tag1,tag2=param2,tag3=param3（区分大小写；每个校验规则之间用逗号,分隔；有些tag可以不带参数；tag的多个参数可用|分割）

我们提供如下的内置检验规则：

除上述需求描述外，我们再实现一个功能来演示怎样往插件中传入参数：可以选择往插件中传入一个启动参数，控制解析注释时遇到错误的行为。

protoc插件开发教程

插件基本原理概述

工作原理

插件命名规则

通信协议

总结

插件开发实践

解析输入

File

Message

Field

CommentSet

产生输出

GeneratedFile

编写插件逻辑

案例解析

需求

参考资料