mongodb源代码分析createCollection命令由create.idl变成create_gen.cpp过程

mongodb命令db.createCollection(name, options)创建一个新集合。由于 MongoDB 在命令中首次引用集合时会隐式创建集合,因此此方法主要用于创建使用特定选项的新集合。

例如,您使用db.createCollection()创建:固定大小集合;集群化集合;使用模式验证的新集合。

db.createCollection() 方法具有以下原型形式:

cpp 复制代码
db.createCollection( <name>,
    {
      capped: <boolean>,
      timeseries: {                  // Added in MongoDB 5.0
         timeField: <string>,        // required for time series collections
         metaField: <string>,
         granularity: <string>,
         bucketMaxSpanSeconds: <number>,  // Added in MongoDB 6.3
         bucketRoundingSeconds: <number>  // Added in MongoDB 6.3
      },
      expireAfterSeconds: <number>,
      clusteredIndex: <document>,  // Added in MongoDB 5.3
      changeStreamPreAndPostImages: <document>,  // Added in MongoDB 6.0
      size: <number>,
      max: <number>,
      storageEngine: <document>,
      validator: <document>,
      validationLevel: <string>,
      validationAction: <string>,
      indexOptionDefaults: <document>,
      viewOn: <string>,
      pipeline: <pipeline>,
      collation: <document>,
      writeConcern: <document>
    }
  )

db.createCollection()参数解释:

参数 类型 说明
capped Boolean 是否为固定大小集合(默认false
size Number 固定集合的最大大小(字节),仅在capped=true时有效
max Number 固定集合的最大文档数量
validator Document JSON Schema 验证器,确保文档符合特定格式
storageEngine Document 存储引擎特定配置(如 WiredTiger 参数)
indexes Array 创建集合时预定义的索引
writeConcern Document 默认写关注级别
readConcern Document 默认读关注级别
autoIndexId Boolean 是否自动为_id字段创建索引(默认true
viewOn String 创建视图时指定源集合
pipeline Array 视图的聚合管道
collation Document 指定排序规则(如区分大小写)
timeseries Document 时间序列集合配置
expireAfterSeconds Number TTL 索引,指定文档自动过期时间(秒)

mongodb源代码src\mongo\db\commands文件夹下面是命令文件所在地:

count_cmd.cpp封装count命令,distinct.cpp封装了distinct命令,dbcommands.cpp封装了CmdCreate和CmdDrop、CmdDatasize等。CmdCreate封装了创建collection过程。CreateCommand解析create命令 ,重点是CreateCommand怎么来的?工具跟踪进去是create_gen.cpp。create_gen.cpp原来是create.idl。

cpp 复制代码
/* create collection */
class CmdCreate : public BasicCommand {
public:
    CmdCreate() : BasicCommand("create") {}

    virtual bool run(OperationContext* opCtx,
                     const string& dbname,
                     const BSONObj& cmdObj,
                     BSONObjBuilder& result) {
        IDLParserErrorContext ctx("create");
        CreateCommand cmd = CreateCommand::parse(ctx, cmdObj);

        ...
    }
} cmdCreate;

create.idlidl文件是什么?

MongoDB 采用 IDL(接口定义语言)生成 C++ 代码是一种常见的工程实践,减少样板代码,提高开发效率,避免手动编写重复逻辑 (如字段提取、类型检查、错误处理),确保代码一致性(所有命令遵循相同的验证规则)。

mongo\db\commands\create.idl内容是:

cpp 复制代码
global:
    cpp_namespace: "mongo"

imports:
    - "mongo/idl/basic_types.idl"

commands:
    create:
        description: "Parser for the 'create' Command"
        namespace: concatenate_with_db
        cpp_name: CreateCommand
        strict: true
        fields:
            capped:
                description: "Specify true to create a capped collection. If you specify true, you
                              must also set a maximum size in the 'size' field."
                type: safeBool
                default: false
            autoIndexId:
                description: "Specify false to disable the automatic creation of an index on the
                              _id field."
                type: safeBool
                optional: true
            idIndex:
                description: "Specify the default _id index specification."
                type: object
                optional: true
            size:
              ...

create.idl怎么转换成create.cpp的呢?

在 buildscripts 有一个目录 idl,这里负责根据 src 中的 idl 生成文件。其中主要看buildscripts/idl/idl/generator.py文件,根据cpp_name生成对应的cpp文件,其中有一段逻辑:

cpp 复制代码
def generate(self, spec):
        # type: (ast.IDLAST) -> None
       ...
            spec_and_structs = spec.structs
            spec_and_structs += spec.commands

            for struct in spec_and_structs:
                self.gen_description_comment(struct.description)
                with self.gen_class_declaration_block(struct.cpp_name):
                    self.write_unindented_line('public:')

                    # Generate a sorted list of string constants
                    self.gen_string_constants_declarations(struct)
                    self.write_empty_line()

                    # Write constructor
                    self.gen_class_constructors(struct)
                    self.write_empty_line()

                    # Write serialization
                    self.gen_serializer_methods(struct)

                    if isinstance(struct, ast.Command):
                        self.gen_op_msg_request_methods(struct)

                    # Write getters & setters
                    for field in struct.fields:
                        if not field.ignore:
                            if field.description:
                                self.gen_description_comment(field.description)
                            self.gen_getter(struct, field)
                            if not struct.immutable and not field.chained_struct_field:
                                self.gen_setter(field)

                    if struct.generate_comparison_operators:
                        self.gen_comparison_operators_declarations(struct)

                    self.write_unindented_line('protected:')
                    self.gen_protected_serializer_methods(struct)

                    # Write private validators
                    if [field for field in struct.fields if field.validator]:
                        self.write_unindented_line('private:')
                        for field in struct.fields:
                            if not field.ignore and not struct.immutable and \
                                not field.chained_struct_field and field.validator:
                                self.gen_validators(field)

                    self.write_unindented_line('private:')

                    # Write command member variables
                    if isinstance(struct, ast.Command):
                        self.gen_known_fields_declaration()
                        self.write_empty_line()

                        self.gen_op_msg_request_member(struct)

                    # Write member variables
                    for field in struct.fields:
                        if not field.ignore and not field.chained_struct_field:
                            self.gen_member(field)

                    # Write serializer member variables
                    # Note: we write these out second to ensure the bit fields can be packed by
                    # the compiler.
                    for field in struct.fields:
                        if _is_required_serializer_field(field):
                            self.gen_serializer_member(field)

                self.write_empty_line()

            for scp in spec.server_parameters:
                if scp.cpp_class is None:
                    self._gen_exported_constexpr(scp.name, 'Default', scp.default, scp.condition)
                self._gen_extern_declaration(scp.cpp_vartype, scp.cpp_varname, scp.condition)
                self.gen_server_parameter_class(scp)

            if spec.configs:
                for opt in spec.configs:
                    self._gen_exported_constexpr(opt.name, 'Default', opt.default, opt.condition)
                    self._gen_extern_declaration(opt.cpp_vartype, opt.cpp_varname, opt.condition)
                self._gen_config_function_declaration(spec)

buildscripts/idl/idl/generator.py运行之后,python运行结果在对应的文件夹\build\opt\mongo\db\commands

create.idl生成了create_gen.h和create_gen.cpp,C++编译之后create_gen.obj文件。

\build\opt\mongo\db\commands\create_gen.h,createCollection命令中的各个参数在下面文件都能看到,参数的get和set方法,代码:

cpp 复制代码
namespace mongo {

/**
 * Parser for the 'create' Command
 */
class CreateCommand {
public:
    ...

    explicit CreateCommand(const NamespaceString nss);

    static CreateCommand parse(const IDLParserErrorContext& ctxt, const BSONObj& bsonObject);
    static CreateCommand parse(const IDLParserErrorContext& ctxt, const OpMsgRequest& request);
    void serialize(const BSONObj& commandPassthroughFields, BSONObjBuilder* builder) const;
    OpMsgRequest serialize(const BSONObj& commandPassthroughFields) const;
    BSONObj toBSON(const BSONObj& commandPassthroughFields) const;

    const NamespaceString& getNamespace() const { return _nss; }

    bool getCapped() const { return _capped; }
    void setCapped(bool value) & {  _capped = std::move(value);  }

    const boost::optional<bool> getAutoIndexId() const& { return _autoIndexId; }
    void getAutoIndexId() && = delete;
    void setAutoIndexId(boost::optional<bool> value) & {  _autoIndexId = std::move(value);  }
...

\build\opt\mongo\db\commands\create_gen.cpp,CreateCommand解析方法,createCollection命令解析成CreateCommand对象,代码:

cpp 复制代码
namespace mongo {

...
CreateCommand::CreateCommand(const NamespaceString nss) : _nss(std::move(nss)), _dbName(nss.db().toString()), _hasDbName(true) {
    // Used for initialization only
}


CreateCommand CreateCommand::parse(const IDLParserErrorContext& ctxt, const BSONObj& bsonObject) {
    NamespaceString localNS;
    CreateCommand object(localNS);
    object.parseProtected(ctxt, bsonObject);
    return object;
}

CreateCommand CreateCommand::parse(const IDLParserErrorContext& ctxt, const OpMsgRequest& request) {
    NamespaceString localNS;
    CreateCommand object(localNS);
    object.parseProtected(ctxt, request);
    return object;
}

总结:buildscripts/idl/idl/generator.py把create.idl转成create_gen.cpp和create_gen.h,再编译成create_gen.obj,CreateCommand对象封装命令createCollection。

相关推荐
一棵开花的树,枝芽无限靠近你33 分钟前
数据结构之克鲁斯卡尔算法
数据结构·算法·c
两圆相切2 小时前
Oracle自治事务——从问题到实践的深度解析
数据库·oracle
西猫雷婶6 小时前
python学智能算法(二十六)|SVM-拉格朗日函数构造
人工智能·python·算法·机器学习·支持向量机
moxiaoran57536 小时前
使用docker安装MongoDB
mongodb·docker·容器
TinpeaV7 小时前
Elasticsearch / MongoDB / Redis / MySQL 区别
大数据·redis·mysql·mongodb·elasticsearch
jstart千语7 小时前
【力扣】第42题:接雨水
算法·leetcode·职场和发展
程序人生5187 小时前
MongoDB 查询时区问题
数据库·mongodb
墨染点香7 小时前
LeetCode 刷题【10. 正则表达式匹配】
算法·leetcode·职场和发展
clock的时钟7 小时前
数据结构-线性表顺序表示
数据结构·算法