在 MongoDB 日常开发中,$in 是最常用的查询操作符之一,用于匹配字段值在指定数组中的文档。但很多开发者都会疑惑:$in 后的数组参数到底能放多少个值?是否存在硬性上限?本文将从 MongoDB 内核限制、实际验证、源码线索及最佳实践四个维度,彻底讲清楚这个问题。
1、$in 的参数上限不是固定值
MongoDB 4.x 中,查询 query 参数的长度并没有明确的硬性限制。MongoDB 并没有为 $in 操作符单独设定 "数组元素数量上限",但 $in 的参数会受限于两个关键全局限制,最终决定了实际可传入的参数数量:
- BSON 文档大小限制 :默认最大 16MB(16*1024* 1024 字节);MongoDB 使用 BSON(Binary JSON)格式存储和传输数据。根据 MongoDB 的设计,单个 BSON 文档的最大大小为 16MB。因此,无论查询条件多么复杂,整个查询请求的 BSON 表示不能超过 16MB。
- 查询条件的内存限制:MongoDB 会将查询条件加载到内存中,过大的数组会占用过多内存,触发性能问题甚至查询失败。
1.1 16MB源代码哪里定义的啊?
com.mongodb.internal.connection.MessageSettings
java
@Immutable
public final class MessageSettings {
private static final int DEFAULT_MAX_DOCUMENT_SIZE = 0x1000000; // 16MB
private static final int DEFAULT_MAX_MESSAGE_SIZE = 0x2000000; // 32MB
private static final int DEFAULT_MAX_BATCH_COUNT = 1000;
private final int maxDocumentSize;
private final int maxMessageSize;
private final int maxBatchCount;
private final int maxWireVersion;
private final ServerType serverType;
/**
* Gets the builder
*
* @return the builder
*/
public static Builder builder() {
return new Builder();
}
/**
* A MessageSettings builder.
*/
@NotThreadSafe
public static final class Builder {
private int maxDocumentSize = DEFAULT_MAX_DOCUMENT_SIZE;
private int maxMessageSize = DEFAULT_MAX_MESSAGE_SIZE;
private int maxBatchCount = DEFAULT_MAX_BATCH_COUNT;
private int maxWireVersion;
private ServerType serverType;
...
}
1.2 限制验证大小源代码
com.mongodb.internal.connection.RequestMessage发现验证不是16MB,而是16MB+16KB。16KB是头空间大小
java
abstract class RequestMessage {
static final AtomicInteger REQUEST_ID = new AtomicInteger(1);
static final int MESSAGE_PROLOGUE_LENGTH = 16;
// Allow an extra 16K to the maximum allowed size of a query or command document, so that, for example,
// a 16M document can be upserted via findAndModify
private static final int DOCUMENT_HEADROOM = 16 * 1024;
...
protected void addDocument(final BsonDocument document, final BsonOutput bsonOutput,
final FieldNameValidator validator) {
addDocument(document, getCodec(document), EncoderContext.builder().build(), bsonOutput, validator,
settings.getMaxDocumentSize() + DOCUMENT_HEADROOM, null);
}
protected void addDocument(final BsonDocument document, final BsonOutput bsonOutput,
final FieldNameValidator validator, final List<BsonElement> extraElements) {
addDocument(document, getCodec(document), EncoderContext.builder().build(), bsonOutput, validator,
settings.getMaxDocumentSize() + DOCUMENT_HEADROOM, extraElements);
}
2、预估$in上线算法
Java 为保证跨平台一致性,将基础类型的字节数固定:int 固定 4 字节、long 固定 8 字节、char 固定 2 字节(支持 Unicode)。
BSON文档大小限制16MB,转成字节是16*1024*1024=16,777,216。
int类型4字节,16,777,216 / 4字节=4,194,304个。
long类型8个字节,16,777,216 / 8字节=2,097,152个。
3、验证int类型的上限个数
测试代码:
cpp
public class Test {
public static void main(String[] args) throws Exception {
Injector injector = Guice.createInjector(new TransactionModule());
MongoDao mongoDao = injector.getInstance(MongoDao.class);
BasicDBObject query = new BasicDBObject();
List<Integer> value = new ArrayList<Integer>();
for (int i = 0; i < 1377250; i++) {
value.add(i+1);
}
int size = value.size();
while(true) {
for (int i = 0; i < 10; i++) {
value.add(++size);
}
System.out.println(value.size());
query.append("id", new BasicDBObject("$in", value));
mongoDao.query("tianlong8bu", "users", query, null);
}
}
}
代码运行结果,1377280报错超过限制。
java
1377260
1377270
1377280
Exception in thread "main" org.bson.BsonMaximumSizeExceededException: Document size of 16793640 is larger than maximum of 16793600.
at org.bson.BsonBinaryWriter.validateSize(BsonBinaryWriter.java:418)
at org.bson.BsonBinaryWriter.backpatchSize(BsonBinaryWriter.java:412)
at org.bson.BsonBinaryWriter.doWriteEndDocument(BsonBinaryWriter.java:133)
at org.bson.AbstractBsonWriter.writeEndDocument(AbstractBsonWriter.java:307)
at com.mongodb.internal.connection.BsonWriterDecorator.writeEndDocument(BsonWriterDecorator.java:53)
at com.mongodb.internal.connection.LevelCountingBsonWriter.writeEndDocument(LevelCountingBsonWriter.java:48)
at com.mongodb.internal.connection.ElementExtendingBsonWriter.writeEndDocument(ElementExtendingBsonWriter.java:43)
at org.bson.codecs.BsonDocumentCodec.encode(BsonDocumentCodec.java:121)
at org.bson.codecs.BsonDocumentCodec.encode(BsonDocumentCodec.java:42)
at com.mongodb.internal.connection.RequestMessage.addDocument(RequestMessage.java:238)
at com.mongodb.internal.connection.RequestMessage.addDocument(RequestMessage.java:188)
at com.mongodb.internal.connection.CommandMessage.encodeMessageBodyWithMetadata(CommandMessage.java:161)
at com.mongodb.internal.connection.RequestMessage.encode(RequestMessage.java:138)
at com.mongodb.internal.connection.CommandMessage.encode(CommandMessage.java:62)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:308)
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:114)
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:601)
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:81)
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:249)
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:214)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:123)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:113)
at com.mongodb.internal.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:328)
at com.mongodb.internal.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:318)
at com.mongodb.internal.operation.CommandOperationHelper.executeCommandWithConnection(CommandOperationHelper.java:201)
at com.mongodb.internal.operation.FindOperation$1.call(FindOperation.java:659)
at com.mongodb.internal.operation.FindOperation$1.call(FindOperation.java:653)
at com.mongodb.internal.operation.OperationHelper.withReadConnectionSource(OperationHelper.java:583)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:653)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:81)
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:184)
at com.mongodb.client.internal.MongoIterableImpl.execute(MongoIterableImpl.java:135)
at com.mongodb.client.internal.MongoIterableImpl.iterator(MongoIterableImpl.java:92)
at com.mongodb.client.internal.MongoIterableImpl.cursor(MongoIterableImpl.java:97)
限制是16MB,转成字节16*1024*1025/1377280 = 12字节。平均一个int类型占用12个字节,跟上面int类型占用4个字节差距比较大,相差8个字节。这个8个字节占用的呢?
我做了第二次实验,把int换成了Long类型,发现一个Long类型平均占用16个字节,long不是默认占用8个字节吗?也是相差8个字节。这个8个字节占用的呢?
4 分析IntegerCodec编码逻辑

MongoDB给基础类型都提供了对象的编码解码器,int对应的是IntegerCodec。
java
public class IntegerCodec implements Codec<Integer> {
@Override
public void encode(final BsonWriter writer, final Integer value, final EncoderContext encoderContext) {
writer.writeInt32(value);
}
@Override
public Integer decode(final BsonReader reader, final DecoderContext decoderContext) {
return decodeInt(reader);
}
@Override
public Class<Integer> getEncoderClass() {
return Integer.class;
}
}
BsonWriterDecorator
java
@Override
public void writeInt32(final int value) {
bsonWriter.writeInt32(value);
}
AbstractBsonWriter
java
@Override
public void writeInt32(final int value) {
checkPreconditions("writeInt32", State.VALUE);
doWriteInt32(value);
setState(getNextState());
}
BsonType
java
/**
* A BSON 32-bit integer.
*/
INT32(0x10),
/**
* A BSON timestamp.
*/
TIMESTAMP(0x11),
/**
* A BSON 64-bit integer.
*/
INT64(0x12),
/**
* A BSON Decimal128.
*
* @since 3.4
*/
DECIMAL128(0x13),
BsonBinaryWriter的doWriteInt32核心点,doWriteInt32不单单写了int值,这估计就是8个字节多出来的原因了。
5 $不是数组而是伪装成数组的文档
MongoDB 的 BSON 数组不是 "纯值列表",而是伪装成数组的文档------ 每个元素对应一个「键为索引字符串(如 "0""1377280")
第一步:BSON 对每个值的类型做标记:int 是 0x10(1 字节)
第二步:writeCurrentName()数组索引键名,16的索引字符串是 "15", 1377280的索引字符串是"1377280",写出字符串的时候还要写个结束符0
第三步:bsonOutput.writeInt32写入数值
java
@Override
protected void doWriteInt32(final int value) {
bsonOutput.writeByte(BsonType.INT32.getValue());
writeCurrentName();
bsonOutput.writeInt32(value);
}
private void writeCurrentName() {
if (getContext().getContextType() == BsonContextType.ARRAY) {
bsonOutput.writeCString(Integer.toString(getContext().index++));
} else {
bsonOutput.writeCString(getName());
}
}
OutputBuffer
java
@Override
public void writeInt32(final int value) {
write(value >> 0);
write(value >> 8);
write(value >> 16);
write(value >> 24);
}
private int writeCharacters(final String str, final boolean checkForNullCharacters) {
int len = str.length();
int total = 0;
for (int i = 0; i < len;) {
int c = Character.codePointAt(str, i);
if (checkForNullCharacters && c == 0x0) {
throw new BsonSerializationException(format("BSON cstring '%s' is not valid because it contains a null character "
+ "at index %d", str, i));
}
if (c < 0x80) {
write((byte) c);
total += 1;
} else if (c < 0x800) {
write((byte) (0xc0 + (c >> 6)));
write((byte) (0x80 + (c & 0x3f)));
total += 2;
} else if (c < 0x10000) {
write((byte) (0xe0 + (c >> 12)));
write((byte) (0x80 + ((c >> 6) & 0x3f)));
write((byte) (0x80 + (c & 0x3f)));
total += 3;
} else {
write((byte) (0xf0 + (c >> 18)));
write((byte) (0x80 + ((c >> 12) & 0x3f)));
write((byte) (0x80 + ((c >> 6) & 0x3f)));
write((byte) (0x80 + (c & 0x3f)));
total += 4;
}
i += Character.charCount(c);
}
write((byte) 0);
total++;
return total;
}
int类型占用4个字节,一个int类型做标记:int 是 0x10(1 字节),数组索引键名(字符串+结束符号)
| 元素序号范围 | 索引字符串 | int 是 0x10(1 字节)+字符 + 结束符) | 该范围元素数量 | |
|---|---|---|---|---|
| 1-9 | "0"-"8" | 1+1+1=3 字节 | 9 | |
| 10-99 | "9"-"98" | 1+2+1=4 字节 | 90 | |
| 100-999 | "99"-"998" | 1+3+1=5 字节 | 900 | |
| 1000-9999 | "999"-"9998" | 1+4+1=6 字节 | 9000 | |
| 10000-99999 | "9999"-"99998" | 1+5+1=7 字节 | 90000 | |
| 100000-999999 | "99999"-"999998" | 1+6+1=8 字节 | 900000 | |
| 1000000-1377280 | "999999"-"1377280" | 1+7+1=9 字节 | 377281 |
6、总结
MongoDB $in 参数无固定数量上限,核心受 BSON 文档 16MB 默认限制(源码 MessageSettings 定义,实际校验含 16KB 头空间)。理论按 Java 基础类型字节估算 int 可放 419 万 +、Long 209 万 +,int类型4个字节,Long类型8个字节,但实测 int 平均 12 字节、Long 16 字节,均多 8 字节。根源是 BSON 数组元素需类型标识(1 字节)+ 索引字符串键名(含长度和结束符,百万级元素平均 7 字节)+ 字符串结束符+int值。建议结合编码逻辑和实际开销预估上限,避免超量导致查询失败。