前言
MongoDB区别Mysql的地方,就是MongoDB支持文档嵌套,比如最近业务中就有一个在音频转写结果中进行对话场景,一个音频中对应多轮对话,这些音频数据和对话信息就存储在MongoDB中文档中。集合结构大致如下
{
"_id":23424234234324234,
"audioId": 2689944,
"contextId": "cht000d24ab@dx187d1168a449a4b540",
"dialogues": [{
"ask": "今天是礼拜天?",
"answer": "是的",
"createTime": 1697356990966
}, {
"ask": "你也要加油哈",
"answer": "奥利给!",
"createTime": 1697378011483
}, {
"ask": "下周见",
"answer": "拜拜!",
"createTime": 1697378072063
}]
}
下面简单介绍几个业务中用到的简单操作。
查询嵌套List的长度大小
java
public Integer getDialoguesSize(Long audioId) {
Integer datasSize = 0;
List<Document> group = Arrays.asList(
new Document("$match",
new Document("audioId",
new Document("$eq", audioId)
)
), new Document("$match",
new Document("dialogues",
new Document("$exists", true)
)
), new Document("$project",
new Document("datasSize",
new Document("$size", "$dialogues"))
)
);
AggregateIterable<Document> aggregate = generalCollection.aggregate(group);
Document document = aggregate.first();
if (document != null) {
datasSize = (Integer) document.get("datasSize");
}
return datasSize;
}
根据嵌套List中属性查询
下面的代码主要查询指定audioId中的dialogues集合中小于createTime,并且根据limit分页查询,这里用到了MongoDB中的Aggregates和unwind来进行聚合查询,具体使用细节,可以参见MongoDB官方文档
java
public AIDialoguesResultDTO queryAiResult(Long audioId, Long createTime, Integer limit) {
AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();
List<Bson> pipeline = Arrays.asList(
Aggregates.match(Filters.eq("audioId", audioId)),
Aggregates.unwind("$dialogues"),
Aggregates.match(Filters.lt("dialogues.createTime", createTime)),
Aggregates.sort(Sorts.descending("dialogues.createTime")),
Aggregates.limit(limit)
);
AggregateIterable<Document> aggregate = generalCollection.aggregate(pipeline);
List<AIDialoguesResult> aiDialoguesResultList = new ArrayList<>();
String contextId = Constant.EMPTY_STR;
for (Document document : aggregate) {
AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
List<String> key = Collections.singletonList("dialogues");
aiDialoguesResult.setAnswer(document.getEmbedded(key, Document.class).getString("answer"));
aiDialoguesResult.setAsk(document.getEmbedded(key, Document.class).getString("ask"));
aiDialoguesResult.setCreateTime(document.getEmbedded(key, Document.class).getLong("createTime"));
aiDialoguesResultList.add(aiDialoguesResult);
contextId = document.getString("contextId");
}
if (!CollectionUtils.isEmpty(aiDialoguesResultList)) {
aiDialoguesResultList = aiDialoguesResultList.stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
}
aiDialoguesResultDTO.setCount(aiDialoguesResultList.size());
aiDialoguesResultDTO.setContextId(contextId);
aiDialoguesResultDTO.setResult(aiDialoguesResultList);
return aiDialoguesResultDTO;
}
当然,我们还有一种比较简单的写法
java
public AIDialoguesResultDTO queryAiResultBackupVersion(Long audioId, Long createTime, Integer limit) {
Bson query = and(eq("audioId", audioId));
AITextResult aiTextResult = mongoDao.findSingle(query, AITextResult.class);
AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();
if (Objects.isNull(aiTextResult)) {
aiDialoguesResultDTO.setResult(Collections.emptyList());
aiDialoguesResultDTO.setCount(0);
aiDialoguesResultDTO.setContextId("");
}
List<AIDialoguesResult> aiDialoguesResultList = aiTextResult.getDialogues();
if (CollectionUtils.isEmpty(aiDialoguesResultList)) {
return aiDialoguesResultDTO;
}
Long finalCreateTime = createTime;
List<AIDialoguesResult> afterFilterAiDialoguesResultList =
aiDialoguesResultList.stream().filter(t -> t.getCreateTime()
< finalCreateTime).sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime).reversed())
.limit(limit).collect(Collectors.toList());
if (CollectionUtils.isEmpty(afterFilterAiDialoguesResultList)) {
aiDialoguesResultDTO.setCount(0);
} else {
aiDialoguesResultDTO.setCount(afterFilterAiDialoguesResultList.size());
}
afterFilterAiDialoguesResultList = afterFilterAiDialoguesResultList.
stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
aiDialoguesResultDTO.setResult(afterFilterAiDialoguesResultList);
aiDialoguesResultDTO.setContextId(aiTextResult.getContextId());
return aiDialoguesResultDTO;
}
上面这种写法比较直接,就是直接audioId进行匹配查询, 然后将当前文档中的dialogues全部加载到内存中,然后在内存中进行排序,分页返回,显然如果dialogues集合长度很大,对内存占用会比较高。
嵌套List的增量追加
对于dialogues数组,如果我们要向dialogues追加元素,我们可以把audioId对应的dialogues全部取出来,然后在List后面追加一个元素,大致代码如下
java
public void saveAiResult(SaveAIResultDTO saveAIResultDTO) {
Long audioId = saveAIResultDTO.getAudioId();
Bson filter = Filters.eq("audioId", audioId);
AITextResult aiTextResult = mongoDao.findSingle(filter, AITextResult.class);
if (Objects.isNull(aiTextResult)) {
aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
mongoDao.saveOrUpdate(aiTextResult);
return;
}
List<AIDialoguesResult> aiDialoguesResults = aiTextResult.getDialogues();
AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
aiDialoguesResult.setCreateTime(new Date().getTime());
aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
aiDialoguesResults.add(aiDialoguesResult);
aiTextResult.setDialogues(aiDialoguesResults);
mongoDao.saveOrUpdate(aiTextResult);
}
上面这种写法本身没有什么问题,但是如果dialogues集合大小比较大,每次追加都将dialogues全部取出来进行追加操作,可能比较占用内存,我们可以利用MongoDB中的push操作,直接追加
java
public void saveAiResultIncremental(SaveAIResultDTO saveAIResultDTO) {
Long audioId = saveAIResultDTO.getAudioId();
Document query = new Document("audioId", audioId);
Bson projection = Projections.fields(Projections.include("contextId"), Projections.excludeId());
FindIterable<Document> result = generalCollection.find(query).projection(projection);
AITextResult aiTextResult;
if (!result.iterator().hasNext()) {
aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
mongoDao.saveOrUpdate(aiTextResult);
return;
}
AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
aiDialoguesResult.setCreateTime(new Date().getTime());
aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
Bson update = push("dialogues", aiDialoguesResult);
Bson filter = Filters.eq("audioId", audioId);
generalCollection.updateOne(filter, update);
}
总结
既然选择了MongoDB,就不能继续沿用Mysql的查询风格,要学会利用MongoDB的特性,否则往往达不到预期效果。