MongoDB中的嵌套List操作

前言

MongoDB区别Mysql的地方,就是MongoDB支持文档嵌套,比如最近业务中就有一个在音频转写结果中进行对话场景,一个音频中对应多轮对话,这些音频数据和对话信息就存储在MongoDB中文档中。集合结构大致如下

复制代码
{
    "_id":23424234234324234,
	"audioId": 2689944,
	"contextId": "cht000d24ab@dx187d1168a449a4b540",
	"dialogues": [{
		"ask": "今天是礼拜天?",
		"answer": "是的",
		"createTime": 1697356990966
	}, {
		"ask": "你也要加油哈",
		"answer": "奥利给!",
		"createTime": 1697378011483
	}, {
		"ask": "下周见",
		"answer": "拜拜!",
		"createTime": 1697378072063
	}]
}

下面简单介绍几个业务中用到的简单操作。

查询嵌套List的长度大小

java 复制代码
    public Integer getDialoguesSize(Long audioId) {
        Integer datasSize = 0;
        List<Document> group = Arrays.asList(
                new Document("$match",
                        new Document("audioId",
                                new Document("$eq", audioId)
                        )
                ), new Document("$match",
                        new Document("dialogues",
                                new Document("$exists", true)
                        )
                ), new Document("$project",
                        new Document("datasSize",
                                new Document("$size", "$dialogues"))
                )
        );
        AggregateIterable<Document> aggregate = generalCollection.aggregate(group);
        Document document = aggregate.first();
        if (document != null) {
            datasSize = (Integer) document.get("datasSize");
        }
        return datasSize;
    }

根据嵌套List中属性查询

下面的代码主要查询指定audioId中的dialogues集合中小于createTime,并且根据limit分页查询,这里用到了MongoDB中的Aggregates和unwind来进行聚合查询,具体使用细节,可以参见MongoDB官方文档

java 复制代码
    public AIDialoguesResultDTO queryAiResult(Long audioId, Long createTime, Integer limit) {
        AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();

        List<Bson> pipeline = Arrays.asList(
                Aggregates.match(Filters.eq("audioId", audioId)),
                Aggregates.unwind("$dialogues"),
                Aggregates.match(Filters.lt("dialogues.createTime", createTime)),
                Aggregates.sort(Sorts.descending("dialogues.createTime")),
                Aggregates.limit(limit)
        );

        AggregateIterable<Document> aggregate = generalCollection.aggregate(pipeline);
        List<AIDialoguesResult> aiDialoguesResultList = new ArrayList<>();
        String contextId = Constant.EMPTY_STR;
        for (Document document : aggregate) {
            AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
            List<String> key = Collections.singletonList("dialogues");

            aiDialoguesResult.setAnswer(document.getEmbedded(key, Document.class).getString("answer"));
            aiDialoguesResult.setAsk(document.getEmbedded(key, Document.class).getString("ask"));
            aiDialoguesResult.setCreateTime(document.getEmbedded(key, Document.class).getLong("createTime"));
            aiDialoguesResultList.add(aiDialoguesResult);
            contextId = document.getString("contextId");
        }

        if (!CollectionUtils.isEmpty(aiDialoguesResultList)) {
            aiDialoguesResultList = aiDialoguesResultList.stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
        }

        aiDialoguesResultDTO.setCount(aiDialoguesResultList.size());
        aiDialoguesResultDTO.setContextId(contextId);
        aiDialoguesResultDTO.setResult(aiDialoguesResultList);
        return aiDialoguesResultDTO;
    }

当然,我们还有一种比较简单的写法

java 复制代码
    public AIDialoguesResultDTO queryAiResultBackupVersion(Long audioId, Long createTime, Integer limit) {
        Bson query = and(eq("audioId", audioId));
        AITextResult aiTextResult = mongoDao.findSingle(query, AITextResult.class);
        AIDialoguesResultDTO aiDialoguesResultDTO = new AIDialoguesResultDTO();
        if (Objects.isNull(aiTextResult)) {
            aiDialoguesResultDTO.setResult(Collections.emptyList());
            aiDialoguesResultDTO.setCount(0);
            aiDialoguesResultDTO.setContextId("");
        }
        List<AIDialoguesResult> aiDialoguesResultList = aiTextResult.getDialogues();

        if (CollectionUtils.isEmpty(aiDialoguesResultList)) {
            return aiDialoguesResultDTO;
        }

        Long finalCreateTime = createTime;
        List<AIDialoguesResult> afterFilterAiDialoguesResultList =
                aiDialoguesResultList.stream().filter(t -> t.getCreateTime()
                                < finalCreateTime).sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime).reversed())
                        .limit(limit).collect(Collectors.toList());

        if (CollectionUtils.isEmpty(afterFilterAiDialoguesResultList)) {
            aiDialoguesResultDTO.setCount(0);
        } else {
            aiDialoguesResultDTO.setCount(afterFilterAiDialoguesResultList.size());
        }
        afterFilterAiDialoguesResultList = afterFilterAiDialoguesResultList.
                stream().sorted(Comparator.comparingLong(AIDialoguesResult::getCreateTime)).collect(Collectors.toList());
        aiDialoguesResultDTO.setResult(afterFilterAiDialoguesResultList);
        aiDialoguesResultDTO.setContextId(aiTextResult.getContextId());
        return aiDialoguesResultDTO;
    }

上面这种写法比较直接,就是直接audioId进行匹配查询, 然后将当前文档中的dialogues全部加载到内存中,然后在内存中进行排序,分页返回,显然如果dialogues集合长度很大,对内存占用会比较高。

嵌套List的增量追加

对于dialogues数组,如果我们要向dialogues追加元素,我们可以把audioId对应的dialogues全部取出来,然后在List后面追加一个元素,大致代码如下

java 复制代码
    public void saveAiResult(SaveAIResultDTO saveAIResultDTO) {
        Long audioId = saveAIResultDTO.getAudioId();
        Bson filter = Filters.eq("audioId", audioId);
        AITextResult aiTextResult = mongoDao.findSingle(filter, AITextResult.class);
        if (Objects.isNull(aiTextResult)) {
            aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
            mongoDao.saveOrUpdate(aiTextResult);
            return;
        }
        List<AIDialoguesResult> aiDialoguesResults = aiTextResult.getDialogues();
        AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
        aiDialoguesResult.setCreateTime(new Date().getTime());
        aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
        aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
        aiDialoguesResults.add(aiDialoguesResult);
        aiTextResult.setDialogues(aiDialoguesResults);
        mongoDao.saveOrUpdate(aiTextResult);
    }

上面这种写法本身没有什么问题,但是如果dialogues集合大小比较大,每次追加都将dialogues全部取出来进行追加操作,可能比较占用内存,我们可以利用MongoDB中的push操作,直接追加

java 复制代码
    public void saveAiResultIncremental(SaveAIResultDTO saveAIResultDTO) {
        Long audioId = saveAIResultDTO.getAudioId();
        Document query = new Document("audioId", audioId);
        Bson projection = Projections.fields(Projections.include("contextId"), Projections.excludeId());
        FindIterable<Document> result = generalCollection.find(query).projection(projection);
        AITextResult aiTextResult;

        if (!result.iterator().hasNext()) {
            aiTextResult = AITextResult.buildAiTextResult(saveAIResultDTO);
            mongoDao.saveOrUpdate(aiTextResult);
            return;
        }
     
        AIDialoguesResult aiDialoguesResult = new AIDialoguesResult();
        aiDialoguesResult.setCreateTime(new Date().getTime());
        aiDialoguesResult.setAsk(saveAIResultDTO.getAsk());
        aiDialoguesResult.setAnswer(saveAIResultDTO.getAnswer());
        Bson update = push("dialogues", aiDialoguesResult);
        Bson filter = Filters.eq("audioId", audioId);
        generalCollection.updateOne(filter, update);
    }

总结

既然选择了MongoDB,就不能继续沿用Mysql的查询风格,要学会利用MongoDB的特性,否则往往达不到预期效果。

相关推荐
一点晖光5 小时前
MongoDB数据迁移方案整理
数据库·mongodb·数据迁移
lhrimperial8 小时前
MongoDB核心技术深度解析题
数据库·mongodb
bing.shao1 天前
FerretDB 替换MongoDB符合信创要求
数据库·mongodb
bing.shao1 天前
FerretDB 完美对接 MongoDB
数据库·mongodb
坚定信念,勇往无前4 天前
docker安装mongodb
mongodb·docker·容器
云和数据.ChenGuang6 天前
openEuler系统下安装MongoDB的技术教程
运维·数据库·mongodb·压力测试·运维工程师·运维技术
ChristXlx6 天前
Linux安装MongoDB(虚拟机适用)
linux·mongodb·postgresql
2301_796512526 天前
React Native鸿蒙跨平台开发如何使用MongoDB或Firebase作为后端数据库来存储车辆信息、保养记录和预约信息
数据库·mongodb·react native
数据与人7 天前
mongodb报错Sort exceeded memory limit of 104857600 bytes
数据库·mongodb
赵渝强老师7 天前
【赵渝强老师】MongoDB的数据类型
数据库·mongodb·nosql