kafka如何获取topic一天的消息量

背景

有时候我们想要统计某个topic一天的消息量大小,在监控不完善的情况下我们可以如何统计呢?

java实现

我们可以基于kafka提供的client自己去实现

首先引入client依赖

xml 复制代码
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-clients</artifactId>
            <version>3.5.0</version>
        </dependency>

具体实现代码

java 复制代码
    public static void main(String[] args) {
    
        String bootstrapServers = "kafka-小奏技术-001.com:9092,kafka-小奏技术-002.com:9092,kafka-小奏技术-003.com:9092";
        String topicName = "小奏技术-topic";

        Properties props = new Properties();
        props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);

        try (AdminClient adminClient = AdminClient.create(props)) {
            long endTime = System.currentTimeMillis();
            // 24 hours ago
            long startTime = endTime - 24 * 60 * 60 * 1000; 

            // Get topic partitions
            List<TopicPartition> partitions = getTopicPartitions(adminClient, topicName);

            // Get offsets for start time
            Map<TopicPartition, Long> startOffsets = getOffsetsForTime(adminClient, partitions, startTime);

            // Get offsets for end time (current time)
            Map<TopicPartition, Long> endOffsets = getOffsetsForTime(adminClient, partitions, endTime);

            // Calculate total message count
            long totalMessages = calculateMessageCount(startOffsets, endOffsets);

            System.out.println("Total messages in the last 24 hours for topic '" + topicName + "': " + totalMessages);

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private static List<TopicPartition> getTopicPartitions(AdminClient adminClient, String topicName) throws ExecutionException, InterruptedException {
        DescribeTopicsResult describeTopicsResult = adminClient.describeTopics(Collections.singletonList(topicName));
        Map<String, TopicDescription> topicDescriptionMap = describeTopicsResult.all().get();
        TopicDescription topicDescription = topicDescriptionMap.get(topicName);

        List<TopicPartition> partitions = new ArrayList<>();
        for (TopicPartitionInfo partitionInfo : topicDescription.partitions()) {
            partitions.add(new TopicPartition(topicName, partitionInfo.partition()));
        }
        return partitions;
    }

    private static Map<TopicPartition, Long> getOffsetsForTime(AdminClient adminClient, List<TopicPartition> partitions, long timestamp) throws ExecutionException, InterruptedException {
        Map<TopicPartition, OffsetSpec> timestampsToSearch = new HashMap<>();
        for (TopicPartition partition : partitions) {
            timestampsToSearch.put(partition, OffsetSpec.forTimestamp(timestamp));
        }

        ListOffsetsResult offsetsForTimes = adminClient.listOffsets(timestampsToSearch);
        Map<TopicPartition, ListOffsetsResult.ListOffsetsResultInfo> offsetsResultMap = offsetsForTimes.all().get();

        Map<TopicPartition, Long> resultOffsets = new HashMap<>();
        for (Map.Entry<TopicPartition, ListOffsetsResult.ListOffsetsResultInfo> entry : offsetsResultMap.entrySet()) {
            resultOffsets.put(entry.getKey(), entry.getValue().offset());
        }

        return resultOffsets;
    }

    private static long calculateMessageCount(Map<TopicPartition, Long> startOffsets, Map<TopicPartition, Long> endOffsets) {
        long totalMessages = 0;
        for (TopicPartition partition : startOffsets.keySet()) {
            Long startOffset = startOffsets.get(partition);
            Long endOffset = endOffsets.get(partition);

            if (startOffset != null && endOffset != null) {
                totalMessages += endOffset - startOffset;
            }
        }
        return totalMessages;
    }

具体的实现逻辑大致如下

  1. 获取topic的所有partition
  2. 获取partition在开始时间点的offset
  3. 获取partition在结束时间点的offset
  4. 计算offset差值即为当前时间段的消息量

总结

代码实现还是比较简单的,就是获取到topic的所有partition的偏移量,然后累加就行

我们也可以基于kafka暴露的JMX指标˙中kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=([-.\w]+) 来计算

相关推荐
古时的风筝1 分钟前
花10 分钟时间,把终端改造成“生产力武器”:Ghostty + Yazi + Lazygit 配置全流程
前端·后端·程序员
Cache技术分享2 分钟前
340. Java Stream API - 理解并行流的额外开销
前端·后端
初次攀爬者4 分钟前
RocketMQ 消息可靠性保障与堆积处理
后端·消息队列·rocketmq
ygxb10 分钟前
如何去创建一个规范化的Agent SKIll?
后端·ai编程·claude
JxWang0538 分钟前
Task01:环境搭建,初识数据库
后端
周杰伦jc39 分钟前
RocketMQ 完全指南:从入门到原理到生产实战、八股面试
后端
小码哥_常39 分钟前
Java可执行JAR包打包大揭秘:三种方式全解析
后端
掘金者阿豪39 分钟前
Halo的“傻瓜建站魔法”:cpolar内网穿透实验室第637个成功挑战
后端
koddnty40 分钟前
c++协程控制流深入剖析
后端·架构
小码哥_常43 分钟前
Spring Boot 集成DFA:打造高效内容安全卫士
后端