Kafka中commitAsync的使用与实例解析

在使用Apache Kafka进行消息处理时,正确管理偏移量(offset)是确保数据一致性和可靠性的重要环节。Kafka提供了多种方式来提交偏移量,其中commitAsync()方法是一种高效且灵活的选择。本文将通过一个完整的实例,详细介绍如何在Kafka中使用commitAsync()方法来异步提交偏移量。

  1. 为什么需要异步提交偏移量?
    在Kafka中,偏移量用于记录消费者消费消息的位置。默认情况下,Kafka消费者会自动提交偏移量,但这种方式可能会导致数据丢失或重复消费。通过将enable.auto.commit设置为false,并手动调用commitAsync()方法,我们可以更精确地控制偏移量的提交时机,从而提高系统的可靠性和性能。
  2. 示例项目配置
    在开始之前,我们需要配置Kafka的生产者和消费者属性。以下是示例代码中的配置类ExampleConfig,它为生产者和消费者提供了基本的配置参数。
    java复制
    package com.logicbig.example;

import java.util.Properties;

public class ExampleConfig {

public static final String BROKERS = "localhost:9092";

复制代码
public static Properties getProducerProps() {
    Properties props = new Properties();
    props.put("bootstrap.servers", BROKERS);
    props.put("acks", "all");
    props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
    props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
    return props;
}

public static Properties getConsumerProps() {
    Properties props = new Properties();
    props.setProperty("bootstrap.servers", BROKERS);
    props.setProperty("group.id", "testGroup");
    props.setProperty("enable.auto.commit", "false");
    props.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    props.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    return props;
}

}

  1. 创建Kafka主题

在运行消费者和生产者之前,我们需要创建一个Kafka主题。以下代码展示了如何使用AdminClient创建一个名为example-topic-2020-5-28的主题,并设置其分区数为1。

java复制

package com.logicbig.example;

import org.apache.kafka.clients.admin.AdminClient;

import org.apache.kafka.clients.admin.AdminClientConfig;

import org.apache.kafka.clients.admin.NewTopic;

import java.util.Collections;

import java.util.Properties;

import java.util.stream.Collectors;

public class TopicCreator {

public static void main(String[] args) throws Exception {

createTopic("example-topic-2020-5-28", 1);

}

复制代码
private static void createTopic(String topicName, int numPartitions) throws Exception {
    Properties config = new Properties();
    config.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, ExampleConfig.BROKERS);
    AdminClient admin = AdminClient.create(config);

    boolean alreadyExists = admin.listTopics().names().get().stream()
            .anyMatch(existingTopicName -> existingTopicName.equals(topicName));

    if (alreadyExists) {
        System.out.printf("topic already exits: %s%n", topicName);
    } else {
        System.out.printf("creating topic: %s%n", topicName);
        NewTopic newTopic = new NewTopic(topicName, numPartitions, (short) 1);
        admin.createTopics(Collections.singleton(newTopic)).all().get();
    }

    System.out.println("-- describing topic --");
    admin.describeTopics(Collections.singleton(topicName)).all().get()
            .forEach((topic, desc) -> {
                System.out.println("Topic: " + topic);
                System.out.printf("Partitions: %s, partition ids: %s%n", desc.partitions().size(),
                        desc.partitions()
                                .stream()
                                .map(p -> Integer.toString(p.partition()))
                                .collect(Collectors.joining(",")));
            });

    admin.close();
}

}

运行上述代码后,将创建一个名为example-topic-2020-5-28的主题,分区数为1。

  1. 使用commitAsync()提交偏移量

接下来,我们将通过一个完整的消费者和生产者示例,展示如何使用commitAsync()方法异步提交偏移量。

生产者代码

生产者代码将向example-topic-2020-5-28主题发送4条消息。

java复制

package com.logicbig.example;

import org.apache.kafka.clients.producer.KafkaProducer;

import org.apache.kafka.clients.producer.ProducerRecord;

public class CommitAsyncExample {

private static String TOPIC_NAME = "example-topic-2020-5-28";

复制代码
private static void sendMessages() {
    Properties producerProps = ExampleConfig.getProducerProps();
    KafkaProducer<String, String> producer = new KafkaProducer<>(producerProps);

    for (int i = 0; i < 4; i++) {
        String value = "message-" + i;
        System.out.printf("Sending message topic: %s, value: %s%n", TOPIC_NAME, value);
        producer.send(new ProducerRecord<>(TOPIC_NAME, value));
    }

    producer.flush();
    producer.close();
}

public static void main(String[] args) throws Exception {
    sendMessages();
}

}

消费者代码

消费者代码将订阅example-topic-2020-5-28主题,并使用commitAsync()方法异步提交偏移量。

java复制

package com.logicbig.example;

import org.apache.kafka.clients.consumer.ConsumerRecord;

import org.apache.kafka.clients.consumer.ConsumerRecords;

import org.apache.kafka.clients.consumer.KafkaConsumer;

import org.apache.kafka.clients.consumer.OffsetAndMetadata;

import org.apache.kafka.common.TopicPartition;

import java.time.Duration;

import java.util.*;

public class CommitAsyncExample {

private static String TOPIC_NAME = "example-topic-2020-5-28";

private static KafkaConsumer<String, String> consumer;

private static TopicPartition topicPartition;

复制代码
public static void main(String[] args) throws Exception {
    Properties consumerProps = ExampleConfig.getConsumerProps();
    consumer = new KafkaConsumer<>(consumerProps);
    topicPartition = new TopicPartition(TOPIC_NAME, 0);
    consumer.assign(Collections.singleton(topicPartition));

    printOffsets("before consumer loop", consumer, topicPartition);
    sendMessages();
    startConsumer();
}

private static void startConsumer() {
    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
        for (ConsumerRecord<String, String> record : records) {
            System.out.printf("consumed: key = %s, value = %s, partition id= %s, offset = %s%n",
                    record.key(), record.value(), record.partition(), record.offset());
        }

        if (records.isEmpty()) {
            System.out.println("-- terminating consumer --");
            break;
        }

        printOffsets("before commitAsync() call", consumer, topicPartition);
        consumer.commitAsync();
        printOffsets("after commitAsync() call", consumer, topicPartition);
    }

    printOffsets("after consumer loop", consumer, topicPartition);
}

private static void printOffsets(String message, KafkaConsumer<String, String> consumer, TopicPartition topicPartition) {
    Map<TopicPartition, OffsetAndMetadata> committed = consumer.committed(new HashSet<>(Arrays.asList(topicPartition)));
    OffsetAndMetadata offsetAndMetadata = committed.get(topicPartition);
    long position = consumer.position(topicPartition);
    System.out.printf("Offset info %s, Committed: %s, current position %s%n", message,
            offsetAndMetadata == null ? null : offsetAndMetadata.offset(), position);
}

private static void sendMessages() {
    Properties producerProps = ExampleConfig.getProducerProps();
    KafkaProducer<String, String> producer = new KafkaProducer<>(producerProps);

    for (int i = 0; i < 4; i++) {
        String value = "message-" + i;
        System.out.printf("Sending message topic: %s, value: %s%n", TOPIC_NAME, value);
        producer.send(new ProducerRecord<>(TOPIC_NAME, value));
    }

    producer.flush();
    producer.close();
}

}

  1. 运行结果分析

运行上述消费者代码后,输出结果如下:

复制

Offset info before consumer loop, Committed: null, current position 0

Sending message topic: example-topic-2020-5-28, value: message-0

Sending message topic: example-topic-2020-5-28, value: message-1

Sending message topic: example-topic-2020-5-28, value: message-2

Sending message topic: example-topic-2020-5-28, value: message-3

consumed: key = null, value = message-0, partition id= 0, offset = 0

consumed: key = null, value = message-1, partition id= 0, offset = 1

consumed: key = null, value = message-2, partition id= 0, offset = 2

consumed: key = null, value = message-3, partition id= 0, offset = 3

Offset info before commitAsync() call, Committed

相关推荐
TTBIGDATA18 小时前
【Atlas】Ambari 中 开启 Kerberos + Ranger 后 Atlas Hook 无权限访问 Kafka Topic:ATLAS_HOOK
大数据·kafka·ambari·linq·ranger·knox·bigtop
Coder_Boy_18 小时前
基于SpringAI的在线考试系统-相关技术栈(分布式场景下事件机制)
java·spring boot·分布式·ddd
程序员泠零澪回家种桔子21 小时前
分布式事务核心解析与实战方案
分布式
凯子坚持 c1 天前
CANN 生态中的分布式训练利器:深入 `collective-ops` 项目实现高效多卡协同
分布式
岁岁种桃花儿1 天前
Kafka从入门到上天系列第一篇:kafka的安装和启动
大数据·中间件·kafka
惊讶的猫1 天前
rabbitmq实践小案例
分布式·rabbitmq
禁默1 天前
打破集群通信“内存墙”:手把手教你用 CANN SHMEM 重构 AIGC 分布式算子
分布式·重构·aigc
惊讶的猫1 天前
rabbitmq初步介绍
分布式·rabbitmq
小镇敲码人1 天前
华为CANN框架中HCCL仓库的全面解析:分布式通信的引擎
分布式·华为
User_芊芊君子1 天前
【分布式训练】CANN SHMEM跨设备内存通信库:构建高效多机多卡训练的关键组件
分布式·深度学习·神经网络·wpf