Kafka 核心原理与实战
一、知识概述
Apache Kafka 是一个分布式流处理平台,最初由 LinkedIn 开发,后来贡献给 Apache。Kafka 以高吞吐量、低延迟、高可用性和可扩展性著称,被广泛应用于日志收集、流处理、事件驱动架构等场景。
本文将深入讲解 Kafka 的核心概念、架构设计、消息存储机制、消费者模型,并通过实战代码演示 Kafka 的生产者和消费者开发。
二、核心概念
2.1 基础架构
sql
+------------------+ +------------------+ +------------------+
| Producer 1 | | Producer 2 | | Producer 3 |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
+-----------+---------------+---------------------------+
|
v
+------------------------------------------------------------+
| Kafka Cluster |
| +------------------------+ +------------------------+ |
| | Broker 1 | | Broker 2 | |
| | +------------------+ | | +------------------+ | |
| | | Topic A (P0, P2) | | | | Topic A (P1) | | |
| | | Topic B (P0) | | | | Topic B (P1, P2) | | |
| | +------------------+ | | +------------------+ | |
| +------------------------+ +------------------------+ |
+------------------------------------------------------------+
|
+-----------+---------------+---------------------------+
| | |
v v v
+--------+---------+ +--------+---------+ +--------+---------+
| Consumer Group A | | Consumer Group B | | Consumer Group C |
| (C1, C2, C3) | | (C1, C2) | | (C1) |
+------------------+ +------------------+ +------------------+
2.2 核心组件
Producer(生产者)
负责将消息发送到 Kafka 集群。生产者决定消息被发送到哪个 Topic 的哪个分区。
Broker(代理服务器)
Kafka 集群中的服务节点,负责消息的存储和转发。每个 Broker 都有唯一的 ID。
Topic(主题)
消息的逻辑分类,类似于数据库中的表。生产者将消息发送到特定 Topic,消费者订阅 Topic 消费消息。
Partition(分区)
Topic 的物理分片,每个分区是一个有序的、不可变的消息序列。分区是 Kafka 实现高吞吐量和水平扩展的关键。
vbnet
Topic: orders
├── Partition 0: [msg0, msg1, msg2, msg3, ...] ← Leader on Broker 1
├── Partition 1: [msg0, msg1, msg2, msg3, ...] ← Leader on Broker 2
└── Partition 2: [msg0, msg1, msg2, msg3, ...] ← Leader on Broker 3
Replica(副本)
分区的备份,用于实现数据冗余和高可用。每个分区有一个 Leader 和多个 Follower。
sql
Partition 0:
├── Leader (Broker 1) ← 处理读写请求
├── Follower (Broker 2) ← 同步数据,不处理客户端请求
└── Follower (Broker 3) ← 同步数据,不处理客户端请求
Consumer(消费者)
从 Kafka 拉取消息进行处理。消费者以消费者组(Consumer Group)的形式工作。
Consumer Group(消费者组)
消费者组是 Kafka 实现单播和广播消息模式的核心机制。
sql
场景1:单播(同一消费者组内,每条消息只被一个消费者消费)
Topic: orders (3 partitions)
├── Partition 0 → Consumer 1 (Group A)
├── Partition 1 → Consumer 2 (Group A)
└── Partition 2 → Consumer 3 (Group A)
场景2:广播(不同消费者组可以消费同一消息)
Topic: orders (3 partitions)
├── Partition 0 → Consumer 1 (Group A), Consumer 4 (Group B)
├── Partition 1 → Consumer 2 (Group A), Consumer 5 (Group B)
└── Partition 2 → Consumer 3 (Group A), Consumer 6 (Group B)
2.3 关键概念
Offset(偏移量)
消息在分区中的唯一标识,是一个递增的整数。消费者通过 Offset 记录消费位置。
sql
Partition 0:
+----+----+----+----+----+----+----+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | ... (Offset)
+----+----+----+----+----+----+----+
ISR(In-Sync Replicas)
与 Leader 保持同步的副本集合。只有 ISR 中的副本才能被选举为新的 Leader。
ACK机制
生产者发送消息后,等待多少个副本确认才算发送成功:
java
// acks=0:生产者不等待任何确认(可能丢消息)
props.put(ProducerConfig.ACKS_CONFIG, "0");
// acks=1:等待 Leader 确认(默认,可能丢消息)
props.put(ProducerConfig.ACKS_CONFIG, "1");
// acks=all 或 -1:等待所有 ISR 副本确认(最安全)
props.put(ProducerConfig.ACKS_CONFIG, "all");
三、消息存储机制
3.1 日志段文件
Kafka 使用日志段(Log Segment)文件存储消息:
bash
/kafka-logs/
└── topic-name-partition-0/
├── 00000000000000000000.log ← 活跃段(当前写入)
├── 00000000000000000000.index ← 偏移量索引
├── 00000000000000000000.timeindex ← 时间戳索引
├── 00000000000000123456.log ← 已完成的段
├── 00000000000000123456.index
├── 00000000000000123456.timeindex
└── ...
3.2 索引机制
Kafka 使用稀疏索引加速消息查找:
java
/**
* 索引文件结构示例
*
* 偏移量索引(.index):
* +-------------+-----------------+
* | Offset | Position |
* +-------------+-----------------+
* | 0 | 0 |
* | 23 | 1024 |
* | 46 | 2048 |
* | 69 | 3072 |
* +-------------+-----------------+
*
* 查找 offset=50 的消息:
* 1. 二分查找索引,找到 offset=46,position=2048
* 2. 从 position=2048 开始顺序扫描
* 3. 找到 offset=50 的消息
*/
3.3 消息格式
Kafka 消息格式(v2 版本):
sql
Record:
+-------------------+
| Length | 4 bytes
+-------------------+
| Attributes | 1 byte
+-------------------+
| Timestamp Delta | varint
+-------------------+
| Offset Delta | varint
+-------------------+
| Key Length | varint
+-------------------+
| Key | bytes
+-------------------+
| Value Length | varint
+-------------------+
| Value | bytes
+-------------------+
| Headers | array
+-------------------+
3.4 日志清理策略
java
// 配置日志清理策略
props.put(LogConfig.CLEANUP_POLICY_CONFIG, "delete"); // 或 "compact"
// 基于时间的清理(默认保留7天)
props.put(LogConfig.RETENTION_MS_CONFIG, "604800000");
// 基于大小的清理
props.put(LogConfig.RETENTION_BYTES_CONFIG, "1073741824"); // 1GB
// 日志压缩(保留每个 Key 的最新值)
// 适用于: changelog topic, compaction topic
props.put(LogConfig.CLEANUP_POLICY_CONFIG, "compact");
四、生产者原理
4.1 发送流程
diff
Producer
|
| 1. 序列化
v
+----------------+
| Serializer |
+----------------+
|
| 2. 分区选择
v
+----------------+
| Partitioner |
+----------------+
|
| 3. 消息累积
v
+----------------+
| RecordAccumulator |
| (批量发送缓冲区) |
+----------------+
|
| 4. 网络发送
v
+----------------+
| Sender Thread |
+----------------+
|
v
Broker
4.2 分区策略
java
/**
* 自定义分区器
*/
public class CustomPartitioner implements Partitioner {
@Override
public int partition(String topic, Object key, byte[] keyBytes,
Object value, byte[] valueBytes, Cluster cluster) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
// 无 Key:轮询或随机
return ThreadLocalRandom.current().nextInt(numPartitions);
}
// 有 Key:Hash 分区
// 特殊业务逻辑:VIP 用户发送到特定分区
if (key instanceof String && ((String) key).startsWith("VIP-")) {
return 0; // VIP 分区
}
// 普通 Key:Hash 取模
return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
@Override
public void close() {}
@Override
public void configure(Map<String, ?> configs) {}
}
// 使用自定义分区器
props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, CustomPartitioner.class);
4.3 批量发送与压缩
java
/**
* 生产者配置优化
*/
public Properties getOptimizedProducerConfig() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
// 批量发送配置
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); // 16KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 5); // 等待 5ms 或 batch 满后发送
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432); // 32MB
// 压缩配置(推荐 LZ4 或 ZSTD)
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
// 或 props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "zstd");
// 重试配置
props.put(ProducerConfig.RETRIES_CONFIG, 3);
props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 100);
// 幂等生产者(防止重复)
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
return props;
}
4.4 生产者示例
java
/**
* 完整的生产者示例
*/
public class OrderProducer {
private final KafkaProducer<String, String> producer;
private final String topic;
public OrderProducer(String brokers, String topic) {
this.topic = topic;
this.producer = new KafkaProducer<>(getProducerProps(brokers));
}
private Properties getProducerProps(String brokers) {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32768);
props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
return props;
}
/**
* 同步发送
*/
public void sendSync(String orderId, String orderJson) throws Exception {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, orderId, orderJson);
try {
RecordMetadata metadata = producer.send(record).get();
System.out.printf("消息发送成功: partition=%d, offset=%d%n",
metadata.partition(), metadata.offset());
} catch (Exception e) {
System.err.println("消息发送失败: " + e.getMessage());
throw e;
}
}
/**
* 异步发送(带回调)
*/
public void sendAsync(String orderId, String orderJson) {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, orderId, orderJson);
producer.send(record, (metadata, exception) -> {
if (exception != null) {
System.err.println("消息发送失败: " + exception.getMessage());
// 可以在这里实现重试逻辑
} else {
System.out.printf("消息发送成功: partition=%d, offset=%d%n",
metadata.partition(), metadata.offset());
}
});
}
/**
* 发送带头的消息
*/
public void sendWithHeaders(String orderId, String orderJson,
Map<String, String> headers) {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, null, orderId, orderJson);
// 添加自定义头
headers.forEach((key, value) ->
record.headers().add(key, value.getBytes(StandardCharsets.UTF_8)));
producer.send(record);
}
/**
* 发送带时间戳的消息
*/
public void sendWithTimestamp(String orderId, String orderJson, long timestamp) {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, null, timestamp, orderId, orderJson);
producer.send(record);
}
/**
* 事务发送(保证原子性)
*/
public void sendInTransaction(List<Order> orders) {
// 初始化事务
producer.initTransactions();
try {
// 开启事务
producer.beginTransaction();
// 发送多条消息
for (Order order : orders) {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, order.getId(), order.toJson());
producer.send(record);
}
// 提交事务
producer.commitTransaction();
} catch (Exception e) {
// 回滚事务
producer.abortTransaction();
System.err.println("事务发送失败,已回滚: " + e.getMessage());
}
}
public void close() {
producer.close();
}
// 使用示例
public static void main(String[] args) throws Exception {
OrderProducer orderProducer = new OrderProducer(
"localhost:9092", "orders");
// 异步发送
orderProducer.sendAsync("order-001",
"{\"id\":\"order-001\",\"amount\":100.0}");
// 同步发送
orderProducer.sendSync("order-002",
"{\"id\":\"order-002\",\"amount\":200.0}");
// 发送带头部的消息
Map<String, String> headers = new HashMap<>();
headers.put("source", "mobile-app");
headers.put("version", "1.0");
orderProducer.sendWithHeaders("order-003",
"{\"id\":\"order-003\",\"amount\":300.0}", headers);
// 事务发送
List<Order> batchOrders = Arrays.asList(
new Order("order-004", 400.0),
new Order("order-005", 500.0)
);
orderProducer.sendInTransaction(batchOrders);
// 确保所有消息发送完成
orderProducer.producer.flush();
orderProducer.close();
}
}
class Order {
private String id;
private double amount;
public Order(String id, double amount) {
this.id = id;
this.amount = amount;
}
public String getId() { return id; }
public String toJson() {
return String.format("{\"id\":\"%s\",\"amount\":%.1f}", id, amount);
}
}
五、消费者原理
5.1 消费者组协调
markdown
Consumer Group 协调流程:
1. 消费者启动,向 GroupCoordinator 发送 JoinGroupRequest
2. GroupCoordinator 选择一个消费者作为 Leader
3. Leader 制定分区分配方案
4. 所有消费者向 GroupCoordinator 发送 SyncGroupRequest
5. GroupCoordinator 将分配方案下发给每个消费者
6. 消费者开始拉取分配给自己的分区
重新平衡(Rebalance)触发条件:
- 新消费者加入组
- 消费者离开组(崩溃或主动离开)
- Topic 分区数变化
- 订阅的 Topic 数量变化
5.2 消费者配置
java
/**
* 消费者配置优化
*/
public Properties getOptimizedConsumerConfig(String groupId) {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
// 消费者组配置
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
// 自动提交 offset(不推荐,可能丢失消息)
// props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
// props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5000);
// 手动提交 offset(推荐)
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
// offset 重置策略
// earliest: 从最早的消息开始消费
// latest: 从最新的消息开始消费(默认)
// none: 抛出异常
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
// 拉取配置
props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1);
props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500);
props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 1048576);
// 心跳与会话超时
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 10000);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 300000);
// 隔离级别(用于事务消费)
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
return props;
}
5.3 消费者示例
java
/**
* 完整的消费者示例
*/
public class OrderConsumer {
private final KafkaConsumer<String, String> consumer;
private final String topic;
private volatile boolean running = true;
public OrderConsumer(String brokers, String topic, String groupId) {
this.topic = topic;
this.consumer = new KafkaConsumer<>(getConsumerProps(brokers, groupId));
}
private Properties getConsumerProps(String brokers, String groupId) {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 100);
return props;
}
/**
* 简单消费循环
*/
public void consume() {
consumer.subscribe(Collections.singletonList(topic));
try {
while (running) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> record : records) {
System.out.printf(
"收到消息: partition=%d, offset=%d, key=%s, value=%s%n",
record.partition(), record.offset(),
record.key(), record.value());
// 处理消息
processMessage(record);
}
// 手动提交 offset
consumer.commitSync();
}
} finally {
consumer.close();
}
}
/**
* 异步提交 offset
*/
public void consumeWithAsyncCommit() {
consumer.subscribe(Collections.singletonList(topic));
try {
while (running) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> record : records) {
processMessage(record);
}
// 异步提交(不阻塞,但可能丢失 offset)
consumer.commitAsync((offsets, exception) -> {
if (exception != null) {
System.err.println("Offset 提交失败: " + exception.getMessage());
} else {
System.out.println("Offset 提交成功: " + offsets);
}
});
}
} finally {
// 最后一次同步提交,确保 offset 不丢失
consumer.commitSync();
consumer.close();
}
}
/**
* 精确控制 offset 提交
*/
public void consumeWithManualCommit() {
consumer.subscribe(Collections.singletonList(topic));
try {
while (running) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(1000));
for (TopicPartition partition : records.partitions()) {
List<ConsumerRecord<String, String>> partitionRecords =
records.records(partition);
for (ConsumerRecord<String, String> record : partitionRecords) {
try {
processMessage(record);
// 逐条提交 offset(精确控制,但性能较低)
consumer.commitSync(Collections.singletonMap(
partition,
new OffsetAndMetadata(record.offset() + 1)
));
} catch (Exception e) {
System.err.println("处理消息失败: " + e.getMessage());
// 可以选择跳过或重试
break;
}
}
}
}
} finally {
consumer.close();
}
}
/**
* 批量处理与提交
*/
public void consumeBatch() {
consumer.subscribe(Collections.singletonList(topic));
try {
while (running) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(1000));
if (records.isEmpty()) continue;
// 批量处理
List<Order> orders = new ArrayList<>();
Map<TopicPartition, OffsetAndMetadata> commitOffsets = new HashMap<>();
for (ConsumerRecord<String, String> record : records) {
Order order = parseOrder(record.value());
orders.add(order);
// 记录最后一条消息的 offset
commitOffsets.put(
new TopicPartition(record.topic(), record.partition()),
new OffsetAndMetadata(record.offset() + 1)
);
}
// 批量保存到数据库
try {
saveOrdersToDatabase(orders);
// 批量提交 offset
consumer.commitSync(commitOffsets);
} catch (Exception e) {
System.err.println("批量保存失败: " + e.getMessage());
// 不提交 offset,下次重新消费
}
}
} finally {
consumer.close();
}
}
/**
* 从指定 offset 开始消费
*/
public void consumeFromOffset(long startOffset) {
// 先订阅(为了获取分区信息)
consumer.subscribe(Collections.singletonList(topic));
// 获取分区列表
consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
Set<TopicPartition> partitions = consumer.assignment();
// 从指定 offset 开始
for (TopicPartition partition : partitions) {
consumer.seek(partition, startOffset);
}
// 开始消费
consume();
}
/**
* 从最早的消息开始消费
*/
public void consumeFromBeginning() {
consumer.subscribe(Collections.singletonList(topic));
consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
consumer.seekToBeginning(consumer.assignment());
consume();
}
/**
* 从最新的消息开始消费
*/
public void consumeFromLatest() {
consumer.subscribe(Collections.singletonList(topic));
consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
consumer.seekToEnd(consumer.assignment());
consume();
}
/**
* 按时间戳查找 offset
*/
public void consumeFromTimestamp(long timestamp) {
consumer.subscribe(Collections.singletonList(topic));
consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
Map<TopicPartition, Long> timestampsToSearch = new HashMap<>();
for (TopicPartition partition : consumer.assignment()) {
timestampsToSearch.put(partition, timestamp);
}
// 查找指定时间戳的 offset
Map<TopicPartition, OffsetAndTimestamp> offsets =
consumer.offsetsForTimes(timestampsToSearch);
// 定位到该 offset
for (Map.Entry<TopicPartition, OffsetAndTimestamp> entry : offsets.entrySet()) {
if (entry.getValue() != null) {
consumer.seek(entry.getKey(), entry.getValue().offset());
}
}
consume();
}
/**
* 暂停与恢复消费
*/
public void consumeWithPauseResume() {
consumer.subscribe(Collections.singletonList(topic));
boolean paused = false;
long pauseStartTime = 0;
try {
while (running) {
// 检查是否需要暂停/恢复
if (shouldPause() && !paused) {
consumer.pause(consumer.assignment());
paused = true;
pauseStartTime = System.currentTimeMillis();
System.out.println("消费已暂停");
} else if (paused && !shouldPause()) {
consumer.resume(consumer.assignment());
paused = false;
System.out.println("消费已恢复");
}
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> record : records) {
processMessage(record);
}
if (!records.isEmpty()) {
consumer.commitSync();
}
}
} finally {
consumer.close();
}
}
/**
* 优雅关闭
*/
public void shutdown() {
running = false;
consumer.wakeup(); // 唤醒 poll 操作
}
private void processMessage(ConsumerRecord<String, String> record) {
// 实际的消息处理逻辑
System.out.println("处理订单: " + record.value());
}
private Order parseOrder(String json) {
// 解析订单 JSON
return new Order("temp", 0.0);
}
private void saveOrdersToDatabase(List<Order> orders) {
// 保存到数据库
System.out.println("保存 " + orders.size() + " 条订单到数据库");
}
private boolean shouldPause() {
// 判断是否需要暂停(例如:下游服务不可用)
return false;
}
// 使用示例
public static void main(String[] args) {
OrderConsumer consumer = new OrderConsumer(
"localhost:9092", "orders", "order-consumer-group");
// 注册关闭钩子
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("正在关闭消费者...");
consumer.shutdown();
}));
// 开始消费
consumer.consume();
}
}
六、Spring Kafka 集成
6.1 依赖配置
xml
<!-- Maven -->
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>3.0.0</version>
</dependency>
gradle
// Gradle
implementation 'org.springframework.kafka:spring-kafka:3.0.0'
6.2 生产者配置
java
@Configuration
public class KafkaProducerConfig {
@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
@Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
config.put(ProducerConfig.ACKS_CONFIG, "all");
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
config.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
return new DefaultKafkaProducerFactory<>(config);
}
@Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
/**
* 事务生产者工厂
*/
@Bean
public ProducerFactory<String, String> transactionalProducerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
config.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "order-tx-");
DefaultKafkaProducerFactory<String, String> factory =
new DefaultKafkaProducerFactory<>(config);
factory.setTransactionIdPrefix("order-tx-");
return factory;
}
@Bean
public KafkaTransactionManager<String, String> kafkaTransactionManager() {
return new KafkaTransactionManager<>(transactionalProducerFactory());
}
}
6.3 消费者配置
java
@Configuration
@EnableKafka
public class KafkaConsumerConfig {
@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
@Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
config.put(ConsumerConfig.GROUP_ID_CONFIG, "order-consumer-group");
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return new DefaultKafkaConsumerFactory<>(config);
}
@Bean
public ConcurrentKafkaListenerContainerFactory<String, String>
kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(3); // 3 个消费者线程
factory.getContainerProperties().setAckMode(
ContainerProperties.AckMode.MANUAL_IMMEDIATE);
return factory;
}
}
6.4 消息发送服务
java
@Service
@Slf4j
public class OrderMessageService {
@Autowired
private KafkaTemplate<String, String> kafkaTemplate;
private static final String ORDER_TOPIC = "orders";
/**
* 发送订单消息
*/
public void sendOrder(Order order) {
String orderJson = toJson(order);
CompletableFuture<SendResult<String, String>> future =
kafkaTemplate.send(ORDER_TOPIC, order.getId(), orderJson);
future.whenComplete((result, ex) -> {
if (ex == null) {
log.info("订单消息发送成功: key={}, partition={}, offset={}",
order.getId(),
result.getRecordMetadata().partition(),
result.getRecordMetadata().offset());
} else {
log.error("订单消息发送失败: key={}", order.getId(), ex);
}
});
}
/**
* 同步发送
*/
public void sendOrderSync(Order order) throws Exception {
String orderJson = toJson(order);
try {
SendResult<String, String> result =
kafkaTemplate.send(ORDER_TOPIC, order.getId(), orderJson).get();
log.info("订单消息发送成功: key={}", order.getId());
} catch (Exception e) {
log.error("订单消息发送失败: key={}", order.getId(), e);
throw e;
}
}
/**
* 事务发送
*/
@Transactional
public void sendOrderInTransaction(List<Order> orders) {
for (Order order : orders) {
kafkaTemplate.send(ORDER_TOPIC, order.getId(), toJson(order));
}
// 事务会在方法结束时自动提交
}
private String toJson(Order order) {
// 转换为 JSON
return String.format("{\"id\":\"%s\",\"amount\":%.1f}",
order.getId(), order.getAmount());
}
}
6.5 消息消费服务
java
@Service
@Slf4j
public class OrderConsumerService {
/**
* 监听订单消息
*/
@KafkaListener(
topics = "orders",
groupId = "order-consumer-group",
containerFactory = "kafkaListenerContainerFactory"
)
public void consumeOrder(ConsumerRecord<String, String> record,
Acknowledgment acknowledgment) {
try {
log.info("收到订单消息: key={}, value={}, partition={}, offset={}",
record.key(), record.value(),
record.partition(), record.offset());
// 处理订单
Order order = parseOrder(record.value());
processOrder(order);
// 手动确认
acknowledgment.acknowledge();
} catch (Exception e) {
log.error("处理订单消息失败: key={}", record.key(), e);
// 不确认,消息会被重新消费
}
}
/**
* 批量消费
*/
@KafkaListener(
topics = "orders",
groupId = "order-batch-consumer-group",
containerFactory = "kafkaListenerContainerFactory",
batch = "true"
)
public void consumeBatch(List<ConsumerRecord<String, String>> records,
Acknowledgment acknowledgment) {
try {
List<Order> orders = new ArrayList<>();
for (ConsumerRecord<String, String> record : records) {
orders.add(parseOrder(record.value()));
}
// 批量处理
batchProcessOrders(orders);
// 确认
acknowledgment.acknowledge();
} catch (Exception e) {
log.error("批量处理订单失败", e);
}
}
/**
* 指定分区消费
*/
@KafkaListener(
topicPartitions = @TopicPartition(
topic = "orders",
partitions = {"0", "1"}
),
groupId = "order-partition-consumer"
)
public void consumeFromPartition(ConsumerRecord<String, String> record) {
log.info("从分区 {} 收到消息: {}", record.partition(), record.value());
}
/**
* 带错误处理器的监听
*/
@KafkaListener(
topics = "orders",
groupId = "order-error-handler-group"
)
public void consumeWithErrorHandler(ConsumerRecord<String, String> record) {
// 可能抛出异常
processOrder(parseOrder(record.value()));
}
/**
* 错误处理器
*/
@Bean
public ConsumerAwareErrorHandler errorHandler() {
return (exception, records, consumer, container) -> {
log.error("消费异常: {}", exception.getMessage());
// 可以选择跳过、重试或发送到死信队列
if (exception instanceof DeserializationException) {
// 反序列化失败,跳过
return;
}
// 其他异常,重试
throw exception;
};
}
private Order parseOrder(String json) {
// 解析 JSON
return new Order("temp", 0.0);
}
private void processOrder(Order order) {
// 处理订单
log.info("处理订单: {}", order.getId());
}
private void batchProcessOrders(List<Order> orders) {
log.info("批量处理 {} 条订单", orders.size());
}
}
七、消息积压处理
7.1 积压原因分析
markdown
常见积压原因:
1. 消费速度 < 生产速度
2. 消费者处理逻辑耗时过长
3. 下游服务响应慢(数据库、外部API)
4. 消费者数量不足
5. 分区数不够,无法扩展消费者
排查方法:
1. 查看消费者 Lag:
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--describe --group order-consumer-group
2. 查看生产速率:
监控 Broker 的 MessagesInPerSec 指标
3. 分析消费者日志,找出耗时操作
7.2 积压处理方案
java
/**
* 消息积压处理方案
*/
public class LagResolutionStrategy {
/**
* 方案1:增加消费者实例
*
* 前提:分区数 >= 消费者数
*
* 例如:Topic 有 10 个分区,可以部署 10 个消费者实例
*/
/**
* 方案2:临时消费者(甩积压)
*/
public void createTemporaryConsumer() {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
// 使用新的消费者组(从最早开始消费)
props.put(ConsumerConfig.GROUP_ID_CONFIG, "lag-resolution-temp");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1000); // 增加拉取数量
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("orders"));
// 快速消费,不做复杂处理
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
if (records.isEmpty()) break;
// 简单处理或转发到其他地方
for (ConsumerRecord<String, String> record : records) {
// 保存到数据库或发送到其他队列
quickProcess(record.value());
}
consumer.commitSync();
}
consumer.close();
}
/**
* 方案3:异步处理
*/
public void consumeWithAsyncProcessing() {
// 使用线程池异步处理
ExecutorService executor = Executors.newFixedThreadPool(10);
KafkaConsumer<String, String> consumer = createConsumer();
consumer.subscribe(Collections.singletonList("orders"));
try {
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
List<CompletableFuture<Void>> futures = new ArrayList<>();
for (ConsumerRecord<String, String> record : records) {
CompletableFuture<Void> future = CompletableFuture.runAsync(
() -> processRecord(record), executor
);
futures.add(future);
}
// 等待所有任务完成
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.join();
// 提交 offset
consumer.commitSync();
}
} finally {
consumer.close();
executor.shutdown();
}
}
/**
* 方案4:跳过积压,从最新开始
*/
public void skipLag() {
KafkaConsumer<String, String> consumer = createConsumer();
consumer.subscribe(Collections.singletonList("orders"));
consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
consumer.seekToEnd(consumer.assignment()); // 跳到最新
// 开始正常消费
}
/**
* 方案5:增加分区数
*/
// 命令行执行:
// kafka-topics.sh --bootstrap-server localhost:9092 \
// --alter --topic orders --partitions 20
private KafkaConsumer<String, String> createConsumer() {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, "order-consumer-group");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
return new KafkaConsumer<>(props);
}
private void quickProcess(String message) {
// 快速处理逻辑
}
private void processRecord(ConsumerRecord<String, String> record) {
// 处理逻辑
}
}
八、最佳实践
8.1 生产者最佳实践
java
/**
* 生产者最佳实践配置
*/
public class BestPracticeProducer {
public static Properties getConfig() {
Properties props = new Properties();
// 基础配置
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"broker1:9092,broker2:9092,broker3:9092");
// 可靠性配置
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
// 性能配置
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32768);
props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864);
// 超时配置
props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30000);
props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120000);
return props;
}
}
8.2 消费者最佳实践
java
/**
* 消费者最佳实践配置
*/
public class BestPracticeConsumer {
public static Properties getConfig(String groupId) {
Properties props = new Properties();
// 基础配置
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"broker1:9092,broker2:9092,broker3:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
// Offset 管理
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
// 性能配置
props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024);
props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500);
// 心跳与会话超时(避免频繁 Rebalance)
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 10000);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 300000);
return props;
}
}
8.3 Topic 设计最佳实践
java
/**
* Topic 设计建议
*/
public class TopicDesignBestPractices {
/**
* 1. 分区数设计
*
* - 分区数 >= 预期的最大消费者数
* - 考虑吞吐量:每个分区的吞吐量约 10-20 MB/s
* - 不宜过多:会增加文件句柄、影响选举时间
*
* 公式:分区数 = max(目标吞吐量 / 单分区吞吐量, 消费者数)
*/
/**
* 2. 副本因子
*
* - 生产环境:至少 3
* - min.insync.replicas = 2(至少 2 个副本同步)
*/
/**
* 3. 日志保留策略
*/
public void configureRetention() {
// 基于时间(默认 7 天)
// log.retention.hours=168
// 基于大小
// log.retention.bytes=1073741824
// 日志压缩(保留最新值)
// cleanup.policy=compact
}
/**
* 4. Topic 命名规范
*
* 格式:<业务域>.<数据类型>.<版本>
* 例如:
* - order.event.v1
* - user.change.v1
* - payment.log.v1
*/
}
8.4 监控指标
java
/**
* 关键监控指标
*/
public class KafkaMonitoringMetrics {
/**
* Broker 指标:
* - MessagesInPerSec:消息写入速率
* - BytesInPerSec / BytesOutPerSec:吞吐量
* - UnderReplicatedPartitions:副本不足的分区数
* - OfflinePartitionsCount:离线分区数
* - ActiveControllerCount:活跃 Controller 数(应为 1)
*/
/**
* Producer 指标:
* - record-send-rate:发送速率
* - record-error-rate:错误率
* - request-latency-avg:平均延迟
* - buffer-available-bytes:可用缓冲区
*/
/**
* Consumer 指标:
* - records-lag-max:最大 Lag
* - records-consumed-rate:消费速率
* - commit-rate:提交速率
* - join-rate:加入组速率(Rebalance 频率)
*/
/**
* 使用 JMX 或 Prometheus 监控
*/
}
九、总结
Kafka 核心要点
- 高吞吐量:顺序写入、零拷贝、批量发送
- 高可用:副本机制、ISR、Leader 选举
- 可扩展:分区机制、水平扩展
- 消息持久化:日志段文件、稀疏索引
使用建议
- 生产者:启用幂等、选择合适的 acks、使用压缩
- 消费者:手动提交 offset、合理设置心跳参数、监控 Lag
- Topic 设计:合理分区数、副本因子、日志保留策略
- 监控告警:Lag、吞吐量、错误率、Rebalance 频率
常见问题解决
| 问题 | 原因 | 解决方案 |
|---|---|---|
| 消息丢失 | acks=1 或 0 | 使用 acks=all,启用幂等 |
| 消息重复 | 生产者重试 | 消费者实现幂等性 |
| 消息积压 | 消费慢 | 增加消费者、异步处理、临时消费者 |
| Rebalance 频繁 | 心跳超时 | 调整 session.timeout.ms、max.poll.interval.ms |
| 高延迟 | 网络或磁盘 | 优化网络、使用 SSD、调整 batch.size |
六、思考与练习
思考题
-
基础题:Kafka如何保证消息的顺序性?在什么情况下消息会出现乱序?
-
进阶题:Kafka的消费者组Rebalance是如何工作的?频繁Rebalance会导致什么问题?如何避免?
-
实战题:Kafka与RabbitMQ在消息模型、吞吐量、延迟、适用场景上有何区别?在一个电商系统中,你会如何选择使用哪个消息队列?
编程练习
练习:使用Spring Kafka实现一个完整的消息系统,包含:(1) 生产者批量发送与压缩配置;(2) 消费者手动提交offset与异常处理;(3) 消息积压监控与告警;(4) 死信队列处理失败消息。
章节关联
- 前置章节:RabbitMQ核心原理与实战
- 后续章节:RocketMQ核心原理与实战
- 扩展阅读:《Kafka权威指南》、Kafka官方文档
📝 下一章预告
下一章将讲解Apache RocketMQ------阿里巴巴开源的分布式消息中间件。RocketMQ以其事务消息、延迟消息、消息轨迹等特性,在电商金融领域有着独特优势。
本章完
参考资料:
- Apache Kafka 官方文档:kafka.apache.org/documentati...
- Spring Kafka 文档:docs.spring.io/spring-kafk...
- Kafka 权威指南(书籍)