43-Kafka 核心原理与实战

Kafka 核心原理与实战

一、知识概述

Apache Kafka 是一个分布式流处理平台，最初由 LinkedIn 开发，后来贡献给 Apache。Kafka 以高吞吐量、低延迟、高可用性和可扩展性著称，被广泛应用于日志收集、流处理、事件驱动架构等场景。

本文将深入讲解 Kafka 的核心概念、架构设计、消息存储机制、消费者模型，并通过实战代码演示 Kafka 的生产者和消费者开发。

二、核心概念

2.1 基础架构

sql 复制代码

+------------------+        +------------------+        +------------------+
|   Producer 1     |        |   Producer 2     |        |   Producer 3     |
+--------+---------+        +--------+---------+        +--------+---------+
         |                           |                           |
         +-----------+---------------+---------------------------+
                     |
                     v
+------------------------------------------------------------+
|                      Kafka Cluster                          |
|  +------------------------+     +------------------------+  |
|  |      Broker 1          |     |      Broker 2          |  |
|  |  +------------------+  |     |  +------------------+  |  |
|  |  | Topic A (P0, P2)  |  |     |  | Topic A (P1)     |  |  |
|  |  | Topic B (P0)      |  |     |  | Topic B (P1, P2) |  |  |
|  |  +------------------+  |     |  +------------------+  |  |
|  +------------------------+     +------------------------+  |
+------------------------------------------------------------+
                     |
         +-----------+---------------+---------------------------+
         |                           |                           |
         v                           v                           v
+--------+---------+        +--------+---------+        +--------+---------+
| Consumer Group A |        | Consumer Group B |        | Consumer Group C |
|  (C1, C2, C3)    |        |  (C1, C2)        |        |  (C1)            |
+------------------+        +------------------+        +------------------+

2.2 核心组件

Producer（生产者）

负责将消息发送到 Kafka 集群。生产者决定消息被发送到哪个 Topic 的哪个分区。

Broker（代理服务器）

Kafka 集群中的服务节点，负责消息的存储和转发。每个 Broker 都有唯一的 ID。

Topic（主题）

消息的逻辑分类，类似于数据库中的表。生产者将消息发送到特定 Topic，消费者订阅 Topic 消费消息。

Partition（分区）

Topic 的物理分片，每个分区是一个有序的、不可变的消息序列。分区是 Kafka 实现高吞吐量和水平扩展的关键。

vbnet 复制代码

Topic: orders
├── Partition 0: [msg0, msg1, msg2, msg3, ...]  ← Leader on Broker 1
├── Partition 1: [msg0, msg1, msg2, msg3, ...]  ← Leader on Broker 2
└── Partition 2: [msg0, msg1, msg2, msg3, ...]  ← Leader on Broker 3

Replica（副本）

分区的备份，用于实现数据冗余和高可用。每个分区有一个 Leader 和多个 Follower。

sql 复制代码

Partition 0:
  ├── Leader (Broker 1)     ← 处理读写请求
  ├── Follower (Broker 2)   ← 同步数据，不处理客户端请求
  └── Follower (Broker 3)   ← 同步数据，不处理客户端请求

Consumer（消费者）

从 Kafka 拉取消息进行处理。消费者以消费者组（Consumer Group）的形式工作。

Consumer Group（消费者组）

消费者组是 Kafka 实现单播和广播消息模式的核心机制。

sql 复制代码

场景1：单播（同一消费者组内，每条消息只被一个消费者消费）
Topic: orders (3 partitions)
├── Partition 0 → Consumer 1 (Group A)
├── Partition 1 → Consumer 2 (Group A)
└── Partition 2 → Consumer 3 (Group A)

场景2：广播（不同消费者组可以消费同一消息）
Topic: orders (3 partitions)
├── Partition 0 → Consumer 1 (Group A), Consumer 4 (Group B)
├── Partition 1 → Consumer 2 (Group A), Consumer 5 (Group B)
└── Partition 2 → Consumer 3 (Group A), Consumer 6 (Group B)

2.3 关键概念

Offset（偏移量）

消息在分区中的唯一标识，是一个递增的整数。消费者通过 Offset 记录消费位置。

sql 复制代码

Partition 0:
+----+----+----+----+----+----+----+
| 0  | 1  | 2  | 3  | 4  | 5  | 6  | ... (Offset)
+----+----+----+----+----+----+----+

ISR（In-Sync Replicas）

与 Leader 保持同步的副本集合。只有 ISR 中的副本才能被选举为新的 Leader。

ACK机制

生产者发送消息后，等待多少个副本确认才算发送成功：

java 复制代码

// acks=0：生产者不等待任何确认（可能丢消息）
props.put(ProducerConfig.ACKS_CONFIG, "0");

// acks=1：等待 Leader 确认（默认，可能丢消息）
props.put(ProducerConfig.ACKS_CONFIG, "1");

// acks=all 或 -1：等待所有 ISR 副本确认（最安全）
props.put(ProducerConfig.ACKS_CONFIG, "all");

三、消息存储机制

3.1 日志段文件

Kafka 使用日志段（Log Segment）文件存储消息：

bash 复制代码

/kafka-logs/
└── topic-name-partition-0/
    ├── 00000000000000000000.log   ← 活跃段（当前写入）
    ├── 00000000000000000000.index ← 偏移量索引
    ├── 00000000000000000000.timeindex ← 时间戳索引
    ├── 00000000000000123456.log   ← 已完成的段
    ├── 00000000000000123456.index
    ├── 00000000000000123456.timeindex
    └── ...

3.2 索引机制

Kafka 使用稀疏索引加速消息查找：

java 复制代码

/**
 * 索引文件结构示例
 * 
 * 偏移量索引（.index）：
 * +-------------+-----------------+
 * | Offset      | Position        |
 * +-------------+-----------------+
 * | 0           | 0               |
 * | 23          | 1024            |
 * | 46          | 2048            |
 * | 69          | 3072            |
 * +-------------+-----------------+
 * 
 * 查找 offset=50 的消息：
 * 1. 二分查找索引，找到 offset=46，position=2048
 * 2. 从 position=2048 开始顺序扫描
 * 3. 找到 offset=50 的消息
 */

3.3 消息格式

Kafka 消息格式（v2 版本）：

sql 复制代码

Record:
+-------------------+
| Length            | 4 bytes
+-------------------+
| Attributes        | 1 byte
+-------------------+
| Timestamp Delta   | varint
+-------------------+
| Offset Delta      | varint
+-------------------+
| Key Length        | varint
+-------------------+
| Key               | bytes
+-------------------+
| Value Length      | varint
+-------------------+
| Value             | bytes
+-------------------+
| Headers           | array
+-------------------+

3.4 日志清理策略

java 复制代码

// 配置日志清理策略
props.put(LogConfig.CLEANUP_POLICY_CONFIG, "delete"); // 或 "compact"

// 基于时间的清理（默认保留7天）
props.put(LogConfig.RETENTION_MS_CONFIG, "604800000");

// 基于大小的清理
props.put(LogConfig.RETENTION_BYTES_CONFIG, "1073741824"); // 1GB

// 日志压缩（保留每个 Key 的最新值）
// 适用于： changelog topic, compaction topic
props.put(LogConfig.CLEANUP_POLICY_CONFIG, "compact");

四、生产者原理

4.1 发送流程

diff 复制代码

Producer
    |
    | 1. 序列化
    v
+----------------+
| Serializer     |
+----------------+
    |
    | 2. 分区选择
    v
+----------------+
| Partitioner    |
+----------------+
    |
    | 3. 消息累积
    v
+----------------+
| RecordAccumulator |
| (批量发送缓冲区)   |
+----------------+
    |
    | 4. 网络发送
    v
+----------------+
| Sender Thread  |
+----------------+
    |
    v
   Broker

4.2 分区策略

java 复制代码

/**
 * 自定义分区器
 */
public class CustomPartitioner implements Partitioner {
    
    @Override
    public int partition(String topic, Object key, byte[] keyBytes, 
                         Object value, byte[] valueBytes, Cluster cluster) {
        
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        
        if (keyBytes == null) {
            // 无 Key：轮询或随机
            return ThreadLocalRandom.current().nextInt(numPartitions);
        }
        
        // 有 Key：Hash 分区
        // 特殊业务逻辑：VIP 用户发送到特定分区
        if (key instanceof String && ((String) key).startsWith("VIP-")) {
            return 0; // VIP 分区
        }
        
        // 普通 Key：Hash 取模
        return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
    }
    
    @Override
    public void close() {}
    
    @Override
    public void configure(Map<String, ?> configs) {}
}

// 使用自定义分区器
props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, CustomPartitioner.class);

4.3 批量发送与压缩

java 复制代码

/**
 * 生产者配置优化
 */
public Properties getOptimizedProducerConfig() {
    Properties props = new Properties();
    props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
              StringSerializer.class.getName());
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
              StringSerializer.class.getName());
    
    // 批量发送配置
    props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); // 16KB
    props.put(ProducerConfig.LINGER_MS_CONFIG, 5); // 等待 5ms 或 batch 满后发送
    props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432); // 32MB
    
    // 压缩配置（推荐 LZ4 或 ZSTD）
    props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
    // 或 props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "zstd");
    
    // 重试配置
    props.put(ProducerConfig.RETRIES_CONFIG, 3);
    props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 100);
    
    // 幂等生产者（防止重复）
    props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
    
    return props;
}

4.4 生产者示例

java 复制代码

/**
 * 完整的生产者示例
 */
public class OrderProducer {
    
    private final KafkaProducer<String, String> producer;
    private final String topic;
    
    public OrderProducer(String brokers, String topic) {
        this.topic = topic;
        this.producer = new KafkaProducer<>(getProducerProps(brokers));
    }
    
    private Properties getProducerProps(String brokers) {
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
                  StringSerializer.class.getName());
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
                  StringSerializer.class.getName());
        props.put(ProducerConfig.ACKS_CONFIG, "all");
        props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
        props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
        props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32768);
        props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
        return props;
    }
    
    /**
     * 同步发送
     */
    public void sendSync(String orderId, String orderJson) throws Exception {
        ProducerRecord<String, String> record = 
            new ProducerRecord<>(topic, orderId, orderJson);
        
        try {
            RecordMetadata metadata = producer.send(record).get();
            System.out.printf("消息发送成功: partition=%d, offset=%d%n",
                metadata.partition(), metadata.offset());
        } catch (Exception e) {
            System.err.println("消息发送失败: " + e.getMessage());
            throw e;
        }
    }
    
    /**
     * 异步发送（带回调）
     */
    public void sendAsync(String orderId, String orderJson) {
        ProducerRecord<String, String> record = 
            new ProducerRecord<>(topic, orderId, orderJson);
        
        producer.send(record, (metadata, exception) -> {
            if (exception != null) {
                System.err.println("消息发送失败: " + exception.getMessage());
                // 可以在这里实现重试逻辑
            } else {
                System.out.printf("消息发送成功: partition=%d, offset=%d%n",
                    metadata.partition(), metadata.offset());
            }
        });
    }
    
    /**
     * 发送带头的消息
     */
    public void sendWithHeaders(String orderId, String orderJson, 
                                 Map<String, String> headers) {
        ProducerRecord<String, String> record = 
            new ProducerRecord<>(topic, null, orderId, orderJson);
        
        // 添加自定义头
        headers.forEach((key, value) -> 
            record.headers().add(key, value.getBytes(StandardCharsets.UTF_8)));
        
        producer.send(record);
    }
    
    /**
     * 发送带时间戳的消息
     */
    public void sendWithTimestamp(String orderId, String orderJson, long timestamp) {
        ProducerRecord<String, String> record = 
            new ProducerRecord<>(topic, null, timestamp, orderId, orderJson);
        producer.send(record);
    }
    
    /**
     * 事务发送（保证原子性）
     */
    public void sendInTransaction(List<Order> orders) {
        // 初始化事务
        producer.initTransactions();
        
        try {
            // 开启事务
            producer.beginTransaction();
            
            // 发送多条消息
            for (Order order : orders) {
                ProducerRecord<String, String> record = 
                    new ProducerRecord<>(topic, order.getId(), order.toJson());
                producer.send(record);
            }
            
            // 提交事务
            producer.commitTransaction();
            
        } catch (Exception e) {
            // 回滚事务
            producer.abortTransaction();
            System.err.println("事务发送失败，已回滚: " + e.getMessage());
        }
    }
    
    public void close() {
        producer.close();
    }
    
    // 使用示例
    public static void main(String[] args) throws Exception {
        OrderProducer orderProducer = new OrderProducer(
            "localhost:9092", "orders");
        
        // 异步发送
        orderProducer.sendAsync("order-001", 
            "{\"id\":\"order-001\",\"amount\":100.0}");
        
        // 同步发送
        orderProducer.sendSync("order-002", 
            "{\"id\":\"order-002\",\"amount\":200.0}");
        
        // 发送带头部的消息
        Map<String, String> headers = new HashMap<>();
        headers.put("source", "mobile-app");
        headers.put("version", "1.0");
        orderProducer.sendWithHeaders("order-003", 
            "{\"id\":\"order-003\",\"amount\":300.0}", headers);
        
        // 事务发送
        List<Order> batchOrders = Arrays.asList(
            new Order("order-004", 400.0),
            new Order("order-005", 500.0)
        );
        orderProducer.sendInTransaction(batchOrders);
        
        // 确保所有消息发送完成
        orderProducer.producer.flush();
        orderProducer.close();
    }
}

class Order {
    private String id;
    private double amount;
    
    public Order(String id, double amount) {
        this.id = id;
        this.amount = amount;
    }
    
    public String getId() { return id; }
    
    public String toJson() {
        return String.format("{\"id\":\"%s\",\"amount\":%.1f}", id, amount);
    }
}

五、消费者原理

5.1 消费者组协调

markdown 复制代码

Consumer Group 协调流程：

1. 消费者启动，向 GroupCoordinator 发送 JoinGroupRequest
2. GroupCoordinator 选择一个消费者作为 Leader
3. Leader 制定分区分配方案
4. 所有消费者向 GroupCoordinator 发送 SyncGroupRequest
5. GroupCoordinator 将分配方案下发给每个消费者
6. 消费者开始拉取分配给自己的分区

重新平衡（Rebalance）触发条件：
- 新消费者加入组
- 消费者离开组（崩溃或主动离开）
- Topic 分区数变化
- 订阅的 Topic 数量变化

5.2 消费者配置

java 复制代码

/**
 * 消费者配置优化
 */
public Properties getOptimizedConsumerConfig(String groupId) {
    Properties props = new Properties();
    props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
              StringDeserializer.class.getName());
    props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
              StringDeserializer.class.getName());
    
    // 消费者组配置
    props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
    
    // 自动提交 offset（不推荐，可能丢失消息）
    // props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
    // props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5000);
    
    // 手动提交 offset（推荐）
    props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
    
    // offset 重置策略
    // earliest: 从最早的消息开始消费
    // latest: 从最新的消息开始消费（默认）
    // none: 抛出异常
    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    
    // 拉取配置
    props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1);
    props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500);
    props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500);
    props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 1048576);
    
    // 心跳与会话超时
    props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 10000);
    props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000);
    props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 300000);
    
    // 隔离级别（用于事务消费）
    props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
    
    return props;
}

5.3 消费者示例

java 复制代码

/**
 * 完整的消费者示例
 */
public class OrderConsumer {
    
    private final KafkaConsumer<String, String> consumer;
    private final String topic;
    private volatile boolean running = true;
    
    public OrderConsumer(String brokers, String topic, String groupId) {
        this.topic = topic;
        this.consumer = new KafkaConsumer<>(getConsumerProps(brokers, groupId));
    }
    
    private Properties getConsumerProps(String brokers, String groupId) {
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
                  StringDeserializer.class.getName());
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
                  StringDeserializer.class.getName());
        props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 100);
        return props;
    }
    
    /**
     * 简单消费循环
     */
    public void consume() {
        consumer.subscribe(Collections.singletonList(topic));
        
        try {
            while (running) {
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(1000));
                
                for (ConsumerRecord<String, String> record : records) {
                    System.out.printf(
                        "收到消息: partition=%d, offset=%d, key=%s, value=%s%n",
                        record.partition(), record.offset(), 
                        record.key(), record.value());
                    
                    // 处理消息
                    processMessage(record);
                }
                
                // 手动提交 offset
                consumer.commitSync();
            }
        } finally {
            consumer.close();
        }
    }
    
    /**
     * 异步提交 offset
     */
    public void consumeWithAsyncCommit() {
        consumer.subscribe(Collections.singletonList(topic));
        
        try {
            while (running) {
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(1000));
                
                for (ConsumerRecord<String, String> record : records) {
                    processMessage(record);
                }
                
                // 异步提交（不阻塞，但可能丢失 offset）
                consumer.commitAsync((offsets, exception) -> {
                    if (exception != null) {
                        System.err.println("Offset 提交失败: " + exception.getMessage());
                    } else {
                        System.out.println("Offset 提交成功: " + offsets);
                    }
                });
            }
        } finally {
            // 最后一次同步提交，确保 offset 不丢失
            consumer.commitSync();
            consumer.close();
        }
    }
    
    /**
     * 精确控制 offset 提交
     */
    public void consumeWithManualCommit() {
        consumer.subscribe(Collections.singletonList(topic));
        
        try {
            while (running) {
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(1000));
                
                for (TopicPartition partition : records.partitions()) {
                    List<ConsumerRecord<String, String>> partitionRecords = 
                        records.records(partition);
                    
                    for (ConsumerRecord<String, String> record : partitionRecords) {
                        try {
                            processMessage(record);
                            
                            // 逐条提交 offset（精确控制，但性能较低）
                            consumer.commitSync(Collections.singletonMap(
                                partition,
                                new OffsetAndMetadata(record.offset() + 1)
                            ));
                            
                        } catch (Exception e) {
                            System.err.println("处理消息失败: " + e.getMessage());
                            // 可以选择跳过或重试
                            break;
                        }
                    }
                }
            }
        } finally {
            consumer.close();
        }
    }
    
    /**
     * 批量处理与提交
     */
    public void consumeBatch() {
        consumer.subscribe(Collections.singletonList(topic));
        
        try {
            while (running) {
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(1000));
                
                if (records.isEmpty()) continue;
                
                // 批量处理
                List<Order> orders = new ArrayList<>();
                Map<TopicPartition, OffsetAndMetadata> commitOffsets = new HashMap<>();
                
                for (ConsumerRecord<String, String> record : records) {
                    Order order = parseOrder(record.value());
                    orders.add(order);
                    
                    // 记录最后一条消息的 offset
                    commitOffsets.put(
                        new TopicPartition(record.topic(), record.partition()),
                        new OffsetAndMetadata(record.offset() + 1)
                    );
                }
                
                // 批量保存到数据库
                try {
                    saveOrdersToDatabase(orders);
                    
                    // 批量提交 offset
                    consumer.commitSync(commitOffsets);
                    
                } catch (Exception e) {
                    System.err.println("批量保存失败: " + e.getMessage());
                    // 不提交 offset，下次重新消费
                }
            }
        } finally {
            consumer.close();
        }
    }
    
    /**
     * 从指定 offset 开始消费
     */
    public void consumeFromOffset(long startOffset) {
        // 先订阅（为了获取分区信息）
        consumer.subscribe(Collections.singletonList(topic));
        
        // 获取分区列表
        consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
        Set<TopicPartition> partitions = consumer.assignment();
        
        // 从指定 offset 开始
        for (TopicPartition partition : partitions) {
            consumer.seek(partition, startOffset);
        }
        
        // 开始消费
        consume();
    }
    
    /**
     * 从最早的消息开始消费
     */
    public void consumeFromBeginning() {
        consumer.subscribe(Collections.singletonList(topic));
        consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
        consumer.seekToBeginning(consumer.assignment());
        consume();
    }
    
    /**
     * 从最新的消息开始消费
     */
    public void consumeFromLatest() {
        consumer.subscribe(Collections.singletonList(topic));
        consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
        consumer.seekToEnd(consumer.assignment());
        consume();
    }
    
    /**
     * 按时间戳查找 offset
     */
    public void consumeFromTimestamp(long timestamp) {
        consumer.subscribe(Collections.singletonList(topic));
        consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
        
        Map<TopicPartition, Long> timestampsToSearch = new HashMap<>();
        for (TopicPartition partition : consumer.assignment()) {
            timestampsToSearch.put(partition, timestamp);
        }
        
        // 查找指定时间戳的 offset
        Map<TopicPartition, OffsetAndTimestamp> offsets = 
            consumer.offsetsForTimes(timestampsToSearch);
        
        // 定位到该 offset
        for (Map.Entry<TopicPartition, OffsetAndTimestamp> entry : offsets.entrySet()) {
            if (entry.getValue() != null) {
                consumer.seek(entry.getKey(), entry.getValue().offset());
            }
        }
        
        consume();
    }
    
    /**
     * 暂停与恢复消费
     */
    public void consumeWithPauseResume() {
        consumer.subscribe(Collections.singletonList(topic));
        
        boolean paused = false;
        long pauseStartTime = 0;
        
        try {
            while (running) {
                // 检查是否需要暂停/恢复
                if (shouldPause() && !paused) {
                    consumer.pause(consumer.assignment());
                    paused = true;
                    pauseStartTime = System.currentTimeMillis();
                    System.out.println("消费已暂停");
                } else if (paused && !shouldPause()) {
                    consumer.resume(consumer.assignment());
                    paused = false;
                    System.out.println("消费已恢复");
                }
                
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(1000));
                
                for (ConsumerRecord<String, String> record : records) {
                    processMessage(record);
                }
                
                if (!records.isEmpty()) {
                    consumer.commitSync();
                }
            }
        } finally {
            consumer.close();
        }
    }
    
    /**
     * 优雅关闭
     */
    public void shutdown() {
        running = false;
        consumer.wakeup(); // 唤醒 poll 操作
    }
    
    private void processMessage(ConsumerRecord<String, String> record) {
        // 实际的消息处理逻辑
        System.out.println("处理订单: " + record.value());
    }
    
    private Order parseOrder(String json) {
        // 解析订单 JSON
        return new Order("temp", 0.0);
    }
    
    private void saveOrdersToDatabase(List<Order> orders) {
        // 保存到数据库
        System.out.println("保存 " + orders.size() + " 条订单到数据库");
    }
    
    private boolean shouldPause() {
        // 判断是否需要暂停（例如：下游服务不可用）
        return false;
    }
    
    // 使用示例
    public static void main(String[] args) {
        OrderConsumer consumer = new OrderConsumer(
            "localhost:9092", "orders", "order-consumer-group");
        
        // 注册关闭钩子
        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("正在关闭消费者...");
            consumer.shutdown();
        }));
        
        // 开始消费
        consumer.consume();
    }
}

六、Spring Kafka 集成

6.1 依赖配置

xml 复制代码

<!-- Maven -->
<dependency>
    <groupId>org.springframework.kafka</groupId>
    <artifactId>spring-kafka</artifactId>
    <version>3.0.0</version>
</dependency>

gradle 复制代码

// Gradle
implementation 'org.springframework.kafka:spring-kafka:3.0.0'

6.2 生产者配置

java 复制代码

@Configuration
public class KafkaProducerConfig {
    
    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;
    
    @Bean
    public ProducerFactory<String, String> producerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
                   StringSerializer.class);
        config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
                   StringSerializer.class);
        config.put(ProducerConfig.ACKS_CONFIG, "all");
        config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
        config.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
        return new DefaultKafkaProducerFactory<>(config);
    }
    
    @Bean
    public KafkaTemplate<String, String> kafkaTemplate() {
        return new KafkaTemplate<>(producerFactory());
    }
    
    /**
     * 事务生产者工厂
     */
    @Bean
    public ProducerFactory<String, String> transactionalProducerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
                   StringSerializer.class);
        config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
                   StringSerializer.class);
        config.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "order-tx-");
        
        DefaultKafkaProducerFactory<String, String> factory = 
            new DefaultKafkaProducerFactory<>(config);
        factory.setTransactionIdPrefix("order-tx-");
        return factory;
    }
    
    @Bean
    public KafkaTransactionManager<String, String> kafkaTransactionManager() {
        return new KafkaTransactionManager<>(transactionalProducerFactory());
    }
}

6.3 消费者配置

java 复制代码

@Configuration
@EnableKafka
public class KafkaConsumerConfig {
    
    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;
    
    @Bean
    public ConsumerFactory<String, String> consumerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
                   StringDeserializer.class);
        config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
                   StringDeserializer.class);
        config.put(ConsumerConfig.GROUP_ID_CONFIG, "order-consumer-group");
        config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        return new DefaultKafkaConsumerFactory<>(config);
    }
    
    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, String> 
            kafkaListenerContainerFactory() {
        
        ConcurrentKafkaListenerContainerFactory<String, String> factory = 
            new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        factory.setConcurrency(3); // 3 个消费者线程
        factory.getContainerProperties().setAckMode(
            ContainerProperties.AckMode.MANUAL_IMMEDIATE);
        return factory;
    }
}

6.4 消息发送服务

java 复制代码

@Service
@Slf4j
public class OrderMessageService {
    
    @Autowired
    private KafkaTemplate<String, String> kafkaTemplate;
    
    private static final String ORDER_TOPIC = "orders";
    
    /**
     * 发送订单消息
     */
    public void sendOrder(Order order) {
        String orderJson = toJson(order);
        
        CompletableFuture<SendResult<String, String>> future = 
            kafkaTemplate.send(ORDER_TOPIC, order.getId(), orderJson);
        
        future.whenComplete((result, ex) -> {
            if (ex == null) {
                log.info("订单消息发送成功: key={}, partition={}, offset={}",
                    order.getId(), 
                    result.getRecordMetadata().partition(),
                    result.getRecordMetadata().offset());
            } else {
                log.error("订单消息发送失败: key={}", order.getId(), ex);
            }
        });
    }
    
    /**
     * 同步发送
     */
    public void sendOrderSync(Order order) throws Exception {
        String orderJson = toJson(order);
        
        try {
            SendResult<String, String> result = 
                kafkaTemplate.send(ORDER_TOPIC, order.getId(), orderJson).get();
            log.info("订单消息发送成功: key={}", order.getId());
        } catch (Exception e) {
            log.error("订单消息发送失败: key={}", order.getId(), e);
            throw e;
        }
    }
    
    /**
     * 事务发送
     */
    @Transactional
    public void sendOrderInTransaction(List<Order> orders) {
        for (Order order : orders) {
            kafkaTemplate.send(ORDER_TOPIC, order.getId(), toJson(order));
        }
        // 事务会在方法结束时自动提交
    }
    
    private String toJson(Order order) {
        // 转换为 JSON
        return String.format("{\"id\":\"%s\",\"amount\":%.1f}", 
            order.getId(), order.getAmount());
    }
}

6.5 消息消费服务

java 复制代码

@Service
@Slf4j
public class OrderConsumerService {
    
    /**
     * 监听订单消息
     */
    @KafkaListener(
        topics = "orders",
        groupId = "order-consumer-group",
        containerFactory = "kafkaListenerContainerFactory"
    )
    public void consumeOrder(ConsumerRecord<String, String> record,
                             Acknowledgment acknowledgment) {
        try {
            log.info("收到订单消息: key={}, value={}, partition={}, offset={}",
                record.key(), record.value(), 
                record.partition(), record.offset());
            
            // 处理订单
            Order order = parseOrder(record.value());
            processOrder(order);
            
            // 手动确认
            acknowledgment.acknowledge();
            
        } catch (Exception e) {
            log.error("处理订单消息失败: key={}", record.key(), e);
            // 不确认，消息会被重新消费
        }
    }
    
    /**
     * 批量消费
     */
    @KafkaListener(
        topics = "orders",
        groupId = "order-batch-consumer-group",
        containerFactory = "kafkaListenerContainerFactory",
        batch = "true"
    )
    public void consumeBatch(List<ConsumerRecord<String, String>> records,
                              Acknowledgment acknowledgment) {
        try {
            List<Order> orders = new ArrayList<>();
            for (ConsumerRecord<String, String> record : records) {
                orders.add(parseOrder(record.value()));
            }
            
            // 批量处理
            batchProcessOrders(orders);
            
            // 确认
            acknowledgment.acknowledge();
            
        } catch (Exception e) {
            log.error("批量处理订单失败", e);
        }
    }
    
    /**
     * 指定分区消费
     */
    @KafkaListener(
        topicPartitions = @TopicPartition(
            topic = "orders",
            partitions = {"0", "1"}
        ),
        groupId = "order-partition-consumer"
    )
    public void consumeFromPartition(ConsumerRecord<String, String> record) {
        log.info("从分区 {} 收到消息: {}", record.partition(), record.value());
    }
    
    /**
     * 带错误处理器的监听
     */
    @KafkaListener(
        topics = "orders",
        groupId = "order-error-handler-group"
    )
    public void consumeWithErrorHandler(ConsumerRecord<String, String> record) {
        // 可能抛出异常
        processOrder(parseOrder(record.value()));
    }
    
    /**
     * 错误处理器
     */
    @Bean
    public ConsumerAwareErrorHandler errorHandler() {
        return (exception, records, consumer, container) -> {
            log.error("消费异常: {}", exception.getMessage());
            // 可以选择跳过、重试或发送到死信队列
            if (exception instanceof DeserializationException) {
                // 反序列化失败，跳过
                return;
            }
            // 其他异常，重试
            throw exception;
        };
    }
    
    private Order parseOrder(String json) {
        // 解析 JSON
        return new Order("temp", 0.0);
    }
    
    private void processOrder(Order order) {
        // 处理订单
        log.info("处理订单: {}", order.getId());
    }
    
    private void batchProcessOrders(List<Order> orders) {
        log.info("批量处理 {} 条订单", orders.size());
    }
}

七、消息积压处理

7.1 积压原因分析

markdown 复制代码

常见积压原因：
1. 消费速度 < 生产速度
2. 消费者处理逻辑耗时过长
3. 下游服务响应慢（数据库、外部API）
4. 消费者数量不足
5. 分区数不够，无法扩展消费者

排查方法：
1. 查看消费者 Lag：
   kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
     --describe --group order-consumer-group

2. 查看生产速率：
   监控 Broker 的 MessagesInPerSec 指标

3. 分析消费者日志，找出耗时操作

7.2 积压处理方案

java 复制代码

/**
 * 消息积压处理方案
 */
public class LagResolutionStrategy {
    
    /**
     * 方案1：增加消费者实例
     * 
     * 前提：分区数 >= 消费者数
     * 
     * 例如：Topic 有 10 个分区，可以部署 10 个消费者实例
     */
    
    /**
     * 方案2：临时消费者（甩积压）
     */
    public void createTemporaryConsumer() {
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        // 使用新的消费者组（从最早开始消费）
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "lag-resolution-temp");
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1000); // 增加拉取数量
        
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList("orders"));
        
        // 快速消费，不做复杂处理
        while (true) {
            ConsumerRecords<String, String> records = 
                consumer.poll(Duration.ofMillis(100));
            
            if (records.isEmpty()) break;
            
            // 简单处理或转发到其他地方
            for (ConsumerRecord<String, String> record : records) {
                // 保存到数据库或发送到其他队列
                quickProcess(record.value());
            }
            
            consumer.commitSync();
        }
        
        consumer.close();
    }
    
    /**
     * 方案3：异步处理
     */
    public void consumeWithAsyncProcessing() {
        // 使用线程池异步处理
        ExecutorService executor = Executors.newFixedThreadPool(10);
        
        KafkaConsumer<String, String> consumer = createConsumer();
        consumer.subscribe(Collections.singletonList("orders"));
        
        try {
            while (true) {
                ConsumerRecords<String, String> records = 
                    consumer.poll(Duration.ofMillis(100));
                
                List<CompletableFuture<Void>> futures = new ArrayList<>();
                
                for (ConsumerRecord<String, String> record : records) {
                    CompletableFuture<Void> future = CompletableFuture.runAsync(
                        () -> processRecord(record), executor
                    );
                    futures.add(future);
                }
                
                // 等待所有任务完成
                CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
                    .join();
                
                // 提交 offset
                consumer.commitSync();
            }
        } finally {
            consumer.close();
            executor.shutdown();
        }
    }
    
    /**
     * 方案4：跳过积压，从最新开始
     */
    public void skipLag() {
        KafkaConsumer<String, String> consumer = createConsumer();
        consumer.subscribe(Collections.singletonList("orders"));
        consumer.poll(Duration.ofMillis(1000)); // 触发分区分配
        consumer.seekToEnd(consumer.assignment()); // 跳到最新
        // 开始正常消费
    }
    
    /**
     * 方案5：增加分区数
     */
    // 命令行执行：
    // kafka-topics.sh --bootstrap-server localhost:9092 \
    //   --alter --topic orders --partitions 20
    
    private KafkaConsumer<String, String> createConsumer() {
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
                  StringDeserializer.class.getName());
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
                  StringDeserializer.class.getName());
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "order-consumer-group");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        return new KafkaConsumer<>(props);
    }
    
    private void quickProcess(String message) {
        // 快速处理逻辑
    }
    
    private void processRecord(ConsumerRecord<String, String> record) {
        // 处理逻辑
    }
}

八、最佳实践

8.1 生产者最佳实践

java 复制代码

/**
 * 生产者最佳实践配置
 */
public class BestPracticeProducer {
    
    public static Properties getConfig() {
        Properties props = new Properties();
        
        // 基础配置
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, 
                  "broker1:9092,broker2:9092,broker3:9092");
        
        // 可靠性配置
        props.put(ProducerConfig.ACKS_CONFIG, "all");
        props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
        props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);
        props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
        
        // 性能配置
        props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
        props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32768);
        props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
        props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864);
        
        // 超时配置
        props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30000);
        props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120000);
        
        return props;
    }
}

8.2 消费者最佳实践

java 复制代码

/**
 * 消费者最佳实践配置
 */
public class BestPracticeConsumer {
    
    public static Properties getConfig(String groupId) {
        Properties props = new Properties();
        
        // 基础配置
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, 
                  "broker1:9092,broker2:9092,broker3:9092");
        props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
        
        // Offset 管理
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        
        // 性能配置
        props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024);
        props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500);
        props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500);
        
        // 心跳与会话超时（避免频繁 Rebalance）
        props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 10000);
        props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000);
        props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 300000);
        
        return props;
    }
}

8.3 Topic 设计最佳实践

java 复制代码

/**
 * Topic 设计建议
 */
public class TopicDesignBestPractices {
    
    /**
     * 1. 分区数设计
     * 
     * - 分区数 >= 预期的最大消费者数
     * - 考虑吞吐量：每个分区的吞吐量约 10-20 MB/s
     * - 不宜过多：会增加文件句柄、影响选举时间
     * 
     * 公式：分区数 = max(目标吞吐量 / 单分区吞吐量, 消费者数)
     */
    
    /**
     * 2. 副本因子
     * 
     * - 生产环境：至少 3
     * - min.insync.replicas = 2（至少 2 个副本同步）
     */
    
    /**
     * 3. 日志保留策略
     */
    public void configureRetention() {
        // 基于时间（默认 7 天）
        // log.retention.hours=168
        
        // 基于大小
        // log.retention.bytes=1073741824
        
        // 日志压缩（保留最新值）
        // cleanup.policy=compact
    }
    
    /**
     * 4. Topic 命名规范
     * 
     * 格式：<业务域>.<数据类型>.<版本>
     * 例如：
     * - order.event.v1
     * - user.change.v1
     * - payment.log.v1
     */
}

8.4 监控指标

java 复制代码

/**
 * 关键监控指标
 */
public class KafkaMonitoringMetrics {
    
    /**
     * Broker 指标：
     * - MessagesInPerSec：消息写入速率
     * - BytesInPerSec / BytesOutPerSec：吞吐量
     * - UnderReplicatedPartitions：副本不足的分区数
     * - OfflinePartitionsCount：离线分区数
     * - ActiveControllerCount：活跃 Controller 数（应为 1）
     */
    
    /**
     * Producer 指标：
     * - record-send-rate：发送速率
     * - record-error-rate：错误率
     * - request-latency-avg：平均延迟
     * - buffer-available-bytes：可用缓冲区
     */
    
    /**
     * Consumer 指标：
     * - records-lag-max：最大 Lag
     * - records-consumed-rate：消费速率
     * - commit-rate：提交速率
     * - join-rate：加入组速率（Rebalance 频率）
     */
    
    /**
     * 使用 JMX 或 Prometheus 监控
     */
}

九、总结

Kafka 核心要点

高吞吐量：顺序写入、零拷贝、批量发送
高可用：副本机制、ISR、Leader 选举
可扩展：分区机制、水平扩展
消息持久化：日志段文件、稀疏索引

使用建议

生产者：启用幂等、选择合适的 acks、使用压缩
消费者：手动提交 offset、合理设置心跳参数、监控 Lag
Topic 设计：合理分区数、副本因子、日志保留策略
监控告警：Lag、吞吐量、错误率、Rebalance 频率

常见问题解决

问题	原因	解决方案
消息丢失	acks=1 或 0	使用 acks=all，启用幂等
消息重复	生产者重试	消费者实现幂等性
消息积压	消费慢	增加消费者、异步处理、临时消费者
Rebalance 频繁	心跳超时	调整 session.timeout.ms、max.poll.interval.ms
高延迟	网络或磁盘	优化网络、使用 SSD、调整 batch.size

六、思考与练习

思考题

基础题：Kafka如何保证消息的顺序性？在什么情况下消息会出现乱序？
进阶题：Kafka的消费者组Rebalance是如何工作的？频繁Rebalance会导致什么问题？如何避免？
实战题：Kafka与RabbitMQ在消息模型、吞吐量、延迟、适用场景上有何区别？在一个电商系统中，你会如何选择使用哪个消息队列？

编程练习

练习：使用Spring Kafka实现一个完整的消息系统，包含：(1) 生产者批量发送与压缩配置；(2) 消费者手动提交offset与异常处理；(3) 消息积压监控与告警；(4) 死信队列处理失败消息。

章节关联

前置章节：RabbitMQ核心原理与实战
后续章节：RocketMQ核心原理与实战
扩展阅读：《Kafka权威指南》、Kafka官方文档

📝 下一章预告

下一章将讲解Apache RocketMQ------阿里巴巴开源的分布式消息中间件。RocketMQ以其事务消息、延迟消息、消息轨迹等特性，在电商金融领域有着独特优势。

本章完

参考资料：

Apache Kafka 官方文档：kafka.apache.org/documentati...
Spring Kafka 文档：docs.spring.io/spring-kafk...
Kafka 权威指南（书籍）