目录
- Kafka概述
- 核心概念
- 架构设计
- 环境搭建
- 生产者开发
- 消费者开发
- 主题管理
- 分区策略
- 消息可靠性
- 性能优化
- 监控管理
- 集群部署
- 实战案例
- 最佳实践
Kafka概述
Apache Kafka是一个分布式流处理平台,主要用于构建实时数据管道和流应用程序。它具有高吞吐量、可扩展性和容错性的特点,被广泛用于日志收集、指标聚合、流处理等场景。
核心特性
高吞吐量:单机可处理数百万条消息/秒
低延迟:毫秒级的消息传递延迟
可扩展性:支持集群水平扩展
持久性:消息持久化到磁盘,支持数据复制
容错性:支持节点故障自动恢复
实时处理:支持实时流数据处理
多语言支持:提供多种语言的客户端
应用场景
日志聚合 → Kafka → 实时分析
用户行为 → Kafka → 推荐系统
传感器数据 → Kafka → 监控系统
数据库变更 → Kafka → 数据同步
消息队列:异步消息传递
日志聚合:收集和聚合分布式系统日志
流处理:实时数据处理和分析
事件溯源:事件驱动架构的基础设施
数据集成:不同系统间的数据传输
核心概念
Producer(生产者)
生产者负责发布消息到Kafka主题。
java
// 基础生产者
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer<String, String> producer = new KafkaProducer<>(props);
producer.send(new ProducerRecord<>("my-topic", "key", "value"));
Consumer(消费者)
消费者从Kafka主题订阅和消费消息。
java
// 基础消费者
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test-group");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("my-topic"));
Topic(主题)
主题是消息的分类,生产者发布消息到主题,消费者从主题订阅消息。
java
# 创建主题
kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
# 查看主题
kafka-topics.sh --list --bootstrap-server localhost:9092
# 描述主题
kafka-topics.sh --describe --topic my-topic --bootstrap-server localhost:9092
Partition(分区)
分区是主题的物理分割,每个分区是一个有序的消息序列。
java
Topic: user-events
├── Partition 0: [msg1, msg4, msg7, ...]
├── Partition 1: [msg2, msg5, msg8, ...]
└── Partition 2: [msg3, msg6, msg9, ...]
Offset(偏移量)
偏移量是分区中消息的唯一标识,用于消费者跟踪消费进度。
java
Partition 0: [0][1][2][3][4][5][6]...
↑
Consumer Offset
Consumer Group(消费者组)
消费者组是一组消费者的集合,组内消费者协作消费主题的分区。
java
Consumer Group: user-service
├── Consumer 1 → Partition 0, 1
├── Consumer 2 → Partition 2, 3
└── Consumer 3 → Partition 4, 5
架构设计
集群架构
java
┌─────────────────────────────────────────────────────────────┐
│ Kafka Cluster │
├─────────────────┬─────────────────┬─────────────────────────┤
│ Broker 1 │ Broker 2 │ Broker 3 │
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
│ │Topic A │ │ │Topic A │ │ │Topic A │ │
│ │Partition 0 │ │ │Partition 1 │ │ │Partition 2 │ │
│ │(Leader) │ │ │(Follower) │ │ │(Leader) │ │
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
└─────────────────┴─────────────────┴─────────────────────────┘
↑ ↑ ↑
┌─────────────────────────────────────────────────────────────┐
│ ZooKeeper Ensemble │
│ Node 1 Node 2 Node 3 │
└─────────────────────────────────────────────────────────────┘
数据流向
java
Producer → Broker → Partition → Consumer
↓ ↓ ↓ ↓
序列化 → 分区策略 → 持久化 → 反序列化
环境搭建
Docker部署
java
# docker-compose.yml
version:'3.8'
services:
zookeeper:
image:confluentinc/cp-zookeeper:7.4.0
hostname:zookeeper
container_name:zookeeper
ports:
-"2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT:2181
ZOOKEEPER_TICK_TIME:2000
volumes:
-zookeeper-data:/var/lib/zookeeper/data
-zookeeper-logs:/var/lib/zookeeper/log
kafka1:
image:confluentinc/cp-kafka:7.4.0
hostname:kafka1
container_name:kafka1
depends_on:
-zookeeper
ports:
-"9092:9092"
-"19092:19092"
environment:
KAFKA_BROKER_ID:1
KAFKA_ZOOKEEPER_CONNECT:'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS:PLAINTEXT://kafka1:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENERS:PLAINTEXT://0.0.0.0:29092,PLAINTEXT_HOST://0.0.0.0:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:3
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR:2
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR:3
KAFKA_JMX_PORT:19092
KAFKA_JMX_HOSTNAME:localhost
volumes:
-kafka1-data:/var/lib/kafka/data
kafka2:
image:confluentinc/cp-kafka:7.4.0
hostname:kafka2
container_name:kafka2
depends_on:
-zookeeper
ports:
-"9093:9093"
-"19093:19093"
environment:
KAFKA_BROKER_ID:2
KAFKA_ZOOKEEPER_CONNECT:'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS:PLAINTEXT://kafka2:29093,PLAINTEXT_HOST://localhost:9093
KAFKA_LISTENERS:PLAINTEXT://0.0.0.0:29093,PLAINTEXT_HOST://0.0.0.0:9093
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:3
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR:2
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR:3
KAFKA_JMX_PORT:19093
KAFKA_JMX_HOSTNAME:localhost
volumes:
-kafka2-data:/var/lib/kafka/data
kafka3:
image:confluentinc/cp-kafka:7.4.0
hostname:kafka3
container_name:kafka3
depends_on:
-zookeeper
ports:
-"9094:9094"
-"19094:19094"
environment:
KAFKA_BROKER_ID:3
KAFKA_ZOOKEEPER_CONNECT:'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS:PLAINTEXT://kafka3:29094,PLAINTEXT_HOST://localhost:9094
KAFKA_LISTENERS:PLAINTEXT://0.0.0.0:29094,PLAINTEXT_HOST://0.0.0.0:9094
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:3
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR:2
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR:3
KAFKA_JMX_PORT:19094
KAFKA_JMX_HOSTNAME:localhost
volumes:
-kafka3-data:/var/lib/kafka/data
kafka-ui:
image:provectuslabs/kafka-ui:latest
container_name:kafka-ui
depends_on:
-kafka1
-kafka2
-kafka3
ports:
-"8080:8080"
environment:
KAFKA_CLUSTERS_0_NAME:local
KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS:kafka1:29092,kafka2:29093,kafka3:29094
KAFKA_CLUSTERS_0_ZOOKEEPER:zookeeper:2181
volumes:
zookeeper-data:
zookeeper-logs:
kafka1-data:
kafka2-data:
kafka3-data:
原生部署
java
# 1. 下载Kafka
wget https://downloads.apache.org/kafka/2.8.2/kafka_2.13-2.8.2.tgz
tar -xzf kafka_2.13-2.8.2.tgz
cd kafka_2.13-2.8.2
# 2. 启动ZooKeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
# 3. 启动Kafka Server
bin/kafka-server-start.sh config/server.properties
# 4. 创建主题
bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092
# 5. 测试发送消息
bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
# 6. 测试消费消息
bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092
Java项目依赖
java
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>3.5.0</version>
</dependency>
<!-- Spring Kafka -->
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
</dependency>
<!-- JSON序列化 -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<!-- Avro序列化 -->
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>7.4.0</version>
</dependency>
生产者开发
基础生产者
java
@Service
@Slf4j
publicclass KafkaProducerService {
privatefinal KafkaTemplate<String, Object> kafkaTemplate;
public KafkaProducerService(KafkaTemplate<String, Object> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void sendMessage(String topic, String key, Object value) {
try {
ProducerRecord<String, Object> record = new ProducerRecord<>(topic, key, value);
// 添加消息头
record.headers().add("timestamp", String.valueOf(System.currentTimeMillis()).getBytes());
record.headers().add("source", "producer-service".getBytes());
// 发送消息
ListenableFuture<SendResult<String, Object>> future = kafkaTemplate.send(record);
// 添加回调
future.addCallback(
result -> log.info("消息发送成功: topic={}, partition={}, offset={}",
result.getRecordMetadata().topic(),
result.getRecordMetadata().partition(),
result.getRecordMetadata().offset()),
failure -> log.error("消息发送失败: {}", failure.getMessage(), failure)
);
} catch (Exception e) {
log.error("发送消息异常: topic={}, key={}", topic, key, e);
}
}
public void sendBatchMessages(String topic, List<Object> messages) {
List<ProducerRecord<String, Object>> records = messages.stream()
.map(message -> new ProducerRecord<String, Object>(topic, message))
.collect(Collectors.toList());
// 批量发送
for (ProducerRecord<String, Object> record : records) {
kafkaTemplate.send(record);
}
// 刷新确保所有消息发送
kafkaTemplate.flush();
log.info("批量发送完成,消息数量: {}", messages.size());
}
}
事务性生产者
java
@Service
@Slf4j
publicclass TransactionalProducerService {
privatefinal KafkaTransactionManager transactionManager;
privatefinal KafkaTemplate<String, Object> kafkaTemplate;
@Transactional
public void sendTransactionalMessages(List<MessageEvent> events) {
try {
for (MessageEvent event : events) {
// 发送消息到不同主题
kafkaTemplate.send("user-events", event.getUserId(), event);
kafkaTemplate.send("audit-events", event.getEventId(), event);
// 数据库操作
saveEventToDatabase(event);
}
log.info("事务性消息发送成功,数量: {}", events.size());
} catch (Exception e) {
log.error("事务性消息发送失败", e);
throw e; // 触发事务回滚
}
}
private void saveEventToDatabase(MessageEvent event) {
// 保存事件到数据库
eventRepository.save(event);
}
}
java
@Configuration
@EnableKafka
@EnableTransactionManagement
publicclass KafkaTransactionConfig {
@Bean
public KafkaTransactionManager kafkaTransactionManager(
ProducerFactory<String, Object> producerFactory) {
returnnew KafkaTransactionManager<>(producerFactory);
}
@Bean
public ProducerFactory<String, Object> producerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
// 事务配置
props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "transaction-producer");
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
returnnew DefaultKafkaProducerFactory<>(props);
}
}
自定义分区器
java
public class CustomPartitioner implements Partitioner {
@Override
public int partition(String topic, Object key, byte[] keyBytes,
Object value, byte[] valueBytes, Cluster cluster) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (key == null) {
// 轮询分区
return random.nextInt(numPartitions);
}
// 自定义分区逻辑
if (key instanceof String) {
String keyStr = (String) key;
// VIP用户分配到固定分区
if (keyStr.startsWith("vip_")) {
return0;
}
// 根据用户ID哈希分区
return Math.abs(keyStr.hashCode()) % numPartitions;
}
return Math.abs(key.hashCode()) % numPartitions;
}
@Override
public void close() {
// 清理资源
}
@Override
public void configure(Map<String, ?> configs) {
// 配置初始化
}
}
// 配置自定义分区器
@Bean
public ProducerFactory<String, Object> producerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, CustomPartitioner.class);
// 其他配置...
returnnew DefaultKafkaProducerFactory<>(props);
}
消费者开发
基础消费者
java
@Component
@Slf4j
publicclass KafkaConsumerService {
@KafkaListener(topics = "user-events", groupId = "user-service-group")
public void handleUserEvent(UserEvent event,
@Header Map<String, Object> headers,
ConsumerRecord<String, UserEvent> record) {
log.info("接收到用户事件: topic={}, partition={}, offset={}, key={}, value={}",
record.topic(), record.partition(), record.offset(), record.key(), event);
try {
// 处理业务逻辑
processUserEvent(event);
log.info("用户事件处理成功: {}", event.getEventId());
} catch (Exception e) {
log.error("用户事件处理失败: {}", event.getEventId(), e);
throw e; // 触发重试机制
}
}
@KafkaListener(topics = "order-events",
groupId = "order-service-group",
containerFactory = "orderKafkaListenerContainerFactory")
public void handleOrderEvent(OrderEvent event) {
log.info("处理订单事件: {}", event);
switch (event.getStatus()) {
case CREATED:
handleOrderCreated(event);
break;
case PAID:
handleOrderPaid(event);
break;
case SHIPPED:
handleOrderShipped(event);
break;
case COMPLETED:
handleOrderCompleted(event);
break;
default:
log.warn("未知订单状态: {}", event.getStatus());
}
}
}
批量消费者
java
@Component
@Slf4j
publicclass BatchKafkaConsumer {
@KafkaListener(topics = "batch-events",
groupId = "batch-consumer-group",
containerFactory = "batchKafkaListenerContainerFactory")
public void handleBatchEvents(List<ConsumerRecord<String, Object>> records) {
log.info("批量处理消息,数量: {}", records.size());
List<Object> events = records.stream()
.map(ConsumerRecord::value)
.collect(Collectors.toList());
try {
// 批量处理业务逻辑
processBatchEvents(events);
log.info("批量消息处理成功,数量: {}", events.size());
} catch (Exception e) {
log.error("批量消息处理失败", e);
// 逐个处理失败的消息
for (ConsumerRecord<String, Object> record : records) {
try {
processSingleEvent(record.value());
} catch (Exception ex) {
log.error("单个消息处理失败: partition={}, offset={}",
record.partition(), record.offset(), ex);
// 发送到死信队列
sendToDeadLetterQueue(record);
}
}
}
}
private void processBatchEvents(List<Object> events) {
// 批量业务处理逻辑
batchService.processEvents(events);
}
private void processSingleEvent(Object event) {
// 单个事件处理逻辑
eventService.processEvent(event);
}
}
// 批量消费配置
@Configuration
publicclass BatchConsumerConfig {
@Bean
public ConcurrentKafkaListenerContainerFactory<String, Object>
batchKafkaListenerContainerFactory(ConsumerFactory<String, Object> consumerFactory) {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
// 批量配置
factory.setBatchListener(true);
factory.setConcurrency(3);
// 批量配置
ContainerProperties containerProperties = factory.getContainerProperties();
containerProperties.setPollTimeout(3000);
containerProperties.setAckMode(ContainerProperties.AckMode.BATCH);
return factory;
}
}
手动提交偏移量
java
@Component
@Slf4j
publicclass ManualCommitConsumer {
@KafkaListener(topics = "manual-commit-topic",
groupId = "manual-commit-group",
containerFactory = "manualCommitContainerFactory")
public void handleManualCommit(ConsumerRecord<String, Object> record,
Acknowledgment acknowledgment) {
log.info("处理手动提交消息: partition={}, offset={}",
record.partition(), record.offset());
try {
// 处理业务逻辑
processMessage(record.value());
// 手动提交偏移量
acknowledgment.acknowledge();
log.info("消息处理并提交成功: offset={}", record.offset());
} catch (Exception e) {
log.error("消息处理失败,不提交偏移量: offset={}", record.offset(), e);
// 不调用acknowledge(),消息将在下次重启时重新消费
}
}
}
// 手动提交配置
@Bean
public ConcurrentKafkaListenerContainerFactory<String, Object>
manualCommitContainerFactory(ConsumerFactory<String, Object> consumerFactory) {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
// 手动提交配置
ContainerProperties containerProperties = factory.getContainerProperties();
containerProperties.setAckMode(ContainerProperties.AckMode.MANUAL);
return factory;
}
主题管理
编程方式管理主题
java
@Service
@Slf4j
publicclass TopicManagementService {
privatefinal AdminClient adminClient;
public TopicManagementService(@Value("${spring.kafka.bootstrap-servers}") String bootstrapServers) {
Properties props = new Properties();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
this.adminClient = AdminClient.create(props);
}
public void createTopic(String topicName, int numPartitions, short replicationFactor) {
NewTopic newTopic = new NewTopic(topicName, numPartitions, replicationFactor);
// 设置主题配置
Map<String, String> configs = new HashMap<>();
configs.put(TopicConfig.CLEANUP_POLICY_CONFIG, TopicConfig.CLEANUP_POLICY_DELETE);
configs.put(TopicConfig.RETENTION_MS_CONFIG, "604800000"); // 7天
configs.put(TopicConfig.SEGMENT_MS_CONFIG, "86400000"); // 1天
newTopic.configs(configs);
CreateTopicsResult result = adminClient.createTopics(Collections.singletonList(newTopic));
try {
result.all().get();
log.info("主题创建成功: {}", topicName);
} catch (Exception e) {
log.error("主题创建失败: {}", topicName, e);
thrownew RuntimeException("主题创建失败", e);
}
}
public void deleteTopic(String topicName) {
DeleteTopicsResult result = adminClient.deleteTopics(Collections.singletonList(topicName));
try {
result.all().get();
log.info("主题删除成功: {}", topicName);
} catch (Exception e) {
log.error("主题删除失败: {}", topicName, e);
thrownew RuntimeException("主题删除失败", e);
}
}
public List<String> listTopics() {
try {
ListTopicsResult result = adminClient.listTopics();
returnnew ArrayList<>(result.names().get());
} catch (Exception e) {
log.error("获取主题列表失败", e);
thrownew RuntimeException("获取主题列表失败", e);
}
}
public TopicDescription describeTopic(String topicName) {
try {
DescribeTopicsResult result = adminClient.describeTopics(Collections.singletonList(topicName));
return result.values().get(topicName).get();
} catch (Exception e) {
log.error("获取主题详情失败: {}", topicName, e);
thrownew RuntimeException("获取主题详情失败", e);
}
}
public void alterTopicConfig(String topicName, Map<String, String> configs) {
ConfigResource resource = new ConfigResource(ConfigResource.Type.TOPIC, topicName);
List<AlterConfigOp> ops = configs.entrySet().stream()
.map(entry -> new AlterConfigOp(
new ConfigEntry(entry.getKey(), entry.getValue()),
AlterConfigOp.OpType.SET))
.collect(Collectors.toList());
Map<ConfigResource, Collection<AlterConfigOp>> alterConfigs =
Collections.singletonMap(resource, ops);
AlterConfigsResult result = adminClient.incrementalAlterConfigs(alterConfigs);
try {
result.all().get();
log.info("主题配置修改成功: {}", topicName);
} catch (Exception e) {
log.error("主题配置修改失败: {}", topicName, e);
thrownew RuntimeException("主题配置修改失败", e);
}
}
}
Spring Boot自动创建主题
java
@Configuration
publicclass TopicConfig {
@Bean
public NewTopic userEventsTopic() {
return TopicBuilder.name("user-events")
.partitions(6)
.replicas(3)
.config(TopicConfig.RETENTION_MS_CONFIG, "604800000") // 7天保留
.config(TopicConfig.SEGMENT_MS_CONFIG, "86400000") // 1天段
.config(TopicConfig.CLEANUP_POLICY_CONFIG, TopicConfig.CLEANUP_POLICY_DELETE)
.build();
}
@Bean
public NewTopic orderEventsTopic() {
return TopicBuilder.name("order-events")
.partitions(12)
.replicas(3)
.config(TopicConfig.RETENTION_MS_CONFIG, "2592000000") // 30天保留
.compact() // 压缩策略
.build();
}
@Bean
public NewTopic auditLogTopic() {
return TopicBuilder.name("audit-logs")
.partitions(3)
.replicas(3)
.config(TopicConfig.RETENTION_MS_CONFIG, "7776000000") // 90天保留
.config(TopicConfig.MIN_IN_SYNC_REPLICAS_CONFIG, "2")
.build();
}
}
分区策略
自定义分区分配策略
java
public class CustomPartitionAssignor extends AbstractPartitionAssignor {
@Override
public String name() {
return"custom";
}
@Override
public GroupAssignment assign(Cluster metadata, GroupSubscription groupSubscription) {
Map<String, List<TopicPartition>> assignment = new HashMap<>();
// 获取所有消费者
List<String> members = new ArrayList<>(groupSubscription.groupSubscription().keySet());
// 获取所有主题分区
List<TopicPartition> allPartitions = getAllPartitions(metadata, groupSubscription);
// 自定义分配逻辑:按消费者能力分配
int totalPartitions = allPartitions.size();
int membersCount = members.size();
for (int i = 0; i < members.size(); i++) {
String member = members.get(i);
List<TopicPartition> memberPartitions = new ArrayList<>();
// 计算该消费者应分配的分区数
int partitionsPerMember = totalPartitions / membersCount;
int extraPartitions = totalPartitions % membersCount;
int startIndex = i * partitionsPerMember + Math.min(i, extraPartitions);
int endIndex = startIndex + partitionsPerMember + (i < extraPartitions ? 1 : 0);
for (int j = startIndex; j < endIndex && j < allPartitions.size(); j++) {
memberPartitions.add(allPartitions.get(j));
}
assignment.put(member, memberPartitions);
}
// 构建分配结果
Map<String, Assignment> groupAssignment = new HashMap<>();
for (Map.Entry<String, List<TopicPartition>> entry : assignment.entrySet()) {
groupAssignment.put(entry.getKey(), new Assignment(entry.getValue()));
}
returnnew GroupAssignment(groupAssignment);
}
private List<TopicPartition> getAllPartitions(Cluster metadata, GroupSubscription groupSubscription) {
List<TopicPartition> allPartitions = new ArrayList<>();
for (String topic : groupSubscription.groupSubscription().values().iterator().next().topics()) {
int partitionCount = metadata.partitionCountForTopic(topic);
for (int i = 0; i < partitionCount; i++) {
allPartitions.add(new TopicPartition(topic, i));
}
}
return allPartitions;
}
}
// 配置自定义分区分配策略
@Bean
public ConsumerFactory<String, Object> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
Arrays.asList(CustomPartitionAssignor.class.getName()));
// 其他配置...
returnnew DefaultKafkaConsumerFactory<>(props);
}
分区重平衡监听
java
@Component
@Slf4j
publicclass RebalanceAwareConsumer {
@KafkaListener(topics = "rebalance-topic",
groupId = "rebalance-group",
containerFactory = "rebalanceListenerContainerFactory")
public void handleMessage(ConsumerRecord<String, Object> record) {
log.info("处理消息: partition={}, offset={}", record.partition(), record.offset());
processMessage(record.value());
}
}
@Configuration
publicclass RebalanceListenerConfig {
@Bean
public ConcurrentKafkaListenerContainerFactory<String, Object>
rebalanceListenerContainerFactory(ConsumerFactory<String, Object> consumerFactory) {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
// 设置重平衡监听器
factory.getContainerProperties().setConsumerRebalanceListener(new ConsumerRebalanceListener() {
@Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
log.info("分区被撤销: {}", partitions);
// 保存当前处理状态
for (TopicPartition partition : partitions) {
saveProcessingState(partition);
}
}
@Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
log.info("分区被分配: {}", partitions);
// 恢复处理状态
for (TopicPartition partition : partitions) {
restoreProcessingState(partition);
}
}
@Override
public void onPartitionsLost(Collection<TopicPartition> partitions) {
log.warn("分区丢失: {}", partitions);
// 清理丢失分区的状态
for (TopicPartition partition : partitions) {
cleanupPartitionState(partition);
}
}
});
return factory;
}
private void saveProcessingState(TopicPartition partition) {
// 保存当前分区的处理状态
log.debug("保存分区状态: {}", partition);
}
private void restoreProcessingState(TopicPartition partition) {
// 恢复分区的处理状态
log.debug("恢复分区状态: {}", partition);
}
private void cleanupPartitionState(TopicPartition partition) {
// 清理分区状态
log.debug("清理分区状态: {}", partition);
}
}
消息可靠性
生产者可靠性配置
java
@Configuration
publicclass ReliableProducerConfig {
@Bean
public ProducerFactory<String, Object> reliableProducerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
// 可靠性配置
props.put(ProducerConfig.ACKS_CONFIG, "all"); // 等待所有副本确认
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE); // 无限重试
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5); // 限制未确认请求数
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // 启用幂等性
// 批量配置
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); // 16KB批量大小
props.put(ProducerConfig.LINGER_MS_CONFIG, 10); // 10ms延迟发送
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432); // 32MB缓冲区
// 压缩配置
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4"); // LZ4压缩
// 超时配置
props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30000); // 30秒请求超时
props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120000); // 2分钟交付超时
returnnew DefaultKafkaProducerFactory<>(props);
}
}
消费者可靠性配置
java
@Configuration
publicclass ReliableConsumerConfig {
@Bean
public ConsumerFactory<String, Object> reliableConsumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "reliable-consumer-group");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
// 可靠性配置
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); // 禁用自动提交
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // 从最早偏移量开始
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed"); // 只读已提交消息
// 会话配置
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 30000); // 30秒会话超时
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000); // 3秒心跳间隔
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 300000); // 5分钟轮询间隔
// 批量配置
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500); // 每次最多拉取500条
props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024); // 最小拉取1KB
props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500); // 最大等待500ms
returnnew DefaultKafkaConsumerFactory<>(props);
}
}
重复消息处理
java
@Component
@Slf4j
publicclass DuplicateMessageHandler {
privatefinal RedisTemplate<String, String> redisTemplate;
privatefinal Duration duplicateWindowDuration = Duration.ofHours(24);
@KafkaListener(topics = "dedupe-topic", groupId = "dedupe-group")
public void handleMessage(@Payload MessageEvent event,
@Header("kafka_receivedMessageKey") String messageKey,
ConsumerRecord<String, MessageEvent> record) {
String dedupeKey = buildDedupeKey(event);
// 检查消息是否重复
if (isDuplicateMessage(dedupeKey)) {
log.info("检测到重复消息,跳过处理: key={}, offset={}",
messageKey, record.offset());
return;
}
try {
// 处理业务逻辑
processMessage(event);
// 标记消息已处理
markMessageProcessed(dedupeKey);
log.info("消息处理成功: key={}, eventId={}", messageKey, event.getEventId());
} catch (Exception e) {
log.error("消息处理失败: key={}, eventId={}", messageKey, event.getEventId(), e);
throw e;
}
}
private String buildDedupeKey(MessageEvent event) {
// 构建去重键:可以基于消息ID、内容哈希等
return String.format("dedupe:%s:%s", event.getEventType(), event.getEventId());
}
private boolean isDuplicateMessage(String dedupeKey) {
Boolean exists = redisTemplate.hasKey(dedupeKey);
return Boolean.TRUE.equals(exists);
}
private void markMessageProcessed(String dedupeKey) {
redisTemplate.opsForValue().set(dedupeKey, "processed", duplicateWindowDuration);
}
private void processMessage(MessageEvent event) {
// 业务处理逻辑
switch (event.getEventType()) {
case"USER_CREATED":
handleUserCreated(event);
break;
case"ORDER_PLACED":
handleOrderPlaced(event);
break;
default:
log.warn("未知事件类型: {}", event.getEventType());
}
}
}
死信队列处理
java
@Component
@Slf4j
publicclass DeadLetterQueueHandler {
privatefinal KafkaTemplate<String, Object> kafkaTemplate;
@KafkaListener(topics = "main-topic",
groupId = "main-consumer-group",
errorHandler = "kafkaErrorHandler")
public void handleMainMessage(ConsumerRecord<String, Object> record) {
log.info("处理主消息: partition={}, offset={}", record.partition(), record.offset());
// 模拟随机失败
if (Math.random() < 0.1) {
thrownew RuntimeException("模拟处理失败");
}
processMessage(record.value());
}
@KafkaListener(topics = "main-topic.DLT", groupId = "dlq-consumer-group")
public void handleDeadLetterMessage(ConsumerRecord<String, Object> record,
@Header Map<String, Object> headers) {
log.info("处理死信消息: partition={}, offset={}", record.partition(), record.offset());
log.info("原始异常信息: {}", headers.get("kafka_dlt-exception-message"));
try {
// 死信消息处理逻辑
processDeadLetterMessage(record.value(), headers);
} catch (Exception e) {
log.error("死信消息处理失败: {}", record.offset(), e);
// 发送告警
sendAlert("死信消息处理失败", record, e);
}
}
private void processDeadLetterMessage(Object message, Map<String, Object> headers) {
// 分析失败原因
String exceptionMessage = (String) headers.get("kafka_dlt-exception-message");
if (exceptionMessage.contains("timeout")) {
// 超时错误:延时重试
scheduleRetry(message, Duration.ofMinutes(5));
} elseif (exceptionMessage.contains("validation")) {
// 验证错误:记录并丢弃
logValidationError(message, exceptionMessage);
} else {
// 其他错误:人工处理
notifyManualReview(message, exceptionMessage);
}
}
@Bean
public KafkaListenerErrorHandler kafkaErrorHandler() {
return (message, exception) -> {
log.error("Kafka消费异常: {}", exception.getMessage());
// 提取原始记录
ConsumerRecord<?, ?> record = (ConsumerRecord<?, ?>) message.getPayload();
// 构建死信消息
ProducerRecord<String, Object> dlqRecord = new ProducerRecord<>(
record.topic() + ".DLT",
record.key() != null ? record.key().toString() : null,
record.value()
);
// 添加错误信息头
dlqRecord.headers().add("kafka_dlt-exception-message",
exception.getMessage().getBytes());
dlqRecord.headers().add("kafka_dlt-original-topic",
record.topic().getBytes());
dlqRecord.headers().add("kafka_dlt-original-partition",
String.valueOf(record.partition()).getBytes());
dlqRecord.headers().add("kafka_dlt-original-offset",
String.valueOf(record.offset()).getBytes());
dlqRecord.headers().add("kafka_dlt-exception-timestamp",
String.valueOf(System.currentTimeMillis()).getBytes());
// 发送到死信队列
kafkaTemplate.send(dlqRecord);
log.info("消息已发送到死信队列: topic={}, partition={}, offset={}",
record.topic(), record.partition(), record.offset());
returnnull;
};
}
}
性能优化
生产者性能优化
java
@Configuration
publicclass KafkaProducerOptimizationConfig {
@Bean
public ProducerFactory<String, Object> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
// 基础配置
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
// 性能优化配置
configProps.put(ProducerConfig.BATCH_SIZE_CONFIG, 65536); // 64KB批次大小
configProps.put(ProducerConfig.LINGER_MS_CONFIG, 50); // 50ms等待时间
configProps.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864); // 64MB缓冲区
configProps.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4"); // LZ4压缩
// 并发配置
configProps.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
// 可靠性配置
configProps.put(ProducerConfig.ACKS_CONFIG, "1"); // 平衡性能和可靠性
configProps.put(ProducerConfig.RETRIES_CONFIG, 3);
configProps.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 1000);
// 超时配置
configProps.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30000);
configProps.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120000);
returnnew DefaultKafkaProducerFactory<>(configProps);
}
}
消费者性能优化
java
@Configuration
publicclass KafkaConsumerOptimizationConfig {
@Bean
public ConsumerFactory<String, Object> consumerFactory() {
Map<String, Object> configProps = new HashMap<>();
// 基础配置
configProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
configProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
configProps.put(ConsumerConfig.GROUP_ID_CONFIG, "high-performance-group");
// 性能优化配置
configProps.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 50000); // 50KB最小拉取
configProps.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500); // 500ms最大等待
configProps.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1000); // 每次拉取1000条
configProps.put(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 262144); // 256KB接收缓冲区
configProps.put(ConsumerConfig.SEND_BUFFER_CONFIG, 131072); // 128KB发送缓冲区
// 会话管理
configProps.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 45000);
configProps.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 15000);
// 自动提交优化
configProps.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); // 手动提交
returnnew DefaultKafkaConsumerFactory<>(configProps);
}
}
监控管理
JMX监控
java
@Service
@Slf4j
publicclass KafkaMetricsService {
private MBeanServer mBeanServer;
@PostConstruct
public void init() {
mBeanServer = ManagementFactory.getPlatformMBeanServer();
}
// 获取生产者指标
public Map<String, Object> getProducerMetrics() {
Map<String, Object> metrics = new HashMap<>();
try {
// 发送速率
ObjectName recordSendRate = new ObjectName(
"kafka.producer:type=producer-metrics,client-id=*,attribute=record-send-rate");
Set<ObjectName> names = mBeanServer.queryNames(recordSendRate, null);
for (ObjectName name : names) {
Double rate = (Double) mBeanServer.getAttribute(name, "Value");
metrics.put("record-send-rate-" + name.getKeyProperty("client-id"), rate);
}
} catch (Exception e) {
log.error("获取生产者指标失败", e);
}
return metrics;
}
// 获取消费者指标
public Map<String, Object> getConsumerMetrics() {
Map<String, Object> metrics = new HashMap<>();
try {
// 消费速率
ObjectName recordsConsumedRate = new ObjectName(
"kafka.consumer:type=consumer-metrics,client-id=*,attribute=records-consumed-rate");
Set<ObjectName> names = mBeanServer.queryNames(recordsConsumedRate, null);
for (ObjectName name : names) {
Double rate = (Double) mBeanServer.getAttribute(name, "Value");
metrics.put("records-consumed-rate-" + name.getKeyProperty("client-id"), rate);
}
} catch (Exception e) {
log.error("获取消费者指标失败", e);
}
return metrics;
}
}
集群部署
java
Docker Compose集群部署
version: '3.8'
services:
zookeeper-1:
image:confluentinc/cp-zookeeper:7.4.0
hostname:zookeeper-1
ports:
-"2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT:2181
ZOOKEEPER_TICK_TIME:2000
ZOOKEEPER_SERVER_ID:1
ZOOKEEPER_SERVERS:zookeeper-1:2888:3888;zookeeper-2:2888:3888;zookeeper-3:2888:3888
kafka-1:
image:confluentinc/cp-kafka:7.4.0
hostname:kafka-1
ports:
-"9092:9092"
environment:
KAFKA_BROKER_ID:1
KAFKA_ZOOKEEPER_CONNECT:zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
KAFKA_ADVERTISED_LISTENERS:PLAINTEXT://kafka-1:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:3
depends_on:
-zookeeper-1
实战案例
电商订单处理系统
java
// 订单事件定义
publicclass OrderEvent {
private String orderId;
private String customerId;
private String productId;
private Integer quantity;
private BigDecimal amount;
private OrderStatus status;
private LocalDateTime timestamp;
// getters and setters
}
publicenum OrderStatus {
CREATED, PAID, SHIPPED, DELIVERED, CANCELLED
}
java
// 订单生产者
@Service
@Slf4j
publicclass OrderEventProducer {
@Autowired
private KafkaTemplate<String, OrderEvent> kafkaTemplate;
public void publishOrderCreated(OrderEvent orderEvent) {
orderEvent.setStatus(OrderStatus.CREATED);
orderEvent.setTimestamp(LocalDateTime.now());
ProducerRecord<String, OrderEvent> record = new ProducerRecord<>(
"order-events",
orderEvent.getOrderId(),
orderEvent
);
// 添加头信息
record.headers().add("event-type", "ORDER_CREATED".getBytes());
record.headers().add("version", "1.0".getBytes());
record.headers().add("source", "order-service".getBytes());
kafkaTemplate.send(record).addCallback(
result -> log.info("订单创建事件发送成功: {}", orderEvent.getOrderId()),
failure -> log.error("订单创建事件发送失败: {}", orderEvent.getOrderId(), failure)
);
}
}
java
// 库存服务消费者
@Service
@Slf4j
publicclass InventoryService {
@KafkaListener(topics = "order-events", groupId = "inventory-group")
public void handleOrderCreated(ConsumerRecord<String, OrderEvent> record) {
OrderEvent orderEvent = record.value();
try {
if (orderEvent.getStatus() == OrderStatus.CREATED) {
// 检查库存
boolean available = checkInventory(orderEvent.getProductId(), orderEvent.getQuantity());
if (available) {
// 预扣库存
reserveInventory(orderEvent.getProductId(), orderEvent.getQuantity());
log.info("库存预扣成功: 订单={}, 商品={}, 数量={}",
orderEvent.getOrderId(), orderEvent.getProductId(), orderEvent.getQuantity());
} else {
log.warn("库存不足: 订单={}, 商品={}, 数量={}",
orderEvent.getOrderId(), orderEvent.getProductId(), orderEvent.getQuantity());
}
}
} catch (Exception e) {
log.error("处理订单创建事件失败: {}", orderEvent.getOrderId(), e);
}
}
private boolean checkInventory(String productId, Integer quantity) {
returntrue; // 检查库存逻辑
}
private void reserveInventory(String productId, Integer quantity) {
// 预扣库存逻辑
}
}
最佳实践
1. 主题设计原则
java
// 主题命名规范
publicclass TopicNamingConvention {
// 格式: <环境>.<业务域>.<数据类型>.<版本>
publicstaticfinal String USER_EVENTS = "prod.user.events.v1";
publicstaticfinal String ORDER_EVENTS = "prod.order.events.v1";
publicstaticfinal String PAYMENT_EVENTS = "prod.payment.events.v1";
// DLQ主题
publicstaticfinal String USER_EVENTS_DLQ = "prod.user.events.v1.dlq";
// 重试主题
publicstaticfinal String USER_EVENTS_RETRY = "prod.user.events.v1.retry";
}
2. 错误处理策略
java
@Configuration
publicclass KafkaErrorHandlingConfig {
@Bean
public DefaultErrorHandler errorHandler() {
// 指数退避重试
ExponentialBackOff backOff = new ExponentialBackOff(1000L, 2.0);
backOff.setMaxElapsedTime(300000L); // 5分钟最大重试时间
DefaultErrorHandler errorHandler = new DefaultErrorHandler(
deadLetterPublishingRecoverer(), backOff);
// 不重试的异常
errorHandler.addNotRetryableExceptions(
DeserializationException.class,
MessageConversionException.class
);
return errorHandler;
}
@Bean
public DeadLetterPublishingRecoverer deadLetterPublishingRecoverer() {
returnnew DeadLetterPublishingRecoverer(kafkaTemplate(),
(record, exception) -> {
if (exception instanceof DeserializationException) {
returnnew TopicPartition(record.topic() + ".dlt.deserialization", 0);
}
returnnew TopicPartition(record.topic() + ".dlt", 0);
});
}
}
3. 监控告警
java
# Prometheus监控规则
groups:
-name:kafka-alerts
rules:
-alert:KafkaConsumerLag
expr:kafka_consumer_lag_max>1000
for:5m
labels:
severity:warning
annotations:
summary:"Kafka consumer lag is high"
description:"Consumer lag is {{ $value }} for {{ $labels.topic }}"
-alert:KafkaProducerErrorRate
expr:rate(kafka_producer_record_error_total[5m])>0.1
for:2m
labels:
severity:critical
annotations:
summary:"Kafka producer error rate is high"
总结
Apache Kafka是一个强大的分布式流处理平台,通过本教程的学习,您应该掌握:
核心概念:理解Kafka的基本概念和架构设计
开发技能:掌握生产者、消费者的开发和配置
可靠性保证:实现消息的可靠传递和处理
性能优化:针对不同场景进行性能调优
运维管理:集群部署、监控告警、故障处理
最佳实践:在生产环境中正确使用Kafka
进阶学习建议
Kafka Streams:学习流处理框架
Schema Registry:掌握模式注册和演进
Kafka Connect:了解数据集成工具
KSQL:学习流式SQL查询
性能调优:深入理解内核参数和JVM调优
Kafka是现代数据架构的核心组件,掌握它对于构建高性能、可扩展的数据驱动系统至关重要。