kafka leader traffic 重平衡

To rebalance broker traffic after a leader imbalance in Apache Kafka, you need to redistribute partition leaders more evenly across your brokers. This ensures that no broker becomes a bottleneck due to hosting too many partition leaders (which handle all client requests).


🧭 Why Leader Imbalance Happens

Leader imbalance can occur when:

  • New brokers are added
  • Broker restarts or failures occur
  • Uneven topic partition assignments
  • Manual reassignment was skipped or misconfigured

✅ Solutions to Rebalance Leader Traffic

✅ Option 1: Use Kafka's Built-In Auto Balancer (Kafka 2.4+)

Kafka has an auto leader rebalance feature.

Check if auto rebalance is enabled:

bash 复制代码
kafka-configs.sh --bootstrap-server <broker>:9092 \
  --entity-type brokers \
  --entity-name <broker-id> \
  --all \
  --describe | grep leader

Look for:

text 复制代码
auto.leader.rebalance.enable=true

Example:

bash 复制代码
bash-4.4$ kafka-configs --bootstrap-server localhost:9092 --entity-type brokers --entity-name 1 --all  --describe | grep leader
  auto.leader.rebalance.enable=true sensitive=false synonyms={DEFAULT_CONFIG:auto.leader.rebalance.enable=true}
  leader.imbalance.check.interval.seconds=300 sensitive=false synonyms={DEFAULT_CONFIG:leader.imbalance.check.interval.seconds=300}
  leader.imbalance.per.broker.percentage=10 sensitive=false synonyms={DEFAULT_CONFIG:leader.imbalance.per.broker.percentage=10}
  unclean.leader.election.enable=false sensitive=false synonyms={DEFAULT_CONFIG:unclean.leader.election.enable=false}

This is the default setting, but if it's false, enable it:

bash 复制代码
kafka-configs.sh --bootstrap-server <broker>:9092 \
  --entity-type brokers \
  --entity-name <broker-id> \
  --alter \
  --add-config auto.leader.rebalance.enable=true

Kafka will then periodically rebalance leaders using leader.imbalance.check.interval.seconds.

重平衡过程:

系统会自动将其他broker上的一些leader转移到新broker,

前提是新broker必须已经是该partition的follower。

转移是渐进式的,不会一次性全部迁移。

自动 Leader 重平衡 仅调整 Leader 角色,不会自动调整 Partition 的副本分配(即不会自动把 Follower 迁移到新 Broker)。


✅ Option 2: Trigger Manual Leader Reassignment

Use the built-in script:

bash 复制代码
kafka-preferred-replica-election.sh --bootstrap-server <broker>:9092

This command triggers preferred leader election . Kafka will try to move partition leaders to their preferred replicas (usually the first broker in the replica list).

This is useful if leader imbalance occurred due to broker failures or restarts.


✅ Option 3: Full Partition Reassignment (Advanced)

If imbalance persists, you may need a full partition reassignment to redistribute both partition leaders and replicas.

Steps:

a) Generate current assignment:
bash 复制代码
kafka-reassign-partitions.sh \
  --bootstrap-server <broker>:9092 \
  --generate \
  --topics-to-move-json-file topics.json \
  --broker-list "1,2,3"

topics.json format:

json 复制代码
{
  "topics": [{"topic": "your-topic-name"}],
  "version": 1
}
b) Review the proposed plan, then execute:
bash 复制代码
kafka-reassign-partitions.sh \
  --bootstrap-server <broker>:9092 \
  --execute \
  --reassignment-json-file reassignment.json

✅ Option 4: Use Cruise Control (Recommended for Large Clusters)

LinkedIn's Cruise Control is a powerful tool to automate:

  • Traffic balancing
  • Leader and replica assignment
  • Self-healing after broker failure

📌 Summary

Method Use When Command
Auto Leader Rebalance Minor imbalance auto.leader.rebalance.enable=true
Manual Election Leader drift after broker restart kafka-preferred-replica-election.sh
Full Reassignment Major imbalance or cluster expansion kafka-reassign-partitions.sh
Cruise Control Ongoing traffic management Cruise Control API/UI
相关推荐
JAVA面经实录9178 小时前
MQ高频面试题标准答案(Java后端/架构面试背诵版)
java·面试·架构·kafka·rabbitmq
Curvatureflight9 小时前
接口幂等性设计:如何避免重复提交、重复扣款和消息重复消费?
分布式·后端·架构
Kyrie_Li10 小时前
Kafka-基础知识总结
运维·分布式·kafka
江华森11 小时前
XXL-JOB 分布式任务调度平台深度学习指南
分布式
Devin~Y14 小时前
从Spring Boot到AI Agent:大厂Java微服务面试三轮实战问答解析
java·spring boot·redis·spring cloud·微服务·ai·kafka
m0_7360348514 小时前
ceph分布式存储
分布式·ceph
Tenifs15 小时前
深入对比分析 RabbitMQ、RocketMQ 和 Kafka
后端·kafka·消息队列·rabbitmq·rocketmq·爱编程的阿彬
冷色调的咖啡师15 小时前
1.大数据架构技术 上——搭建分布式Hadoop集群
大数据·linux·hadoop·分布式·hdfs·架构·yarn
Rick199315 小时前
Kafka、RocketMQ、RabbitMQ 三大消息队列
kafka·rabbitmq·rocketmq
坤昱1 天前
cfs调度类深入解刨——最新内核细节分析5
linux·分布式·cfs调度·eevdf调度·linux调度·linux技术·kernel最新版本内容