kafka topic分区数设定

创建一个 1副本1分区的topic

复制代码
kafka-topics.sh --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1 

对这个topic生产吞吐量测试

复制代码
# 生产者吞吐量测试工具
kafka-producer-perf-test.sh  # 生产者测试
--topic test # topic
--num-records 300000 # 总共300000条数据 
--record-size 1000 # 每条1000字节,约1kb
--throughput 100000 # 每次发送100000条记录
--producer-props bootstrap.servers=localhost:9092 # 指定服务
​
复制代码
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 100000 --producer-props bootstrap.servers=localhost:9092
160273 records sent, 32016.2 records/sec (30.53 MB/sec), 660.6 ms avg latency, 1107.0 ms max latency.
126080 records sent, 25100.5 records/sec (23.94 MB/sec), 1261.6 ms avg latency, 1945.0 ms max latency.
300000 records sent, 28097.780275 records/sec (26.80 MB/sec), 940.64 ms avg latency, 1945.00 ms max latency, 911 ms 50th, 1606 ms 95th, 1883 ms 99th, 1938 ms 99.9th.
# 测试结果:30.53 23.94 26.80
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 50000 --producer-props bootstrap.servers=localhost:9092  
151761 records sent, 30352.2 records/sec (28.95 MB/sec), 632.4 ms avg latency, 1495.0 ms max latency.
133584 records sent, 26716.8 records/sec (25.48 MB/sec), 1322.6 ms avg latency, 1696.0 ms max latency.
300000 records sent, 29571.217348 records/sec (28.20 MB/sec), 934.69 ms avg latency, 1696.00 ms max latency, 957 ms 50th, 1589 ms 95th, 1650 ms 99th, 1692 ms 99.9th.
# 测试结果:28.95 25.48 28.20
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 70000 --producer-props bootstrap.servers=localhost:9092 
167185 records sent, 33437.0 records/sec (31.89 MB/sec), 585.8 ms avg latency, 1241.0 ms max latency.
300000 records sent, 35756.853397 records/sec (34.10 MB/sec), 712.25 ms avg latency, 1318.00 ms max latency, 785 ms 50th, 1202 ms 95th, 1310 ms 99th, 1317 ms 99.9th.
# 测试结果:31.89 34.10   前后对比可以得知,每次发送70000条记录时效率最高,此时吞吐量 70000*1KB/s 70MB/s
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 80000 --producer-props bootstrap.servers=localhost:9092 
154609 records sent, 30500.9 records/sec (29.09 MB/sec), 615.7 ms avg latency, 1852.0 ms max latency.
300000 records sent, 35545.023697 records/sec (33.90 MB/sec), 773.73 ms avg latency, 1878.00 ms max latency, 632 ms 50th, 1521 ms 95th, 1869 ms 99th, 1874 ms 99.9th.
# 测试结果:29.09 33.90

对这个topic消费吞吐量测试

复制代码
# 消费者吞吐量测试工具
kafka-consumer-perf-test.sh # 消费者测试
--topic test # topic
--broker-list localhost:9092 # broker 
--messages 300000 # 总共消费多少数据300000
--threads 6 # 6个线程消费
复制代码
​
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 1
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:56:52:402, 2024-01-11 09:56:55:365, 286.1099, 96.5609, 300008, 101251.4344, 817, 2146, 133.3224, 139798.6952
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 3
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:00:232, 2024-01-11 09:57:03:485, 286.1099, 100.9526, 300008, 92225.0231, 626, 2627, 108.9113, 114201.7510
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 10
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:39:562, 2024-01-11 09:57:42:293, 286.1099, 104.7638, 300008, 109852.8012, 619, 2112, 135.4687, 142049.2424
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 15
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:52:771, 2024-01-11 09:57:55:798, 286.1099, 94.5193, 300008, 99110.6706, 579, 2448, 116.8750, 122552.287
​
# 消费者测试的时候只需调整 thread 对MB.sec取平均即可,我的测试结果大约100M

计算分区数

  • 然后假设总的目标吞吐量是Tt,那么分区数=Tt / min(Tp,Tc)
  • 例如:producer吞吐量 = 70m/s;consumer吞吐量 =100m/s,期望吞吐量 300m/s;
  • 分区数 = 300 / 70 = 4或者5个分区
相关推荐
亲爱的非洲野猪31 分钟前
怎么理解使用MQ解决分布式事务 -- 以kafka为例
分布式·kafka
黄雪超1 小时前
Kafka——消费者组重平衡全流程解析
大数据·分布式·kafka
黄雪超1 小时前
Kafka——Kafka控制器
大数据·分布式·kafka
不辉放弃1 小时前
kafka的消息存储机制和查询机制
数据库·kafka·pyspark·大数据开发
IT闫1 小时前
《深入剖析Kafka分布式消息队列架构奥秘》之Kafka基本知识介绍
分布式·架构·kafka
Aomnitrix4 小时前
【分布式版本控制系统】Git的使用
分布式·git
conkl4 小时前
构建 P2P 网络与分布式下载系统:从底层原理到安装和功能实现
linux·运维·网络·分布式·网络协议·算法·p2p
孟婆来包棒棒糖~10 小时前
SpringCloude快速入门
分布式·后端·spring cloud·微服务·wpf
cui_win12 小时前
Kafka运维实战 14 - kafka消费者组消费进度(Lag)深入理解【实战】
分布式·kafka
梦想画家13 小时前
Apache Kafka实时数据流处理实战指南
分布式·kafka·apache