kafka topic分区数设定

创建一个 1副本1分区的topic

复制代码
kafka-topics.sh --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1 

对这个topic生产吞吐量测试

复制代码
# 生产者吞吐量测试工具
kafka-producer-perf-test.sh  # 生产者测试
--topic test # topic
--num-records 300000 # 总共300000条数据 
--record-size 1000 # 每条1000字节,约1kb
--throughput 100000 # 每次发送100000条记录
--producer-props bootstrap.servers=localhost:9092 # 指定服务
​
复制代码
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 100000 --producer-props bootstrap.servers=localhost:9092
160273 records sent, 32016.2 records/sec (30.53 MB/sec), 660.6 ms avg latency, 1107.0 ms max latency.
126080 records sent, 25100.5 records/sec (23.94 MB/sec), 1261.6 ms avg latency, 1945.0 ms max latency.
300000 records sent, 28097.780275 records/sec (26.80 MB/sec), 940.64 ms avg latency, 1945.00 ms max latency, 911 ms 50th, 1606 ms 95th, 1883 ms 99th, 1938 ms 99.9th.
# 测试结果:30.53 23.94 26.80
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 50000 --producer-props bootstrap.servers=localhost:9092  
151761 records sent, 30352.2 records/sec (28.95 MB/sec), 632.4 ms avg latency, 1495.0 ms max latency.
133584 records sent, 26716.8 records/sec (25.48 MB/sec), 1322.6 ms avg latency, 1696.0 ms max latency.
300000 records sent, 29571.217348 records/sec (28.20 MB/sec), 934.69 ms avg latency, 1696.00 ms max latency, 957 ms 50th, 1589 ms 95th, 1650 ms 99th, 1692 ms 99.9th.
# 测试结果:28.95 25.48 28.20
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 70000 --producer-props bootstrap.servers=localhost:9092 
167185 records sent, 33437.0 records/sec (31.89 MB/sec), 585.8 ms avg latency, 1241.0 ms max latency.
300000 records sent, 35756.853397 records/sec (34.10 MB/sec), 712.25 ms avg latency, 1318.00 ms max latency, 785 ms 50th, 1202 ms 95th, 1310 ms 99th, 1317 ms 99.9th.
# 测试结果:31.89 34.10   前后对比可以得知,每次发送70000条记录时效率最高,此时吞吐量 70000*1KB/s 70MB/s
​
​
​
[root@dream1 ~]# kafka-producer-perf-test.sh --num-records 300000 --record-size 1000 --topic test --throughput 80000 --producer-props bootstrap.servers=localhost:9092 
154609 records sent, 30500.9 records/sec (29.09 MB/sec), 615.7 ms avg latency, 1852.0 ms max latency.
300000 records sent, 35545.023697 records/sec (33.90 MB/sec), 773.73 ms avg latency, 1878.00 ms max latency, 632 ms 50th, 1521 ms 95th, 1869 ms 99th, 1874 ms 99.9th.
# 测试结果:29.09 33.90

对这个topic消费吞吐量测试

复制代码
# 消费者吞吐量测试工具
kafka-consumer-perf-test.sh # 消费者测试
--topic test # topic
--broker-list localhost:9092 # broker 
--messages 300000 # 总共消费多少数据300000
--threads 6 # 6个线程消费
复制代码
​
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 1
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:56:52:402, 2024-01-11 09:56:55:365, 286.1099, 96.5609, 300008, 101251.4344, 817, 2146, 133.3224, 139798.6952
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 3
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:00:232, 2024-01-11 09:57:03:485, 286.1099, 100.9526, 300008, 92225.0231, 626, 2627, 108.9113, 114201.7510
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 10
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:39:562, 2024-01-11 09:57:42:293, 286.1099, 104.7638, 300008, 109852.8012, 619, 2112, 135.4687, 142049.2424
[root@dream1 ~]# kafka-consumer-perf-test.sh --topic test --broker-list localhost:9092 --messages 300000 --threads 15
WARNING: option [threads] and [num-fetch-threads] have been deprecated and will be ignored by the test
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-11 09:57:52:771, 2024-01-11 09:57:55:798, 286.1099, 94.5193, 300008, 99110.6706, 579, 2448, 116.8750, 122552.287
​
# 消费者测试的时候只需调整 thread 对MB.sec取平均即可,我的测试结果大约100M

计算分区数

  • 然后假设总的目标吞吐量是Tt,那么分区数=Tt / min(Tp,Tc)
  • 例如:producer吞吐量 = 70m/s;consumer吞吐量 =100m/s,期望吞吐量 300m/s;
  • 分区数 = 300 / 70 = 4或者5个分区
相关推荐
蒋星熠13 分钟前
分布式计算深度解析:从理论到实践的技术探索
分布式·机器学习·spark·自动化·云计算·边缘计算·mapreduce
Gss7772 小时前
Kafka 相关内容总结
分布式·kafka
摇滚侠6 小时前
Spring Boot3零基础教程,KafkaTemplate 发送消息,笔记77
java·spring boot·笔记·后端·kafka
小小的木头人8 小时前
Windows Docker desktop 部署
运维·kafka
一晌小贪欢13 小时前
Python爬虫第10课:分布式爬虫架构与Scrapy-Redis
分布式·爬虫·python·网络爬虫·python爬虫·python3
摇滚侠17 小时前
Spring Boot3零基础教程,监听 Kafka 消息,笔记78
spring boot·笔记·kafka
摇滚侠19 小时前
Spring Boot3零基础教程,Kafka 小结,笔记79
spring boot·笔记·kafka
沐浴露z21 小时前
一篇文章详解Kafka Broker
java·分布式·kafka
pythonpioneer1 天前
Ray Tune 强大的分布式超参数调优框架
分布式·其他
笨蛋少年派1 天前
Hadoop High Availability 简介
大数据·hadoop·分布式