Kafka集成flume

1.flume作为生产者集成Kafka

kafka作为flume的sink,扮演消费者角色

1.1 flume配置文件

vim $kafka/jobs/flume-kafka.conf

bash 复制代码
# agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1 c2

# Describe/configure the source
a1.sources.r1.type = TAILDIR
#记录最后监控文件的断点的文件,此文件位置可不改
a1.sources.r1.positionFile =  /export/server/flume/job/data/tail_dir.json
a1.sources.r1.filegroups = f1 f2
a1.sources.r1.filegroups.f1 = /export/server/flume/job/data/.*file.*
a1.sources.r1.filegroups.f2 =/export/server/flume/job/data/.*log.*

# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = customers
a1.sinks.k1.kafka.bootstrap.servers =node1:9092,node2:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

1.2开启flume监控

flume-ng agent -n a1 -c conf/ -f /export/server/kafka/jobs/kafka-flume.conf

1.3开启Kafka消费者

kafka-console-consumer.sh --bootstrap-server node1:9092,node2:9092 --topic consumers --from-beginning

1.4生产数据

往被监控文件输入数据

ljr@node1 data\]$echo hello \>\>file2.txt \[ljr@node1 data\]$ echo ============== \>\>file2.txt 查看Kafka消费者 ![](https://img-blog.csdnimg.cn/direct/754950995683442593335d088d65a239.png) 可见Kafka集成flume生产者成功。 ### 2.flume作为消费者集成Kafka kafka作为flume的source,扮演生产者角色 #### 2.1flume配置文件 vim $kafka/jobs/flume-kafka.conf ```bash # agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource #注意不要大于channel transactionCapacity的值100 a1.sources.r1.batchSize = 50 a1.sources.r1.batchDurationMillis = 200 a1.sources.r1.kafka.bootstrap.servers =node1:9092, node1:9092 a1.sources.r1.kafka.topics = consumers a1.sources.r1.kafka.consumer.group.id = custom.g.id # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 #注意transactionCapacity的值不要小于sources batchSize的值50 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 ``` #### 2.2开启flume监控 flume-ng agent -n a1 -c conf/ -f /export/server/kafka/jobs/kafka-flume1.conf #### 2.3开启Kafka生产者并生产数据 kafka-console-producer.sh --bootstrap-server node1:9092,node2:9092 --topic consumers ![](https://img-blog.csdnimg.cn/direct/f4b9a37f761b411882d53fe141864136.png) 查看flume监控台 ![](https://img-blog.csdnimg.cn/direct/16b931bcd36445eb885dde1a8068d1f0.png) 可见Kafka集成flume消费者成功。

相关推荐
星辰_mya5 小时前
消息队列遇到Producer发送慢
分布式·kafka
AutoMQ6 小时前
一行配置让你的 Apache Kafka RTO 缩短一半
kafka
meijinmeng6 小时前
Kafka Ansible+Helm批量部署 压测 监控
kafka
lhxsir9 小时前
kafka数据异常记录
分布式·kafka
小宋10219 小时前
从 Kafka 告警到前端实时可见:SSE 在故障诊断平台中的一次完整落地实践
java·前端·kafka
DemonAvenger11 小时前
深入理解Kafka分区策略:实现数据均衡分布的最佳实践
性能优化·kafka·消息队列
予枫的编程笔记12 小时前
【Kafka进阶篇】Kafka延迟请求处理核心:时间轮算法拆解,比DelayQueue高效10倍
java·kafka·高并发·时间轮算法·delayqueue·延迟任务·timingwheel
笨蛋不要掉眼泪12 小时前
Spring Cloud Gateway 扩展:全局跨域配置
java·分布式·微服务·架构·gateway
新猿一马13 小时前
一文读懂kafka重平衡
kafka
正在走向自律13 小时前
高并发场景下一卡通系统数据库架构设计与实践
数据库·分布式·一卡通系统