Kafka集成flume

1.flume作为生产者集成Kafka

kafka作为flume的sink,扮演消费者角色

1.1 flume配置文件

vim $kafka/jobs/flume-kafka.conf

bash 复制代码
# agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1 c2

# Describe/configure the source
a1.sources.r1.type = TAILDIR
#记录最后监控文件的断点的文件,此文件位置可不改
a1.sources.r1.positionFile =  /export/server/flume/job/data/tail_dir.json
a1.sources.r1.filegroups = f1 f2
a1.sources.r1.filegroups.f1 = /export/server/flume/job/data/.*file.*
a1.sources.r1.filegroups.f2 =/export/server/flume/job/data/.*log.*

# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = customers
a1.sinks.k1.kafka.bootstrap.servers =node1:9092,node2:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

1.2开启flume监控

flume-ng agent -n a1 -c conf/ -f /export/server/kafka/jobs/kafka-flume.conf

1.3开启Kafka消费者

kafka-console-consumer.sh --bootstrap-server node1:9092,node2:9092 --topic consumers --from-beginning

1.4生产数据

往被监控文件输入数据

ljr@node1 data\]$echo hello \>\>file2.txt \[ljr@node1 data\]$ echo ============== \>\>file2.txt 查看Kafka消费者 ![](https://img-blog.csdnimg.cn/direct/754950995683442593335d088d65a239.png) 可见Kafka集成flume生产者成功。 ### 2.flume作为消费者集成Kafka kafka作为flume的source,扮演生产者角色 #### 2.1flume配置文件 vim $kafka/jobs/flume-kafka.conf ```bash # agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource #注意不要大于channel transactionCapacity的值100 a1.sources.r1.batchSize = 50 a1.sources.r1.batchDurationMillis = 200 a1.sources.r1.kafka.bootstrap.servers =node1:9092, node1:9092 a1.sources.r1.kafka.topics = consumers a1.sources.r1.kafka.consumer.group.id = custom.g.id # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 #注意transactionCapacity的值不要小于sources batchSize的值50 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 ``` #### 2.2开启flume监控 flume-ng agent -n a1 -c conf/ -f /export/server/kafka/jobs/kafka-flume1.conf #### 2.3开启Kafka生产者并生产数据 kafka-console-producer.sh --bootstrap-server node1:9092,node2:9092 --topic consumers ![](https://img-blog.csdnimg.cn/direct/f4b9a37f761b411882d53fe141864136.png) 查看flume监控台 ![](https://img-blog.csdnimg.cn/direct/16b931bcd36445eb885dde1a8068d1f0.png) 可见Kafka集成flume消费者成功。

相关推荐
一只程序汪17 分钟前
【如何实现分布式压测中间件】
分布式·中间件
William一直在路上1 小时前
主流分布式中间件及其选型
分布式·中间件
茫茫人海一粒沙1 小时前
理解 Confluent Schema Registry:Kafka 生态中的结构化数据守护者
分布式·kafka
weixin_438335401 小时前
分布式定时任务:Elastic-Job-Lite
分布式·elasticjoblite
dessler4 小时前
Kafka-消费者(Consumer)和消费者组(Consumer Group)
linux·运维·kafka
hjs_deeplearning5 小时前
认知篇#10:何为分布式与多智能体?二者联系?
人工智能·分布式·深度学习·学习·agent·智能体
小毛驴8505 小时前
Windows 环境下设置 RabbitMQ 的 consumer_timeout 参数
windows·分布式·rabbitmq
述雾学java7 小时前
Spring Cloud 服务追踪实战:使用 Zipkin 构建分布式链路追踪
分布式·spring·spring cloud·zipkin
大只鹅7 小时前
分布式部署下如何做接口防抖---使用分布式锁
redis·分布式
weixin_438335407 小时前
分布式定时任务:xxl-job
分布式