数据采集工具之Flume

本文主要实现数据到datahub的采集过程

1、下载

Index of /dist/flume/1.11.0

datahub插件下载

https://aliyun-datahub.oss-cn-hangzhou.aliyuncs.com/tools/aliyun-flume-datahub-sink-2.0.9.tar.gz

2、安装

$ tar aliyun-flume-datahub-sink-x.x.x.tar.gz
$ cd aliyun-flume-datahub-sink-x.x.x
$ mkdir ${FLUME_HOME}/plugins.d
$ mv aliyun-flume-datahub-sink ${FLUME_HOME}/plugins.d

3、编写配置文件

# A single-node Flume configuration for DataHub
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /soft/data/test.csv
# Describe the sink
a1.sinks.k1.type = com.aliyun.datahub.flume.sink.DatahubSink
a1.sinks.k1.datahub.accessId = 2Z8tAOpDPBm5LEkA
a1.sinks.k1.datahub.accessKey = Tlupsw2G0PdKGCRyPLucHjeESqoCla
a1.sinks.k1.datahub.endPoint = https://datahub.cn-beijing-tbdg-d01.dh.res.bigdata.tbea.com
a1.sinks.k1.datahub.project = bigdata
a1.sinks.k1.datahub.topic = txt_flume
a1.sinks.k1.serializer = DELIMITED
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = id,name,gender,salary,my_time,decimal
a1.sinks.k1.serializer.charset = UTF-8
a1.sinks.k1.datahub.retryTimes = 5
a1.sinks.k1.datahub.retryInterval = 5
a1.sinks.k1.datahub.batchSize = 100
a1.sinks.k1.datahub.batchTimeout = 5
a1.sinks.k1.datahub.enablePb = true
a1.sinks.k1.datahub.compressType = DEFLATE
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

4、启动

flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console

Q:启动报错

[root@hadoop2 apache-flume-1.11.0-bin]# flume-ng agent -n a1 -c conf -f ./conf/flume-txt2datahub.conf -Dflume.root.logger=INFO,console
Info: Including Hive libraries found via () for Hive access
+ exec /soft/jdk1.8.0_421/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/soft/apache-flume-1.11.0-bin/conf:/soft/apache-flume-1.11.0-bin/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/lib/*:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -n a1 -f ./conf/flume-txt2datahub.conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/soft/apache-flume-1.11.0-bin/plugins.d/aliyun-flume-datahub-sink/libext/slf4j-log4j12-1.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkNotNull(Ljava/lang/Object;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
        at com.aliyun.datahub.flume.sink.DatahubSink.configure(DatahubSink.java:59)
        at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
        at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:456)
        at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:109)
        at org.apache.flume.node.Application.main(Application.java:491)

A:删除Flume lib文件夹中的guava jar包文件,重新启动

相关推荐
DolphinScheduler社区1 小时前
Apache DolphinScheduler + OceanBase,搭建分布式大数据调度平台的实践
大数据
时差9532 小时前
MapReduce 的 Shuffle 过程
大数据·mapreduce
kakwooi3 小时前
Hadoop---MapReduce(3)
大数据·hadoop·mapreduce
数新网络3 小时前
《深入浅出Apache Spark》系列②:Spark SQL原理精髓全解析
大数据·sql·spark
昨天今天明天好多天8 小时前
【数据仓库】
大数据
油头少年_w9 小时前
大数据导论及分布式存储HadoopHDFS入门
大数据·hadoop·hdfs
Elastic 中国社区官方博客10 小时前
释放专利力量:Patently 如何利用向量搜索和 NLP 简化协作
大数据·数据库·人工智能·elasticsearch·搜索引擎·自然语言处理
力姆泰克10 小时前
看电动缸是如何提高农机的自动化水平
大数据·运维·服务器·数据库·人工智能·自动化·1024程序员节
力姆泰克10 小时前
力姆泰克电动缸助力农业机械装备,提高农机的自动化水平
大数据·服务器·数据库·人工智能·1024程序员节
QYR市场调研10 小时前
自动化研磨领域的革新者:半自动与自动自磨机的技术突破
大数据·人工智能