Apache Flume（5）：多个agent模型

可以将多个Flume agent 程序连接在一起，其中一个agent的sink将数据发送到另一个agent的source。Avro文件格式是使用Flume通过网络发送数据的标准方法。

从多个Web服务器收集日志，发送到一个或多个集中处理的agent，之后再发往日志存储中心：

同样的日志发送到不同的目的地：

将前面两个示例组合应用

第一个agent从Netcat接收数据，增加一个channel和sink，将这个sink发送到第二个agent

第二个agent在监控文件变化的同时监控从sink发送来的事件，最终输出到控制台

使用Avro Sink，必须设置以下属性

|----------|-----|--------------|
| 属性名 | 默认值 | Description |
| channel | -- | |
| type | -- | avro |
| hostname | -- | 绑定的主机名或者IP地址 |
| port | -- | 监听端口 |

使用Avro Source，必须设置以下属性

|----------|-----|--------------|
| 属性名 | 默认值 | 说明 |
| channels | -- | |
| type | -- | avro |
| bind | -- | 绑定的主机名或者IP地址 |
| port | -- | 监听端口 |

添加agent1配置文件

复制代码

# 定义agent名称为a1
# 设置3个组件的名称
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2

# 配置source类型为NetCat,监听地址为本机，端口为44444
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# 配置sink1类型为Logger
a1.sinks.k1.type = logger
# 配置sink2类型为Avro
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = 192.168.85.132
a1.sinks.k2.port = 55555

# 配置channel类型为内存，内存队列最大容量为1000，一个事务中从source接收的Events数量或者发送给sink的Events数量最大为100
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

# 将source和sink绑定到channel上
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

添加agent2配置文件

复制代码

# 定义agent名称为a2
# 设置3个组件的名称
a2.sources = r1 r2
a2.sinks = k1
a2.channels = c1

# 配置source类型为exec,命令为 tail -F app.log
a2.sources.r1.type = exec
a2.sources.r1.command = tail -F app.log

# 配置source类型为avro
a2.sources.r2.type = avro
a2.sources.r2.bind = 192.168.85.132
a2.sources.r2.port = 55555

# 配置sink类型为Logger
a2.sinks.k1.type = logger

# 配置channel类型为内存，内存队列最大容量为1000，一个事务中从source接收的Events数量或者发送给sink的Events数量最大为100
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

# 将source和sink绑定到channel上
a2.sources.r1.channels = c1
a2.sources.r2.channels = c1
a2.sinks.k1.channel = c1

启动agent1和agent2

复制代码

flume-ng agent -n a1 -c conf -f agent1.conf
flume-ng agent -n a2 -c conf -f agent2.conf

先往app.log中写入日志，可以在agent2看到最新数据

打开Netcat连接到44444，发送数据，可以同时在agent1和agent2看到最新数据。