flume + kafka

前提：

1、下载 flume http://flume.apache.org/download.html

2、下载配置 kafka http://www.cnblogs.com/eggplantpro/articles/8428932.html

3、服务器3台，我这边是5台

s1:10.211.55.16 zk&kafka zk是zookeeper

s2:10.211.55.17 zk

s3:10.211.55.18 zk

s4:10.211.55.19 kafka&flume

s5:10.211.55.20 kafka

安装：

1、解压

结构如上，也是中规中矩。显然，配置文件在 conf 下

2、配置

flume 的配置不同于其他的软件，flume是一种类型的服务，就是一种配置。conf 也有模板，建议配置对应的 sources 、channel、sink 都去官方指导文档里去找

http://flume.apache.org/FlumeUserGuide.html

vim flume-kafka.properties

这是官方给的demo，我们可以跟着demo改就好了

# example.conf: A single-node Flume configuration# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444# Describe the sink
a1.sinks.k1.type = logger# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

改好的配置

# example.conf: A single-node Flume configuration# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1# Describe/configure the source
#a1.sources.r1.type = netcat
#a1.sources.r1.bind = localhost
#a1.sources.r1.port = 44444a1.sources.r1.type = exec   # source 的类型是命令
a1.sources.r1.command = tail -F /home/test.log  #tail 一个日志 的命令。只要有日志写入，就会下沉到sink# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = mxb  # 这是 kafka topic的名称
a1.sinks.k1.kafka.bootstrap.servers = s1:9092 这是kafka的服务器地址和ip
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy# Use a channel which buffers events in memory
a1.channels.c1.type = memory #管道类型是 内存# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

　3、启动

../bin/flume-ng agent --conf conf --conf-file flume-kafka.properties --name a1 -Dflume.root.logger=INFO,console

　4、检验

在kafka的服务器上启动一个kafka的consumer

cd /kafka/bin
./kafka-console-consumer.sh --zookeeper s1:2181 --from-beginning --topic mxb

因为在flume source是 tail 一个日志，所以我们往 /home/test.log 写入内容即可

for((i=0;i<5000;i++))
do
echo test$i
done

执行写入日志的命令后，在启动 kafka-consumer 的服务器会看到消费的信息

转载于:https://www.cnblogs.com/chouc/p/8429324.html

flume + kafka相关推荐

Flume+Kafka+Spark Streaming+MySQL实时日志分析
文章目录项目背景案例需求一.分析 1.日志分析二.日志采集第一步.代码编辑 2.启动采集代码三.编写Spark Streaming的代码第一步创建工程第二步选择创建Scala工程 ...
Flume+Kafka+Storm+Redis构建大数据实时处理系统：实时统计网站PV、UV+展示
http://blog.51cto.com/xpleaf/2104160?cid=704690 1 大数据处理的常用方法前面在我的另一篇文章中<大数据采集.清洗.处理:使用MapReduce进 ...
Flume+Kafka双剑合璧玩转大数据平台日志采集
点击上方蓝色字体,选择"设为星标" 回复"资源"获取更多资源大数据技术与架构点击右侧关注,大数据开发领域最强公众号! 大数据真好玩点击右侧关注,大数据真好 ...
Flume+Kafka+Strom基于伪分布式环境的结合使用
--------------------------------------- 博文作者:迦壹博客地址:Flume+Kafka+Strom基于伪分布式环境的结合使用转载声明:可以转载, 但必须以超 ...
Flume+kafka+flink+es 构建大数据实时处理
大数据目前的处理方法有两种:一种是离线处理,一种是实时处理.如何构建我们自己的实时数据处理系统我们选用flume+kafka+flink+es来作为我们实时数据处理工具.因此我们的架构是: flume ...
flume+kafka消费数据【纯个人笔记】
1.数据生产使用java代码往一个文件中写入数据 package com.mobile;import java.io.*; import java.text.DecimalFormat; impor ...
记录完全分布式开发zookeeper hadoop flume kafka hbase
先写坑:练习都是在自己的用户下(非root用户),一定要随时注意自己的目录的权限,用sftp来上传最终做成的样子 | hadoop101 datanode namenode kafka flume ...
【python+flume+kafka+spark streaming】编写word_count入门示例
一. 整体架构的一些理解 1.整体架构的理解: 架构中的角色分为了数据采集,数据缓冲,还有数据处理. flume由于输入和输出的接口众多,于是利用这特点来实现无编程的数据采集. 无编程的数据采集,我是 ...
flume kafka storm mysql_flume+kafka+storm打通过程
0.有的地方我已经整理成脚本了,有的命令是脚本里面截取的 1.启动hadoop和yarn $HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start- ...

flume + kafka

flume + kafka相关推荐

最新文章

热门文章