1.环境介绍

如图所示,NODEJS做为数据源的的产生者产生消息,发到Kafka队列,然后参见红线,表示本地开发的环境下数据的流向(本地开发时,storm topology运行在本地模式)

2.搭建环境,我采用的是eclipse+maven

1.建立一个maven工程, 然后将pom文件修改如下:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.h3c.storm</groupId><artifactId>storm-samples</artifactId><packaging>jar</packaging><version>1.0-SNAPSHOT</version><name>storm-kafka-test</name><url>http://maven.apache.org</url><dependencies><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>3.8.1</version><scope>test</scope></dependency><dependency>  <groupId>jdk.tools</groupId>  <artifactId>jdk.tools</artifactId>  <version>1.7</version>  <scope>system</scope>  <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>  </dependency>    <dependency><groupId>org.apache.storm</groupId><artifactId>storm-core</artifactId><version>0.10.0</version><!-- keep storm out of the jar-with-dependencies --><scope>provided</scope></dependency><dependency><groupId>org.apache.kafka</groupId><artifactId>kafka_2.9.2</artifactId><version>0.8.1.1</version><exclusions><exclusion><groupId>org.apache.zookeeper</groupId><artifactId>zookeeper</artifactId></exclusion><exclusion><groupId>log4j</groupId><artifactId>log4j</artifactId></exclusion></exclusions></dependency><dependency>  <groupId>org.apache.storm</groupId>  <artifactId>storm-kafka</artifactId>  <version>0.9.2-incubating</version>  </dependency><dependency><groupId>org.apache.storm</groupId><artifactId>storm-hbase</artifactId><version>0.10.0</version></dependency>  </dependencies>
</project>

View Code

2.nodeJS发消息的示例代码,当然,首先要手动在kafka里新建一个topic对应代码里的topic,我这里创建的topic是"historyclients"

var kafka = require('kafka-node');
var Producer = kafka.Producer;
var KeyedMessage = kafka.KeyedMessage;
var conf = '172.27.8.111:2181,172.27.8.112:2181,172.27.8.119:2181';
var client = new kafka.Client(conf);
var producer = new Producer(client);var clientOnlineInfo ={"clientMAC":"0000-0000-0002","acSN":"210235A1AMB159000008","onLineTime":"2016-06-27 10:00:00"};var clientOnlineInfoStr = JSON.stringify(clientOnlineInfo);var msg = [{ topic: 'historyclients', messages: clientOnlineInfoStr, partition: 0 }
];producer.on('ready', function () {producer.send(msg, function (err, data) {console.log("done!")console.log(data);});
});producer.on('error', function (err) {console.error(err);
});

View Code

3.spout代码

package com.h3c.storm;import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichSpout;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;public class KafkaSpout extends BaseRichSpout{private SpoutOutputCollector collector;private  ConsumerConnector consumer; private  String topic; Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap;private static ConsumerConfig createConsumerConfig()  {  Properties props = new Properties();  props.put("zookeeper.connect", "172.27.8.111:2181,172.27.8.112:2181,172.27.8.119:2181");  props.put("group.id", "group1");  props.put("zookeeper.session.timeout.ms", "40000");  props.put("zookeeper.sync.time.ms", "200");  props.put("auto.commit.interval.ms", "1000");  return new ConsumerConfig(props);  }   @Overridepublic void open(Map conf, TopologyContext context,SpoutOutputCollector collector) {System.err.println("open!!!!!!!!!!!!!!!");this.collector = collector;/* create consumer */this.topic = "historyclients";this.consumer = kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig()); /* topic HashMap,which means the map can include multiple topics */Map<String, Integer> topicCountMap = new HashMap<String, Integer>();topicCountMap.put(topic, new Integer(1));  this.consumerMap = consumer.createMessageStreams(topicCountMap);  }@Overridepublic void nextTuple() {KafkaStream<byte[], byte[]> stream = consumerMap.get(topic).get(0);  ConsumerIterator<byte[], byte[]> it = stream.iterator(); String toSay = "";while (it.hasNext()) {toSay = new String(it.next().message());System.err.println("receive:" + toSay);  this.collector.emit(new Values(toSay));}              }@Overridepublic void declareOutputFields(OutputFieldsDeclarer declarer) {declarer.declare(new Fields("clientInfo"));}
}

View Code

4.storm-hbase API 中要求实现的mapper代码

package com.h3c.storm;import org.apache.storm.hbase.bolt.mapper.HBaseMapper;
import org.apache.storm.hbase.common.ColumnList;import backtype.storm.tuple.Tuple;public class MyHBaseMapper implements HBaseMapper {public ColumnList columns(Tuple tuple) {ColumnList cols = new ColumnList();//参数依次是列族名,列名,值cols.addColumn("f1".getBytes(), "colMAC".getBytes(), tuple.getStringByField("clientInfo").getBytes());//System.err.println("BOLT + " + tuple.getStringByField("clientInfo"));//cols.addColumn("f1".getBytes(), "hhhhhhh".getBytes(), "0000-0000-0001".getBytes());//System.err.println("BOLT + " + tuple.getStringByField("clientInfo"));return cols;}public byte[] rowKey(Tuple tuple) {//return tuple.getStringByField("clientInfo").getBytes();return "newRowKey".getBytes(); }}

Mapper

5.topology代码

package com.h3c.storm;import java.util.Map;
import java.util.Random;import org.apache.storm.hbase.bolt.HBaseBolt;
import org.apache.storm.hbase.bolt.mapper.HBaseMapper;import java.util.HashMap;
import java.util.List;  import java.util.Properties;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.BasicOutputCollector;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.TopologyBuilder;
import backtype.storm.topology.base.*;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Tuple;
import backtype.storm.tuple.Values;
import backtype.storm.utils.Utils;
import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.ConsumerTimeoutException;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
import kafka.message.MessageAndMetadata;public class PersistTopology {private static final String KAFKA_SPOUT = "KAFKA_SPOUT";private static final String HBASE_BOLT = "HBASE_BOLT";public static void main(String[] args) throws Exception {/* define spout */KafkaSpout kafkaSpout = new KafkaSpout();System.setProperty("hadoop.home.dir", "E:\\eclipse\\");/* define HBASE Bolt */HBaseMapper mapper = new MyHBaseMapper();HBaseBolt hbaseBolt = new HBaseBolt("historyclients", mapper).withConfigKey("hbase.conf");/* define topology*/TopologyBuilder builder = new TopologyBuilder();builder.setSpout(KAFKA_SPOUT, kafkaSpout);builder.setBolt(HBASE_BOLT, hbaseBolt).shuffleGrouping(KAFKA_SPOUT);Config conf = new Config();conf.setDebug(true);Map<String, Object> hbConf = new HashMap<String, Object>();
//        if(args.length > 0){
//            hbConf.put("hbase.rootdir", args[0]);
//        }//hbConf.put("hbase.rootdir", "hdfs://172.27.8.111:8020/apps/hbase/data");conf.put("hbase.conf", hbConf);if (args != null && args.length > 0) {conf.setNumWorkers(3);StormSubmitter.submitTopology(args[0], conf, builder.createTopology());} else {LocalCluster cluster = new LocalCluster();cluster.submitTopology("test", conf, builder.createTopology());Utils.sleep(600000);cluster.killTopology("test");cluster.shutdown();}}
}

View Code

6.需要从集群中取中hbase-site.xml这个文件,加到项目里,在buildpath中可设置

7.在C:\Windows\System32\drivers\etc下把hosts文件加上到集群的IP与域名的映射

172.27.8.111 node1.hde.h3c.com node1
172.27.8.112 node2.hde.h3c.com node2
172.27.8.119 node3.hde.h3c.com node3

8. 出现java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.的解决办法

网上下载winutils.exe这个文件,找一个地方放好,比如我放在E:\eclipse\bin 下面,前面一定要有个“bin”

然后在代码里加上这句即可

System.setProperty("hadoop.home.dir", "E:\\eclipse\\");

参考文章

http://www.tuicool.com/articles/r6ZZBjU

转载于:https://www.cnblogs.com/zhengchunhao/p/5630052.html

nodejs+kafka+storm+hbase 开发相关推荐

  1. Kafka+Storm+HBase项目Demo(5)--topology,spout,bolt使用

    相关概念 1.Topologies 一个topology是spouts和bolts组成的图, 通过stream groupings将图中的spouts和bolts连接起来. 2.Streams 消息流 ...

  2. 大数据开发超高频面试题!大厂面试必看!包含Hadoop、zookeeper、Hive、flume、kafka、Hbase、flink、spark、数仓等

    大数据开发面试题 包含Hadoop.zookeeper.Hive.flume.kafka.Hbase.flink.spark.数仓等高频面试题. 数据来自原博主爬虫获取! 文章目录 大数据开发面试题 ...

  3. Flume+Kafka+Storm+Redis构建大数据实时处理系统:实时统计网站PV、UV+展示

    http://blog.51cto.com/xpleaf/2104160?cid=704690 1 大数据处理的常用方法 前面在我的另一篇文章中<大数据采集.清洗.处理:使用MapReduce进 ...

  4. Kafka+Storm+HDFS整合实践

    2019独角兽企业重金招聘Python工程师标准>>> 在基于Hadoop平台的很多应用场景中,我们需要对数据进行离线和实时分析,离线分析可以很容易地借助于Hive来实现统计分析,但 ...

  5. kfaka storm写入mysql_flume+kafka+storm+mysql架构设计

    序言 前段时间学习了storm,最近刚开blog,就把这些资料放上来供大家参考.这个框架用的组件基本都是最新稳定版本,flume-ng1.4+kafka0.8+storm0.9+mysql如果有需要测 ...

  6. mysql storm_flume+kafka+storm+mysql架构设计

    前段时间学习了storm,最近刚开blog,就把这些资料放上来供大家参考. 这个框架用的组件基本都是最新稳定版本,flume-ng1.4+kafka0.8+storm0.9+mysql(项目是mave ...

  7. 大数据之jstorm,storm,hbase,hadoop and so on

    大数据之jstorm,storm,hbase,hadoop and so on ActiveMQ_in_Action__最新版.pdf: http://www.t00y.com/file/766927 ...

  8. hbase开发环境搭建及运行hbase小实例(HBase 0.98.3新api)

    问题导读: 1.如何搭建hbase开发环境? 2.HTableDescriptor初始化产生了那些变化? 3.eclipse如何连接hbase集群? hbase开发环境搭建与hadoop开发环境搭建差 ...

  9. redis storm mysql_flume+kafka+storm+redis/mysql启动命令记录

     1.flume启动 bin/flume-ng agent --conf conf --conf-file conf/flume-conf.properties --name fks -Dflum ...

最新文章

  1. Java基础之多线程详细分析
  2. python登录网页版微信发送消息
  3. 使用 vue-cli 开发多页应用
  4. 大幅广告显示隐藏效果
  5. java doctype_HTML !DOCTYPE 声明 | 菜鸟教程
  6. java安卓开发——1.新项目搭建
  7. ToStringBuilder学习(一):常用方法介绍
  8. 分享8个超酷的HTML5相册动画应用
  9. Ubuntu20.04 截图工具推荐
  10. 黑鲨重装计算机安装无法继续,示例黑鲨装机大师装机失败无法开机怎么办?
  11. 因果推断英文书单整理及简介
  12. PR连接蓝牙后无声音
  13. java 热力图,热力图
  14. 大学十年(一个程序员的路程)(林锐博士)《1----9》【林锐的大学10年】
  15. 计算机虚拟化技术论文,虚拟化技术在计算机技术中的运用
  16. C#编程,获取当前时间为一年的第几周的一种方法。
  17. 主流消费级固态硬盘SSD接口
  18. 暗月内网渗透实战——项目七
  19. 配置Cross-Origin的几种方法
  20. android 2.3 wifi (二)

热门文章

  1. Flink并行度优先级_集群操作常用指令_运行组件_任务提交流程_数据流图变化过程
  2. skywalking 安装_SkyWalking全链路追踪利器
  3. 论文阅读 - Group Normalization
  4. 天池 在线编程 最佳利用率(二分查找 + 哈希)
  5. LeetCode 399. 除法求值(图的DFS搜索)
  6. EM(期望极大化)算法及其推广
  7. 贪心应用--汽车加油次数问题
  8. hal库开启中断关中断_STM32对HAL库的定时器中断
  9. iis无法读取配置文件_SpringBoot 有很多读取配置文件的方法,你知道几个? 静态方法读取呢?...
  10. python冒泡算法_python_冒泡算法