1:上传文件到hdfs

[jifeng@feng01 hadoop]$ hadoop fs -mkdir /user/jifeng/fire
15/03/05 13:29:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[jifeng@feng01 hadoop]$ hadoop fs -put /home/jifeng/hadoop/firewallb-week2-origin.sql /user/jifeng/fire/
15/03/05 13:30:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[jifeng@feng01 hadoop]$ 

2:hive中建表

[jifeng@feng01 ~]$ hive
15/03/05 13:31:08 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* insteadLogging initialized using configuration in jar:file:/home/jifeng/hive/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive> CREATE EXTERNAL TABLE firewall (         idauto INT,         dt TIMESTAMP,         dtstr STRING,         timeSeconds INT,         priority STRING,         operation STRING,         messageCode STRING,         protocol STRING,         srcIp STRING,         destIp STRING,         srcIPNumber bigINT,         destIPNumber bigINT,         srcPort INT,         destPort INT,         destService STRING,         direction STRING,         flags STRING,         command STRING,         slot1min INT,         slot5min INT,         slot30min INT,         slot60min INT        ) row format delimited fields terminated by '\t';
OK
Time taken: 3.414 seconds
hive> load data inpath '/user/jifeng/fire/firewallb-week2-origin.sql' into table firewall;
Loading data to table default.firewall
Table default.firewall stats: [numFiles=1, numRows=0, totalSize=3415284428, rawDataSize=0]
OK
Time taken: 1.441 seconds

建表语句:

CREATE EXTERNAL TABLE firewall (  idauto INT,  dt TIMESTAMP,  dtstr STRING,  timeSeconds INT,  priority STRING,  operation STRING,  messageCode STRING,  protocol STRING,  srcIp STRING,  destIp STRING,  srcIPNumber bigINT,  destIPNumber bigINT,  srcPort INT,  destPort INT,  destService STRING,  direction STRING,  flags STRING,  command STRING,  slot1min INT,  slot5min INT,  slot30min INT,  slot60min INT   ) row format delimited fields terminated by '\t'; 

3:hive中测试

hive> select * from firewall order by dt desc limit 10;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1423061535604_0004, Tracking URL = http://feng01:8088/proxy/application_1423061535604_0004/
Kill Command = /home/jifeng/hadoop/hadoop-2.4.1/bin/hadoop job  -kill job_1423061535604_0004
Hadoop job information for Stage-1: number of mappers: 13; number of reducers: 1
2015-03-05 13:35:57,706 Stage-1 map = 0%,  reduce = 0%
2015-03-05 13:36:58,247 Stage-1 map = 0%,  reduce = 0%
2015-03-05 13:37:26,248 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 5.38 sec
2015-03-05 13:37:34,542 Stage-1 map = 8%,  reduce = 0%, Cumulative CPU 7.63 sec
2015-03-05 13:38:40,840 Stage-1 map = 8%,  reduce = 0%, Cumulative CPU 13.89 sec
2015-03-05 13:39:24,926 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 79.08 sec
2015-03-05 13:40:00,669 Stage-1 map = 13%,  reduce = 0%, Cumulative CPU 85.37 sec
2015-03-05 13:40:25,540 Stage-1 map = 15%,  reduce = 0%, Cumulative CPU 86.27 sec
2015-03-05 13:40:45,643 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU 107.73 sec
2015-03-05 13:40:46,742 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU 109.79 sec
2015-03-05 13:40:59,426 Stage-1 map = 36%,  reduce = 0%, Cumulative CPU 116.75 sec
2015-03-05 13:41:03,706 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 120.27 sec
2015-03-05 13:41:07,921 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 125.38 sec
2015-03-05 13:41:15,830 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 130.26 sec
2015-03-05 13:41:18,002 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 131.86 sec
2015-03-05 13:42:18,780 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 170.98 sec
2015-03-05 13:43:19,428 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 194.07 sec
2015-03-05 13:44:20,028 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 197.32 sec
2015-03-05 13:45:20,494 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 200.93 sec
2015-03-05 13:46:21,209 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 223.6 sec
2015-03-05 13:46:42,075 Stage-1 map = 54%,  reduce = 15%, Cumulative CPU 230.03 sec
2015-03-05 13:46:49,284 Stage-1 map = 56%,  reduce = 15%, Cumulative CPU 218.59 sec
2015-03-05 13:46:51,358 Stage-1 map = 60%,  reduce = 15%, Cumulative CPU 219.07 sec
2015-03-05 13:46:54,448 Stage-1 map = 63%,  reduce = 15%, Cumulative CPU 191.15 sec
2015-03-05 13:46:58,609 Stage-1 map = 64%,  reduce = 15%, Cumulative CPU 193.67 sec
2015-03-05 13:47:05,823 Stage-1 map = 67%,  reduce = 15%, Cumulative CPU 180.53 sec
2015-03-05 13:47:09,938 Stage-1 map = 69%,  reduce = 15%, Cumulative CPU 194.45 sec
2015-03-05 13:47:11,986 Stage-1 map = 69%,  reduce = 18%, Cumulative CPU 194.85 sec
2015-03-05 13:47:47,100 Stage-1 map = 72%,  reduce = 18%, Cumulative CPU 209.86 sec
2015-03-05 13:48:31,364 Stage-1 map = 77%,  reduce = 18%, Cumulative CPU 210.55 sec
2015-03-05 13:48:37,549 Stage-1 map = 77%,  reduce = 21%, Cumulative CPU 210.74 sec
2015-03-05 13:48:53,998 Stage-1 map = 82%,  reduce = 21%, Cumulative CPU 177.87 sec
2015-03-05 13:48:55,021 Stage-1 map = 82%,  reduce = 23%, Cumulative CPU 178.51 sec
2015-03-05 13:49:06,447 Stage-1 map = 90%,  reduce = 23%, Cumulative CPU 143.53 sec
2015-03-05 13:49:11,591 Stage-1 map = 90%,  reduce = 26%, Cumulative CPU 144.64 sec
2015-03-05 13:49:12,612 Stage-1 map = 95%,  reduce = 26%, Cumulative CPU 145.78 sec
2015-03-05 13:49:14,665 Stage-1 map = 95%,  reduce = 28%, Cumulative CPU 145.81 sec
2015-03-05 13:49:28,043 Stage-1 map = 97%,  reduce = 28%, Cumulative CPU 147.61 sec
2015-03-05 13:49:29,075 Stage-1 map = 100%,  reduce = 28%, Cumulative CPU 147.67 sec
2015-03-05 13:50:06,153 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 148.22 sec
2015-03-05 13:50:09,249 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 149.17 sec
MapReduce Total cumulative CPU time: 2 minutes 29 seconds 170 msec
Ended Job = job_1423061535604_0004
MapReduce Jobs Launched:
Job 0: Map: 13  Reduce: 1   Cumulative CPU: 149.68 sec   HDFS Read: 3418565375 HDFS Write: 2076 SUCCESS
Total MapReduce CPU Time Spent: 2 minutes 29 seconds 680 msec
OK
16600916        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.6.6.7 10.0.2.2        168166919       167772674       42465   52372   52372_tcp       (empty) (empty) (empty) 227670004553400 758900  379450
16600915        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.13.77.49      10.0.3.2        168643889       167772930       58200   32179   32179_tcp       (empty) (empty) (empty) 22767000 4553400 758900  379450
16600931        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.6.6.7 10.0.2.2        168166919       167772674       42464   33004   33004_tcp       (empty) (empty) (empty) 227670004553400 758900  379450
16600919        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.6.6.7 10.0.2.2        168166919       167772674       42465   38300   38300_tcp       (empty) (empty) (empty) 227670004553400 758900  379450
16600918        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.13.77.49      10.0.3.2        168643889       167772930       58199   25087   25087_tcp       (empty) (empty) (empty) 22767000 4553400 758900  379450
16600912        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.6.6.7 10.0.2.2        168166919       167772674       42465   556     remotefs        (empty) (empty) (empty) 227670004553400 758900  379450
16600913        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.13.77.49      10.0.3.2        168643889       167772930       58199   32179   32179_tcp       (empty) (empty) (empty) 22767000 4553400 758900  379450
16600920        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Info    Teardown        ASA-6-302014    TCP      10.6.6.7        172.10.0.2      168166919       2886336514      42706   3389    3389_tcp        inbound SYN Timeout      (empty) 22767000        4553400 758900  379450
16600930        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.13.77.49      10.0.3.2        168643889       167772930       58200   13817   13817_tcp       (empty) (empty) (empty) 22767000 4553400 758900  379450
16600911        2013-04-15 10:00:00     15/Apr/2013 10:00:00    1366020000      Warning Deny    ASA-4-106023    TCP     10.13.77.49      10.0.3.2        168643889       167772930       58200   1993    1993_tcp        (empty) (empty) (empty) 22767000 4553400 758900  379450
Time taken: 879.31 seconds, Fetched: 10 row(s)

4:启动spark sql server

[jifeng@feng02 spark-1.2.0-bin-2.4.1]$ ./sbin/start-thriftserver.sh  --hiveconf hive.server2.thrift.port=10000  --hiveconf hive.server2.thrift.bind.host=feng02 --master spark://feng02:7077
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /home/jifeng/hadoop/spark-1.2.0-bin-2.4.1/sbin/../logs/spark-jifeng-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-feng02.out

运行beeline

[jifeng@feng02 spark-1.2.0-bin-2.4.1]$ ./bin/beeline
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Beeline version ??? by Apache Hive

连接:

beeline> !connect jdbc:hive2://feng02:10000 scott tiger org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://feng02:10000
log4j:WARN No appenders could be found for logger (org.apache.thrift.transport.TSaslTransport).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Connected to: Spark SQL (version 1.2.0)
Driver: null (version null)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://feng02:10000> show tables;
+----------------+
|     result     |
+----------------+
| course         |
| firewall       |
| hbase_table_1  |
| pokes          |
| student        |
+----------------+
5 rows selected (0.489 seconds)

5:spark sql测试

0: jdbc:hive2://feng02:10000> select * from firewall order by dt desc limit 10;
+-----------+------------------------+-----------------------+--------------+-----------+------------+---------------+--+
|  idauto   |           dt           |         dtstr         | timeseconds  | priority  | operation  |  messagecode  |  |
+-----------+------------------------+-----------------------+--------------+-----------+------------+---------------+--+
| 16600918  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600914  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600917  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600912  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600913  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600915  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600916  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600911  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
| 16600920  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Info      | Teardown   | ASA-6-302014  |  |
| 16600919  | 2013-04-15 10:00:00.0  | 15/Apr/2013 10:00:00  | 1366020000   | Warning   | Deny       | ASA-4-106023  |  |
+-----------+------------------------+-----------------------+--------------+-----------+------------+---------------+--+
10 rows selected (119.692 seconds)

统计IP出现次数:

0: jdbc:hive2://feng02:10000> select srcIp,count(*) as num  from firewall  group by srcIp order by  num desc limit 100;
+-----------------+----------+
|      srcIp      |   num    |
+-----------------+----------+
| 10.12.15.152    | 3875019  |
| 10.13.77.49     | 1954742  |
| 10.6.6.7        | 1769857  |
| 10.138.235.111  | 1630920  |
| 10.12.14.15     | 756475   |
| 10.17.15.10     | 720936   |
| 172.10.0.4      | 482568   |
| 10.78.100.150   | 300530   |
| 10.247.58.182   | 292384   |
| 10.38.217.48    | 282588   |
| 172.30.0.4      | 232850   |
| 10.10.11.102    | 205579   |
| 10.170.32.110   | 202735   |
| 10.170.32.181   | 200539   |
| 10.138.214.18   | 197864   |
| 10.247.106.27   | 193912   |
| 10.156.165.120  | 183929   |
| 10.15.7.85      | 172422   |
| 10.0.0.42       | 165063   |
| 10.200.20.2     | 164519   |
| 10.70.68.127    | 161735   |
| 172.20.0.15     | 145924   |
| 10.250.178.101  | 140053   |
| 172.20.0.4      | 122768   |
| 172.30.1.223    | 51329    |
| 172.20.1.81     | 51328    |
| 172.10.2.66     | 51304    |
| 172.20.1.47     | 51298    |
| 172.30.1.218    | 51236    |
| 172.10.2.135    | 51233    |
| 172.20.1.23     | 51199    |
| 172.10.2.106    | 51174    |
| 172.20.0.3      | 20795    |
| 172.10.0.2      | 16296    |
| 10.13.77.4      | 15622    |
| 10.77.101.2     | 15307    |
| 10.77.24.4      | 15274    |
| 10.0.2.181      | 15264    |
| 10.0.3.77       | 15086    |
| 10.58.99.97     | 15021    |
| 10.83.101.25    | 14997    |
| 10.62.15.101    | 14930    |
| 10.1.55.43      | 14783    |
| 172.20.0.2      | 14754    |
| 10.27.200.49    | 14690    |
| 10.18.37.65     | 14645    |
| 10.43.51.79     | 14457    |
| 172.30.0.3      | 11113    |
| 172.30.0.2      | 10129    |
| 172.10.0.3      | 9598     |
| 172.10.0.9      | 3899     |
| 172.10.0.8      | 3852     |
| 172.20.0.6      | 3833     |
| 172.10.0.7      | 3820     |
| 172.10.0.5      | 3805     |
| 10.57.84.113    | 3420     |
| 10.57.105.222   | 3329     |
| 10.170.32.239   | 3231     |
| 172.30.1.215    | 2973     |
| 172.30.1.248    | 2908     |
| 172.30.1.213    | 2890     |
| 172.30.1.217    | 2713     |
| 172.30.1.210    | 2686     |
| 172.20.0.5      | 2668     |
| 172.30.1.205    | 2639     |
| 10.76.187.67    | 2609     |
| 172.30.1.204    | 2556     |
| 172.30.1.227    | 2553     |
| 172.30.1.242    | 2525     |
| 172.30.1.236    | 2521     |
| 10.156.165.212  | 2513     |
| 10.88.52.24     | 2513     |
| 172.30.1.247    | 2513     |
| 10.41.9.89      | 2509     |
| 10.89.101.53    | 2503     |
| 10.88.25.4      | 2494     |
| 172.30.1.243    | 2473     |
| 172.30.1.245    | 2457     |
| 172.30.1.241    | 2442     |
| 10.51.88.167    | 2418     |
| 172.30.1.202    | 2415     |
| 172.30.1.237    | 2414     |
| 10.0.3.178      | 2414     |
| 10.36.156.99    | 2411     |
| 10.32.42.116    | 2410     |
| 10.78.90.9      | 2408     |
| 172.30.1.229    | 2408     |
| 10.33.0.15      | 2406     |
| 10.0.125.6      | 2400     |
| 10.138.78.75    | 2400     |
| 10.51.88.253    | 2400     |
| 10.38.37.13     | 2398     |
| 10.76.187.68    | 2397     |
| 10.76.193.186   | 2397     |
| 10.149.59.54    | 2396     |
| 10.170.148.52   | 2395     |
| 10.179.19.15    | 2395     |
| 10.57.105.44    | 2395     |
| 172.30.1.224    | 2394     |
+-----------------+----------+
|      srcIp      |   num    |
+-----------------+----------+
| 10.38.37.229    | 2393     |
+-----------------+----------+
100 rows selected (65.642 seconds)
0: jdbc:hive2://feng02:10000> select count(*) from firewall;
+-----------+
|    _c0    |
+-----------+
| 16600931  |
+-----------+
1 row selected (47.636 seconds)

spark sql and hive 3g数据测试相关推荐

  1. spark sql读取hive底层_scala – 从一个hive表中读取并使用spark sql写回来

    我正在使用Spark SQL读取Hive表并将其分配给 scala val val x = sqlContext.sql("select * from some_table") 然 ...

  2. Spark SQL整合Hive

    Spark SQL官方释义 Spark SQL is Apache Spark's module for working with structured data. 一.使用Spark SQL访问Hi ...

  3. Hive on Spark和Spark sql on Hive,你能分的清楚么

    摘要:结构上Hive On Spark和SparkSQL都是一个翻译层,把一个SQL翻译成分布式可执行的Spark程序. 本文分享自华为云社区<Hive on Spark和Spark sql o ...

  4. spark基础之Spark SQL和Hive的集成以及ThriftServer配置

    如果希望Maven编译Spark时支持Hive,需要给定-Phive -Phive-thriftserver.比如比如:mvn -Pyarn -Phadoop-2.6 -Dhadoop.version ...

  5. 使用Spark SQL读取Hive上的数据

    Spark SQL主要目的是使得用户可以在Spark上使用SQL,其数据源既可以是RDD,也可以是外部的数据源(比如Parquet.Hive.Json等).Spark SQL的其中一个分支就是Spar ...

  6. spark sql on hive初探

    前一段时间由于shark项目停止更新,sql on spark拆分为两个方向,一个是spark sql on hive,另一个是hive on spark.hive on spark达到可用状态估计还 ...

  7. Spark SQL 与 Hive 的第一场会师

    "你好,一杯热美式,加 2 份shot, 1 份焦糖,谢谢" L 跨进汇智国际中心大厦的 Starbucks, 拿着 iPhone 对着点餐机轻轻一扫,对黑带服务员小妹抛出一个笑脸 ...

  8. spark sql读写hive的过程

    Spark sql读写hive需要hive相关的配置,所以一般将hive-site.xml文件放到spark的conf目录下.代码调用都是简单的,关键是源码分析过程,spark是如何与hive交互的. ...

  9. Spark SQL 和 Hive UDF ExceptionInInitializerError getRemoteBlockReaderFromTcp BlockReaderFactory

    文章目录 1.背景 2. hive UDF函数 2. 注册到hive中 3. Spark SQL 4.运行报错 5. HDFS读取问题? 6. 牛 ???????? 7 . 解决后在谷歌搜索发现 8. ...

最新文章

  1. 2017年计算机四级试题,2017年全国计算机四级考试模拟试题与答案
  2. php如何包含html模板,php html模板中怎么使用类似include的功能?
  3. 聊聊 SpringCloud 中的父子容器
  4. 基于STM32,无人船岸基信息处理代码--python实现
  5. Android studio中不同颜色代表什么意思
  6. HDU-3065 病毒侵袭持续中 AC自动机又是一板子!
  7. 测试一下Windows Live Writer能否正常使用。
  8. 【转载+思考】追问泄题之源:命题权寻租打开方便之门
  9. MySQL十四:单表最大2000W行数据
  10. 关于 联想昭阳 E43A 无线开关设置的 正确解决办法
  11. 【iOS】—— 获取设备唯一标识
  12. vs2015遇见问题:后面有“::”的名称一定是类名或命名空间名
  13. 使用MapReduce计算Pi
  14. 【青少年编程】【二级】绘制五彩缤纷的多瓣花
  15. VMware无法启动虚拟机的解决办法
  16. 2020新版软件自动化测试自学全套教程——中级程序员学习路线
  17. 分享关于UE4中matinee工具的使用教程
  18. Oracle 存储过程 问题记录
  19. SpringMVC Controller中常用到的一些注解
  20. 机器人学习——姿态表达,ros通讯机制

热门文章

  1. java 正则 空格_Java中关于空格的正则表达式
  2. 离开当前屏幕的判断方法_Android App内存泄露测试方法总结
  3. linux ps1 目录颜色,Linux下PS1终端下的颜色设置
  4. python remove函数_python中remove函数的用法是什么?
  5. go mongodb排序查询_「赵强老师」MongoDB中的索引(下)
  6. catia如何测量毛料尺寸_浅谈线束尺寸测量基准点的定义
  7. 九十九、Python所学经验分享
  8. 直播预告 | AAAI 2022论文解读:对称的语义感知的妆容迁移与移除网络
  9. EMNLP 2021 | 大道至简:微软提出可生成高质量文本表示的预训练语言模型
  10. 预训练语言模型论文分类整理:综述、基准数据集、PLM的设计和分析