hive笔记（与上一偏《hadoop集群搭建》结合）

2024-05-07 11:47:34

1.上传hive安装包

2.解压

2.1

create table trade_detail (id bigint, account string, income double, expenses double, time string);

2.2

创建文件: /root/trade_detail.txt

1 Jason@ahome.com 3000000000.0 0.0 2017-10-01

2 Tony@ahome.com 3000000000.0 0.0 2017-10-01

3 Scott@ahome.com 3000000000.0 0.0 2017-10-01

2.3

load data local inpath '/root/trade_detail.txt' into table trade_detail;

2.4 指定分隔符

create table teacher(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by '\t';

load data local inpath '/root/trade_detail.txt' into table teacher;

2.5. 目录下所有文件创建表(使用 hdfs 上面的文件夹创建表)

2.5.1 先把文件放到 hdfs 上面

hive> dfs -put /root/student.txt /data/a.txt;

hive> dfs -put /root/student.txt /data/b.txt;

2.5.2 创建表并指向 hdfs 的目录 /data

hive> create table bd_teacher(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by '\t' location '/data';

2.6 在 hive 的命令控制台可以执行 hadoop shell 命令:

hive> dfs -ls /;

hive> dfs -mkdir /data;

hive> dfs -put /root/student.txt /data/a.txt;

hive> dfs -put /root/student.txt /data/b.txt;

hive> create external table bd_student(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by '\t' location '/data';

hive> select * from bd_student;

OK

1 Jason@ahome.com 3.0E9 0.0 2017-10-01

2 Tony@ahome.com 3.0E9 0.0 2017-10-01

3 Scott@ahome.com 3.0E9 0.0 2017-10-01

1 Jason@ahome.com 3.0E9 0.0 2017-10-01

2 Tony@ahome.com 3.0E9 0.0 2017-10-01

3 Scott@ahome.com 3.0E9 0.0 2017-10-01

Time taken: 0.331 seconds, Fetched: 6 row(s)

hive> dfs -put /root/student.txt /data/c.txt;

hive> select * from bd_student;

OK

1 Jason@ahome.com 3.0E9 0.0 2017-10-01

2 Tony@ahome.com 3.0E9 0.0 2017-10-01

3 Scott@ahome.com 3.0E9 0.0 2017-10-01

1 Jason@ahome.com 3.0E9 0.0 2017-10-01

2 Tony@ahome.com 3.0E9 0.0 2017-10-01

3 Scott@ahome.com 3.0E9 0.0 2017-10-01

1 Jason@ahome.com 3.0E9 0.0 2017-10-01

2 Tony@ahome.com 3.0E9 0.0 2017-10-01

3 Scott@ahome.com 3.0E9 0.0 2017-10-01

Time taken: 0.331 seconds, Fetched: 9 row(s)

即把数据放到目录，hive 就会把表的数据指向到这个目录下的文件进行查询

2.7 分区表

hive> create external table beauties (id bigint, name string, size double) partitioned by (nation string) row format delimited fields terminated by '\t' location '/beauty';

cat /root/beauty.txt

1 jingtian 35

2 bingbing 35

cat /root/beauty2.txt

5 bdyjy 32

6 jzmb 35

cat /root/beauty3.txt

8 zl 1

// 直接hadoop上传的文件，分区表查询不到数据

hadoop fs -put beauty.txt /beauty/b.c

hive> select * from beauties;

OK

Time taken: 0.424 seconds

// 使用 hive load

hive> hive> load data local inpath '/root/beauty.txt' into table beauties partition (nation='China');

hive> select * from beauties;

OK

1 jingtian 35.0 China

2 bingbing 35.0 China

Time taken: 0.125 seconds, Fetched: 2 row(s)

// 添加其他分区

alter table beauties add partition (nation='Japan');

// 添加数据

dfs -put /root/beauty2.txt /beauty/nation=Japan;

2.8 分区表2

create table sms (id bigint, connect string, area string) partitioned by (area string) row format delimited fields terminated by '\t';

创建不成功，表字段不能作为表分区的字段(只能改原码)

2.9 利用sqoop 导入mysql表数据到 hive:

创建 hive 表:

create table trade_detail (id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by '\t';

create table user_info (id bigint, account string, name string, age int) row format delimited fields terminated by '\t';

导入数据到 hive 库:

sqoop import --connect jdbc:mysql://hadoop202:3306/hadoop --username root --password root --table trade_detail --hive-import --hive-overwrite --hive-database userdb --hive-table trade_detail --fields-terminated-by '\t'

sqoop import --connect jdbc:mysql://hadoop202:3306/hadoop --username root --password root --table user_info --hive-import --hive-overwrite --hive-database userdb --hive-table user_info --fields-terminated-by '\t'

表连接语句:

select t.account, u.name, t.income, t.expenses, t.surplus from user_info u join (select account, sum(income) as income, sum(expenses) as expenses, sum(income-expenses) as surplus from trade_detail group by account) t on u.account = t.account;

3.配置

3.1安装mysql

查询以前安装的mysql相关包

rpm -qa | grep mysql

暴力删除这个包

rpm -e mysql-libs-5.1.66-2.el6_3.i686 --nodeps

rpm -ivh MySQL-server-5.1.73-1.glibc23.i386.rpm

rpm -ivh MySQL-client-5.1.73-1.glibc23.i386.rpm

执行命令设置mysql

/usr/bin/mysql_secure_installation

将hive添加到环境变量当中

GRANT ALL PRIVILEGES ON hive.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

FLUSH PRIVILEGES

在hive当中创建两张表

create table trade_detail (id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by '\t';

create table user_info (id bigint, account string, name string, age int) row format delimited fields terminated by '\t';

将mysq当中的数据直接导入到hive当中

sqoop import --connect jdbc:mysql://192.168.1.10:3306/itcast --username root --password 123 --table trade_detail --hive-import --hive-overwrite --hive-table trade_detail --fields-terminated-by '\t'

sqoop import --connect jdbc:mysql://192.168.1.10:3306/itcast --username root --password 123 --table user_info --hive-import --hive-overwrite --hive-table user_info --fields-terminated-by '\t'

创建一个result表保存前一个sql执行的结果

create table result row format delimited fields terminated by '\t' as select t2.account, t2.name, t1.income, t1.expenses, t1.surplus from user_info t2 join (select account, sum(income) as income, sum(expenses) as expenses, sum(income-expenses) as surplus from trade_detail group by account) t1 on (t1.account = t2.account);

create table user (id int, name string) row format delimited fields terminated by '\t'

将本地文件系统上的数据导入到HIVE当中

load data local inpath '/root/user.txt' into table user;

创建外部表

create external table stubak (id int, name string) row format delimited fields terminated by '\t' location '/stubak';

创建分区表

普通表和分区表区别：有大量数据增加的需要建分区表

create table book (id bigint, name string) partitioned by (pubdate string) row format delimited fields terminated by '\t';

分区表加载数据

load data local inpath './book.txt' overwrite into table book partition (pubdate='2010-08-22');

hive笔记（与上一偏《hadoop集群搭建》结合）相关推荐

好程序员大数据笔记之：Hadoop集群搭建
好程序员大数据笔记之:Hadoop集群搭建在学习大数据的过程中,我们接触了很多关于Hadoop的理论和操作性的知识点,尤其在近期学习的Hadoop集群的搭建问题上,小细节,小难点拼频频出现,所以,今天 ...
大数据Hadoop集群搭建
大数据Hadoop集群搭建一.环境服务器配置: CPU型号:Intel® Xeon® CPU E5-2620 v4 @ 2.10GHz CPU核数:16 内存:64GB 操作系统版本:CentO ...
【大数据实战】Docker中Hadoop集群搭建
目录 Docker中Hadoop集群搭建环境网络设置安装docker 安装OpenSSH免密登录 Ansible安装软件环境配置配置hadoop运行所需配置文件 Hadoop 启动问题 D ...
linux hadoop集群搭建,hadoop集群搭建
hadoop集群搭建步骤实验介绍下面将要在三台linux虚拟机上搭建hadoop集群. 知识点 linux基本命令集群安装完成实验需要以下相关知识解压命令 tar -zxvf XX.tar. ...
Hadoop集群搭建(27)
2019独角兽企业重金招聘Python工程师标准>>> Hadoop集群搭建方式: 1.1 确定部署三个节点,分别是hadoop0,hadoop1,hadoop2. 其中had ...
Hadoop 集群搭建
Hadoop 集群搭建 2016-09-24 杜亦舒目标在3台服务器上搭建 Hadoop2.7.3 集群,然后测试验证,要能够向 HDFS 上传文件,并成功运行 mapreduce 示例程序搭建 ...
大数据 -- Hadoop集群搭建
Hadoop集群搭建 1.修改/etc/hosts文件在每台linux机器上,sudo vim /etc/hosts 编写hosts文件.将主机名和ip地址的映射填写进去.编辑完后,结果如下: 2. ...
不看就亏系列！这里有完整的 Hadoop 集群搭建教程，和最易懂的 Hadoop 概念！| 附代码...
作者 | chen_01_c 责编 | Carol 来源 | CSDN 博客封图 | CSDN付费下载于视觉中国 hadoop介绍 Hadoop 是 Lucene 创始人 Doug Cutting, ...
Hadoop集群搭建（三台Linux服务器）
Hadoop集群搭建(三台Linux服务器) 搭建之前注意的几点问题环境以及版本基本命令 Linux环境准备首先安装Hadoop 配置集群分发脚本克隆配置好的机器 Hadoop集群配置第一步 ...
hadoop集群搭建
hadoop集群搭建这里的集群只是一个单点登录的集群,没有做到正真的HA高可用,只是一个namenode节点多个datanode节点基本思路: 先在一个节点上配置好hadoop集群将配置好的ha ...

最新文章

热门文章