hive复合数据类型查表使用以及控制语句 case when、if

–hive中的复合数据类型
–数组
–有如下数据
战狼2,吴京:吴刚:龙母,2017-08-16
三生三世十里桃花,刘亦菲:杨洋,2017-08-20
普罗米修斯,苍老师:小泽老师:波多老师,2017-09-17
–建表映射：

create table t_movie(movie_name string,actors array<string>,first_show date)
row format delimited fields terminated by ','
collection items terminated by ':';

–查询元素
select movie_name,actors[1],first_show from t_movie;
±------------±------±------------+
| movie_name | _c1 | first_show |
±------------±------±------------+
| 战狼2 | 吴刚 | 2017-08-16 |
| 三生三世十里桃花 | 杨洋 | 2017-08-20 |
| 普罗米修斯 | 小泽老师 | 2017-09-17 |

–查询包含字段

select movie_name,actors,first_show
from t_movie
where array_contains(actors,'吴刚');

–查询每部电影有多少个演员

select movie_name,actors,first_show,size(actors) as actor_number
from t_movie;

–电影名字主演数量首映时间

select movie_name,size(actors) as actor_number,first_show
from t_movie;

二、–有如下数据

1,zhangsan,father:xiaoming#mother:xiaohuang#brother:xiaoxu,28
2,lisi,father:mayun#mother:huangyi#brother:guanyu,22
3,wangwu,father:wangjianlin#mother:ruhua#brother:jingtian,29
4,mayun,father:mayongzhen#mother:angelababy,26

建表映射

create table t_family(id int,name string,family_members map<string,string>,age int)
row format delimited fields terminated by ','
collection items terminated by '#'
map keys terminated by ':';

±-------------±---------------±---------------------------------------------------±------------- -+
| t_family.id | t_family.name | t_family.family_members | t_family.age |
±-------------±---------------±---------------------------------------------------±------------- -+
| 1 | zhangsan | {“father”:“xiaoming”,“mother”:“xiaohuang”,“brother”:“xiaoxu”} | 28 |
| 2 | lisi | {“father”:“mayun”,“mother”:“huangyi”,“brother”:“guanyu”} | 22 |
| 3 | wangwu | {“father”:“wangjianlin”,“mother”:“ruhua”,“sister”:“jingtian”} | 29 |
| 4 | mayun | {“father”:“mayongzhen”,“mother”:“angelababy”} | 26 |
±-------------±---------------±---------------------------------------------------±------------- -+

–查出每个人的爸爸

select id,name,family_members['father'],age
from t_family;

±----±----------±-------------±-----+
| id | name | _c2 | age |
±----±----------±-------------±-----+
| 1 | zhangsan | xiaoming | 28 |
| 2 | lisi | mayun | 22 |
| 3 | wangwu | wangjianlin | 29 |
| 4 | mayun | mayongzhen | 26 |
±----±----------±-------------±-----+

–查出每个人的爸爸、姐妹

select id,name,family_members['father'],family_members['sister']age
from t_family;

–查出每个人有哪些亲属关系

select id,name,map_keys(family_members) as relations,age
from t_family;

±----±----------±-------------------------------±-----+
| id | name | relations | age |
±----±----------±-------------------------------±-----+
| 1 | zhangsan | [“father”,“mother”,“brother”] | 28 |
| 2 | lisi | [“father”,“mother”,“brother”] | 22 |
| 3 | wangwu | [“father”,“mother”,“brother”] | 29 |
| 4 | mayun | [“father”,“mother”] | 26 |
±----±----------±-------------------------------±-----+

–查出每个人的亲人名字

select id,name,map_values(family_members),age
from t_family;

–查出每个人的亲人数量
select id,name,size(family_members) as relations,age
from t_family;
±----±----------±-----------±-----+
| id | name | relations | age |
±----±----------±-----------±-----+
| 1 | zhangsan | 3 | 28 |
| 2 | lisi | 3 | 22 |
| 3 | wangwu | 3 | 29 |
| 4 | mayun | 2 | 26 |
±----±----------±-----------±-----+

–方案一
–查出所有拥有兄弟的人及他兄弟是谁函数可以嵌套，先查出家庭成员的key,在返回给数组array是否包含brother
select id,name,age,family_members[‘brother’]
from t_family where array_contains(map_keys(family_members),‘brother’);
±----±----------±-----+
| id | name | age |
±----±----------±-----+
| 1 | zhangsan | 28 |
| 2 | lisi | 22 |
| 3 | wangwu | 29 |
±----±----------±-----+

–方案二子查询

select id,name,age,family_members['brother']
from
(select id,name,age,map_keys(family_members) as relations,
family_members
from t_family) tmp where array_contains(relations,'brother');

三、struct类型
数据：
1,zhangsan,18:male:北京
2,lisi,28:female:重庆
3,wangwu,38:male:广州
4,赵六,26:female:深圳
5,钱七,17:male:南宁
6,王八,48:female:南京

–建表映射上述数据
create table t_user(id int,name string,info struct age:int,sex:string,addr:string )
row format delimited fields terminated by ‘,’
collection items terminated by ‘:’;

–查询每个人的id name 地址
select id,name,info.addr
from t_user;
±----±----------±------+
| id | name | addr |
±----±----------±------+
| 1 | zhangsan | 北京 |
| 2 | lisi | 重庆 |
| 3 | wangwu | 广州 |
| 4 | 赵六 | 深圳 |
| 5 | 钱七 | 南宁 |
| 6 | 王八 | 南京 |
±----±----------±------+

–按空格分开

0: jdbc:hive2://localhost:10000> select split(sentence,’ ')from t_wc;
±-------------------------------------------+
| _c0 |
±-------------------------------------------+
| [“hello”,“tom”,“hello”,“jim”] |
| [“hello”,“rose”,“hello”,“tom”] |
| [“tom”,“love”,“rose”,“rose”,“love”,“jim”] |
| [“jim”,“love”,“tom”,“love”,“is”,“what”] |
| [“what”,“is”,“love”,“how”,“love”] |
±-------------------------------------------+

–explode 炸开打散数据就是把一行数据变成多行，行转列
select explode (split(sentence,’ ')) from t_wc;
±-------+
| col |
±-------+
| hello |
| tom |
| hello |
| jim |
| hello |
| rose |
| hello |
| tom |
| tom |
| love |
| rose |
| rose |
| love |
| jim |
| jim |
| love |
| tom |
| love |
| is |
| what |
| what |
| is |
| love |
| how |
| love |
±-------+

–求wordcount
select word,count(1) from
(select explode (split(sentence,’ '))as word from t_wc) tmp
group by word;

±-------±-----+
| word | _c1 |
±-------±-----+
| hello | 4 |
| how | 1 |
| is | 2 |
| jim | 3 |
| love | 6 |
| rose | 3 |
| tom | 4 |
| what | 2 |
±-------±-----+

五、
–条件控制函数 case when
±-----------±-------------±---------------------------------------+
| t_user.id | t_user.name | t_user.info |
±-----------±-------------±---------------------------------------+
| 1 | zhangsan | {“age”:18,“sex”:“male”,“addr”:“北京”} |
| 2 | lisi | {“age”:28,“sex”:“female”,“addr”:“重庆”} |
| 3 | wangwu | {“age”:38,“sex”:“male”,“addr”:“广州”} |
| 4 | 赵六 | {“age”:26,“sex”:“female”,“addr”:“深圳”} |
| 5 | 钱七 | {“age”:17,“sex”:“male”,“addr”:“南宁”} |
| 6 | 王八 | {“age”:48,“sex”:“female”,“addr”:“南京”} |
±-----------±-------------±---------------------------------------+

查询出用户的id、name、年龄（如果年龄在30岁以下，显示年轻人，30-40之间显示中年人，40以上老年人）
select id,name,
case
when info.age<30 then ‘青年’
when info.age>=30 and info.age<40 then ‘中年’
else ‘老年’
end
from t_user;

±----±----------±-----+
| id | name | _c2 |
±----±----------±-----+
| 1 | zhangsan | 青年 |
| 2 | lisi | 青年 |
| 3 | wangwu | 中年 |
| 4 | 赵六 | 青年 |
| 5 | 钱七 | 青年 |
| 6 | 王八 | 老年 |
±----±----------±-----+

–if

±--------------------±-----------------------------±--------------------+
| t_movie.movie_name | t_movie.actors | t_movie.first_show |
±--------------------±-----------------------------±--------------------+
| 战狼2 | [“吴京”,“吴刚”,“龙母”] | 2017-08-16 |
| 三生三世十里桃花 | [“刘亦菲”,“杨洋”] | 2017-08-20 |
| 普罗米修斯 | [“苍老师”,“小泽老师”,“波多老师”] | 2017-09-17 |
| 战狼2 | [“吴京”,“吴刚”,“龙母”] | 2017-08-16 |
| 三生三世十里桃花 | [“刘亦菲”,“杨洋”] | 2017-08-20 |
| 美女与野兽 | [“加藤鹰”,“苍老师”,“小泽老师”,“波多老师”] | 2017-09-17 |
±--------------------±-----------------------------±--------------------+

–需求，查询电影信息，并且如果有吴刚的，显示好电影否者烂片
select movie_name,actors,first_show,
if(array_contains(actors,‘吴刚’),‘好片’,‘烂片’)
from t_movie;

±------------±-----------------------------±------------±-----+
| movie_name | actors | first_show | _c3 |
±------------±-----------------------------±------------±-----+
| 战狼2 | [“吴京”,“吴刚”,“龙母”] | 2017-08-16 | 好片 |
| 三生三世十里桃花 | [“刘亦菲”,“杨洋”] | 2017-08-20 | 烂片 |
| 普罗米修斯 | [“苍老师”,“小泽老师”,“波多老师”] | 2017-09-17 | 烂片 |
| 战狼2 | [“吴京”,“吴刚”,“龙母”] | 2017-08-16 | 好片 |
| 三生三世十里桃花 | [“刘亦菲”,“杨洋”] | 2017-08-20 | 烂片 |
| 美女与野兽 | [“加藤鹰”,“苍老师”,“小泽老师”,“波多老师”] | 2017-09-17 | 烂片 |
±------------±-----------------------------±------------±-----+

hive复合数据类型查表使用以及控制语句 case when、if相关推荐

Hadoop HIVE 复合数据类型
1.数组 arrays arrays: ARRAY<data_type> (Note: negative values and non-constant expressions are a ...
hive复合数据类型之array
概述 ARRAY:ARRAY类型是由一系列相同数据类型的元素组成,这些元素可以通过下标来访问.比如有一个ARRAY类型的变量fruits,它是由['apple','orange','mango']组成 ...
hive复合数据类型之struct
概述 STRUCT:STRUCT可以包含不同数据类型的元素.这些元素可以通过"点语法"的方式来得到所需要的元素,比如user是一个STRUCT类型,那么可以通过user.addre ...
hive复合数据类型之map
概述 MAP:MAP包含key->value键值对,可以通过key来访问元素.比如"userlist"是一个map类型,其中username是key,password是val ...
【Hive】Hive的数据类型
Hive中数据类型可以分为基本数据类型和复合数据类型.这些数据类型都是用Java实现的. 1. 基本数据类型类型名称描述举例 boolean true/false true tinyint 1b ...
Hive 基本语法操练（四）：Hive 复合类型
hive语法中主要提供了以下复合数据类型: 1)Structs: structs内部的数据可以通过DOT(.)来存取.例如,表中一列c的类型为STRUCT{a INT; b INT},我们可以通过c. ...
python调用spark和调用hive_Spark(Hive) SQL数据类型使用详解(Python)
Spark SQL使用时需要有若干"表"的存在,这些"表"可以来自于Hive,也可以来自"临时表".如果"表"来自于Hi ...
《Go语言圣经》学习笔记第四章复合数据类型
<Go语言圣经>学习笔记第四章复合数据类型目录数组 Slice Map 结构体 JSON 文本和HTML模板注:学习<Go语言圣经>笔记,PDF点击下载,建议看书. ...
5.hive建库建表与数据导入
建库 hive中有一个默认的库: 库名: default 库目录:hdfs://hdp20-01:9000/user/hive/warehouse 新建库: create database db_or ...

hive复合数据类型查表使用以及控制语句 case when、if

hive复合数据类型查表使用以及控制语句 case when、if相关推荐

最新文章

热门文章

hive复合数据类型查表使用 以及控制语句 case when、if

hive复合数据类型查表使用 以及控制语句 case when、if相关推荐

最新文章

热门文章

hive复合数据类型查表使用以及控制语句 case when、if

hive复合数据类型查表使用以及控制语句 case when、if相关推荐