HiveQL：数据操作

文章目录

1. 向管理表中装载数据
2. 通过查询语句向表中插入数据
3. 动态分区插入
4. 从单个查询语句创建表并加载数据
5. 导出数据

学习自《Hive编程指南》

1. 向管理表中装载数据

hive (default)> load data local inpath "/home/hadoop/workspace/student.txt"> overwrite into table student1;

分区表可以跟 partition (key1 = v1, key2 = v2, …)

有 local ：复制本地路径文件到 hdfs
无 local：移动 hdfs 文件到新的 hdfs 路径

overwrite：目标文件夹中的数据将会被删除
没有 overwrite ：把新增加的文件添加到目标文件夹中，不删除原数据

inpath 后的路径下，不能包含任何文件夹

2. 通过查询语句向表中插入数据

hadoop@dblab-VirtualBox:~/workspace$ cat stu.txt
1   michael male    china
2   ming    male    china1
3   haha    female  china
4   huahua  female  china1

创建表，加载数据

hive (default)> create table stu(> id int,> name string,> sex string,> country string)> row format delimited fields terminated by '\t';hive (default)> load data local inpath '/home/hadoop/workspace/stu.txt'> into table stu;

通过 select 语句向其他表填入数据

hive (default)> create table employee(> name string,> country string)> row format delimited fields terminated by '\t';

hive (default)> from stu s> insert overwrite table employee> select s.name, s.country where s.id%2=1;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20210408224138_1df23614-7945-40c0-9a4d-df88e4f58ea1
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2021-04-08 22:41:40,081 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local1437521177_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee/.hive-staging_hive_2021-04-08_22-41-38_345_1863326332876590299-1/-ext-10000
Loading data to table default.employee
MapReduce Jobs Launched:
Stage-Stage-1:  HDFS Read: 83 HDFS Write: 180 SUCCESS
Total MapReduce CPU Time Spent: 0 msec

hive (default)> select * from employee;
OK
michael china
haha    china

向多表插入数据

hive (default)> from stu s> insert into table employee> select s.name, s.country where s.sex='female'> insert into table employee1> select s.name, s.country where s.sex='male';
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20210408230623_bc69bccf-348e-467d-b88e-498664f27017
Total jobs = 5
Launching Job 1 out of 5
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2021-04-08 23:06:24,405 Stage-2 map = 100%,  reduce = 0%
Ended Job = job_local2065691620_0003
Stage-5 is selected by condition resolver.
Stage-4 is filtered out by condition resolver.
Stage-6 is filtered out by condition resolver.
Stage-11 is selected by condition resolver.
Stage-10 is filtered out by condition resolver.
Stage-12 is filtered out by condition resolver.
Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee/.hive-staging_hive_2021-04-08_23-06-23_001_7974131043339100692-1/-ext-10000
Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee1/.hive-staging_hive_2021-04-08_23-06-23_001_7974131043339100692-1/-ext-10002
Loading data to table default.employee
Loading data to table default.employee1
MapReduce Jobs Launched:
Stage-Stage-2:  HDFS Read: 470 HDFS Write: 474 SUCCESS
Total MapReduce CPU Time Spent: 0 msechive (default)> select * from employee;
ming    china1
huahua  china1
haha    china
huahua  china1hive (default)> select * from employee1;
michael china
ming    china1

3. 动态分区插入

hive (default)> from stu s> insert overwrite table employee2> partition (country, sex)> select s.id, s.name, s.country, s.sex;hive (default)> select * from employee2;
OK
3   haha    china   female
1   michael china   male
4   huahua  china1  female
2   ming    china1  male

4. 从单个查询语句创建表并加载数据

表的模式由 select 生成

hive (default)> create table employee3> as select id, name from stu> where country='china';hive (default)> select * from employee3;
1   michael
3   haha

此功能不能用于外部表（数据没有装载，在外部）

5. 导出数据

hive (default)> from stu s> insert overwrite local directory '/tmp/employee'> select s.id, s.name, s.sex> where country='china';

可以同时写入多个文件，insert 重复写几次

hive (default)> ! ls /tmp/employee -r;
000000_0hive (default)> ! cat /tmp/employee/000000_0;
1michaelmale
3hahafemale

HiveQL：数据操作相关推荐

HiveQL数据操作
文章目录 Hive--HiveQL数据操作 1.向管理表中插入数据 2.通过查询语句向表中插入数据 3.单个查询语句中创建表并加载数据 4.导出数据 Hive--HiveQL数据操作 1.向管理表中插 ...
Hive学习—数据操作
第5章 HiveQL:数据操作第4章主要介绍如何创建表,随之而来的下个问题即,如何装载数据到这些表中. 本章主要讨论Hive查询语言中,向表中装载数据和从表中抽取数据到文件系统的数据操作语言部分. ...
认识Hive，以及Hive的数据定义与数据操作，hive的数据查询和hive函数
认识Hive 为什么要出现hive 前面知识我们讲到mapreudce计算框架,各位需要通过java编码的形式来实现设计运算过程,这对各位的编程能力提出了更高的要求,难道没有门槛更低的方式来实现运算的 ...
全网最细之HiveQL语句操作
HiveQL语句操作关键字尽量大写但是本人小写容易记忆所以本篇为小写 -------------------------------------------------------------- ...
numpy和torch数据操作对比
对numpy和torch数据操作进行对比,避免遗忘. ndarray和tensor import torch import numpy as npnp_data = np.arange(6).resh ...
Redis数据库搭建主从同步（主从概念、主从配置、主从数据操作）
1. 主从概念⼀个master可以拥有多个slave,⼀个slave⼜可以拥有多个slave,如此下去,形成了强⼤的多级服务器集群架构 master用来写数据,slave用来读数据,经统计:网站的读 ...
命令行客户端MySQL基本命令的使用（登录、登出、数据库操作的SQL语句、表结构的SQL语句、表数据操作的SQL语句）
1. 登录和登出数据库登录数据库: 输入下面命令: mysql -uroot -p 说明: -u 后面是登录的用户名 [写成-u root也是可以的] -p 后面是登录密码, 如果不填写, 回车之 ...
MySQL基础篇：数据操作语言DML
1.概述数据操作语言(DML)用于插入.修改.删除.查询数据记录,包括以下SQL语句: INSERT:添加数据到数据库中 UPDATE:修改数据库中的数据 DELETE:删除数据库中的数据 2.插入 ...
逻辑模型三要素-数据操作
数据操作是指对数据库中各种对象的实例或取值所允许执行操作的集合,其中包括操作方法及有关规则,它是对数据库动态特性的描述.
使用dplyr进行数据操作（30个实例）
本文转载自"R语言",已获授权. dplyr软件包是R中功能最强大,最受欢迎的软件包之一.该软件包由最受欢迎的R程序员Hadley Wickham编写,他编写了许多有用的R软件包, ...

HiveQL：数据操作

文章目录

1. 向管理表中装载数据

2. 通过查询语句向表中插入数据

3. 动态分区插入

4. 从单个查询语句创建表并加载数据

5. 导出数据

HiveQL：数据操作相关推荐

最新文章

热门文章

HiveQL： 数据操作

文章目录

1. 向管理表中装载数据

2. 通过查询语句向表中插入数据

3. 动态分区插入

4. 从单个查询语句创建表并加载数据

5. 导出数据

HiveQL： 数据操作相关推荐

最新文章

热门文章

HiveQL：数据操作

HiveQL：数据操作相关推荐