hive和mysql传输数据类型

1. 基本数据类型

新增数据类型TIMESTAMP的值可以是：

• 整数：距离Unix新纪元时间(1970年1月1日，午夜12点)的秒数

• 浮点数：距离Unix新纪元时间的秒数，精确到纳秒(小数点后保留9位数)

• 字符串：JDBC所约定的时间字符串格式，格式为：YYYY-MM-DD hh:mm:ss:fffffffff

BINARY数据类型用于存储变长的二进制数据。

2.复杂数据类型

3.数据类型应用举例

##创建员工表，使用默认分割符

CREATE TABLE employee(

name STRING,

salary FLOAT,

leader ARRAY,

deductions MAP,

address STRUCT

)

4.列的分割符

HiveQL文本文件数据编码表

CREATE TABLE employee(

name STRING,

salary FLOAT,

subordinates ARRAY,

deductions MAP,

address STRUCT

)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\001'

COLLECTION ITEMS TERMINATED BY '\002'

MAP KEYS TERMINATED BY '\003'

LINES TERMINATED BY '\n'

STORED AS TEXTFILE;

• [ROW FORMAT DELIMITED]关键字，是用来设置创建的表在加载数据的时候，支持的列分隔符;

• FIELDS TERMINATED BY '\001' ，字符\001是^A的八进制数。这个子句表明Hive将使用^A字符作为列分隔符。

• COLLECTION ITEMS TERMINATED BY '\002'

，字符\002是^B的八进制数。这个子句表明Hive将使用^B字符作为集合元素的分隔符。

• MAP KEYS TERMINATED BY '\003'

，字符\003是^C的八进制数。这个子句表明Hive将使用^C字符作为map的键和值之间的分隔符。

• LINES TERMINATED BY '\n' 、STORED AS TEXTFILE这个两个子句不需要ROW FORMAT DELIMITED

关键字

• Hive目前对于LINES TERMINATED BY…仅支持字符‘\n’，行与行之间的分隔符只能为‘\n’。

hive的基本命令

1.数据库的创建：

本质上是在hdfs上创建一个目录，使用comment加入数据库的描述信息，描述信息放在引号里。数据库的属性信息放在描述信息之后用with

dbproperties 加入，属性信息放在括号内，属性名和属性值放在引号里，用等号连接有多条属性用逗号分隔。

##创建一个数据库名为myhive,加入描述信息及属性信息

create database myhive comment 'this is myhive db'

with dbproperties ('author'='me','date'='2018-4-21')

;

##查看属性信息

describe database extended myhive;

##在原有数据库基础上加入新的属性信息

alter database myhive set dbproperties ('id'='1');

##切换库

use myhive;

##删除数据库

drop database myhive;

2.表的创建

默认创建到当前数据库(default是hive默认库)，创建表的本质也是在hdfs上创建一个目录

==================练习array的使用，本地数据加载，对比hive与mysql的区别========================

##创建数据array.txt映射表t_array

create table if not exists t_array(

id int comment 'this is id',

score array

)

comment 'this is my table'

row format delimited fields terminated by ','

collection items terminated by '|'

tblproperties ('id'='11','author'='me')

;

##从本地加载数据array.txt文件

load data local inpath '/testdata/array.txt' into table t_array;

##查询表里面的数据

select * from t_array;

##查询id=1的第一条成绩信息

select score[0] from t_array where id=1;

##查询id=2的成绩条数

select size(score) from t_array where id=2;

##查询一共有多少条数据

select count(*) from t_array;

##把arra1.txt追加的方式从本地加载进这个表中

load data local inpath '/testdata/array1.txt' into table t_array;

##把test.txt追加的方式从本地加载进这个表中

load data local inpath '/testdata/test.txt' into table t_array;

##从本地覆盖方式加载数据array.txt文件至t_array表中

load data local inpath '/testdata/array.txt' overwrite into table t_array;

====================练习map的使用，查看表的创建过程，创建表的同时指定数据位置===================

##创建数据map.txt的映射表t_map

create table if not exists t_map(

id int,

score map

)

row format delimited fields terminated by ','

collection items terminated by '|'

map keys terminated by ':'

stored as textfile

;

##从hdfs加载数据，map.txt在hdfs上的位置位置被移动。

load data local inpath '/testdata/map.txt' into table t_map;

##查询id=1的数学成绩

select score['math'] from t_map where id=1;

##查询每个人考了多少科

select size(score) from t_map;

##查看表的创建过程

show create table t_map;

CREATE TABLE `t_map1`(

`id` int,

`score` map)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ','

COLLECTION ITEMS TERMINATED BY '|'

MAP KEYS TERMINATED BY ':'

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

'hdfs://linux5:8020/user/hive/warehouse/t_map'

;

##创建表的同时指定数据的位置

create table if not exists t_map2(

id int,

score map

)

row format delimited fields terminated by ','

collection items terminated by '|'

map keys terminated by ':'

stored as textfile

location '/test'

;

##删除表

drop table test2;

====================练习struct的使用，外部表的创建，总结内部表外部表的区别=====================

##创建数据struct.txt的映射表t_struct(使用external关键字并指定数据位置创建外部表)

create external table if not exists t_struct(

id int,

grade struct

)

row format delimited fields terminated by ','

collection items terminated by '|'

location '/external'

##查看score>90的信息

select * from t_struct where grade.score>90;

##创建外部表t_struct1

create external table if not exists t_struct1(

id int,

grade struct

)

row format delimited fields terminated by ','

collection items terminated by '|'

;

##insert into 方式追加数据

insert into table t_struct1 select * from t_struct;

##删除表：只有元数据被删除，数据文件仍然存储在hdfs上

drop table t_struct;

3.为hive表加载数据：

将数据文件copy到对应的表目录下面(如果是hdfs上的目录，将是剪切)。

##load方式从本地加载数据，会将数据拷贝到表所对应的hdfs目录

#追加

load data local inpath '本地数据路径' into table tablename

#覆盖

load data local inpath '本地数据路径' overwrite into table tablename

##load方式从hdfs加载数据,会将数据移动到对应的hdfs目录

#追加

load data inpath 'hdfs数据路径' into table tablename

#覆盖

load data inpath 'hdfs数据路径' into table tablename

##通过查询语句向表中插入数据

#追加

insert into table table1 select * from table2

#覆盖

insert overwrite into table table1 select * from table2

4.内部表与外部表

内部表：在Hive 中创建表时，默认情况下Hive 负责管理数据。即，Hive 把数据移入它的"仓库目录" (warehouse

directory)

外部表：由用户来控制数据的创建和删除。外部数据的位置需要在创建表的时候指明。使用EXTERNAL 关键字以后， Hìve

知道数据并不由自己管理，因此不会把数据移到自己的仓库目录。事实上，在定义时，它甚至不会检查这一外部位置是否存在。这是一个非常重要的特性，因为这意味着你可以把创建数据推迟到创建表之后才进行。

区别：丢弃内部表时，这个表(包括它的元数据和数据)会被一起删除。丢弃外部表时，Hive 不会碰数据，只会删除元数据，而不会删除数据文件本身

5.表属性修改

##创建表log2

CREATE external TABLE log2(

id string COMMENT 'this is id column',

phonenumber bigint,

mac string,

ip string,

url string,

status1 string,

status2 string,

up int,

down int,

code int,

dt String

)

COMMENT 'this is log table' ##加入描述信息

ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '

LINES TERMINATED BY '\n'

stored as textfile;

##加载数据

load local data inpath '/home/data.log.txt' into table log2;

修改表名：rename to

alter table原名rename to 新名

alter table log rename to log2;

修改列名：change column

alter table 表名 change column 字段名新字段名字段类型【描述信息】;

##修改列名

alter table log4 change column ip myip String;

##修改列名同时加入列的描述

alter table log4 change column myip ip String comment 'this is mysip' ;

##使用after关键字，将修改后的字段放在某个字段后

alter table log4 change column myip ip String comment 'this is myip' after code;

##使用first关键字。将修改的字段调整到第一个字段

alter table log4 change column ip myip int comment 'this is myip' first;

添加列：add columns

##添加列，使用add columns,后面跟括号，括号里面加要加入的字段及字段描述，多个字段用逗号分开

alter table log4 add columns(

x int comment 'this x',

y int

);

删除列：

##删除列，使用replace columns,后面跟括号，括号里面加要删除的字段，多个字段用逗号分开

alter table log4 replace columns(x int,y int);

alter table log4 replace columns(

myip int,

id string,

phonenumber bigint,

mac string,

url string,

status1 string,

status2 string,

up int,

down int,

code int,

dt string

);

将内部表转换为外部表:

alter table log4 set tblproperties(

'EXTERNAL' = 'TRUE'

);

alter table log4 set tblproperties(

'EXTERNAL' = 'false'

);

alter table log4 set tblproperties(

'EXTERNAL' = 'FALSE'

);

hive和mysql传输数据类型_hive的数据类型相关推荐

hive 把mysql语句执行_Hive SQL 语句的执行顺序
提示 Hive SQL 教程编写中,使用过程中有任何建议,提供意见.建议.纠错.催更加微信 sinbam. 当我们写了一个 sql,但是执行起来很慢,这时如果我们知道这个sql的底层执行流程是怎样的 ...
hive和mysql的区别_hive和mysql的区别是什么
hive和mysql的区别是什么 hive和mysql的区别有: 1.查询语言不同:hive是hql语言,mysql是sql语句: 2.数据存储位置不同:hive是把数据存储在hdfs上,而mysql ...
为什么hive需要mysql作为数据库_Hive安装（本地独立模式，MySql为元数据库）
部署环境: 系统 Red hat linux 6.4 Hadoop版本 1.2.1 Hive版本 0.11.0 Mysql数据库版本 5.6.15 目前Hive已经更新到0.13.1版本安装步骤: ...
hive连接mysql报错_hive远程模式初始化mysql报错
hive的远程模式需要mysql数据库,需要安装mysql数据库, 创建mysql 数据库用于存储hive的原信息 create database hive DEFAULT CHARSET utf8 ...
hive 安装mysql报错_hive的元数据存储在mysql后,报错的解决方法
最近,因为工作的需要,一直在研究hadoop系统.许多分析工作都将通过hive来解决,所以特将所碰到的问题和解决方案,陆续整理出来,既做为回顾又供需要的朋友参考! 因为要实现多人开发Hive,所以需要 ...
MySQL数据类型和Java数据类型对应关系表
MySql 数据类型和 Java 数据类型之间的转换是很灵活的. 一般来讲,任何 MySql 数据类型都可以被转换为一个 java.lang.String,任何 MySql 数字类型都可以被转换为任何 ...
MySQL命令（一）| 数据类型、常用命令一览、库的操作、表的操作
文章目录数据类型数值类型字符串类型日期/时间类型常用命令一览库的操作显示当前数据库创建数据库使用数据库删除数据库表的操作创建表显示当前库中所有表查看表结构删除表数据类型 ...
mysql中的基本数据类型_mysql基本数据类型
2018-07-12 11:24:24 mysql数据库分多钟数据类型,大类可以分为三种:数值类型.时间(日期)和字符(串)类型. 数值类型 MySQL支持所有标准SQL数值数据类型. 这些类型包括严 ...
mysql里的char怎么添加数据类型_MySQL CHAR 数据类型
MySQL CHAR 数据类型简介:在本教程中,您将了解MySQL CHAR数据类型以及如何在数据库表设计中应用它. MySQL CHAR数据类型简介 CHAR数据类型是MySQL中的固定长度的字符 ...

hive和mysql传输数据类型_hive的数据类型

hive和mysql传输数据类型_hive的数据类型相关推荐

最新文章

热门文章