gp数据库和mysql区别_gp数据库(创建表分区)

1、创建数据库：

create database 库名;

2、删除数据库：

drop database 库名;

3、创建表：

create table 表名(

id integer,

name text,

price numeric {精确度较高的小数型，同mysql的decimal}

);

3-1、GP建表指定列级约束

create table 表名(

id integer primary key, {主键约束}

name text not null, {非空约束}

price numeric check(price>0), {检查约束}

type integer unique {唯一约束}

);

【注】：主键、唯一键、随机分布不能共存

3-2、声明表的分布策略

GP的分布键作用是保证数据能够均匀分布在不同的存储节点上，充分利用并行计算带来的高性能。GP的分布策略包括HASH分布和随机分布。

HASH分布的关键字是：distributed by(列名)

随机分布的关键字是：distributed by randonly

在创建表或者修改表定义的时候，必须使用distributed by来执行分布键，从而使数据均匀的存储在不同的segment上。

如果指定的分布键(列名)不是主键，则无法创建(指定的列必须是主键)。

(1)、声明hash分布

create table 表名(

id integer primary key, {主键约束}

name text not null, {非空约束}

price numeric check(price>0), {检查约束}

type integer unique {唯一约束}

)distributed by(id);

(2)、声明随机分布

create table 表名(

id integer primary key, {主键约束，这里就不能在声明主键约束}

name text not null, {非空约束}

price numeric check(price>0), {检查约束}

type integer unique {唯一约束，这里就不能在声明唯一约束}

)distributed by randonly; {指定随机分布}

【注】：主键、唯一键、随机分布不能共存

否则报错：ERROR:Primary key and distributed randonly are incompatible{不兼容的}

【说明】：几何数据类型和自定义的数据类型不适合作为GP的分布键。如果没有适合的列可以保证数据的均匀分布，则使用随机分布

4、分区表的特征

(1)、对一张表作分区，实际上是创建了一张父表和多个子表

(2)、每个分区在创建时都带有一个不同的检查约束(check)

(3)、任何分区结构的修改或者表结构的修改都要通过父表使用partition字句结合alter table命令完成

查看某张表是否为分区表：

select count(*) from pg_partition where parrelid='测试schema.表名'::regclass;

结果：如果有数据就是分区表，如果无数据则不为分区表

(4)、分区选择性扫描的限制

查询计划只可以用稳定的比较运算符 = , < , > , <= , >= , <>

查询计划不能识别非稳定的函数来执行选择性扫描

(5)、创建和管理分区表

分区表类型使用场景

range 表示一个序列范围，如日期、数字、价格等

list 表示一个列表，如产品名称

【注】：主键或是唯一键必须包含表中的所有分区键

1)、定义range类型分区表，使用start，end，every定义分区增量让GP自动创建分区

create table test(

c1 integer,

c2 date

)distributed by(c1)

partition by range(c2)

(

start(date,'2019-09-01')inclusive

end(date,'2019-09-03')exclusive

every(interval '1 day')

);

创建结果：

test(父表)

test_1_part_1

test_1_part_2

test_1_part_3

关键字说明：

start：分区开始值

end: 分区结束值

inclusive: 表示包含左边的取值

exclusive: 表示不包含左边的取值

every: 表示分区范围自增长的步长

2)、定义日期范围分区表，且并给每个分区表单独命名

create table test(

c1 integer,

c2 date

)distributed by(c1)

partition by range(c2)

(

partition one start(date,'2019-09-01')inclusive

partition two start(date,'2019-09-02')inclusive

partition thr start(date,'2019-09-03')inclusive

end(date,'2019-09-04')exclusive

);

创建结果：

test(父表)

test_1_par_one

test_1_par_two

test_1_par_thr

3)、定义数字范围分区表

create table test2(

id int,

year int

)distributed by(id)

partition by range(year)

(

statr(2014)

end(2016)

every(1),

default partition extra

);

创建结果：

test2(父表)

test2_1_prt_extra

test2_1_prt_2

test2_1_prt_3

4)、创建列表分区

create table test2(

id int,

gender char(1)

)distributed by(id)

partition by list(gender)

(

partition boys values('M'),

partition girls values('F'),

default partition thr

);

【说明】：default partition 分区名称的作用是是定义默认分区，在分区检查约束范围内的数据会被放到对应的分区，不在各个

分区表检查范围内的数据都会被放入到默认分区表中

举例：

insert into test2 value

(1,'M'),

(2,'M'),

(3,'F'),

(4,'K');

select * from test2_1_prt_boys;

结果：

1 M

2 M

select * from test2_1_prt_girls;

结果：

3 F

select * from test2_1_prt_thr;

结果：{K不等于M也不等与F，所以被存储到默认分区内}

4 K

5)、定义多级分区表

创建二级子分区表根据日期字段做一级分区，再根据地区列表做二级分区

create table test2(

id int,

dates date,

region text

)distributed by(id)

partition by range(dates)

subpartition by list(region)

subpartition template

(

subpartition usa values('usa'),

subpartition uk values('uk'),

subpartition ch values('ch'),

default subpartition otherRegions

)

(

start(date,'2019-09-20')inclusive

end(date,'2019-09-22')exclusive

every(interaval '1 day'),

default partition otherDays

);

6)、查看子分区是否被扫描

explain select * from test2;

7)、交换分区(待查方法)

8)、查看分区设计

通过pg_partition视图查看分区表的设计情况

select partitionboundry, partitiontablename, partitionname, partitionlevel, partitionrank from pg_partition;

通过pg_partition_templates查看子分区模板

select * from pg_partition_templates where tablename='test2';

通过pg_partition_columns查看分区键

select * from pg_partition_columns where tablename='test2';

(6)、维护分区表

分区表的维护包括添加新分区，重命名，拆分，模板修改和删除分区等

1)、添加新分区{原分区表中如果默认存在分区需要先drop掉默认分区}

alter table test2 drop default partition;

alter table test2 add partition 分区名 start('2019-09-20'::date) end('2019-09-30'::date);

分区名如：p20200317

1-1)、创建一个新的空分区：

create table 分区表名_y2008m02 partition OF 源表

for values from ('2008-02-01') TO ('2008-03-01')

TABLESPACE fasttablespace;

2、重命名分区{修改父表信息、会影响所有的分区表}

alter table test2 rename to test22;

3)、只修改分区名称{for('2019-09-20')填写分区键的值这里即指test22_1_prt_4这个分区表}

alter table test22 rename partition for('2019-09-20') to change_par;

结果：test22_1_prt_4->改名为test22_1_prt_change_par

4)、删除分区

alter table schema.test22 drop default partition; 删除默认分区

alter table schema.test22 drop partition if exists "partitionName";

5)、清空分区数据

alter table test22 truncate partition for(rand(1));

6)、修改子分区模板

alter table test22 set subpartition template

(

subpartition usa value('usa'),

subpartition africa value('africa'),

default subpartition other

);

7)、拆分分区

alter table 分区表名 split partition p20120105分区名 at(('2012-01-06'::date)) into

(PARTITION p20120105(分区名) ,PARTITION p20120106(分区名) );

8)、交换分区

alter table 分区表名 exchange partition p20120102(分区名) with table 新分区表名;

5、数据的存储方式

推存储(heap)：适合数据经常变化的小表

只追加存储(Append-Only){即AO表}：适合大表，通常是批量装载数且只进行只读查询操作

默认的建表存储模式为堆存储。

创建堆存储表(heap)：

create table test(id int)distributed by(id);

创建AO表：

create table test(

id int,

name text not null,

sex text not null check(sex in('male','famale'))

)with (appendonly=true)

distributed by(id);

【注】：AO表不支持主键、唯一约束

6、快速建表

(1)、在创建表的时候，如果要创建一张结构一模一样的表，可以利用create table like命令，但是创建表后的一些特殊属性并不会一样。如压缩、只增加(appendonly)属性等。

如果不指定分布键，则默认分布键与源表一样

create table test2 (like test1) distributed by(id);

{注：使用create table like命令创建的表不带数据的，且(like 源表)得加上括号}

(2)、根据查询结果创建表，使用create table as 或 select into命令

create table as 和 select into命令功能一样，但select into语法简单且不能手动指定分布键，只能使用默认的分布键

{创建test2表}

方式一:

create table test2 as select id,name from test1 distributed by(id);

方式二：

select id,name into test1 from test2;

7、创建列存表

选择列存储或行存储的场景：

(1)、表中的数据需要做更新操作，选择行存储

(2)、如果表经常有insert操作，选择行存储

(3)、如果在select和where中涉及表的全部或大部分列时，选择行存储

(4)、如果在where和having中对单列做聚合操作且返回少量的行，选择列存储

(5)、行存储对于列多或行尺寸相对少的表更高效

(6)、列存储只在访问宽表的少量列的查询中性能更高

(7)、列存储根据有压缩优势

(8)、默认情况下，表是按行存储的方式存储

(9)、列存储必须是AO表，否则无法创建成功，使用with(orientation=column)指定为列存储

如：

create table test(

id integer,

name text not null

)with(orientation=column,appendonly=true)

distributed by(id)

8、创建压缩表

表压缩的目的是为了减少占用存储空间，用于数据仓库中的事实表。不经常进行数据和表结构的操作。压缩表必须是AO表。

GP数据库的压缩方式分为：表级压缩、列级压缩

行存储：表级压缩，列级压缩压缩算法：ZLIB、QUICKLZ

列存储：表级压缩，列级压缩压缩算法：RLE_TYPE、ZLIB、QUICKLZ

(1)、表级压缩

get_ao_distribution(表名) 查看AO表的分布情况

get_ao_compression_ratio(表名) 查看AO表的压缩率

pg_total_relation_size(表名) 查看AO表的占用空间大小(通常和函数pg_size_pretty()连用)

查看AO表test1的分布情况

select get_ao_distribution(test1);

查看AO表test1的压缩率

select get_ao_compression_ratio(test1);

查看AO表test1的占用空间大小

select pg_size_pretty(pg_total_relation_size(test1));

(2)、创建压缩表(表级压缩)

create table test1(

id integer,

name text not null

)with(appendonly=true,compression='zlib')

distributed by(id);

3)、创建压缩表(列级压缩)

create table test1(

id integer encoding(compression='zlib'),

name text not null encoding(compression='quickly'),

sex text not null encoding(compression=null)

)with(appendonly=true,orientation=column)

distributed by(id);

【注】：创建压缩表，同时使用了表级压缩和列级压缩，列级压缩会覆盖表级压缩的设置

9、数据加载

1000，0000条记录加载到数据库表

\copy test22 from '/data/test.csv' with delimeter ',';

gp数据库和mysql区别_gp数据库(创建表分区)相关推荐

gbase数据库是什么？gbase数据库与MySQL区别
gbase数据库和MySQL数据库都是比较常见的数据库管理系统,二者在功能上有点类似,但是具体使用范围有些差异.gbase数据库是什么?gbase数据库与MySQL区别有什么?下面小编就来给大家详细介 ...
oracle数据库需要的端口号,SQL Server数据库、MySQL、Oracle数据库各自的默认端口号...
我们今天主要向大家讲述的是SQL Server数据库.MySQL.Oracle数据库各自的默认端口号,以下就是对SQL Server数据库.MySQL.Oracle数据库各自的默认端口号的描述,望在你 ...
mysql创建表分区详细介绍及示例
mysql创建表分区详细介绍及示例 1. 基本概念 1.1 什么是表分区? 1.2 表分区与分表的区别 1.3 表分区有什么好处? 1.4 分区表的限制因素 2. 如何判断当前MySQL是否支持分区? ...
MYSQL定时创建表分区
MYSQL定时创建表分区一.存储过程-表分区 ----------------------------------------------------------------- 需求: 每月创建一个 ...
mysql创建表分区
文章目录创建表分区创建批量查询测试数据的存储过程创建表分区注意:创建分区个数最好是质数 DROP TABLE IF EXISTS `mg_zz_feature`; CREATE TABLE m ...
数据库——MySQL（一）（数据库常用命令、数据类型、创建表与修改表结构、约束、约束修改添加）
MySQL数据库的概述: MySQL是一种开放源代码的关系型数据库管理系统(RDBMS),MySQL数据库系统使用最常用的数据库管理语言--结构化查询语言(SQL)进行数据库信息增.删.查.改管理 - ...
mysql 按日期删除数据库_DAY11 - MySQL入门（数据库的增、删、改、查基本操作）...
一. 数据库的介绍二. MySQL的基本语法 l 注释: 单行注释: #注释内容单行注释: -- 注释内容(注意,两个"--"之后有一个空格) 多行注释: /*注释内容*/ l ...
数据库事务mysql意思_数据库事务作用
事务ACID属性与隔离级别概念数据库领域中的事务指的是一系列对数据库的操作集合,是数据库管理系统(DBMS)定义的一个执行单位.事务的作用体现在两个方面: 在并发访问数据库的场景中,利用事务来隔离 ...
mysql 数据库设置mysql注入_MYSQL数据库浅析MySQL的注入安全问题
<MYSQL数据库浅析MySQL的注入安全问题>要点: 本文介绍了MYSQL数据库浅析MySQL的注入安全问题,希望对您有用.如果有疑问,可以联系我们. 如果把用户输入到一个网页,将其插入 ...
数据库技术 mysql简介_MYSQL数据库简介
数据库系统(Database system)= 数据库管理系统(DBMS,Database Management System)+数据库(Database) 数据库管理系统(DBMS)可分为两类:一类 ...

gp数据库和mysql区别_gp数据库(创建表分区)

gp数据库和mysql区别_gp数据库(创建表分区)相关推荐

最新文章

热门文章