标签

PostgreSQL , TPC-H , HTAP


背景

去年测的PostgreSQL 10 vs Deepgreen.

《TPC-H测试 - PostgreSQL 10 vs Deepgreen(Greenplum)》

PG每年发一个大版本,此次PostgreSQL 11即将release,对PostgreSQL 11 beta3 vs Deepgreen v18 测试的TPC-H结果如下。

事实证明PG的AP能力又有了质的飞越,OLAP已无限接近MPP数据库deepgreen的性能。同时拥有单实例百万级QPS的OLTP能力,应该是目前覆盖业务场景可以做到最广泛的数据库(PG的TAG:全栈数据库 无疑)。

测试 环境

ECS , 32核 , 本地SSD 8TB.

PostgreSQL 11 beta3

Deepgreen v18 (16 segments)

SF=10

deepgreen部署quick guide

http://vitessedata.com/products/deepgreen-db/download/

以os用户digoal为例, 部署deepgreen(6个segment)

# 部署目录
mkdir -p /data01/dp/dbfast0
mkdir -p /data01/dp/dbfast1
mkdir -p /data01/dp/dbfast2
mkdir -p /data02/dp/dbfast1
mkdir -p /data02/dp/dbfast2
mkdir -p /data03/dp/dbfast1
mkdir -p /data03/dp/dbfast2
chown -R digoal:digoal /data01/dp
chown -R digoal:digoal /data02/dp
chown -R digoal:digoal /data03/dp
su - digoalwget  https://s3.amazonaws.com/vitessedata/download/deepgreendb.18.08.rh7.x86_64.180823.bin   chmod  +x deepgreendb.18.08.rh7.x86_64.180823.bin./deepgreendb.18.08.rh7.x86_64.180823.binvi ./deepgreendb/greenplum_path.sh#追加
# 使用实际目录
export MASTER_DATA_DIRECTORY=/data01/dp/dbfast0/dg-1
export PGDATA=$MASTER_DATA_DIRECTORY
export PGHOST=127.0.0.1
export PGPORT=15432
# 使用os用户名
export PGUSER=digoal
export PGPASSWORD=123
export PGDATABASE=postgres. ./deepgreendb/greenplum_path.shecho "`hostname`" > hostfilegpssh-exkeys -h localhostgpssh-exkeys -h `hostname`cat > cluster.conf <<HEREHEREARRAY_NAME="mpp1 cluster"
CLUSTER_NAME="mpp1 cluster"
MACHINE_LIST_FILE=hostfile
SEG_PREFIX=dg
DATABASE_PREFIX=dg
PORT_BASE=25432
declare -a DATA_DIRECTORY=(/data01/dp/dbfast1 /data01/dp/dbfast2 /data02/dp/dbfast1 /data02/dp/dbfast2 /data03/dp/dbfast1 /data03/dp/dbfast2)
MASTER_HOSTNAME=`hostname`
MASTER_DIRECTORY=/data01/dp/dbfast0
MASTER_PORT=15432
IP_ALLOW=0.0.0.0/0
TRUSTED_SHELL=/usr/bin/ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
export MASTER_DATA_DIRECTORY
export TRUSTED_SHELL
DEFAULT_QD_MAX_CONNECT=25
QE_CONNECT_FACTOR=5HEREHEREgpinitsystem -c cluster.conf -h hostfile

测试过程

PG 11(未使用分区表), DP v18(使用分区表,1个月1分区).

1、对所有列加索引

《PostgreSQL 快速给指定表每个字段创建索引》

do language plpgsql $$
declare    tables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier'];     n name;     x name;     i int;    sql text;    ts1 timestamp;    ts2 timestamp;
begin    set max_parallel_maintenance_workers=24;   -- 创建索引时的并行度    set max_parallel_workers=32;    set max_parallel_workers_per_gather=32;    set maintenance_work_mem='4GB';    set min_parallel_table_scan_size=0;    set min_parallel_index_scan_size=0;    set parallel_setup_cost=0;    set parallel_tuple_cost=0;    foreach n in array tables loop    i := 1;      for x in select attname from pg_attribute where attrelid=n::regclass and attnum>=1 and not attisdropped loop    sql := format('create index IF NOT EXISTS idx_%s__%s on %s (%s) tablespace tbs1', n, i, n, x);   -- 封装创建索引的SQL    ts1 := clock_timestamp();    raise notice '% start execute: %', ts1, sql;    execute sql;  -- 创建索引      commit;  -- pg 11已支持procedure, do里面开子事务  ts2 := clock_timestamp();    raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1;    i:=i+1;    end loop;    execute format('analyze %s', n);     end loop;
end;
$$;

多轮测试后得到实际使用的索引(包括pk, uk在内), 使用下面的CODE替换之前的所有索引。

select string_agg(pg_get_indexdef(indexrelname::regclass),';') from pg_stat_all_indexes  where relname = any(array['customer','lineitem','nation','orders','part','partsupp','region','supplier']) and idx_scan<>0;
do language plpgsql $$
declaretables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier'];n name;idx_sql text;ts1 timestamp;ts2 timestamp;sql text;
beginset max_parallel_maintenance_workers=24;   -- 创建索引时的并行度set max_parallel_workers=32;set max_parallel_workers_per_gather=32;set maintenance_work_mem='4GB';set min_parallel_table_scan_size=0;set min_parallel_index_scan_size=0;set parallel_setup_cost=0;set parallel_tuple_cost=0;-- set default_tablespace='tbs1';  -- 是否使用其他表空间idx_sql := 'CREATE UNIQUE INDEX IF NOT EXISTS customer_pkey ON public.customer USING btree (c_custkey);CREATE INDEX IF NOT EXISTS idx1 ON public.lineitem USING btree (l_orderkey) WHERE (l_commitdate < l_receiptdate);CREATE INDEX IF NOT EXISTS idx_lineitem__11 ON public.lineitem USING btree (l_shipdate);CREATE INDEX IF NOT EXISTS idx_lineitem__2 ON public.lineitem USING btree (l_partkey);CREATE INDEX IF NOT EXISTS idx_lineitem__3 ON public.lineitem USING btree (l_suppkey);CREATE INDEX IF NOT EXISTS idx_lineitem_orderkey ON public.lineitem USING btree (l_orderkey);CREATE INDEX IF NOT EXISTS idx_lineitem_shipdate ON public.lineitem USING btree (l_shipdate, l_discount, l_quantity);CREATE UNIQUE INDEX IF NOT EXISTS lineitem_pkey ON public.lineitem USING btree (l_orderkey, l_linenumber);CREATE INDEX IF NOT EXISTS idx6 ON public.nation USING btree (n_nationkey, n_regionkey);CREATE INDEX IF NOT EXISTS idx_nation_regionkey ON public.nation USING btree (n_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS nation_pkey ON public.nation USING btree (n_nationkey);CREATE INDEX IF NOT EXISTS idx_orders__6 ON public.orders USING btree (o_orderpriority);CREATE INDEX IF NOT EXISTS idx_orders_custkey ON public.orders USING btree (o_custkey);CREATE INDEX IF NOT EXISTS idx_orders_orderdate ON public.orders USING btree (o_orderdate);CREATE UNIQUE INDEX IF NOT EXISTS orders_pkey ON public.orders USING btree (o_orderkey);CREATE INDEX IF NOT EXISTS idx_part_1 ON public.part USING btree (p_container, p_brand);CREATE INDEX IF NOT EXISTS idx_part__5 ON public.part USING btree (p_type);CREATE INDEX IF NOT EXISTS idx_part__6 ON public.part USING btree (p_size);CREATE UNIQUE INDEX IF NOT EXISTS part_pkey ON public.part USING btree (p_partkey);CREATE INDEX IF NOT EXISTS idx ON public.partsupp USING btree (ps_partkey, ps_suppkey, ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp__4 ON public.partsupp USING btree (ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp_partkey ON public.partsupp USING btree (ps_partkey);CREATE INDEX IF NOT EXISTS idx_partsupp_suppkey ON public.partsupp USING btree (ps_suppkey);CREATE UNIQUE INDEX IF NOT EXISTS partsupp_pkey ON public.partsupp USING btree (ps_partkey, ps_suppkey);CREATE INDEX IF NOT EXISTS idx5 ON public.region USING btree (r_name, r_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS region_pkey ON public.region USING btree (r_regionkey);CREATE INDEX IF NOT EXISTS idx4 ON public.supplier USING btree (s_suppkey, s_nationkey);CREATE INDEX IF NOT EXISTS idx_supplier_nation_key ON public.supplier USING btree (s_nationkey);CREATE UNIQUE INDEX IF NOT EXISTS supplier_pkey ON public.supplier USING btree (s_suppkey);';for sql in select regexp_split_to_table(idx_sql, ';') loopts1 := clock_timestamp();raise notice '% start execute: %', ts1, sql;execute sql; commit;  -- pg 11已支持procedure, do里面开子事务ts2 := clock_timestamp();raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1;end loop;foreach n in array tables loopexecute format('analyze %s', n);end loop;
end;
$$;

2、强制测试过程中,使用并行查询

alter role postgres set enable_nestloop=off;
alter role postgres set enable_mergejoin=off;
alter role postgres set work_mem ='8GB';  alter role postgres set max_parallel_workers_per_gather=24;
alter role postgres set parallel_setup_cost=0;
alter role postgres set parallel_tuple_cost=0;
alter role postgres set min_parallel_table_scan_size=0;
alter role postgres set min_parallel_index_scan_size=0;
alter role postgres set geqo_threshold=100;alter table customer set (parallel_workers=24);
alter table lineitem set (parallel_workers=24);
alter table nation set (parallel_workers=24);
alter table orders set (parallel_workers=24);
alter table part set (parallel_workers=24);
alter table partsupp set (parallel_workers=24);
alter table region set (parallel_workers=24);
alter table supplier set (parallel_workers=24);

3、Q17,加了一个索引

create index idx_part_1 on part (p_container,p_brand);

4、PG 11校对了成本因子

《优化器成本因子校对 - PostgreSQL explain cost constants alignment to timestamp》

postgresql.auto.confcpu_tuple_cost=0.00018884145574257426
cpu_index_tuple_cost = 0.00433497085216479990
cpu_operator_cost = 0.00216748542608239995
seq_page_cost=0.014329
random_page_cost =0.016
postgresql.conflisten_addresses = '0.0.0.0'
port = 1921
max_connections = 2000
superuser_reserved_connections = 3
unix_socket_directories = '., /var/run/postgresql, /tmp'
tcp_keepalives_idle = 60
tcp_keepalives_interval = 10
tcp_keepalives_count = 10
shared_buffers = 48GB
max_prepared_transactions = 2000
work_mem = 4GB
maintenance_work_mem = 2GB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 10.0
effective_io_concurrency = 4
max_worker_processes = 128
max_parallel_maintenance_workers = 16
max_parallel_workers_per_gather = 24
parallel_leader_participation = on
max_parallel_workers = 32
wal_level = minimal
synchronous_commit = off
wal_writer_delay = 10ms
checkpoint_timeout = 35min
max_wal_size = 96GB
min_wal_size = 24GB
checkpoint_completion_target = 0.1
max_wal_senders = 0
random_page_cost = 1.1
effective_cache_size = 400GB
default_statistics_target = 1000
log_destination = 'csvlog'
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose
log_line_prefix = '%m [%p] '
log_timezone = 'PRC'
log_autovacuum_min_duration = 0
autovacuum_freeze_max_age = 1200000000
autovacuum_multixact_freeze_max_age = 1400000000
autovacuum_vacuum_cost_delay = 0ms
vacuum_freeze_table_age = 1150000000
vacuum_multixact_freeze_table_age = 1150000000
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'

5、deepgreen v18

http://vitessedata.com/products/deepgreen-db/download/install/

gp_vmem_protect_limit=16384;

TPC-H (SF=10) 性能对比

数据库 q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22
PostgreSQL 10 17 2 41 6 14 4 27 31 54 16 7 100 213 4 25 60 552 1317 11 82 95 143
PostgreSQL 11beta3 3 2 2 5 3 4 2 2 9 3 1 2 4 2 6 10 27 46 2 10 13 2
DeepGreen v16 6 4 10 4 7 1 27 10 283 5 2 5 14 1 2 5 31 53 6 10 246 7
DeepGreen v18 2 1 1 1 1 0 3 1 3 2 1 2 2 0 0 1 5 3 2 1 3 1

SF=200

SF=200 Deepgreen on ECS(64c, 512G, 8*2TB(ssd) lvm strip)(33 segments)

2018-08-25 15:03:59 [1535180639] : running TPC-H benchmark
2018-08-25 15:03:59 [1535180639] : running queries defined in TPC-H benchmark
2018-08-25 15:03:59 [1535180639] :   running query 1
2018-08-25 15:03:59 [1535180639] : run explain
2018-08-25 15:03:59 [1535180639] : run the query on background
2018-08-25 15:04:12 [1535180652] :     query 1 finished OK (12 seconds)
2018-08-25 15:04:12 [1535180652] :   running query 2
2018-08-25 15:04:12 [1535180652] : run explain
2018-08-25 15:04:12 [1535180652] : run the query on background
2018-08-25 15:04:15 [1535180655] :     query 2 finished OK (3 seconds)
2018-08-25 15:04:15 [1535180655] :   running query 3
2018-08-25 15:04:15 [1535180655] : run explain
2018-08-25 15:04:15 [1535180655] : run the query on background
2018-08-25 15:04:27 [1535180667] :     query 3 finished OK (11 seconds)
2018-08-25 15:04:27 [1535180667] :   running query 4
2018-08-25 15:04:27 [1535180667] : run explain
2018-08-25 15:04:27 [1535180667] : run the query on background
2018-08-25 15:04:34 [1535180674] :     query 4 finished OK (7 seconds)
2018-08-25 15:04:34 [1535180674] :   running query 5
2018-08-25 15:04:34 [1535180674] : run explain
2018-08-25 15:04:34 [1535180674] : run the query on background
2018-08-25 15:04:47 [1535180687] :     query 5 finished OK (12 seconds)
2018-08-25 15:04:47 [1535180687] :   running query 6
2018-08-25 15:04:47 [1535180687] : run explain
2018-08-25 15:04:47 [1535180687] : run the query on background
2018-08-25 15:04:48 [1535180688] :     query 6 finished OK (1 seconds)
2018-08-25 15:04:48 [1535180688] :   running query 7
2018-08-25 15:04:48 [1535180688] : run explain
2018-08-25 15:04:48 [1535180688] : run the query on background
2018-08-25 15:05:21 [1535180721] :     query 7 finished OK (32 seconds)
2018-08-25 15:05:21 [1535180721] :   running query 8
2018-08-25 15:05:21 [1535180721] : run explain
2018-08-25 15:05:21 [1535180721] : run the query on background
2018-08-25 15:05:32 [1535180732] :     query 8 finished OK (11 seconds)
2018-08-25 15:05:32 [1535180732] :   running query 9
2018-08-25 15:05:32 [1535180732] : run explain
2018-08-25 15:05:32 [1535180732] : run the query on background
2018-08-25 15:06:02 [1535180762] :     query 9 finished OK (29 seconds)
2018-08-25 15:06:02 [1535180762] :   running query 10
2018-08-25 15:06:02 [1535180762] : run explain
2018-08-25 15:06:02 [1535180762] : run the query on background
2018-08-25 15:06:11 [1535180771] :     query 10 finished OK (9 seconds)
2018-08-25 15:06:11 [1535180771] :   running query 11
2018-08-25 15:06:11 [1535180771] : run explain
2018-08-25 15:06:11 [1535180771] : run the query on background
2018-08-25 15:06:14 [1535180774] :     query 11 finished OK (2 seconds)
2018-08-25 15:06:14 [1535180774] :   running query 12
2018-08-25 15:06:14 [1535180774] : run explain
2018-08-25 15:06:14 [1535180774] : run the query on background
2018-08-25 15:06:23 [1535180783] :     query 12 finished OK (9 seconds)
2018-08-25 15:06:23 [1535180783] :   running query 13
2018-08-25 15:06:23 [1535180783] : run explain
2018-08-25 15:06:23 [1535180783] : run the query on background
2018-08-25 15:06:41 [1535180801] :     query 13 finished OK (17 seconds)
2018-08-25 15:06:41 [1535180801] :   running query 14
2018-08-25 15:06:41 [1535180801] : run explain
2018-08-25 15:06:41 [1535180801] : run the query on background
2018-08-25 15:06:42 [1535180802] :     query 14 finished OK (1 seconds)
2018-08-25 15:06:42 [1535180802] :   running query 15
2018-08-25 15:06:42 [1535180802] : run explain
2018-08-25 15:06:42 [1535180802] : run the query on background
2018-08-25 15:06:46 [1535180806] :     query 15 finished OK (4 seconds)
2018-08-25 15:06:46 [1535180806] :   running query 16
2018-08-25 15:06:46 [1535180806] : run explain
2018-08-25 15:06:46 [1535180806] : run the query on background
2018-08-25 15:06:56 [1535180816] :     query 16 finished OK (9 seconds)
2018-08-25 15:06:56 [1535180816] :   running query 17
2018-08-25 15:06:56 [1535180816] : run explain
2018-08-25 15:06:56 [1535180816] : run the query on background
2018-08-25 15:08:06 [1535180886] :     query 17 finished OK (69 seconds)
2018-08-25 15:08:06 [1535180886] :   running query 18
2018-08-25 15:08:06 [1535180886] : run explain
2018-08-25 15:08:06 [1535180886] : run the query on background
2018-08-25 15:08:30 [1535180910] :     query 18 finished OK (24 seconds)
2018-08-25 15:08:30 [1535180910] :   running query 19
2018-08-25 15:08:30 [1535180910] : run explain
2018-08-25 15:08:31 [1535180911] : run the query on background
2018-08-25 15:08:42 [1535180922] :     query 19 finished OK (11 seconds)
2018-08-25 15:08:42 [1535180922] :   running query 20
2018-08-25 15:08:42 [1535180922] : run explain
2018-08-25 15:08:42 [1535180922] : run the query on background
2018-08-25 15:08:55 [1535180936] :     query 20 finished OK (13 seconds)
2018-08-25 15:08:56 [1535180936] :   running query 21
2018-08-25 15:08:56 [1535180936] : run explain
2018-08-25 15:08:56 [1535180936] : run the query on background
2018-08-25 15:09:16 [1535180956] :     query 21 finished OK (20 seconds)
2018-08-25 15:09:16 [1535180956] :   running query 22
2018-08-25 15:09:16 [1535180956] : run explain
2018-08-25 15:09:16 [1535180956] : run the query on background
2018-08-25 15:09:26 [1535180966] :     query 22 finished OK (9 seconds)
2018-08-25 15:09:26 [1535180966] : finished TPC-H benchmark

SF=200 PG 11

使用HINT改几条慢SQL。

gp_tpch/dss/queries/2.sql17.sql18.sql

《PostgreSQL 开启with-llvm(JIT)后,新增插件异常(clang: Command not found)处理》

《关键时刻HINT出彩 - PG优化器的参数优化、执行计划固化CASE》

Q2

/*+ Leading((partsupp ((nation region) supplier))) Leading((part (((nation region) supplier) partsupp))) */
selects_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_comment
frompart,supplier,partsupp,nation,region
wherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 47and p_type like '%NICKEL'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA'and ps_supplycost = (selectmin(ps_supplycost)frompartsupp,supplier,nation,regionwherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA')
order bys_acctbal desc,n_name,s_name,p_partkey
LIMIT 100;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------Limit  (cost=4589445377.60..4589445377.60 rows=1 width=198)->  Sort  (cost=4589445377.60..4589445377.60 rows=1 width=198)Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey->  Hash Join  (cost=225752.90..4589445377.59 rows=1 width=198)Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))->  Gather  (cost=0.00..18775.09 rows=158816 width=34)Workers Planned: 28->  Parallel Seq Scan on part  (cost=0.00..18775.09 rows=5672 width=34)Filter: (((p_type)::text ~~ '%NICKEL'::text) AND (p_size = 47))->  Hash  (cost=69355.84..69355.84 rows=31998515 width=180)->  Gather  (cost=69220.57..69355.84 rows=31998515 width=180)Workers Planned: 28->  Parallel Hash Join  (cost=69220.57..69355.84 rows=1142804 width=180)Hash Cond: (supplier.s_suppkey = partsupp.ps_suppkey)->  Parallel Hash Join  (cost=858.16..861.87 rows=14286 width=172)Hash Cond: (nation.n_nationkey = supplier.s_nationkey)->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=34)Hash Cond: (nation.n_regionkey = region.r_regionkey)->  Parallel Seq Scan on nation  (cost=0.00..0.01 rows=1 width=42)->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)->  Parallel Seq Scan on region  (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)->  Parallel Hash  (cost=689.83..689.83 rows=71429 width=154)->  Parallel Seq Scan on supplier  (cost=0.00..689.83 rows=71429 width=154)->  Parallel Hash  (cost=54898.31..54898.31 rows=5714021 width=24)->  Parallel Seq Scan on partsupp  (cost=0.00..54898.31 rows=5714021 width=24)SubPlan 1->  Aggregate  (cost=1806.01..1806.01 rows=1 width=8)->  Hash Join  (cost=1804.89..1806.00 rows=7 width=8)Hash Cond: (partsupp_1.ps_suppkey = supplier_1.s_suppkey)->  Index Scan using idx_partsupp_partkey on partsupp partsupp_1  (cost=0.49..1.48 rows=37 width=16)Index Cond: (part.p_partkey = ps_partkey)->  Hash  (cost=861.87..861.87 rows=400000 width=8)->  Gather  (cost=858.16..861.87 rows=400000 width=8)Workers Planned: 28->  Parallel Hash Join  (cost=858.16..861.87 rows=14286 width=8)Hash Cond: (nation_1.n_nationkey = supplier_1.s_nationkey)->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=8)Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)->  Parallel Seq Scan on nation nation_1  (cost=0.00..0.01 rows=1 width=16)->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)->  Parallel Seq Scan on region region_1  (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)->  Parallel Hash  (cost=689.83..689.83 rows=71429 width=16)->  Parallel Seq Scan on supplier supplier_1  (cost=0.00..689.83 rows=71429 width=16)JIT:Functions: 98Inlining: trueOptimization: true
(49 rows)

更新

Q2 建议加这个索引, 使用nestloop join

create index idx on partsupp (ps_partkey,ps_suppkey,ps_supplycost);
set enable_nestloop=on;or/*+ Set(enable_nestloop on) */ select                                                                                                           s_acctbal,s_name,   n_name,p_partkey,p_mfgr,   s_address,s_phone,  s_comment
from             part,supplier,partsupp,nation,  region
where         p_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 47           and p_type like '%NICKEL'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA'          and ps_supplycost = (select       min(ps_supplycost)from                      partsupp,supplier,nation,  region where         p_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA'          )
order by s_acctbal desc,n_name,        s_name,p_partkey
LIMIT 100;Time: 5345.093 ms (00:05.345)QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------Limit  (cost=2807136.16..2807136.16 rows=1 width=198)->  Sort  (cost=2807136.16..2807136.16 rows=1 width=198)Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey->  Nested Loop  (cost=17.90..2807136.15 rows=1 width=198)Join Filter: (nation.n_nationkey = supplier.s_nationkey)->  Nested Loop  (cost=17.88..2807135.97 rows=6 width=180)->  Nested Loop  (cost=17.51..2807133.70 rows=6 width=42)->  Index Scan using idx_part__6 on part  (cost=0.49..15906.04 rows=160623 width=34)Index Cond: (p_size = 47)Filter: ((p_type)::text ~~ '%NICKEL'::text)->  Index Only Scan using idx on partsupp  (cost=17.02..17.38 rows=1 width=24)Index Cond: ((ps_partkey = part.p_partkey) AND (ps_supplycost = (SubPlan 1)))SubPlan 1->  Aggregate  (cost=16.52..16.52 rows=1 width=8)->  Hash Join  (cost=0.91..16.51 rows=8 width=8)Hash Cond: (supplier_1.s_nationkey = nation_1.n_nationkey)->  Nested Loop  (cost=0.86..16.34 rows=38 width=16)->  Index Scan using idx_partsupp_partkey on partsupp partsupp_1  (cost=0.49..0.78 rows=38 width=16)Index Cond: (part.p_partkey = ps_partkey)->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.37..0.41 rows=1 width=16)Index Cond: (s_suppkey = partsupp_1.ps_suppkey)->  Hash  (cost=0.03..0.03 rows=5 width=8)->  Gather  (cost=0.02..0.03 rows=5 width=8)Workers Planned: 28->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=8)Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)->  Parallel Seq Scan on nation nation_1  (cost=0.00..0.01 rows=1 width=16)->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)->  Parallel Seq Scan on region region_1  (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)->  Index Scan using supplier_pkey on supplier  (cost=0.37..0.38 rows=1 width=154)Index Cond: (s_suppkey = partsupp.ps_suppkey)->  Materialize  (cost=0.02..0.06 rows=5 width=34)->  Gather  (cost=0.02..0.03 rows=5 width=34)Workers Planned: 28->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=34)Hash Cond: (nation.n_regionkey = region.r_regionkey)->  Parallel Seq Scan on nation  (cost=0.00..0.01 rows=1 width=42)->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)->  Parallel Seq Scan on region  (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)JIT:Functions: 67Inlining: trueOptimization: true
(45 rows)

Q4

create index idx1 on lineitem (l_orderkey) where l_commitdate < l_receiptdate;

Q17

/*+ Set(work_mem 12GB) Set(max_parallel_workers_per_gather 28) */ explain selectsum(l_extendedprice) / 7.0 as avg_yearly
fromlineitem,part,(SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_agg
wherep_partkey = l_partkeyand agg_partkey = l_partkeyand p_brand = 'Brand#11'and p_container = 'SM PKG'and l_quantity < avg_quantity
LIMIT 1;
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------Limit  (cost=6130644.31..6130644.31 rows=1 width=8)->  Aggregate  (cost=6130644.31..6130644.31 rows=1 width=8)->  Hash Join  (cost=5647079.65..6129794.97 rows=391854 width=8)Hash Cond: (part.p_partkey = lineitem_1.l_partkey)Join Filter: (lineitem.l_quantity < (('0.2'::double precision * avg(lineitem_1.l_quantity))))->  Gather  (cost=11834.33..491832.73 rows=1193799 width=32)Workers Planned: 28->  Parallel Hash Join  (cost=11834.33..491832.73 rows=42636 width=32)Hash Cond: (lineitem.l_partkey = part.p_partkey)->  Parallel Seq Scan on lineitem  (cost=0.00..382461.00 rows=42857385 width=24)->  Parallel Hash  (cost=11830.99..11830.99 rows=1421 width=8)->  Parallel Bitmap Heap Scan on part  (cost=427.06..11830.99 rows=1421 width=8)Recheck Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))->  Bitmap Index Scan on idx_part_1  (cost=0.00..418.44 rows=39794 width=0)Index Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))->  Hash  (cost=5542429.84..5542429.84 rows=39389900 width=16)->  Finalize HashAggregate  (cost=5356798.88..5534991.39 rows=39389900 width=16)Group Key: lineitem_1.l_partkey->  Gather  (cost=568246.52..575684.96 rows=1102917200 width=40)Workers Planned: 28->  Partial HashAggregate  (cost=568246.52..575684.96 rows=39389900 width=40)Group Key: lineitem_1.l_partkey->  Parallel Seq Scan on lineitem lineitem_1  (cost=0.00..382461.00 rows=42857385 width=16)JIT:Functions: 44Inlining: trueOptimization: true
(27 rows)

Q18

/*+ Leading( ((lineitem (customer orders)) ANY_subquery) ) */  explain selectc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,sum(l_quantity)
fromcustomer,orders,lineitem
whereo_orderkey in (selectl_orderkeyfromlineitemgroup byl_orderkey havingsum(l_quantity) > 312)and c_custkey = o_custkeyand o_orderkey = l_orderkey
group byc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice
order byo_totalprice desc,o_orderdate
LIMIT 100;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------Limit  (cost=5094012.87..5094013.09 rows=100 width=55)->  Sort  (cost=5094012.87..5104543.55 rows=4858479 width=55)Sort Key: orders.o_totalprice DESC, orders.o_orderdate->  HashAggregate  (cost=4932105.34..4933022.83 rows=4858479 width=55)Group Key: customer.c_custkey, orders.o_orderkey->  Hash Join  (cost=1182913.25..4900513.30 rows=4858479 width=55)Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)->  Gather  (cost=160021.94..1146574.91 rows=1200006784 width=63)Workers Planned: 28->  Parallel Hash Join  (cost=160021.94..1146574.91 rows=42857385 width=63)Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)->  Parallel Seq Scan on lineitem  (cost=0.00..382461.00 rows=42857385 width=16)->  Parallel Hash  (cost=134775.97..134775.97 rows=10714120 width=47)->  Parallel Hash Join  (cost=108506.79..134775.97 rows=10714120 width=47)Hash Cond: (customer.c_custkey = orders.o_custkey)->  Parallel Seq Scan on customer  (cost=0.00..11101.38 rows=1071468 width=27)->  Parallel Hash  (cost=83260.82..83260.82 rows=10714120 width=28)->  Parallel Seq Scan on orders  (cost=0.00..83260.82 rows=10714120 width=28)->  Hash  (cost=1020029.33..1020029.33 rows=1214594 width=8)->  Finalize HashAggregate  (cost=1011214.02..1019799.96 rows=1214594 width=8)Group Key: lineitem_1.l_orderkeyFilter: (sum(lineitem_1.l_quantity) > '312'::double precision)->  Gather  (cost=568246.52..568934.61 rows=102025924 width=16)Workers Planned: 28->  Partial HashAggregate  (cost=568246.52..568934.61 rows=3643783 width=16)Group Key: lineitem_1.l_orderkey->  Parallel Seq Scan on lineitem lineitem_1  (cost=0.00..382461.00 rows=42857385 width=16)
(27 rows)

SF=200 pg 11 结果

2018-08-25 19:46:22 [1535197582] : running TPC-H benchmark
2018-08-25 19:46:22 [1535197582] : running queries defined in TPC-H benchmark
2018-08-25 19:46:22 [1535197582] :   running query 1
2018-08-25 19:46:22 [1535197582] : run explain
2018-08-25 19:46:22 [1535197582] : run the query on background
2018-08-25 19:46:40 [1535197600] :     query 1 finished OK (18 seconds)
2018-08-25 19:46:40 [1535197600] :   running query 2
2018-08-25 19:46:40 [1535197600] : run explain
2018-08-25 19:46:40 [1535197600] : run the query on background
2018-08-25 19:47:19 [1535197639] :     query 2 finished OK (38 seconds)
2018-08-25 19:47:19 [1535197639] :   running query 3
2018-08-25 19:47:19 [1535197639] : run explain
2018-08-25 19:47:21 [1535197641] : run the query on background
2018-08-25 19:47:46 [1535197666] :     query 3 finished OK (25 seconds)
2018-08-25 19:47:46 [1535197666] :   running query 4
2018-08-25 19:47:46 [1535197666] : run explain
2018-08-25 19:47:47 [1535197667] : run the query on background
2018-08-25 19:48:19 [1535197699] :     query 4 finished OK (32 seconds)
2018-08-25 19:48:19 [1535197699] :   running query 5
2018-08-25 19:48:19 [1535197699] : run explain
2018-08-25 19:48:19 [1535197699] : run the query on background
2018-08-25 19:49:16 [1535197756] :     query 5 finished OK (57 seconds)
2018-08-25 19:49:17 [1535197757] :   running query 6
2018-08-25 19:49:17 [1535197757] : run explain
2018-08-25 19:49:17 [1535197757] : run the query on background
2018-08-25 19:49:25 [1535197765] :     query 6 finished OK (8 seconds)
2018-08-25 19:49:25 [1535197765] :   running query 7
2018-08-25 19:49:25 [1535197765] : run explain
2018-08-25 19:49:25 [1535197765] : run the query on background
2018-08-25 19:50:17 [1535197817] :     query 7 finished OK (52 seconds)
2018-08-25 19:50:17 [1535197817] :   running query 8
2018-08-25 19:50:18 [1535197818] : run explain
2018-08-25 19:50:18 [1535197818] : run the query on background
2018-08-25 19:50:42 [1535197842] :     query 8 finished OK (24 seconds)
2018-08-25 19:50:42 [1535197842] :   running query 9
2018-08-25 19:50:42 [1535197842] : run explain
2018-08-25 19:50:42 [1535197842] : run the query on background
2018-08-25 19:51:49 [1535197909] :     query 9 finished OK (66 seconds)
2018-08-25 19:51:49 [1535197909] :   running query 10
2018-08-25 19:51:49 [1535197909] : run explain
2018-08-25 19:51:49 [1535197909] : run the query on background
2018-08-25 19:52:27 [1535197947] :     query 10 finished OK (38 seconds)
2018-08-25 19:52:27 [1535197947] :   running query 11
2018-08-25 19:52:27 [1535197947] : run explain
2018-08-25 19:52:28 [1535197948] : run the query on background
2018-08-25 19:52:52 [1535197972] :     query 11 finished OK (24 seconds)
2018-08-25 19:52:52 [1535197972] :   running query 12
2018-08-25 19:52:52 [1535197972] : run explain
2018-08-25 19:52:52 [1535197972] : run the query on background
2018-08-25 19:53:18 [1535197998] :     query 12 finished OK (26 seconds)
2018-08-25 19:53:18 [1535197998] :   running query 13
2018-08-25 19:53:18 [1535197998] : run explain
2018-08-25 19:53:19 [1535197999] : run the query on background
2018-08-25 19:54:58 [1535198098] :     query 13 finished OK (98 seconds)
2018-08-25 19:54:58 [1535198098] :   running query 14
2018-08-25 19:54:58 [1535198098] : run explain
2018-08-25 19:54:58 [1535198098] : run the query on background
2018-08-25 19:55:11 [1535198111] :     query 14 finished OK (13 seconds)
2018-08-25 19:55:11 [1535198111] :   running query 15
2018-08-25 19:55:11 [1535198111] : run explain
2018-08-25 19:55:11 [1535198111] : run the query on background
2018-08-25 19:56:10 [1535198170] :     query 15 finished OK (58 seconds)
2018-08-25 19:56:10 [1535198170] :   running query 16
2018-08-25 19:56:10 [1535198170] : run explain
2018-08-25 19:56:10 [1535198170] : run the query on background
2018-08-25 19:58:05 [1535198285] :     query 16 finished OK (114 seconds)
2018-08-25 19:58:05 [1535198285] :   running query 17
2018-08-25 19:58:05 [1535198285] : run explain
2018-08-25 19:58:07 [1535198287] : run the query on background
2018-08-25 20:10:21 [1535199021] :     query 17 finished OK (732 seconds)
2018-08-25 20:10:21 [1535199021] :   running query 18
2018-08-25 20:10:21 [1535199021] : run explain
2018-08-25 20:10:22 [1535199022] : run the query on background
2018-08-25 20:20:19 [1535199619] :     query 18 finished OK (595 seconds)
2018-08-25 20:20:19 [1535199619] :   running query 19
2018-08-25 20:20:19 [1535199619] : run explain
2018-08-25 20:20:19 [1535199619] : run the query on background
2018-08-25 20:20:31 [1535199631] :     query 19 finished OK (12 seconds)
2018-08-25 20:20:31 [1535199631] :   running query 20
2018-08-25 20:20:31 [1535199631] : run explain
2018-08-25 20:20:34 [1535199634] : run the query on background
2018-08-25 20:24:08 [1535199848] :     query 20 finished OK (213 seconds)
2018-08-25 20:24:08 [1535199848] :   running query 21
2018-08-25 20:24:08 [1535199848] : run explain
2018-08-25 20:24:08 [1535199848] : run the query on background
2018-08-25 20:26:13 [1535199973] :     query 21 finished OK (124 seconds)
2018-08-25 20:26:13 [1535199973] :   running query 22
2018-08-25 20:26:13 [1535199973] : run explain
2018-08-25 20:26:14 [1535199974] : run the query on background
2018-08-25 20:26:28 [1535199988] :     query 22 finished OK (14 seconds)
2018-08-25 20:26:28 [1535199988] : finished TPC-H benchmark

SF=200

数据库 q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22
PostgreSQL 11beta3 18 38 25 32 57 8 52 24 66 38 24 26 98 13 58 114 732 595 12 213 124 14
DeepGreen v18 12 3 11 7 12 1 32 11 29 9 2 9 17 1 4 9 69 24 11 13 20 9

小结

PostgreSQL 11 parallel hash agg, parallel hash join的性能有了质的飞跃。

同时PostgreSQL 11在创建索引时,也支持了PARALLEL,创建索引的速度有了飞跃。

(SF=10)相比PostgreSQL 10,PostgreSQL 11版本AP能力得到了极大增强。在“没有使用列存储、没有使用分区表的”情况下,已接近Deepgreen(Greenplum原班人马)。ck, 田丰老师 .

PostgreSQL 11优化器还有可以加强的地方,在TPC-H的一些SQL中,执行计划并不是最好的,比如通过aqo提高优化器的能力。

https://github.com/postgrespro/aqo

PostgreSQL实现HTAP是必然的趋势。将来还可以结合citus插件实现sharding, (同时引入列存功能)。

参考

《TPC-H测试 - PostgreSQL 10 vs Deepgreen(Greenplum)》

https://github.com/digoal/gp_tpch

《PostgreSQL 10 + PostGIS + Sharding(pg_pathman) + MySQL(fdw外部表) on ECS 部署指南(适合新用户) - 珍藏级》

《PostgreSQL on ECS多云盘的部署、快照备份和恢复》

(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen相关推荐

  1. PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab

    标签 PostgreSQL , tpcc 背景 环境 阿里云虚拟机 [root@pg11-test ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32- ...

  2. PostgreSQL 11 1Kw TPCC , 1亿 TPCB 7*24 强压耐久测试

    标签 PostgreSQL , tpcc , tpcb 背景 TPCC, TPCB是工业标准的OLTP类型业务的数据库测试,包含大量的读.写.更新.删除操作. 7*24小时强压耐久测试,主要看数据库在 ...

  3. PostgreSQL 11 1Kw TPCC , 1亿 TPCB 7*24 强压耐久测试 1

    标签 PostgreSQL , tpcc , tpcb 背景 TPCC, TPCB是工业标准的OLTP类型业务的数据库测试,包含大量的读.写.更新.删除操作. 7*24小时强压耐久测试,主要看数据库在 ...

  4. PostgreSQL 11 1000亿 tpcb、1000W tpcc 性能测试 - on 阿里云ECS + ESSD (含quorum based 0丢失多副本配置与性能测试)...

    标签 PostgreSQL , pgbench , tpcb , tpcc , tpch 背景 https://help.aliyun.com/knowledge_detail/64950.html ...

  5. PostgreSQL 11 100亿 tpcb 性能测试 on ECS

    标签 PostgreSQL , tpcb , pgbench , 100亿 背景 PostgreSQL 11 发布在即,以下是tpcc与tpch测试的结果: <PostgreSQL 11 tpc ...

  6. PostgreSQL 11 preview - bloom filter 误报率评估测试及如何降低误报 - 暨bloom filter应用于HEAP与INDEX的一致性检测...

    标签 PostgreSQL , bloom filter , 误报率 , amcheck 背景 bloom filter是一个空间压缩,概率数据结构,可以存储超容量的数据集,用于判断某个元素是否存在于 ...

  7. 小学生除法检验,10道题,每小题10分,被除数和除数随机产生。被除数是20~200之间的数,除数是2~9之间的一位数。用户输入答案,计算机判卷,做对了加10分,做错了不允许重做,最后输出成绩;

    欢迎加qq群:453398542 学习讨论,会定期分享资料课程,解答问题. 1.         小学生除法检验,10道题,每小题10分,被除数和除数随机产生.被除数是20~200之间的数,除数是2~ ...

  8. 10个200万的网络监控摄像机应该配置百兆还是千兆的交换机?

    200万像素的网络摄像机,是1080P网络摄像机,主码流4-6M,子码流为0.8~1M,共约7M,10个200万的网络摄像机需要的码流是70M,不管是百兆交换机还是千兆交换机,他们的实际带宽只有百分之 ...

  9. 物联网平台 源码 智慧农业物联网平台 ,支持mqtt,h ttp,coap协议 java+postgresql,支持集群

    物联网平台 源码 智慧农业物联网平台 ,支持mqtt,h ttp,coap协议 java+postgresql,支持集群. 支持萤石云摄像头接入,云台控制,支持nbiot,lora物联网平台,5g物联 ...

  10. 测试人员转型是大势所趋:我的10年自动化测试经验分享

    做测试十多年,有不少人问过我下面问题: 现在的手工测试真的不行了吗? 测试工程师,三年多快四年的经验,入门自动化测试需要多久? 自学自动化测试到底需要学哪些东西? 不得不说,随着行业的竞争加剧,互联网 ...

最新文章

  1. 联手IBM布局云计算,王健林如何再造一个新万达?
  2. 微众WeCross 跨链平台(3)交易流程
  3. python sorted原理_Python sort()和sorted()
  4. 增强:MB1A物料价格检查
  5. OpenCV基于ARM的Linux系统的交叉编译
  6. 前魅族科技高级副总裁李楠:iPhone 11的工业设计完全崩盘
  7. python100以内自然数之和_python输出100以内的质数与合数实例代码
  8. [20170515]数据库启动的一个疑问.txt
  9. 智能语音识别究竟是如何实现的?
  10. [转载] python中callable_Python callable() 函数
  11. 方差分析软件_手把手教你用Graphpad做单因素方差分析
  12. vscode配置Git管理工具GitLens
  13. 2013年11月16Aspx源码发布详情
  14. 一个VSCode插件实现软妹音程序员鼓励师24小时在线,还能吐槽PM
  15. USB接口的功能和区分全解析
  16. office转换为还原度高的html,使用Aspose把office文件转换为Html文件及生成文件浏览乱码的解决...
  17. 开关二极管IN4148
  18. k8s(18)之污点与容忍
  19. provide inject
  20. mysql好评中评统计_scrapy爬取京东笔记本及评论信息存入MySQL数据库

热门文章

  1. LCT动态树【史上最精简易懂的LCT讲解】
  2. jmeter-如何进行参数化-循环读取参数
  3. 微服务自动化部署(ansible playbook)干货之--zookeeper部署
  4. iOS 开发 入门:使用Ad Hoc 进行用户测试
  5. xcode6以后, 使用.pch
  6. PHP笔记(CSS篇)
  7. 【习题基础知识】输入与输出、简单循环(好多图贴上去好累...要看图的话我把word文档发给你)...
  8. 【转】Java多线程面试问题集锦
  9. Microsoft Enterprise Library 5.0 系列(二) Cryptography Application Block (高级)
  10. 化学堵水剂处理油井出水问题