(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen
标签
PostgreSQL , TPC-H , HTAP
背景
去年测的PostgreSQL 10 vs Deepgreen.
《TPC-H测试 - PostgreSQL 10 vs Deepgreen(Greenplum)》
PG每年发一个大版本,此次PostgreSQL 11即将release,对PostgreSQL 11 beta3 vs Deepgreen v18 测试的TPC-H结果如下。
事实证明PG的AP能力又有了质的飞越,OLAP已无限接近MPP数据库deepgreen的性能。同时拥有单实例百万级QPS的OLTP能力,应该是目前覆盖业务场景可以做到最广泛的数据库(PG的TAG:全栈数据库 无疑)。
测试 环境
ECS , 32核 , 本地SSD 8TB.
PostgreSQL 11 beta3
Deepgreen v18 (16 segments)
SF=10
deepgreen部署quick guide
http://vitessedata.com/products/deepgreen-db/download/
以os用户digoal为例, 部署deepgreen(6个segment)
# 部署目录
mkdir -p /data01/dp/dbfast0
mkdir -p /data01/dp/dbfast1
mkdir -p /data01/dp/dbfast2
mkdir -p /data02/dp/dbfast1
mkdir -p /data02/dp/dbfast2
mkdir -p /data03/dp/dbfast1
mkdir -p /data03/dp/dbfast2
chown -R digoal:digoal /data01/dp
chown -R digoal:digoal /data02/dp
chown -R digoal:digoal /data03/dp
su - digoalwget https://s3.amazonaws.com/vitessedata/download/deepgreendb.18.08.rh7.x86_64.180823.bin chmod +x deepgreendb.18.08.rh7.x86_64.180823.bin./deepgreendb.18.08.rh7.x86_64.180823.binvi ./deepgreendb/greenplum_path.sh#追加
# 使用实际目录
export MASTER_DATA_DIRECTORY=/data01/dp/dbfast0/dg-1
export PGDATA=$MASTER_DATA_DIRECTORY
export PGHOST=127.0.0.1
export PGPORT=15432
# 使用os用户名
export PGUSER=digoal
export PGPASSWORD=123
export PGDATABASE=postgres. ./deepgreendb/greenplum_path.shecho "`hostname`" > hostfilegpssh-exkeys -h localhostgpssh-exkeys -h `hostname`cat > cluster.conf <<HEREHEREARRAY_NAME="mpp1 cluster"
CLUSTER_NAME="mpp1 cluster"
MACHINE_LIST_FILE=hostfile
SEG_PREFIX=dg
DATABASE_PREFIX=dg
PORT_BASE=25432
declare -a DATA_DIRECTORY=(/data01/dp/dbfast1 /data01/dp/dbfast2 /data02/dp/dbfast1 /data02/dp/dbfast2 /data03/dp/dbfast1 /data03/dp/dbfast2)
MASTER_HOSTNAME=`hostname`
MASTER_DIRECTORY=/data01/dp/dbfast0
MASTER_PORT=15432
IP_ALLOW=0.0.0.0/0
TRUSTED_SHELL=/usr/bin/ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
export MASTER_DATA_DIRECTORY
export TRUSTED_SHELL
DEFAULT_QD_MAX_CONNECT=25
QE_CONNECT_FACTOR=5HEREHEREgpinitsystem -c cluster.conf -h hostfile
测试过程
PG 11(未使用分区表), DP v18(使用分区表,1个月1分区).
1、对所有列加索引
《PostgreSQL 快速给指定表每个字段创建索引》
do language plpgsql $$
declare tables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier']; n name; x name; i int; sql text; ts1 timestamp; ts2 timestamp;
begin set max_parallel_maintenance_workers=24; -- 创建索引时的并行度 set max_parallel_workers=32; set max_parallel_workers_per_gather=32; set maintenance_work_mem='4GB'; set min_parallel_table_scan_size=0; set min_parallel_index_scan_size=0; set parallel_setup_cost=0; set parallel_tuple_cost=0; foreach n in array tables loop i := 1; for x in select attname from pg_attribute where attrelid=n::regclass and attnum>=1 and not attisdropped loop sql := format('create index IF NOT EXISTS idx_%s__%s on %s (%s) tablespace tbs1', n, i, n, x); -- 封装创建索引的SQL ts1 := clock_timestamp(); raise notice '% start execute: %', ts1, sql; execute sql; -- 创建索引 commit; -- pg 11已支持procedure, do里面开子事务 ts2 := clock_timestamp(); raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1; i:=i+1; end loop; execute format('analyze %s', n); end loop;
end;
$$;
多轮测试后得到实际使用的索引(包括pk, uk在内), 使用下面的CODE替换之前的所有索引。
select string_agg(pg_get_indexdef(indexrelname::regclass),';') from pg_stat_all_indexes where relname = any(array['customer','lineitem','nation','orders','part','partsupp','region','supplier']) and idx_scan<>0;
do language plpgsql $$
declaretables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier'];n name;idx_sql text;ts1 timestamp;ts2 timestamp;sql text;
beginset max_parallel_maintenance_workers=24; -- 创建索引时的并行度set max_parallel_workers=32;set max_parallel_workers_per_gather=32;set maintenance_work_mem='4GB';set min_parallel_table_scan_size=0;set min_parallel_index_scan_size=0;set parallel_setup_cost=0;set parallel_tuple_cost=0;-- set default_tablespace='tbs1'; -- 是否使用其他表空间idx_sql := 'CREATE UNIQUE INDEX IF NOT EXISTS customer_pkey ON public.customer USING btree (c_custkey);CREATE INDEX IF NOT EXISTS idx1 ON public.lineitem USING btree (l_orderkey) WHERE (l_commitdate < l_receiptdate);CREATE INDEX IF NOT EXISTS idx_lineitem__11 ON public.lineitem USING btree (l_shipdate);CREATE INDEX IF NOT EXISTS idx_lineitem__2 ON public.lineitem USING btree (l_partkey);CREATE INDEX IF NOT EXISTS idx_lineitem__3 ON public.lineitem USING btree (l_suppkey);CREATE INDEX IF NOT EXISTS idx_lineitem_orderkey ON public.lineitem USING btree (l_orderkey);CREATE INDEX IF NOT EXISTS idx_lineitem_shipdate ON public.lineitem USING btree (l_shipdate, l_discount, l_quantity);CREATE UNIQUE INDEX IF NOT EXISTS lineitem_pkey ON public.lineitem USING btree (l_orderkey, l_linenumber);CREATE INDEX IF NOT EXISTS idx6 ON public.nation USING btree (n_nationkey, n_regionkey);CREATE INDEX IF NOT EXISTS idx_nation_regionkey ON public.nation USING btree (n_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS nation_pkey ON public.nation USING btree (n_nationkey);CREATE INDEX IF NOT EXISTS idx_orders__6 ON public.orders USING btree (o_orderpriority);CREATE INDEX IF NOT EXISTS idx_orders_custkey ON public.orders USING btree (o_custkey);CREATE INDEX IF NOT EXISTS idx_orders_orderdate ON public.orders USING btree (o_orderdate);CREATE UNIQUE INDEX IF NOT EXISTS orders_pkey ON public.orders USING btree (o_orderkey);CREATE INDEX IF NOT EXISTS idx_part_1 ON public.part USING btree (p_container, p_brand);CREATE INDEX IF NOT EXISTS idx_part__5 ON public.part USING btree (p_type);CREATE INDEX IF NOT EXISTS idx_part__6 ON public.part USING btree (p_size);CREATE UNIQUE INDEX IF NOT EXISTS part_pkey ON public.part USING btree (p_partkey);CREATE INDEX IF NOT EXISTS idx ON public.partsupp USING btree (ps_partkey, ps_suppkey, ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp__4 ON public.partsupp USING btree (ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp_partkey ON public.partsupp USING btree (ps_partkey);CREATE INDEX IF NOT EXISTS idx_partsupp_suppkey ON public.partsupp USING btree (ps_suppkey);CREATE UNIQUE INDEX IF NOT EXISTS partsupp_pkey ON public.partsupp USING btree (ps_partkey, ps_suppkey);CREATE INDEX IF NOT EXISTS idx5 ON public.region USING btree (r_name, r_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS region_pkey ON public.region USING btree (r_regionkey);CREATE INDEX IF NOT EXISTS idx4 ON public.supplier USING btree (s_suppkey, s_nationkey);CREATE INDEX IF NOT EXISTS idx_supplier_nation_key ON public.supplier USING btree (s_nationkey);CREATE UNIQUE INDEX IF NOT EXISTS supplier_pkey ON public.supplier USING btree (s_suppkey);';for sql in select regexp_split_to_table(idx_sql, ';') loopts1 := clock_timestamp();raise notice '% start execute: %', ts1, sql;execute sql; commit; -- pg 11已支持procedure, do里面开子事务ts2 := clock_timestamp();raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1;end loop;foreach n in array tables loopexecute format('analyze %s', n);end loop;
end;
$$;
2、强制测试过程中,使用并行查询
alter role postgres set enable_nestloop=off;
alter role postgres set enable_mergejoin=off;
alter role postgres set work_mem ='8GB'; alter role postgres set max_parallel_workers_per_gather=24;
alter role postgres set parallel_setup_cost=0;
alter role postgres set parallel_tuple_cost=0;
alter role postgres set min_parallel_table_scan_size=0;
alter role postgres set min_parallel_index_scan_size=0;
alter role postgres set geqo_threshold=100;alter table customer set (parallel_workers=24);
alter table lineitem set (parallel_workers=24);
alter table nation set (parallel_workers=24);
alter table orders set (parallel_workers=24);
alter table part set (parallel_workers=24);
alter table partsupp set (parallel_workers=24);
alter table region set (parallel_workers=24);
alter table supplier set (parallel_workers=24);
3、Q17,加了一个索引
create index idx_part_1 on part (p_container,p_brand);
4、PG 11校对了成本因子
《优化器成本因子校对 - PostgreSQL explain cost constants alignment to timestamp》
postgresql.auto.confcpu_tuple_cost=0.00018884145574257426
cpu_index_tuple_cost = 0.00433497085216479990
cpu_operator_cost = 0.00216748542608239995
seq_page_cost=0.014329
random_page_cost =0.016
postgresql.conflisten_addresses = '0.0.0.0'
port = 1921
max_connections = 2000
superuser_reserved_connections = 3
unix_socket_directories = '., /var/run/postgresql, /tmp'
tcp_keepalives_idle = 60
tcp_keepalives_interval = 10
tcp_keepalives_count = 10
shared_buffers = 48GB
max_prepared_transactions = 2000
work_mem = 4GB
maintenance_work_mem = 2GB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 10.0
effective_io_concurrency = 4
max_worker_processes = 128
max_parallel_maintenance_workers = 16
max_parallel_workers_per_gather = 24
parallel_leader_participation = on
max_parallel_workers = 32
wal_level = minimal
synchronous_commit = off
wal_writer_delay = 10ms
checkpoint_timeout = 35min
max_wal_size = 96GB
min_wal_size = 24GB
checkpoint_completion_target = 0.1
max_wal_senders = 0
random_page_cost = 1.1
effective_cache_size = 400GB
default_statistics_target = 1000
log_destination = 'csvlog'
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose
log_line_prefix = '%m [%p] '
log_timezone = 'PRC'
log_autovacuum_min_duration = 0
autovacuum_freeze_max_age = 1200000000
autovacuum_multixact_freeze_max_age = 1400000000
autovacuum_vacuum_cost_delay = 0ms
vacuum_freeze_table_age = 1150000000
vacuum_multixact_freeze_table_age = 1150000000
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'
5、deepgreen v18
http://vitessedata.com/products/deepgreen-db/download/install/
gp_vmem_protect_limit=16384;
TPC-H (SF=10) 性能对比
数据库 | q1 | q2 | q3 | q4 | q5 | q6 | q7 | q8 | q9 | q10 | q11 | q12 | q13 | q14 | q15 | q16 | q17 | q18 | q19 | q20 | q21 | q22 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PostgreSQL 10 | 17 | 2 | 41 | 6 | 14 | 4 | 27 | 31 | 54 | 16 | 7 | 100 | 213 | 4 | 25 | 60 | 552 | 1317 | 11 | 82 | 95 | 143 |
PostgreSQL 11beta3 | 3 | 2 | 2 | 5 | 3 | 4 | 2 | 2 | 9 | 3 | 1 | 2 | 4 | 2 | 6 | 10 | 27 | 46 | 2 | 10 | 13 | 2 |
DeepGreen v16 | 6 | 4 | 10 | 4 | 7 | 1 | 27 | 10 | 283 | 5 | 2 | 5 | 14 | 1 | 2 | 5 | 31 | 53 | 6 | 10 | 246 | 7 |
DeepGreen v18 | 2 | 1 | 1 | 1 | 1 | 0 | 3 | 1 | 3 | 2 | 1 | 2 | 2 | 0 | 0 | 1 | 5 | 3 | 2 | 1 | 3 | 1 |
SF=200
SF=200 Deepgreen on ECS(64c, 512G, 8*2TB(ssd) lvm strip)(33 segments)
2018-08-25 15:03:59 [1535180639] : running TPC-H benchmark
2018-08-25 15:03:59 [1535180639] : running queries defined in TPC-H benchmark
2018-08-25 15:03:59 [1535180639] : running query 1
2018-08-25 15:03:59 [1535180639] : run explain
2018-08-25 15:03:59 [1535180639] : run the query on background
2018-08-25 15:04:12 [1535180652] : query 1 finished OK (12 seconds)
2018-08-25 15:04:12 [1535180652] : running query 2
2018-08-25 15:04:12 [1535180652] : run explain
2018-08-25 15:04:12 [1535180652] : run the query on background
2018-08-25 15:04:15 [1535180655] : query 2 finished OK (3 seconds)
2018-08-25 15:04:15 [1535180655] : running query 3
2018-08-25 15:04:15 [1535180655] : run explain
2018-08-25 15:04:15 [1535180655] : run the query on background
2018-08-25 15:04:27 [1535180667] : query 3 finished OK (11 seconds)
2018-08-25 15:04:27 [1535180667] : running query 4
2018-08-25 15:04:27 [1535180667] : run explain
2018-08-25 15:04:27 [1535180667] : run the query on background
2018-08-25 15:04:34 [1535180674] : query 4 finished OK (7 seconds)
2018-08-25 15:04:34 [1535180674] : running query 5
2018-08-25 15:04:34 [1535180674] : run explain
2018-08-25 15:04:34 [1535180674] : run the query on background
2018-08-25 15:04:47 [1535180687] : query 5 finished OK (12 seconds)
2018-08-25 15:04:47 [1535180687] : running query 6
2018-08-25 15:04:47 [1535180687] : run explain
2018-08-25 15:04:47 [1535180687] : run the query on background
2018-08-25 15:04:48 [1535180688] : query 6 finished OK (1 seconds)
2018-08-25 15:04:48 [1535180688] : running query 7
2018-08-25 15:04:48 [1535180688] : run explain
2018-08-25 15:04:48 [1535180688] : run the query on background
2018-08-25 15:05:21 [1535180721] : query 7 finished OK (32 seconds)
2018-08-25 15:05:21 [1535180721] : running query 8
2018-08-25 15:05:21 [1535180721] : run explain
2018-08-25 15:05:21 [1535180721] : run the query on background
2018-08-25 15:05:32 [1535180732] : query 8 finished OK (11 seconds)
2018-08-25 15:05:32 [1535180732] : running query 9
2018-08-25 15:05:32 [1535180732] : run explain
2018-08-25 15:05:32 [1535180732] : run the query on background
2018-08-25 15:06:02 [1535180762] : query 9 finished OK (29 seconds)
2018-08-25 15:06:02 [1535180762] : running query 10
2018-08-25 15:06:02 [1535180762] : run explain
2018-08-25 15:06:02 [1535180762] : run the query on background
2018-08-25 15:06:11 [1535180771] : query 10 finished OK (9 seconds)
2018-08-25 15:06:11 [1535180771] : running query 11
2018-08-25 15:06:11 [1535180771] : run explain
2018-08-25 15:06:11 [1535180771] : run the query on background
2018-08-25 15:06:14 [1535180774] : query 11 finished OK (2 seconds)
2018-08-25 15:06:14 [1535180774] : running query 12
2018-08-25 15:06:14 [1535180774] : run explain
2018-08-25 15:06:14 [1535180774] : run the query on background
2018-08-25 15:06:23 [1535180783] : query 12 finished OK (9 seconds)
2018-08-25 15:06:23 [1535180783] : running query 13
2018-08-25 15:06:23 [1535180783] : run explain
2018-08-25 15:06:23 [1535180783] : run the query on background
2018-08-25 15:06:41 [1535180801] : query 13 finished OK (17 seconds)
2018-08-25 15:06:41 [1535180801] : running query 14
2018-08-25 15:06:41 [1535180801] : run explain
2018-08-25 15:06:41 [1535180801] : run the query on background
2018-08-25 15:06:42 [1535180802] : query 14 finished OK (1 seconds)
2018-08-25 15:06:42 [1535180802] : running query 15
2018-08-25 15:06:42 [1535180802] : run explain
2018-08-25 15:06:42 [1535180802] : run the query on background
2018-08-25 15:06:46 [1535180806] : query 15 finished OK (4 seconds)
2018-08-25 15:06:46 [1535180806] : running query 16
2018-08-25 15:06:46 [1535180806] : run explain
2018-08-25 15:06:46 [1535180806] : run the query on background
2018-08-25 15:06:56 [1535180816] : query 16 finished OK (9 seconds)
2018-08-25 15:06:56 [1535180816] : running query 17
2018-08-25 15:06:56 [1535180816] : run explain
2018-08-25 15:06:56 [1535180816] : run the query on background
2018-08-25 15:08:06 [1535180886] : query 17 finished OK (69 seconds)
2018-08-25 15:08:06 [1535180886] : running query 18
2018-08-25 15:08:06 [1535180886] : run explain
2018-08-25 15:08:06 [1535180886] : run the query on background
2018-08-25 15:08:30 [1535180910] : query 18 finished OK (24 seconds)
2018-08-25 15:08:30 [1535180910] : running query 19
2018-08-25 15:08:30 [1535180910] : run explain
2018-08-25 15:08:31 [1535180911] : run the query on background
2018-08-25 15:08:42 [1535180922] : query 19 finished OK (11 seconds)
2018-08-25 15:08:42 [1535180922] : running query 20
2018-08-25 15:08:42 [1535180922] : run explain
2018-08-25 15:08:42 [1535180922] : run the query on background
2018-08-25 15:08:55 [1535180936] : query 20 finished OK (13 seconds)
2018-08-25 15:08:56 [1535180936] : running query 21
2018-08-25 15:08:56 [1535180936] : run explain
2018-08-25 15:08:56 [1535180936] : run the query on background
2018-08-25 15:09:16 [1535180956] : query 21 finished OK (20 seconds)
2018-08-25 15:09:16 [1535180956] : running query 22
2018-08-25 15:09:16 [1535180956] : run explain
2018-08-25 15:09:16 [1535180956] : run the query on background
2018-08-25 15:09:26 [1535180966] : query 22 finished OK (9 seconds)
2018-08-25 15:09:26 [1535180966] : finished TPC-H benchmark
SF=200 PG 11
使用HINT改几条慢SQL。
gp_tpch/dss/queries/2.sql17.sql18.sql
《PostgreSQL 开启with-llvm(JIT)后,新增插件异常(clang: Command not found)处理》
《关键时刻HINT出彩 - PG优化器的参数优化、执行计划固化CASE》
Q2
/*+ Leading((partsupp ((nation region) supplier))) Leading((part (((nation region) supplier) partsupp))) */
selects_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_comment
frompart,supplier,partsupp,nation,region
wherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 47and p_type like '%NICKEL'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA'and ps_supplycost = (selectmin(ps_supplycost)frompartsupp,supplier,nation,regionwherep_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA')
order bys_acctbal desc,n_name,s_name,p_partkey
LIMIT 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=4589445377.60..4589445377.60 rows=1 width=198)-> Sort (cost=4589445377.60..4589445377.60 rows=1 width=198)Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey-> Hash Join (cost=225752.90..4589445377.59 rows=1 width=198)Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))-> Gather (cost=0.00..18775.09 rows=158816 width=34)Workers Planned: 28-> Parallel Seq Scan on part (cost=0.00..18775.09 rows=5672 width=34)Filter: (((p_type)::text ~~ '%NICKEL'::text) AND (p_size = 47))-> Hash (cost=69355.84..69355.84 rows=31998515 width=180)-> Gather (cost=69220.57..69355.84 rows=31998515 width=180)Workers Planned: 28-> Parallel Hash Join (cost=69220.57..69355.84 rows=1142804 width=180)Hash Cond: (supplier.s_suppkey = partsupp.ps_suppkey)-> Parallel Hash Join (cost=858.16..861.87 rows=14286 width=172)Hash Cond: (nation.n_nationkey = supplier.s_nationkey)-> Parallel Hash Join (cost=0.02..0.03 rows=1 width=34)Hash Cond: (nation.n_regionkey = region.r_regionkey)-> Parallel Seq Scan on nation (cost=0.00..0.01 rows=1 width=42)-> Parallel Hash (cost=0.01..0.01 rows=1 width=8)-> Parallel Seq Scan on region (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)-> Parallel Hash (cost=689.83..689.83 rows=71429 width=154)-> Parallel Seq Scan on supplier (cost=0.00..689.83 rows=71429 width=154)-> Parallel Hash (cost=54898.31..54898.31 rows=5714021 width=24)-> Parallel Seq Scan on partsupp (cost=0.00..54898.31 rows=5714021 width=24)SubPlan 1-> Aggregate (cost=1806.01..1806.01 rows=1 width=8)-> Hash Join (cost=1804.89..1806.00 rows=7 width=8)Hash Cond: (partsupp_1.ps_suppkey = supplier_1.s_suppkey)-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1 (cost=0.49..1.48 rows=37 width=16)Index Cond: (part.p_partkey = ps_partkey)-> Hash (cost=861.87..861.87 rows=400000 width=8)-> Gather (cost=858.16..861.87 rows=400000 width=8)Workers Planned: 28-> Parallel Hash Join (cost=858.16..861.87 rows=14286 width=8)Hash Cond: (nation_1.n_nationkey = supplier_1.s_nationkey)-> Parallel Hash Join (cost=0.02..0.03 rows=1 width=8)Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)-> Parallel Seq Scan on nation nation_1 (cost=0.00..0.01 rows=1 width=16)-> Parallel Hash (cost=0.01..0.01 rows=1 width=8)-> Parallel Seq Scan on region region_1 (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)-> Parallel Hash (cost=689.83..689.83 rows=71429 width=16)-> Parallel Seq Scan on supplier supplier_1 (cost=0.00..689.83 rows=71429 width=16)JIT:Functions: 98Inlining: trueOptimization: true
(49 rows)
更新
Q2 建议加这个索引, 使用nestloop join
create index idx on partsupp (ps_partkey,ps_suppkey,ps_supplycost);
set enable_nestloop=on;or/*+ Set(enable_nestloop on) */ select s_acctbal,s_name, n_name,p_partkey,p_mfgr, s_address,s_phone, s_comment
from part,supplier,partsupp,nation, region
where p_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 47 and p_type like '%NICKEL'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA' and ps_supplycost = (select min(ps_supplycost)from partsupp,supplier,nation, region where p_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'ASIA' )
order by s_acctbal desc,n_name, s_name,p_partkey
LIMIT 100;Time: 5345.093 ms (00:05.345)QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=2807136.16..2807136.16 rows=1 width=198)-> Sort (cost=2807136.16..2807136.16 rows=1 width=198)Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey-> Nested Loop (cost=17.90..2807136.15 rows=1 width=198)Join Filter: (nation.n_nationkey = supplier.s_nationkey)-> Nested Loop (cost=17.88..2807135.97 rows=6 width=180)-> Nested Loop (cost=17.51..2807133.70 rows=6 width=42)-> Index Scan using idx_part__6 on part (cost=0.49..15906.04 rows=160623 width=34)Index Cond: (p_size = 47)Filter: ((p_type)::text ~~ '%NICKEL'::text)-> Index Only Scan using idx on partsupp (cost=17.02..17.38 rows=1 width=24)Index Cond: ((ps_partkey = part.p_partkey) AND (ps_supplycost = (SubPlan 1)))SubPlan 1-> Aggregate (cost=16.52..16.52 rows=1 width=8)-> Hash Join (cost=0.91..16.51 rows=8 width=8)Hash Cond: (supplier_1.s_nationkey = nation_1.n_nationkey)-> Nested Loop (cost=0.86..16.34 rows=38 width=16)-> Index Scan using idx_partsupp_partkey on partsupp partsupp_1 (cost=0.49..0.78 rows=38 width=16)Index Cond: (part.p_partkey = ps_partkey)-> Index Scan using supplier_pkey on supplier supplier_1 (cost=0.37..0.41 rows=1 width=16)Index Cond: (s_suppkey = partsupp_1.ps_suppkey)-> Hash (cost=0.03..0.03 rows=5 width=8)-> Gather (cost=0.02..0.03 rows=5 width=8)Workers Planned: 28-> Parallel Hash Join (cost=0.02..0.03 rows=1 width=8)Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)-> Parallel Seq Scan on nation nation_1 (cost=0.00..0.01 rows=1 width=16)-> Parallel Hash (cost=0.01..0.01 rows=1 width=8)-> Parallel Seq Scan on region region_1 (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)-> Index Scan using supplier_pkey on supplier (cost=0.37..0.38 rows=1 width=154)Index Cond: (s_suppkey = partsupp.ps_suppkey)-> Materialize (cost=0.02..0.06 rows=5 width=34)-> Gather (cost=0.02..0.03 rows=5 width=34)Workers Planned: 28-> Parallel Hash Join (cost=0.02..0.03 rows=1 width=34)Hash Cond: (nation.n_regionkey = region.r_regionkey)-> Parallel Seq Scan on nation (cost=0.00..0.01 rows=1 width=42)-> Parallel Hash (cost=0.01..0.01 rows=1 width=8)-> Parallel Seq Scan on region (cost=0.00..0.01 rows=1 width=8)Filter: (r_name = 'ASIA'::bpchar)JIT:Functions: 67Inlining: trueOptimization: true
(45 rows)
Q4
create index idx1 on lineitem (l_orderkey) where l_commitdate < l_receiptdate;
Q17
/*+ Set(work_mem 12GB) Set(max_parallel_workers_per_gather 28) */ explain selectsum(l_extendedprice) / 7.0 as avg_yearly
fromlineitem,part,(SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_agg
wherep_partkey = l_partkeyand agg_partkey = l_partkeyand p_brand = 'Brand#11'and p_container = 'SM PKG'and l_quantity < avg_quantity
LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------Limit (cost=6130644.31..6130644.31 rows=1 width=8)-> Aggregate (cost=6130644.31..6130644.31 rows=1 width=8)-> Hash Join (cost=5647079.65..6129794.97 rows=391854 width=8)Hash Cond: (part.p_partkey = lineitem_1.l_partkey)Join Filter: (lineitem.l_quantity < (('0.2'::double precision * avg(lineitem_1.l_quantity))))-> Gather (cost=11834.33..491832.73 rows=1193799 width=32)Workers Planned: 28-> Parallel Hash Join (cost=11834.33..491832.73 rows=42636 width=32)Hash Cond: (lineitem.l_partkey = part.p_partkey)-> Parallel Seq Scan on lineitem (cost=0.00..382461.00 rows=42857385 width=24)-> Parallel Hash (cost=11830.99..11830.99 rows=1421 width=8)-> Parallel Bitmap Heap Scan on part (cost=427.06..11830.99 rows=1421 width=8)Recheck Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))-> Bitmap Index Scan on idx_part_1 (cost=0.00..418.44 rows=39794 width=0)Index Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))-> Hash (cost=5542429.84..5542429.84 rows=39389900 width=16)-> Finalize HashAggregate (cost=5356798.88..5534991.39 rows=39389900 width=16)Group Key: lineitem_1.l_partkey-> Gather (cost=568246.52..575684.96 rows=1102917200 width=40)Workers Planned: 28-> Partial HashAggregate (cost=568246.52..575684.96 rows=39389900 width=40)Group Key: lineitem_1.l_partkey-> Parallel Seq Scan on lineitem lineitem_1 (cost=0.00..382461.00 rows=42857385 width=16)JIT:Functions: 44Inlining: trueOptimization: true
(27 rows)
Q18
/*+ Leading( ((lineitem (customer orders)) ANY_subquery) ) */ explain selectc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,sum(l_quantity)
fromcustomer,orders,lineitem
whereo_orderkey in (selectl_orderkeyfromlineitemgroup byl_orderkey havingsum(l_quantity) > 312)and c_custkey = o_custkeyand o_orderkey = l_orderkey
group byc_name,c_custkey,o_orderkey,o_orderdate,o_totalprice
order byo_totalprice desc,o_orderdate
LIMIT 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------Limit (cost=5094012.87..5094013.09 rows=100 width=55)-> Sort (cost=5094012.87..5104543.55 rows=4858479 width=55)Sort Key: orders.o_totalprice DESC, orders.o_orderdate-> HashAggregate (cost=4932105.34..4933022.83 rows=4858479 width=55)Group Key: customer.c_custkey, orders.o_orderkey-> Hash Join (cost=1182913.25..4900513.30 rows=4858479 width=55)Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)-> Gather (cost=160021.94..1146574.91 rows=1200006784 width=63)Workers Planned: 28-> Parallel Hash Join (cost=160021.94..1146574.91 rows=42857385 width=63)Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)-> Parallel Seq Scan on lineitem (cost=0.00..382461.00 rows=42857385 width=16)-> Parallel Hash (cost=134775.97..134775.97 rows=10714120 width=47)-> Parallel Hash Join (cost=108506.79..134775.97 rows=10714120 width=47)Hash Cond: (customer.c_custkey = orders.o_custkey)-> Parallel Seq Scan on customer (cost=0.00..11101.38 rows=1071468 width=27)-> Parallel Hash (cost=83260.82..83260.82 rows=10714120 width=28)-> Parallel Seq Scan on orders (cost=0.00..83260.82 rows=10714120 width=28)-> Hash (cost=1020029.33..1020029.33 rows=1214594 width=8)-> Finalize HashAggregate (cost=1011214.02..1019799.96 rows=1214594 width=8)Group Key: lineitem_1.l_orderkeyFilter: (sum(lineitem_1.l_quantity) > '312'::double precision)-> Gather (cost=568246.52..568934.61 rows=102025924 width=16)Workers Planned: 28-> Partial HashAggregate (cost=568246.52..568934.61 rows=3643783 width=16)Group Key: lineitem_1.l_orderkey-> Parallel Seq Scan on lineitem lineitem_1 (cost=0.00..382461.00 rows=42857385 width=16)
(27 rows)
SF=200 pg 11 结果
2018-08-25 19:46:22 [1535197582] : running TPC-H benchmark
2018-08-25 19:46:22 [1535197582] : running queries defined in TPC-H benchmark
2018-08-25 19:46:22 [1535197582] : running query 1
2018-08-25 19:46:22 [1535197582] : run explain
2018-08-25 19:46:22 [1535197582] : run the query on background
2018-08-25 19:46:40 [1535197600] : query 1 finished OK (18 seconds)
2018-08-25 19:46:40 [1535197600] : running query 2
2018-08-25 19:46:40 [1535197600] : run explain
2018-08-25 19:46:40 [1535197600] : run the query on background
2018-08-25 19:47:19 [1535197639] : query 2 finished OK (38 seconds)
2018-08-25 19:47:19 [1535197639] : running query 3
2018-08-25 19:47:19 [1535197639] : run explain
2018-08-25 19:47:21 [1535197641] : run the query on background
2018-08-25 19:47:46 [1535197666] : query 3 finished OK (25 seconds)
2018-08-25 19:47:46 [1535197666] : running query 4
2018-08-25 19:47:46 [1535197666] : run explain
2018-08-25 19:47:47 [1535197667] : run the query on background
2018-08-25 19:48:19 [1535197699] : query 4 finished OK (32 seconds)
2018-08-25 19:48:19 [1535197699] : running query 5
2018-08-25 19:48:19 [1535197699] : run explain
2018-08-25 19:48:19 [1535197699] : run the query on background
2018-08-25 19:49:16 [1535197756] : query 5 finished OK (57 seconds)
2018-08-25 19:49:17 [1535197757] : running query 6
2018-08-25 19:49:17 [1535197757] : run explain
2018-08-25 19:49:17 [1535197757] : run the query on background
2018-08-25 19:49:25 [1535197765] : query 6 finished OK (8 seconds)
2018-08-25 19:49:25 [1535197765] : running query 7
2018-08-25 19:49:25 [1535197765] : run explain
2018-08-25 19:49:25 [1535197765] : run the query on background
2018-08-25 19:50:17 [1535197817] : query 7 finished OK (52 seconds)
2018-08-25 19:50:17 [1535197817] : running query 8
2018-08-25 19:50:18 [1535197818] : run explain
2018-08-25 19:50:18 [1535197818] : run the query on background
2018-08-25 19:50:42 [1535197842] : query 8 finished OK (24 seconds)
2018-08-25 19:50:42 [1535197842] : running query 9
2018-08-25 19:50:42 [1535197842] : run explain
2018-08-25 19:50:42 [1535197842] : run the query on background
2018-08-25 19:51:49 [1535197909] : query 9 finished OK (66 seconds)
2018-08-25 19:51:49 [1535197909] : running query 10
2018-08-25 19:51:49 [1535197909] : run explain
2018-08-25 19:51:49 [1535197909] : run the query on background
2018-08-25 19:52:27 [1535197947] : query 10 finished OK (38 seconds)
2018-08-25 19:52:27 [1535197947] : running query 11
2018-08-25 19:52:27 [1535197947] : run explain
2018-08-25 19:52:28 [1535197948] : run the query on background
2018-08-25 19:52:52 [1535197972] : query 11 finished OK (24 seconds)
2018-08-25 19:52:52 [1535197972] : running query 12
2018-08-25 19:52:52 [1535197972] : run explain
2018-08-25 19:52:52 [1535197972] : run the query on background
2018-08-25 19:53:18 [1535197998] : query 12 finished OK (26 seconds)
2018-08-25 19:53:18 [1535197998] : running query 13
2018-08-25 19:53:18 [1535197998] : run explain
2018-08-25 19:53:19 [1535197999] : run the query on background
2018-08-25 19:54:58 [1535198098] : query 13 finished OK (98 seconds)
2018-08-25 19:54:58 [1535198098] : running query 14
2018-08-25 19:54:58 [1535198098] : run explain
2018-08-25 19:54:58 [1535198098] : run the query on background
2018-08-25 19:55:11 [1535198111] : query 14 finished OK (13 seconds)
2018-08-25 19:55:11 [1535198111] : running query 15
2018-08-25 19:55:11 [1535198111] : run explain
2018-08-25 19:55:11 [1535198111] : run the query on background
2018-08-25 19:56:10 [1535198170] : query 15 finished OK (58 seconds)
2018-08-25 19:56:10 [1535198170] : running query 16
2018-08-25 19:56:10 [1535198170] : run explain
2018-08-25 19:56:10 [1535198170] : run the query on background
2018-08-25 19:58:05 [1535198285] : query 16 finished OK (114 seconds)
2018-08-25 19:58:05 [1535198285] : running query 17
2018-08-25 19:58:05 [1535198285] : run explain
2018-08-25 19:58:07 [1535198287] : run the query on background
2018-08-25 20:10:21 [1535199021] : query 17 finished OK (732 seconds)
2018-08-25 20:10:21 [1535199021] : running query 18
2018-08-25 20:10:21 [1535199021] : run explain
2018-08-25 20:10:22 [1535199022] : run the query on background
2018-08-25 20:20:19 [1535199619] : query 18 finished OK (595 seconds)
2018-08-25 20:20:19 [1535199619] : running query 19
2018-08-25 20:20:19 [1535199619] : run explain
2018-08-25 20:20:19 [1535199619] : run the query on background
2018-08-25 20:20:31 [1535199631] : query 19 finished OK (12 seconds)
2018-08-25 20:20:31 [1535199631] : running query 20
2018-08-25 20:20:31 [1535199631] : run explain
2018-08-25 20:20:34 [1535199634] : run the query on background
2018-08-25 20:24:08 [1535199848] : query 20 finished OK (213 seconds)
2018-08-25 20:24:08 [1535199848] : running query 21
2018-08-25 20:24:08 [1535199848] : run explain
2018-08-25 20:24:08 [1535199848] : run the query on background
2018-08-25 20:26:13 [1535199973] : query 21 finished OK (124 seconds)
2018-08-25 20:26:13 [1535199973] : running query 22
2018-08-25 20:26:13 [1535199973] : run explain
2018-08-25 20:26:14 [1535199974] : run the query on background
2018-08-25 20:26:28 [1535199988] : query 22 finished OK (14 seconds)
2018-08-25 20:26:28 [1535199988] : finished TPC-H benchmark
SF=200
数据库 | q1 | q2 | q3 | q4 | q5 | q6 | q7 | q8 | q9 | q10 | q11 | q12 | q13 | q14 | q15 | q16 | q17 | q18 | q19 | q20 | q21 | q22 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PostgreSQL 11beta3 | 18 | 38 | 25 | 32 | 57 | 8 | 52 | 24 | 66 | 38 | 24 | 26 | 98 | 13 | 58 | 114 | 732 | 595 | 12 | 213 | 124 | 14 |
DeepGreen v18 | 12 | 3 | 11 | 7 | 12 | 1 | 32 | 11 | 29 | 9 | 2 | 9 | 17 | 1 | 4 | 9 | 69 | 24 | 11 | 13 | 20 | 9 |
小结
PostgreSQL 11 parallel hash agg, parallel hash join的性能有了质的飞跃。
同时PostgreSQL 11在创建索引时,也支持了PARALLEL,创建索引的速度有了飞跃。
(SF=10)相比PostgreSQL 10,PostgreSQL 11版本AP能力得到了极大增强。在“没有使用列存储、没有使用分区表的”情况下,已接近Deepgreen(Greenplum原班人马)。ck, 田丰老师 .
PostgreSQL 11优化器还有可以加强的地方,在TPC-H的一些SQL中,执行计划并不是最好的,比如通过aqo提高优化器的能力。
https://github.com/postgrespro/aqo
PostgreSQL实现HTAP是必然的趋势。将来还可以结合citus插件实现sharding, (同时引入列存功能)。
参考
《TPC-H测试 - PostgreSQL 10 vs Deepgreen(Greenplum)》
https://github.com/digoal/gp_tpch
《PostgreSQL 10 + PostGIS + Sharding(pg_pathman) + MySQL(fdw外部表) on ECS 部署指南(适合新用户) - 珍藏级》
《PostgreSQL on ECS多云盘的部署、快照备份和恢复》
(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen相关推荐
- PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab
标签 PostgreSQL , tpcc 背景 环境 阿里云虚拟机 [root@pg11-test ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32- ...
- PostgreSQL 11 1Kw TPCC , 1亿 TPCB 7*24 强压耐久测试
标签 PostgreSQL , tpcc , tpcb 背景 TPCC, TPCB是工业标准的OLTP类型业务的数据库测试,包含大量的读.写.更新.删除操作. 7*24小时强压耐久测试,主要看数据库在 ...
- PostgreSQL 11 1Kw TPCC , 1亿 TPCB 7*24 强压耐久测试 1
标签 PostgreSQL , tpcc , tpcb 背景 TPCC, TPCB是工业标准的OLTP类型业务的数据库测试,包含大量的读.写.更新.删除操作. 7*24小时强压耐久测试,主要看数据库在 ...
- PostgreSQL 11 1000亿 tpcb、1000W tpcc 性能测试 - on 阿里云ECS + ESSD (含quorum based 0丢失多副本配置与性能测试)...
标签 PostgreSQL , pgbench , tpcb , tpcc , tpch 背景 https://help.aliyun.com/knowledge_detail/64950.html ...
- PostgreSQL 11 100亿 tpcb 性能测试 on ECS
标签 PostgreSQL , tpcb , pgbench , 100亿 背景 PostgreSQL 11 发布在即,以下是tpcc与tpch测试的结果: <PostgreSQL 11 tpc ...
- PostgreSQL 11 preview - bloom filter 误报率评估测试及如何降低误报 - 暨bloom filter应用于HEAP与INDEX的一致性检测...
标签 PostgreSQL , bloom filter , 误报率 , amcheck 背景 bloom filter是一个空间压缩,概率数据结构,可以存储超容量的数据集,用于判断某个元素是否存在于 ...
- 小学生除法检验,10道题,每小题10分,被除数和除数随机产生。被除数是20~200之间的数,除数是2~9之间的一位数。用户输入答案,计算机判卷,做对了加10分,做错了不允许重做,最后输出成绩;
欢迎加qq群:453398542 学习讨论,会定期分享资料课程,解答问题. 1. 小学生除法检验,10道题,每小题10分,被除数和除数随机产生.被除数是20~200之间的数,除数是2~ ...
- 10个200万的网络监控摄像机应该配置百兆还是千兆的交换机?
200万像素的网络摄像机,是1080P网络摄像机,主码流4-6M,子码流为0.8~1M,共约7M,10个200万的网络摄像机需要的码流是70M,不管是百兆交换机还是千兆交换机,他们的实际带宽只有百分之 ...
- 物联网平台 源码 智慧农业物联网平台 ,支持mqtt,h ttp,coap协议 java+postgresql,支持集群
物联网平台 源码 智慧农业物联网平台 ,支持mqtt,h ttp,coap协议 java+postgresql,支持集群. 支持萤石云摄像头接入,云台控制,支持nbiot,lora物联网平台,5g物联 ...
- 测试人员转型是大势所趋:我的10年自动化测试经验分享
做测试十多年,有不少人问过我下面问题: 现在的手工测试真的不行了吗? 测试工程师,三年多快四年的经验,入门自动化测试需要多久? 自学自动化测试到底需要学哪些东西? 不得不说,随着行业的竞争加剧,互联网 ...
最新文章
- 联手IBM布局云计算,王健林如何再造一个新万达?
- 微众WeCross 跨链平台(3)交易流程
- python sorted原理_Python sort()和sorted()
- 增强:MB1A物料价格检查
- OpenCV基于ARM的Linux系统的交叉编译
- 前魅族科技高级副总裁李楠:iPhone 11的工业设计完全崩盘
- python100以内自然数之和_python输出100以内的质数与合数实例代码
- [20170515]数据库启动的一个疑问.txt
- 智能语音识别究竟是如何实现的?
- [转载] python中callable_Python callable() 函数
- 方差分析软件_手把手教你用Graphpad做单因素方差分析
- vscode配置Git管理工具GitLens
- 2013年11月16Aspx源码发布详情
- 一个VSCode插件实现软妹音程序员鼓励师24小时在线,还能吐槽PM
- USB接口的功能和区分全解析
- office转换为还原度高的html,使用Aspose把office文件转换为Html文件及生成文件浏览乱码的解决...
- 开关二极管IN4148
- k8s(18)之污点与容忍
- provide inject
- mysql好评中评统计_scrapy爬取京东笔记本及评论信息存入MySQL数据库
热门文章
- LCT动态树【史上最精简易懂的LCT讲解】
- jmeter-如何进行参数化-循环读取参数
- 微服务自动化部署(ansible playbook)干货之--zookeeper部署
- iOS 开发 入门:使用Ad Hoc 进行用户测试
- xcode6以后, 使用.pch
- PHP笔记(CSS篇)
- 【习题基础知识】输入与输出、简单循环(好多图贴上去好累...要看图的话我把word文档发给你)...
- 【转】Java多线程面试问题集锦
- Microsoft Enterprise Library 5.0 系列(二) Cryptography Application Block (高级)
- 化学堵水剂处理油井出水问题