作者:Percona公司的Ibrar Ahmed

https://www.percona.com/blog/2019/07/30/parallelism-in-postgresql/

PostgreSQL is one of the finest object-relational databases, and its architecture is process-based instead of thread-based. While almost all the current database systems utilize threads for parallelism, PostgreSQL’s process-based architecture was implemented prior to POSIX threads. PostgreSQL launches a process “postmaster” on startup, and after that spans new process whenever a new client connects to the PostgreSQL.

Before version 10 there was no parallelism in a single connection. It is true that multiple queries from the different clients can have parallelism because of process architecture, but they couldn’t gain any performance benefit from one another. In other words, a single query runs serially and did not have parallelism. This is a huge limitation because a single query cannot utilize the multi-core. Parallelism in PostgreSQL was introduced from version 9.6. Parallelism, in a sense, is where a single process can have multiple threads to query the system and utilize the multicore in a system. This gives PostgreSQL intra-query parallelism.

Parallelism in PostgreSQL was implemented as part of multiple features which cover sequential scans, aggregates, and joins.

Components of Parallelism in PostgreSQL

There are three important components of parallelism in PostgreSQL. These are the process itself, gather, and workers. Without parallelism the process itself handles all the data, however, when planner decides that a query or part of it can be parallelized, it adds a Gather node within the parallelizable portion of the plan and makes a gather root node of that subtree.  Query execution starts at the process (leader) level and all the serial parts of the plan are run by the leader. However, if parallelism is enabled and permissible for any part (or whole) of the query, then gather node with a set of workers is allocated for it. Workers are the threads that run in parallel with part of the tree (partial-plan) that needs to be parallelized. The relation’s blocks are divided amongst threads such that the relation remains sequential. The number of threads is governed by settings as set in PostgreSQL’s configuration file. The workers coordinate/communicate using shared memory, and once workers have completed their work, the results are passed on to the leader for accumulation.

Parallel Sequential Scans

In PostgreSQL 9.6, support for the parallel sequential scan was added. A sequential scan is a scan on a table in which a sequence of blocks is evaluated one after the other. This, by its very nature, allows parallelism. So this was a natural candidate for the first implementation of parallelism. In this, the whole table is sequentially scanned in multiple worker threads. Here is the simple query where we query the pgbench_accounts table rows (63165) which has 1500000000 tuples. The total execution time is 4343080ms. As there is no index defined, the sequential scan is used. The whole table is scanned in a single process with no thread. Therefore the single core of CPU is used regardless of how many cores are available.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

db=# EXPLAIN ANALYZE SELECT *

FROM pgbench_accounts

WHERE abalance > 0;

QUERY PLAN

----------------------------------------------------------------------

Seq Scan on pgbench_accounts (cost=0.00..73708261.04 rows=1 width=97)

(actual time=6868.238..4343052.233 rows=63165 loops=1)

Filter: (abalance > 0)

Rows Removed by Filter: 1499936835

Planning Time: 1.155 ms

Execution Time: 4343080.557 ms

(5 rows)

What if these 1,500,000,000 rows scanned parallel using “10” workers within a process? It will reduce the execution time drastically.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

db=# EXPLAIN ANALYZE select * from pgbench_accounts where abalance > 0;

QUERY PLAN

----------------------------------------------------------------------

Gather  (cost=1000.00..45010087.20 rows=1 width=97)

(actual time=14356.160..1628287.828 rows=63165 loops=1)

Workers Planned: 10

Workers Launched: 10

->  Parallel Seq Scan on pgbench_accounts

(cost=0.00..45009087.10 rows=1 width=97)

(actual time=43694.076..1628068.096 rows=5742 loops=11)

Filter: (abalance > 0)

Rows Removed by Filter: 136357894

Planning Time: 37.714 ms

Execution Time: 1628295.442 ms

(8 rows)

Now the total execution time is 1628295ms; this is a 266% improvement while using 10 workers thread used to scan.

Query used for the Benchmark:  SELECT * FROM pgbench_accounts WHERE abalance > 0;

Size of Table: 426GB

Total Rows in Table: 1500000000

The system used for the Benchmark:

    CPU: 2 Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz

    RAM: 256GB DDR3 1600

    DISK: ST3000NM0033

The above graph clearly shows how parallelism improves performance for a sequential scan. When a single worker is added, the performance understandably degrades as no parallelism is gained, but the creation of an additional gather node and a single work adds overhead. However, with more than one worker thread, the performance improves significantly. Also, it is important to note that performance doesn’t increase in a linear or exponential fashion. It improves gradually until the addition of more workers will not give any performance boost; sort of like approaching a horizontal asymptote. This benchmark was performed on a 64-core machine, and it is clear that having more than 10 workers will not give any significant performance boost.

Parallel Aggregates

In databases, calculating aggregates are very expensive operations. When evaluated in a single process, these take a reasonably long time. In PostgreSQL 9.6, the ability to calculate these in parallel was added by simply dividing these in chunks (a divide and conquer strategy). This allowed multiple workers to calculate the part of aggregate before the final value(s) based on these calculations was calculated by the leader. More technically speaking, PartialAggregate nodes are added to a plan tree, and each PartialAggregate node takes the output from one worker. These outputs are then emitted to a FinalizeAggregate node that combines the aggregates from multiple (all) PartialAggregate nodes. So effectively, the parallel partial plan includes a FinalizeAggregate node at the root and a Gather node which will have PartialAggregate nodes as children.

Shell

1

2

3

4

5

6

7

8

9

10

db=# EXPLAIN ANALYZE SELECT count(*) from pgbench_accounts;

QUERY PLAN

----------------------------------------------------------------------

Aggregate  (cost=73708261.04..73708261.05 rows=1 width=8)

(actual time=2025408.357..2025408.358 rows=1 loops=1)

->  Seq Scan on pgbench_accounts  (cost=0.00..67330666.83 rows=2551037683 width=0)

(actual time=8.162..1963979.618 rows=1500000000 loops=1)

Planning Time: 54.295 ms

Execution Time: 2025419.744 ms

(4 rows)

Following is an example of a plan when an aggregate is to be evaluated in parallel.  You can clearly see performance improvement here.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

db=# EXPLAIN ANALYZE SELECT count(*) from pgbench_accounts;

QUERY PLAN

----------------------------------------------------------------------

Finalize Aggregate  (cost=45010088.14..45010088.15 rows=1 width=8)

(actual time=1737802.625..1737802.625 rows=1 loops=1)

->  Gather  (cost=45010087.10..45010088.11 rows=10 width=8)

(actual time=1737791.426..1737808.572 rows=11 loops=1)

Workers Planned: 10

Workers Launched: 10

->  Partial Aggregate

(cost=45009087.10..45009087.11 rows=1 width=8)

(actual time=1737752.333..1737752.334 rows=1 loops=11)

->  Parallel Seq Scan on pgbench_accounts

(cost=0.00..44371327.68 rows=255103768 width=0)

(actual time=7.037..1731083.005 rows=136363636 loops=11)

Planning Time: 46.031 ms

Execution Time: 1737817.346 ms

(8 rows)

With parallel aggregates, in this particular case, we get a performance boost of just over 16% as the execution time of 2025419.744 is reduced to 1737817.346 when 10 parallel workers are involved.  

Query used for the Benchmark:  SELECT count(*) FROM pgbench_accounts WHERE abalance > 0;

Size of Table: 426GB

Total Rows in Table: 1500000000

The system used for the Benchmark:

    CPU: 2 Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz

    RAM: 256GB DDR3 1600

    DISK: ST3000NM0033

Parallel Index (B-Tree) Scans

The parallel support for B-Tree index means index pages are scanned in parallel. The B-Tree index is one of the most used indexes in PostgreSQL. In a parallel version of B-Tree, a worker scans the B-Tree and when it reaches its leaf node, it then scans the block and triggers the blocked waiting worker to scan the next block.

Confused? Let’s look at an example of this. Suppose we have a table foo with id and name columns, with 18 rows of data. We create an index on the id column of table foo. A system column CTID is attached with each row of table which identifies the physical location of the row. There are two values in the CTID column: the block number and the offset.

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

postgres=# <strong>SELECT</strong> ctid, id <strong>FROM</strong> foo;

ctid  | id

--------+-----

(0,55) | 200

(0,56) | 300

(0,57) | 210

(0,58) | 220

(0,59) | 230

(0,60) | 203

(0,61) | 204

(0,62) | 300

(0,63) | 301

(0,64) | 302

(0,65) | 301

(0,66) | 302

(1,31) | 100

(1,32) | 101

(1,33) | 102

(1,34) | 103

(1,35) | 104

(1,36) | 105

(18 rows)

Let’s create the B-Tree index on that table’s id column.

Shell

1

CREATE INDEX foo_idx ON foo(id)

Suppose we want to select values where id <= 200 with 2 workers. Worker-0 will start from the root node and scan until the leaf node 200. It’ll handover the next block under node 105 to Worker-1, which is in a blocked and wait-state.  If there are other workers, blocks are divided into the workers. A similar pattern is repeated until the scan is completed.

Parallel Bitmap Scans

To parallelize a bitmap heap scan, we need to be able to divide blocks among workers in a way very similar to parallel sequential scan. To do that, a scan on one or more indexes is done and a bitmap indicating which blocks are to be visited is created. This is done by a leader process, i.e. this part of the scan is run sequentially. However, the parallelism kicks in when the identified blocks are passed to workers, the same way as in a parallel sequential scan.

Parallel Joins

Parallelism in the merge joins support is also one of the hottest features added in this release. In this, a table joins with other tables’ inner loop hash or merge. In any case, there is no parallelism supported in the inner loop. The entire loop is scanned as a whole, and the parallelism occurs when each worker executes the inner loop as a whole. The results of each join sent to gather accumulate and produce the final results.

Summary

It is obvious from what we’ve already discussed in this blog that parallelism gives significant performance boosts for some, slight gains for others, and may cause performance degradation in some cases. Ensure that parallel_setup_cost or parallel_tuple_cost are set up correctly to enable the query planner to choose a parallel plan. Even after setting low values for these GUIs, if a parallel plan is not produced, refer to the PostgreSQL documentation on parallelism for details.

For a parallel plan, you can get per-worker statistics for each plan node to understand how the load is distributed amongst workers. You can do that through EXPLAIN (ANALYZE, VERBOSE). As with any other performance feature, there is no one rule that applies to all workloads. Parallelism should be carefully configured for whatever the need may be, and you must ensure that the probability of gaining performance is significantly higher than the probability of a drop in performance.

Discuss on Hacker News

【转】Parallelism in PostgreSQL相关推荐

  1. PostgreSQL 10.0 preview 功能增强 - 后台运行(pg_background)

    标签 PostgreSQL , 10.0 , 后台运行 , pg_background_launch , pg_background_result , pg_background_detach , p ...

  2. PostgreSQL on XFS 性能优化 - 1

    概要 XFS文件系统的性能优化主要分4块 1. 逻辑卷/RAID优化部分 2. XFS mkfs 优化部分 3. XFS mount 优化部分 4. xfsctl 优化部分 以上几个部分,建议了解原理 ...

  3. Postgresql实验系列(4)SIMD提升线性搜索性能24.5%(附带PG SIMD完整用例)

    概要 接上一篇<Postgresql引入SIMD指令集> PG引入SIMD执行集后具体有多大性能提升?本篇抽取PG的simd库,对比线性搜索场景的性能: 测试场景(文章最后提供完整程序) ...

  4. 并发 vs 并行 (Concurrency Is Not Parallelism)

    前言 不知你是否曾经下列这些疑问? 并发与并行性有何关系? 什么是同步和异步执行? 如何区分并发与并行? 线程如何与所有这些概念一起使用? 并发 并发性意味着应用程序同时(并发地)在多个任务上取得进展 ...

  5. Postgresql 日志收集

    PG安装完成后默认不会记录日志,必须修改对应的(${PGDATA}/postgresql.conf)配置才可以,这里只介绍常用的日志配置. 1.logging_collector = on/off - ...

  6. pg数据库开启远程连接_如何运行远程客户端连接postgresql数据库

    如何运行远程客户端连接 postgresql 数据库 前提条件是 2 个: 1 , pg_hba.conf 里面配置了运行远程客户机连接 pg_hba.conf 配置后需要重新加载 reload 生效 ...

  7. Postgresql:删除及查询字段中包含单引号的数据

    Postgresql:删除及查询字段中包含单引号的数据 1. 假设pg表t_info的属性att,值为固定的:'test' 2. 假设值为不固定的,'abcde' 参考 1. 假设pg表t_info的 ...

  8. postgresql Insert插入的几个报错

    postgresql Insert插入的几个报错 1. org.postgresql.util.PSQLException: 未设定参数值 2 的内容. 2. postgresql : column ...

  9. 【Postgresql】触发器某个字段更新时执行,行插入或更新执行

    [Postgresql]触发器某个字段更新时执行,行插入或更新执行 1. postgresql触发器 2. 触发器的创建及示例 1) 字段更新时,触发 2) 行插入或更新时,触发 3. 触发器的删除 ...

最新文章

  1. UVA1411 Ants(带权二分图的最大完美匹配、zkw费用流)
  2. python快速加引号_在python中如何快速地将一串字符串首尾加上双引号?
  3. Java学习笔记二:数据类型
  4. Redis中对ZSet类型的操作命令
  5. Dcloud HTML5 监听蓝牙设备 调用 原生安卓实现 - aspirant - 博客园
  6. linux 进程函数替换,Linux使用exec函数实现进程替换的代码分享
  7. 联想拯救者Y90游戏性能实测:全程满帧 散热能力出众
  8. 图像去雾:基于暗通道的去雾算法 - 附代码
  9. 2021湖南高考成绩分段查询,2021年湖南高考成绩排名查询系统,湖南高考位次排名查询...
  10. Swagger UI 可视化 web API 文档、Multiple Dockets with the same group name are not supported.
  11. 企业在信息化建设上乘之选:软件快速开发框架
  12. Speed Gear(变速精灵XP) V6.0 - 免费版,破解版,绿色版
  13. 机械革命bios升级_¥1500买6年前神舟老战神,3内存+4硬盘升级潜力强,鲁大师15万!...
  14. 金三银四马-sb java面试突击资源
  15. 为什么vs数据库中文显示问号_oracle中文显示为问号
  16. 二本学历,3年软件测试点点点,25K入职阿里巴巴
  17. sinc函数卷积_从采样点到声音:sinc函数和卷积
  18. 软件测试面试中都会问到哪些关于Python的问题?
  19. 微信小程序用户头像编辑上传
  20. python编写四则运算_python实现四则运算

热门文章

  1. C++:后缀增量和减量运算符:++ 和 --
  2. 发布:程序员导航网站建成之程序发布上线
  3. Word 宏代码:将〖〗内的文本作为脚注加在页下
  4. 计算机二级vb知识点汇总,计算机等级考试二级VB考点:控件数组
  5. 暴风影音2018届校园招聘技术类笔试题目
  6. 计算机学前教育教案,学前教育学第四章教案.doc
  7. EBS Form开发中LOV和Editor介绍
  8. 吐槽和卧槽以及跳槽的含义,这个你应该知道
  9. 拒绝破解,用10大免费软件来代替盗版
  10. 一些黑防的VIP教程,绝对可下,迅雷下载