2019独角兽企业重金招聘Python工程师标准>>>

标签

PostgreSQL , pg_stat , 实时质量监控

背景

当业务系统越来越庞大后,各个业务线的数据对接会越来越频繁,但是也会引入一个问题。

数据质量。

我有几张阿里云幸运券分享给你,用券购买或者升级阿里云相应产品会有特惠惊喜哦!把想要买的产品的幸运券都领走吧!快下手,马上就要抢光了。

例如上游是否去掉了一些字段,或者上游数据是否及时触达,又或者上游数据本身是否出现了问题。

通过业务数据质量监控,可以发现这些问题。

而PostgreSQL内置的统计信息能力,已经满足了大部分业务数据质量实时监控场景的需求。

如果需要更加业务话、定制的数据质量监控。PostgreSQL还能支持阅后即焚,流式计算、异步消息等特性,支持实时的数据质量监控。

内置功能,业务数据质量实时监控

PostgreSQL内置统计信息如下:

1、准实时记录数

postgres=# \d pg_class      Table "pg_catalog.pg_class"      Column        |     Type     | Collation | Nullable | Default
---------------------+--------------+-----------+----------+---------      relname             | name         |           | not null |   -- 对象名    relnamespace        | oid          |           | not null |   -- 对象所属的schema, 对应pg_namespace.oid    relpages            | integer      |           | not null |   -- 评估的页数(单位为block_size)    reltuples           | real         |           | not null |   -- 评估的记录数

2、准实时的每列的统计信息(空值占比、平均长度、有多少唯一值、高频词、高频词的占比、均匀分布柱状图、线性相关性、高频元素、高频元素占比、高频元素柱状图)

详细的解释如下:

postgres=# \d pg_stats       View "pg_catalog.pg_stats"      Column         |   Type   | Default
------------------------+----------+---------      schemaname             | name     |   -- 对象所属的schema    tablename              | name     |   -- 对象名    attname                | name     |   -- 列名    inherited              | boolean  |   -- 是否为继承表的统计信息(false时表示当前表的统计信息,true时表示包含所有继承表的统计信息)    null_frac              | real     |   -- 该列空值比例    avg_width              | integer  |   -- 该列平均长度    n_distinct             | real     |   -- 该列唯一值个数(-1表示唯一,小于1表示占比,大于等于1表示实际的唯一值个数)    most_common_vals       | anyarray |   -- 该列高频词    most_common_freqs      | real[]   |   -- 该列高频词对应的出现频率    histogram_bounds       | anyarray |   -- 该列柱状图(表示隔出的每个BUCKET的记录数均等)    correlation            | real     |   -- 该列存储相关性(-1到1的区间),绝对值越小,存储越离散。小于0表示反向相关,大于0表示正向相关    most_common_elems      | anyarray |   -- 该列为多值类型(数组)时,多值元素的高频词    most_common_elem_freqs | real[]   |   -- 多值元素高频词的出现频率    elem_count_histogram   | real[]   |   -- 多值元素的柱状图中,每个区间的非空唯一元素个数

3、准实时的每个表的统计信息,(被全表扫多少次,使用全表扫的方法扫了多少条记录,被索引扫多少次,使用索引扫扫了多少条记录,写入多少条记录,更新多少条记录,有多少DEAD TUPLE等)。

postgres=# \d pg_stat_all_tables   View "pg_catalog.pg_stat_all_tables"  Column        |           Type           | Default
---------------------+--------------------------+---------  relid               | oid                      |   schemaname          | name                     |   relname             | name                     |   seq_scan            | bigint                   | -- 被全表扫多少次  seq_tup_read        | bigint                   | -- 使用全表扫的方法扫了多少条记录  idx_scan            | bigint                   | -- 被索引扫多少次  idx_tup_fetch       | bigint                   | -- 使用索引扫的方法扫了多少条记录  n_tup_ins           | bigint                   | -- 插入了多少记录  n_tup_upd           | bigint                   | -- 更新了多少记录  n_tup_del           | bigint                   | -- 删除了多少记录  n_tup_hot_upd       | bigint                   | -- HOT更新了多少记录  n_live_tup          | bigint                   | -- 多少可见记录  n_dead_tup          | bigint                   | -- 多少垃圾记录  n_mod_since_analyze | bigint                   |   last_vacuum         | timestamp with time zone |   last_autovacuum     | timestamp with time zone |   last_analyze        | timestamp with time zone |   last_autoanalyze    | timestamp with time zone |   vacuum_count        | bigint                   |   autovacuum_count    | bigint                   |   analyze_count       | bigint                   |   autoanalyze_count   | bigint                   |

4、统计信息分析调度策略

PostgreSQL会根据表记录的变化,自动收集统计信息。调度的参数控制如下:

#track_counts = on
#autovacuum = on                        # Enable autovacuum subprocess?  'on'
autovacuum_naptime = 15s                # time between autovacuum runs
#autovacuum_analyze_threshold = 50      # min number of row updates before  # analyze  默认变更 0.1% 后就会自动收集统计信息。  #autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze

通过内置的统计信息能得到这些信息:

1、准实时记录数

2、每列(空值占比、平均长度、有多少唯一值、高频词、高频词的占比、均匀分布柱状图、线性相关性、高频元素、高频元素占比、高频元素柱状图)

业务数据质量可以根据以上反馈,实时被发现。

例子

1、创建测试表

create table test(id int primary key, c1 int, c2 int, info text, crt_time timestamp);
create index idx_test_1 on test (crt_time);

2、创建压测脚本

vi test.sql  \set id random(1,10000000)
insert into test values (:id, random()*100, random()*10000, random()::text, now()) on conflict (id) do update set crt_time=now();

3、压测

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 1200

4、创建清除数据调度,保持30秒的数据。

delete from test where ctid = any (array(  select ctid from test where crt_time < now()-interval '30 second'
));

0.1秒调度一次

psql   delete from test where ctid = any (array(  select ctid from test where crt_time < now()-interval '30 second'
));  \watch 0.1
日志如下  DELETE 18470  Fri 08 Dec 2017 04:31:54 PM CST (every 0.1s)  DELETE 19572  Fri 08 Dec 2017 04:31:55 PM CST (every 0.1s)  DELETE 20159  Fri 08 Dec 2017 04:31:55 PM CST (every 0.1s)  DELETE 20143  Fri 08 Dec 2017 04:31:55 PM CST (every 0.1s)  DELETE 21401  Fri 08 Dec 2017 04:31:55 PM CST (every 0.1s)  DELETE 21956  Fri 08 Dec 2017 04:31:56 PM CST (every 0.1s)  DELETE 19978  Fri 08 Dec 2017 04:31:56 PM CST (every 0.1s)  DELETE 21916

5、实时监测统计信息

每列统计信息

postgres=# select attname,null_frac,avg_width,n_distinct,most_common_vals,most_common_freqs,histogram_bounds,correlation from pg_stats where tablename='test';  attname           | id
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  | {25,99836,193910,289331,387900,492669,593584,695430,795413,890787,1001849,1100457,1203161,1301537,1400265,1497824,1595610,1702278,1809415,1912946,2006274,2108505,2213771,2314440,2409333,2513067,2616217,2709052,2813209,2916342,3016292,3110554,3210817,3305896,3406145,3512379,3616638,3705990,3804538,3902207,4007939,4119100,4214497,4314986,4405492,4513675,4613327,4704905,4806556,4914360,5020248,5105998,5194904,5292779,5394640,5497986,5600441,5705246,5806209,5905498,6006522,6115688,6212831,6308451,6408320,6516028,6622895,6720613,6817877,6921460,7021999,7118151,7220074,7315355,7413563,7499978,7603076,7695692,7805120,7906168,8000492,8099783,8200918,8292854,8389462,8491879,8589691,8696502,8798076,8892978,8992364,9089390,9192142,9294759,9399562,9497099,9601571,9696437,9800758,9905327,9999758}
correlation       | -0.00220302
.....  attname           | c2
null_frac         | 0
avg_width         | 4
n_distinct        | 9989
most_common_vals  | {3056,6203,1352,1649,1777,3805,7029,420,430,705,1015,1143,2810,3036,3075,3431,3792,4459,4812,5013,5662,5725,5766,6445,6882,7034,7064,7185,7189,7347,8266,8686,8897,9042,9149,9326,9392,9648,9652,9802,63,164,235,453,595,626,672,813,847,1626,1636,1663,1749,1858,2026,2057,2080,2106,2283,2521,2596,2666,2797,2969,3131,3144,3416,3500,3870,3903,3956,3959,4252,4265,4505,4532,4912,5048,5363,5451,5644,5714,5734,5739,5928,5940,5987,6261,6352,6498,6646,6708,6886,6914,7144,7397,7589,7610,7640,7687}
most_common_freqs | {0.000366667,0.000366667,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667,0.000266667}
histogram_bounds  | {0,103,201,301,399,495,604,697,802,904,1009,1121,1224,1320,1419,1514,1623,1724,1820,1930,2045,2147,2240,2335,2433,2532,2638,2738,2846,2942,3038,3143,3246,3342,3443,3547,3644,3744,3852,3966,4064,4162,4262,4354,4460,4562,4655,4755,4851,4948,5046,5143,5237,5340,5428,5532,5625,5730,5830,5932,6048,6144,6248,6349,6456,6562,6657,6768,6859,6964,7060,7161,7264,7357,7454,7547,7638,7749,7852,7956,8046,8138,8240,8337,8445,8539,8626,8728,8825,8924,9016,9116,9214,9311,9420,9512,9603,9709,9811,9911,10000}
correlation       | -0.00246515  ...  attname           | crt_time
null_frac         | 0
avg_width         | 8
n_distinct        | -0.931747
most_common_vals  | {"2017-12-08 16:32:53.836223","2017-12-08 16:33:02.700473","2017-12-08 16:33:03.226319","2017-12-08 16:33:03.613826","2017-12-08 16:33:08.171908","2017-12-08 16:33:14.727654","2017-12-08 16:33:20.857187","2017-12-08 16:33:22.519299","2017-12-08 16:33:23.388035","2017-12-08 16:33:23.519205"}
most_common_freqs | {6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05,6.66667e-05}
histogram_bounds  | {"2017-12-08 16:32:50.397367","2017-12-08 16:32:50.987576","2017-12-08 16:32:51.628523","2017-12-08 16:32:52.117421","2017-12-08 16:32:52.610271","2017-12-08 16:32:53.152021","2017-12-08 16:32:53.712685","2017-12-08 16:32:54.3036","2017-12-08 16:32:54.735576","2017-12-08 16:32:55.269238","2017-12-08 16:32:55.691081","2017-12-08 16:32:56.066085","2017-12-08 16:32:56.541396","2017-12-08 16:32:56.865717","2017-12-08 16:32:57.350169","2017-12-08 16:32:57.698694","2017-12-08 16:32:58.062828","2017-12-08 16:32:58.464265","2017-12-08 16:32:58.92354","2017-12-08 16:32:59.27284","2017-12-08 16:32:59.667347","2017-12-08 16:32:59.984229","2017-12-08 16:33:00.310772","2017-12-08 16:33:00.644104","2017-12-08 16:33:00.976184","2017-12-08 16:33:01.366153","2017-12-08 16:33:01.691384","2017-12-08 16:33:02.021643","2017-12-08 16:33:02.382856","2017-12-08 16:33:02.729636","2017-12-08 16:33:03.035666","2017-12-08 16:33:03.508461","2017-12-08 16:33:03.829351","2017-12-08 16:33:04.151727","2017-12-08 16:33:04.4596","2017-12-08 16:33:04.76933","2017-12-08 16:33:05.125295","2017-12-08 16:33:05.537555","2017-12-08 16:33:05.83828","2017-12-08 16:33:06.15387","2017-12-08 16:33:06.545922","2017-12-08 16:33:06.843679","2017-12-08 16:33:07.111281","2017-12-08 16:33:07.414602","2017-12-08 16:33:07.707961","2017-12-08 16:33:08.119891","2017-12-08 16:33:08.388883","2017-12-08 16:33:08.674867","2017-12-08 16:33:08.979336","2017-12-08 16:33:09.339377","2017-12-08 16:33:09.647791","2017-12-08 16:33:09.94157","2017-12-08 16:33:10.232294","2017-12-08 16:33:10.652072","2017-12-08 16:33:10.921087","2017-12-08 16:33:11.17986","2017-12-08 16:33:11.477399","2017-12-08 16:33:11.776529","2017-12-08 16:33:12.110676","2017-12-08 16:33:12.382742","2017-12-08 16:33:12.70362","2017-12-08 16:33:13.020485","2017-12-08 16:33:13.477398","2017-12-08 16:33:13.788134","2017-12-08 16:33:14.072125","2017-12-08 16:33:14.346058","2017-12-08 16:33:14.625692","2017-12-08 16:33:14.889661","2017-12-08 16:33:15.139977","2017-12-08 16:33:15.390732","2017-12-08 16:33:15.697878","2017-12-08 16:33:16.127449","2017-12-08 16:33:16.438117","2017-12-08 16:33:16.725608","2017-12-08 16:33:17.01954","2017-12-08 16:33:17.344609","2017-12-08 16:33:17.602447","2017-12-08 16:33:17.919983","2017-12-08 16:33:18.201386","2017-12-08 16:33:18.444387","2017-12-08 16:33:18.714402","2017-12-08 16:33:19.099394","2017-12-08 16:33:19.402888","2017-12-08 16:33:19.673556","2017-12-08 16:33:19.991907","2017-12-08 16:33:20.23329","2017-12-08 16:33:20.517752","2017-12-08 16:33:20.783084","2017-12-08 16:33:21.032402","2017-12-08 16:33:21.304109","2017-12-08 16:33:21.725122","2017-12-08 16:33:21.998994","2017-12-08 16:33:22.232959","2017-12-08 16:33:22.462384","2017-12-08 16:33:22.729792","2017-12-08 16:33:23.001244","2017-12-08 16:33:23.251215","2017-12-08 16:33:23.534155","2017-12-08 16:33:23.772144","2017-12-08 16:33:24.076088","2017-12-08 16:33:24.471151"}
correlation       | 0.760231

记录数

postgres=# select reltuples from pg_class where relname='test';
-[ RECORD 1 ]----------
reltuples | 3.74614e+06

DML活跃度统计信息

postgres=# select * from pg_stat_all_tables where relname ='test';
-[ RECORD 1 ]-------+------------------------------
relid               | 591006
schemaname          | public
relname             | test
seq_scan            | 2
seq_tup_read        | 0
idx_scan            | 28300980
idx_tup_fetch       | 24713736
n_tup_ins           | 19730476
n_tup_upd           | 8567352
n_tup_del           | 16143587
n_tup_hot_upd       | 0
n_live_tup          | 3444573
n_dead_tup          | 24748887
n_mod_since_analyze | 547474
last_vacuum         |
last_autovacuum     | 2017-12-08 16:31:10.820459+08
last_analyze        |
last_autoanalyze    | 2017-12-08 16:35:16.75293+08
vacuum_count        | 0
autovacuum_count    | 1
analyze_count       | 0
autoanalyze_count   | 124

数据清理调度

由于是数据质量监控,所以并不需要保留所有数据,我们通过以下方法,可以高效的清除数据,不影响写入和读取。

《如何根据行号高效率的清除过期数据 - 非分区表,数据老化实践》

单实例,每秒的清除速度约263万行。

如何清除统计信息

postgres=# select pg_stat_reset_single_table_counters('test'::regclass);

如何强制手工收集统计信息

postgres=# analyze verbose test;
INFO:  analyzing "public.test"
INFO:  "test": scanned 30000 of 238163 pages, containing 560241 live rows and 4294214 dead rows; 30000 rows in sample, 4319958 estimated total rows
ANALYZE

定制化,业务数据质量实时监控

使用阅后即焚的方法,实时监测数据质量。

例子:

《HTAP数据库 PostgreSQL 场景与性能测试之 32 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(JSON + 函数流式计算)》

《HTAP数据库 PostgreSQL 场景与性能测试之 31 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(读写大吞吐并测)》

《HTAP数据库 PostgreSQL 场景与性能测试之 27 - (OLTP) 物联网 - FEED日志, 流式处理 与 阅后即焚 (CTE)》

《PostgreSQL 异步消息实践 - Feed系统实时监测与响应(如 电商主动服务) - 分钟级到毫秒级的实现》

数据清理调度

由于是数据质量监控,所以并不需要保留所有数据,我们通过以下方法,可以高效的清除数据,不影响写入和读取。

《如何根据行号高效率的清除过期数据 - 非分区表,数据老化实践》

单实例,每秒的清除速度约263万行。

参考

《如何根据行号高效率的清除过期数据 - 非分区表,数据老化实践》

《PostgreSQL 统计信息pg_statistic格式及导入导出dump_stat - 兼容Oracle》

《PostgreSQL pg_stat_ pg_statio_ 统计信息(scan,read,fetch,hit)源码解读》

阅读原文

http://click.aliyun.com/m/36626/

转载于:https://my.oschina.net/u/3637633/blog/1587897

PostgreSQL 业务数据质量 实时监控 实践相关推荐

  1. DataMan-美团旅行数据质量监管平台实践

    背景 数据,已经成为互联网企业非常依赖的新型重要资产.数据质量的好坏直接关系到信息的精准度,也影响到企业的生存和竞争力.Michael Hammer(<Reengineering the Cor ...

  2. 数据质量控制理论与实践经验

    数据质量管理是对数据从计划.收集.记录.存储.回收.分析和展示生命周期的每个阶段里可能引发的数据质量问题,进行识别.度量.监控.预警等一系列管理活动,并通过改善和提高组织的管理水平使得数据质量获得进一 ...

  3. 空气质量实时监控平台

    空气质量实时在线监测平台,实时采集并展示空气中主要污染物含量指标,这些污染物包括:PM2.5,PM10,CO,SO2,NO2,O3.平台也包括TVOC,温度,湿度等多个环境参数的实时采集和展示.平台自 ...

  4. java解析日志数据_Java实时监控日志文件并输出的方法详解

    Java实时监控日志文件并输出的方法详解 想在前台显示数据同步过程中产生的日志文件,在网上找到解决方案,做了代码测试好用.这里做个记录 java.io.RandomAccessFile可以解决同时向文 ...

  5. Zabbix与ELK整合实现对安全日志数据的实时监控告警

    微信公众号:运维开发故事,作者:double冬 1 ELK与ZABBIX有什么关系? ELK大家应该比较熟悉了,zabbix应该也不陌生,那么将ELK和zabbix放到一起的话,可能大家就有疑问了?这 ...

  6. 【微信小程序】collection.watch实现对云端数据的实时监控

    最近在做的小程序项目中,为了实现一个聊天功能的实时性,需要对云端储存的聊天内容进行监控.这篇博文简单的介绍一下,希望对各位朋友有所帮助. 博文主要内容导引: 1.介绍利用collection.watc ...

  7. 「ECharts」电商平台数据可视化实时监控系统之后台开发

    此项目后台采用 Koa2 进行开发配置,相关配置整理如下. 1. Koa2 概述 Koa 是一个新的 web 框架,由 Express 幕后的原班人马打造, 致力于成为 web 应用和 API 开发领 ...

  8. 电商平台数据可视化实时监控系统(开发目录)

    目录 项目地址github 1.koa2快速上手 2.后台项目初步 3.前端项目的创建和准备 4.单独图表组件的开发---商家销售统计(横向柱状图) 5.单独图表组件的开发---销量趋势图表(折线图) ...

  9. 爱奇艺数据质量监控的探索和实践

    01 问题和目标:为什么要进行数据质量监控? 数据质量监控其实跟当前疫情的防控工作有些类似,核酸检测能尽早去发现病毒,溯源则会更了解病毒会在哪些场景,或者对哪些人有比较大的影响,方便进行跟踪,这和数据 ...

最新文章

  1. 拦截器获取不到sesssion作用域的值_ES6--块级作用域
  2. 201771010126 王燕《面向对象程序设计(Java)》第十四周学习总结(测试程序11)...
  3. NOIP2017 列队
  4. mysql中concat函数的使用相关总结
  5. 第二课unit11 系统恢复技术
  6. sklearn中的分类决策树
  7. 重温WEB开发系列(二)HTML HEAD
  8. Linux——安装之磁盘分区
  9. 通过讲课来建立自己的知识系统
  10. 硬件时间,操作系统时间,Windows 和linux 双系统时间差8小时问题说明
  11. Linux下android开发环境 遇到的问题
  12. Linux命令解释之yum
  13. Android 游戏开发中横竖屏切换问题
  14. Android Studio中TextView
  15. php搞笑图片合成,PS教你怎么把照片做成搞笑的qq表情
  16. 【转摘】芯片的本质是什么
  17. 定义类Shape作为父类,并在类中定义方法求周长和面积; (2)定义Shape子类圆形(circle),具有半径属性和常量PI,同时重写父类中的方法; (3)定义Shape子类长方形(rect
  18. 最大的幻术-游戏开发-我的游戏构思-环境
  19. 计算机视觉与图形学-神经渲染专题-StructNeRF室内重建
  20. win10 linux startx,win10商店安装ubuntu 18.04 lts startx切换VT失败

热门文章

  1. KeyMob移动广告聚合平台给予开发者服务!
  2. httpmodule权限应用
  3. Linux 小知识翻译 - 「架构」(arch)
  4. android控件之TextView(一)
  5. 重学ES6 函数的扩展(下)
  6. 关于yarn的一些心得
  7. 在IIS上安装 thinkphp的方法
  8. 农产品流通信息化及农超对接体系的现状
  9. 《Groovy极简教程》第12章 Groovy的JSON包
  10. 并发用户数和TPS的关系