postgresql源码学习(53)—— vacuum②-lazy vacuum之heap_vacuum_rel函数
一、 table_relation_vacuum函数
1. 函数定义
前篇最后(https://blog.csdn.net/Hehuyi_In/article/details/128749517),我们提到了table_relation_vacuum函数(tableam.h文件),本篇继续学习。
如前面所说,手动和autovacuum触发的vacuum操作均会走到该函数,需要对表加4级锁。该函数针对lazy vacuum,因此vacuum full,CLUSTER,ANALYZE操作不会走到它。
static inline void
table_relation_vacuum(Relation rel, struct VacuumParams *params,BufferAccessStrategy bstrategy)
{rel->rd_tableam->relation_vacuum(rel, params, bstrategy);
}
这里遇到一个问题,relation_vacuum实际不是一个函数,源码文件中也没找到它的内容,无法再看到下层函数,因此这里借助gdb跟踪一把。
2. 下层函数追踪
b table_relation_vacuum
可以看到后面实际是调用了heap_vacuum_rel函数(vacuumlazy.c文件)。兜兜转转半天,终于见到了lazy vacuum函数的庐山真面目。但在学习它之前,还是先来看看其中几个预备知识点,避免一脸懵逼。
二、 准备知识
1. TransactionIdPrecedesOrEquals函数
这个老朋友之前学习过,用来比较哪个事务id更旧,原理参考:https://blog.csdn.net/Hehuyi_In/article/details/102869893
/** TransactionIdPrecedesOrEquals --- is id1 logically <= id2?*/
bool
TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2)
{int32 diff;if (!TransactionIdIsNormal(id1) || !TransactionIdIsNormal(id2))return (id1 <= id2);diff = (int32) (id1 - id2);return (diff <= 0);
}
2. LVRelState结构体
LV指的是Lazy Vacuum,从这个名字可以猜测是与表状态相关的结构体。
typedef struct LVRelState
{/* Target heap relation and its indexes */Relation rel;Relation *indrels;int nindexes;/* Wraparound failsafe has been triggered? */bool failsafe_active;/* Consider index vacuuming bypass optimization? */bool consider_bypass_optimization;/* Doing index vacuuming, index cleanup, rel truncation? */bool do_index_vacuuming;bool do_index_cleanup;bool do_rel_truncate;/* Buffer access strategy and parallel state */BufferAccessStrategy bstrategy;LVParallelState *lps;/* Statistics from pg_class when we start out */BlockNumber old_rel_pages; /* previous value of pg_class.relpages */double old_live_tuples; /* previous value of pg_class.reltuples *//* rel's initial relfrozenxid and relminmxid */TransactionId relfrozenxid;MultiXactId relminmxid;/* VACUUM operation's cutoff for pruning */TransactionId OldestXmin;/* VACUUM operation's cutoff for freezing XIDs and MultiXactIds */TransactionId FreezeLimit;MultiXactId MultiXactCutoff;/* Error reporting state */char *relnamespace;char *relname;char *indname;BlockNumber blkno; /* used only for heap operations */OffsetNumber offnum; /* used only for heap operations */VacErrPhase phase;/** State managed by lazy_scan_heap() follows*/LVDeadTuples *dead_tuples; /* items to vacuum from indexes */BlockNumber rel_pages; /* total number of pages */BlockNumber scanned_pages; /* number of pages we examined */BlockNumber pinskipped_pages; /* # of pages skipped due to a pin */BlockNumber frozenskipped_pages; /* # of frozen pages we skipped */BlockNumber tupcount_pages; /* pages whose tuples we counted */BlockNumber pages_removed; /* pages remove by truncation */BlockNumber lpdead_item_pages; /* # pages with LP_DEAD items */BlockNumber nonempty_pages; /* actually, last nonempty page + 1 *//* Statistics output by us, for table */double new_rel_tuples; /* new estimated total # of tuples */double new_live_tuples; /* new estimated total # of live tuples *//* Statistics output by index AMs */IndexBulkDeleteResult **indstats;/* Instrumentation counters */int num_index_scans;int64 tuples_deleted; /* # deleted from table */int64 lpdead_items; /* # deleted from indexes */int64 new_dead_tuples; /* new estimated total # of dead items in* table */int64 num_tuples; /* total number of nonremovable tuples */int64 live_tuples; /* live tuples (reltuples estimate) */
} LVRelState;
三、 heap_vacuum_rel函数
根据注释,该函数负责vacuum单个堆表、清理其索引、更新relpages,reltuples的统计信息。进入此函数前,我们已经完成了事务开启以及对应表的4级锁获取。
/** heap_vacuum_rel() -- perform VACUUM for one heap relation** This routine vacuums a single heap, cleans out its indexes, and* updates its relpages and reltuples statistics.** At entry, we have already established a transaction and opened* and locked the relation.*/
void
heap_vacuum_rel(Relation rel, VacuumParams *params,BufferAccessStrategy bstrategy)
{LVRelState *vacrel;PGRUsage ru0;TimestampTz starttime = 0;WalUsage walusage_start = pgWalUsage;WalUsage walusage = {0, 0, 0};long secs;int usecs;double read_rate,write_rate;bool aggressive; /* should we scan all unfrozen pages? 是否应该扫描所有未冻结页? */bool scanned_all_unfrozen; /* actually scanned all such pages? 是否实际扫描了所有未冻结页? */char **indnames = NULL;TransactionId xidFullScanLimit;MultiXactId mxactFullScanLimit;BlockNumber new_rel_pages;BlockNumber new_rel_allvisible;double new_live_tuples;TransactionId new_frozen_xid;MultiXactId new_min_multi;ErrorContextCallback errcallback;PgStat_Counter startreadtime = 0;PgStat_Counter startwritetime = 0;TransactionId OldestXmin;TransactionId FreezeLimit;MultiXactId MultiXactCutoff;
…
首先根据输入的freeze参数,计算并赋值给各类限制值变量(带&的都是),用于下面判断是否采取迫切模式(aggressive)。根据前面的注释,aggressive=true则需要扫描所有未冻结页。冻结相关参考:postgresql_internals-14 学习笔记(三)冻结、rebuild_Hehuyi_In的博客-CSDN博客
/* 根据输入的freeze参数,计算各类限制值(带&的都是),用于下面判断是否采取迫切(aggressive)清理 */vacuum_set_xid_limits(rel,params->freeze_min_age,params->freeze_table_age,params->multixact_freeze_min_age,params->multixact_freeze_table_age,&OldestXmin, &FreezeLimit, &xidFullScanLimit,&MultiXactCutoff, &mxactFullScanLimit);/* * 如果表的relfrozenxid <= xidFullScanLimit(表中最新xid- vacuum_freeze_table_age),则触发aggressive scan,multiXid类似;如果设置了DISABLE_PAGE_SKIPPING(禁用跳过页),则也触发aggressive scan。*/aggressive = TransactionIdPrecedesOrEquals(rel->rd_rel->relfrozenxid,xidFullScanLimit);aggressive |= MultiXactIdPrecedesOrEquals(rel->rd_rel->relminmxid,mxactFullScanLimit);if (params->options & VACOPT_DISABLE_PAGE_SKIPPING)aggressive = true;
初始化vacrel变量并根据各类option设置其字段初始值,各字段含义参考LVRelState结构体定义。
vacrel = (LVRelState *) palloc0(sizeof(LVRelState));/* Set up high level stuff about rel */vacrel->rel = rel;/* 打开表索引,返回索引名和数量 */vac_open_indexes(vacrel->rel, RowExclusiveLock, &vacrel->nindexes,&vacrel->indrels);vacrel->failsafe_active = false;vacrel->consider_bypass_optimization = true;/** The index_cleanup param either disables index vacuuming and cleanup or* forces it to go ahead when we would otherwise apply the index bypass* optimization. The default is 'auto', which leaves the final decision* up to lazy_vacuum().** The truncate param allows user to avoid attempting relation truncation,* though it can't force truncation to happen.*/Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &¶ms->truncate != VACOPTVALUE_AUTO);vacrel->do_index_vacuuming = true;vacrel->do_index_cleanup = true;vacrel->do_rel_truncate = (params->truncate != VACOPTVALUE_DISABLED);if (params->index_cleanup == VACOPTVALUE_DISABLED){/* Force disable index vacuuming up-front */vacrel->do_index_vacuuming = false;vacrel->do_index_cleanup = false;}else if (params->index_cleanup == VACOPTVALUE_ENABLED){/* Force index vacuuming. Note that failsafe can still bypass. */vacrel->consider_bypass_optimization = false;}else{/* Default/auto, make all decisions dynamically */Assert(params->index_cleanup == VACOPTVALUE_AUTO);}vacrel->bstrategy = bstrategy;vacrel->old_rel_pages = rel->rd_rel->relpages;vacrel->old_live_tuples = rel->rd_rel->reltuples;vacrel->relfrozenxid = rel->rd_rel->relfrozenxid;vacrel->relminmxid = rel->rd_rel->relminmxid;/* Set cutoffs for entire VACUUM */vacrel->OldestXmin = OldestXmin;vacrel->FreezeLimit = FreezeLimit;vacrel->MultiXactCutoff = MultiXactCutoff;vacrel->relnamespace = get_namespace_name(RelationGetNamespace(rel));vacrel->relname = pstrdup(RelationGetRelationName(rel));vacrel->indname = NULL;vacrel->phase = VACUUM_ERRCB_PHASE_UNKNOWN;
…
lazy_scan_heap是lazy vacuum的核心函数,该函数将首先扫描表(会用到vm文件),找到无效的元组和具有空闲空间的page,然后计算表的有效元组数,最后执行表和索引的清理操作。
/* Do the vacuuming,核心函数 */lazy_scan_heap(vacrel, params, aggressive);
- 关闭表索引
- 计算实际是否扫描了所有未冻结页(aggressive模式),并设置scanned_all_unfrozen的值
- lazy_truncate_heap进行文件末尾的页截断(可选操作),这部分空间可以释放回操作系统。注意这个函数会短暂加8级锁,有可能影响业务
ConditionalLockRelation(vacrel->rel, AccessExclusiveLock)
- 更新pg_class中的统计信息
- 清理vacrel中的索引统计信息及索引名
/* Done with indexes,关闭表索引 */vac_close_indexes(vacrel->nindexes, vacrel->indrels, NoLock);/** Compute whether we actually scanned the all unfrozen pages. If we did,* we can adjust relfrozenxid and relminmxid.** NB: We need to check this before truncating the relation, because that* will change ->rel_pages.*/if ((vacrel->scanned_pages + vacrel->frozenskipped_pages)< vacrel->rel_pages){Assert(!aggressive);scanned_all_unfrozen = false;}elsescanned_all_unfrozen = true;/** Optionally truncate the relation.尝试truncate文件末的页*/if (should_attempt_truncation(vacrel)){/** Update error traceback information. This is the last phase during* which we add context information to errors, so we don't need to* revert to the previous phase.*/update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_TRUNCATE,vacrel->nonempty_pages,InvalidOffsetNumber);lazy_truncate_heap(vacrel);}/** Update statistics in pg_class.*/new_rel_pages = vacrel->rel_pages;new_live_tuples = vacrel->new_live_tuples;visibilitymap_count(rel, &new_rel_allvisible, NULL);if (new_rel_allvisible > new_rel_pages)new_rel_allvisible = new_rel_pages;new_frozen_xid = scanned_all_unfrozen ? FreezeLimit : InvalidTransactionId;new_min_multi = scanned_all_unfrozen ? MultiXactCutoff : InvalidMultiXactId;vac_update_relstats(rel,new_rel_pages,new_live_tuples,new_rel_allvisible,vacrel->nindexes > 0,new_frozen_xid,new_min_multi,false);…/* Cleanup index statistics and index names */for (int i = 0; i < vacrel->nindexes; i++){if (vacrel->indstats[i])pfree(vacrel->indstats[i]);if (indnames && indnames[i])pfree(indnames[i]);}
}
如你所见,我们又掉进了新的坑里——lazy_scan_heap,后面继续研究研究这个函数~
参考:
《PostgreSQL数据库内核分析》
PostgreSQL 源码解读(128)- MVCC#12(vacuum过程-heap_vacuum_rel函数)_ITPUB博客
http://blog.itpub.net/6906/viewspace-2564641/
Postgresql Freezing 实现原理_13446560的技术博客_51CTO博客
https://www.pudn.com/news/6277722b517cd20ea491bf39.html
postgresql源码学习(53)—— vacuum②-lazy vacuum之heap_vacuum_rel函数相关推荐
- postgresql源码学习(52)—— vacuum①-准备工作与主要流程
关于vacuum的基础知识,参考,本篇从源码层继续学习 https://blog.csdn.net/Hehuyi_In/article/details/102992065 https://blog.c ...
- postgresql源码学习(49)—— MVCC⑤-cmin与cmax 同事务内的可见性判断
一. 难以理解的场景 postgresql源码学习(十九)-- MVCC④-可见性判断 HeapTupleSatisfiesMVCC函数_Hehuyi_In的博客-CSDN博客 在前篇的可见性判断中有 ...
- PostgreSQL源码学习(1)--PG13代码结构
PostgreSQL源码学习(1)–PG13代码结构 PostgreSQL代码结构 Bootstrap:用于支持Bootstrap运行模式,该模式主要用来创建初始的模板数据库. Main:主程序模块, ...
- PostgreSQL源码学习(一)编译安装与GDB入门
提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档 PostgreSQL源码学习(一)编译安装与GDB入门 前言 一.安装PostgreSQL 1.获取源码 2.配置 3.编译 3.安装 ...
- postgresql源码学习(27)—— 事务日志⑦-日志落盘上层函数 XLogFlush
一. 预备知识 1. XLOG什么时候需要落盘 事务commit之前 log buffer被覆盖之前 后台进程定期落盘 2. 两个核心结构体 这两个结构体定义代码在xlog.c,它们在日志落盘过程中非 ...
- postgresql源码学习(51)—— 提交日志CLOG 原理 用途 管理函数
一. CLOG是什么 CLOG(commit log)记录事务的最终状态. 物理上,是$PGDATA/pg_xact目录下的一些文件 逻辑上,是一个数组,下标为事务id,值为事务最终状态 1. 事务最 ...
- postgresql源码学习(57)—— pg中的四种动态库加载方法
一. 基础知识 1. 什么是库 库其实就是一些通用代码,可以在程序中重复使用,比如一些数学函数,可以不需要自己编写,直接调用相关函数即可实现,避免重复造轮子. 在linux中,支持两种类型的库: 1. ...
- postgresql源码学习(一)—— 源码编译安装与gdb调试入门
一. postgresql源码编译安装 因为只是用来调试的测试环境,把基本的软件装好和库建好就可以,一切从简. 1. 创建用户和目录 mkdir -p /data/postgres/base/ mkd ...
- postgresql源码学习(九)—— 常规锁②-强弱锁与Fast Path
一. 强锁与弱锁 根据兼容性表,彼此相容的3个锁(1-3级,AccessShareLock.RowShareLock.RowExclusiveLock)是弱锁,4级锁ShareUpdateExclus ...
最新文章
- 扩展源_Ubuntu14版本下无法使用php7.2版本的bcmath扩展
- 企业价值观念形成的四个阶段
- Struts1.2的框架验证
- CSS3.0_选择器_学习笔记
- VS2013 int main(int argc, char** argv)参数传递
- 在Tomcat 与weblogic 中的 日志(log4j) 配置系列三(log文件的存放路径)
- sql server 群集_SQL Server 2014 –安装群集实例–分步(3/3)
- 2014年3月新鲜出炉的最佳 JavaScript 工具库
- mysql和php长度的漏洞_mysql和php字符长度判断
- EDA技术实用教程 | 复习一 | IP核的概念和分类
- 当程序员具备了抽象思维
- []趋势科技2015校园招聘
- AR1021x USB网卡驱动学习笔记
- DCOS之Mesos-DNS介绍
- 什么是视频内容推荐引擎?
- Codeforces 718E Matvey's Birthday bfs
- C++之vector的高维数组
- USACO修理牛棚 Barn Repair
- 演示笔记本电脑如何一键安装win10系统
- 设置NTFS磁盘文件夹的可写权限(转自:http://doc.spacebuilder.cn/Default.aspx?Page=setNTFSAspxAutoDetectCookieSuppor)