hbase记录日志wal

SQL Server transaction log is one of the most critical and in the same time one of the most misinterpreted part. While being neglected, it can easily become a bottleneck to our SQL Server environment. We need to have this in mind and to take care of our transaction logs in order to streamline the performance of our queries and increase log’s throughput.

SQL Server事务日志是最关键的日志之一,同时也是最容易被误解的部分之一。 尽管被忽略,但它很容易成为我们SQL Server环境的瓶颈。 我们需要牢记这一点,并照顾好我们的事务日志,以简化查询的性能并提高日志的吞吐量。

This sounds great but it is not as straightforward as it looks! In order to make our life easier and not to worry about the transaction logs, we should be aware what actually stands behind these *.ldf files sitting somewhere across our file system and how they are logging the information.

这听起来不错,但并不像看起来那样简单! 为了使我们的生活更轻松而不用担心事务日志,我们应该知道位于文件系统中某个位置的* .ldf文件背后的实际含义以及它们如何记录信息。

日志结构 (Log Structure)

Every SQL database has a transaction log which maps over one or more physical files. The Database Engine divides each file into a specific number of virtual log files (VLFs) that are being used to hold the information about everything happening inside our databases. This number of VLFs is not some kind of magic but it is being chosen dynamically. The calculation is taking place whenever you are creating or extending your log file regardless this being performed manually or automatically. Here is the formula which is currently being used:

每个SQL数据库都有一个事务日志,该日志映射一个或多个物理文件。 数据库引擎将每个文件划分为特定数量的虚拟日志文件(VLF),这些虚拟日志文件用于保存有关数据库内部发生的所有事情的信息。 这个数量的VLF并不是魔术,而是动态选择的。 无论何时创建或扩展日志文件,无论是手动还是自动执行,都会进行计算。 这是当前正在使用的公式:

Under 1 MB – 2 new VLFs (roughly 1/2 of the new size each)

小于1 MB – 2个新的VLF(每个大约为新大小的1/2)

Above 1 MB and under or equal to 64 MB – 4 new VLFs (roughly 1/4 of the new size each)

大于1 MB且小于或等于64 MB – 4个新的VLF(每个大约为新大小的1/4)

Above 64 MB and under or equal to 1 GB – 8 new VLFs (roughly 1/8 of the new size each)

大于64 MB且小于或等于1 GB – 8个新的VLF(每个大约为新大小的1/8)

Above 1 GB – 16 new VLFs (roughly 1/16 of the new size each)

1 GB以上– 16个新的VLF(每个大约为新大小的1/16)

You can check how many VLFs and what their size is by using the undocumented “DBCC LOGINFO” command. Note that the size of the log file will be roughly the same but not exactly. Let’s create a simple database with this script:

您可以使用未记录的“ DBCC LOGINFO”命令检查多少个VLF及其大小。 请注意,日志文件的大小将大致相同,但不完全相同。 让我们使用此脚本创建一个简单的数据库:


USE [master]
GO
CREATE DATABASE [myTestVLF] ON PRIMARY
( NAME = N'myTestVLF', FILENAME = N'C:\myTestVLF.mdf'
,SIZE = 80 MB)
LOG ON
( NAME = N'myTestVLF_log', FILENAME = N'C:\myTestVLF_log.ldf'
,SIZE = 300 MB)
GO

We can now use the DBCC command to check the result:

现在,我们可以使用DBCC命令来检查结果:


USE myTestVLF;
GO
DBCC LOGINFO;
GO

In the picture below we can see that the last VLF has a size that slightly differs from the others:

在下图中,我们可以看到最后一个VLF的大小与其他VLF略有不同:

The transaction log file is with wrap-around nature. This means that whenever we reach the end of the file, the engine will try to reuse the first VLF. If this is not possible and an autogrowth is enabled, SQL Server will make an attempt to grow it:

事务日志文件具有环绕特性。 这意味着只要我们到达文件末尾,引擎就会尝试重用第一个VLF。 如果这不可能并且启用了自动增长,则SQL Server将尝试对其进行增长:

日志块 (Log Blocks)

The next physical layer of the transaction log is the log blocks. Virtual Log Files are divided into blocks which are between 512 bytes and 60 kilobytes large. These are respectively the smallest and largest amount of data that can be flushed to disk. Log Blocks are acting as containers for the log records.

事务日志的下一个物理层是日志块。 虚拟日志文件分为块,大小在512字节到60 KB之间。 这些分别是可以刷新到磁盘的最小和最大数据量。 日志块充当日志记录的容器。

日志记录 (Log Records)

In SQL Server, we have almost every operation logged somehow. This includes actions like data modifications (insert, update, and delete), page allocation or deallocation, the start and end of a transaction and etc. The database engine is recording the changes happening in it by generating log records. Each log record is identified by a unique number called LSN – Log Sequence Number. The LSN number is ever increasing so the next log record will always have a higher LSN than the previous one. This LSN is 10 bytes in length and it is consisting of three parts: VLF Sequence Number (4 bytes): Log Block Number (4 bytes): Log Record Number (2 bytes):

在SQL Server中,几乎所有操作都以某种方式记录下来。 这包括数据修改(插入,更新和删除),页面分配或释放,事务的开始和结束等操作。数据库引擎通过生成日志记录来记录其中发生的更改。 每个日志记录都由一个称为LSN的唯一编号-日志序列号标识。 LSN数目不断增加,因此下一个日志记录将始终具有比上一个更高的LSN。 该LSN的长度为10个字节,由三部分组成:VLF序列号(4个字节):日志块号(4个字节):日志记录号(2个字节):

Log records are not always part of a transaction. We have specific LSNs like PFS free space change and Differential Bitmap changes that are not part of a transaction

日志记录并不总是事务的一部分。 我们有不属于事务一部分的特定LSN,例如PFS可用空间更改和差异位图更改

Transactions are reflected into the log as a chain of log records. Furthermore records are stored in a serial sequence as they are created so the LSNs that are part of the same transaction are not necessarily located next to each other. Each transaction has an ID that is pointed in every log record. SQL Server is using this to create a chain using backward pointers in order to track all the LSNs that are part of a specific transaction and speed up the rollback process.

事务作为日志记录链反映到日志中。 此外,记录在创建时会以串行顺序存储,因此属于同一事务的LSN不必彼此相邻。 每个事务都有一个在每个日志记录中指向的ID。 SQL Server使用它来创建使用后向指针的链,以便跟踪属于特定事务的所有LSN并加快回滚过程。

We can examine the log records by using the table-valued function “fn_dblog”. This is not in the scope of this article but if you are that curious you can check the documentation using sp_help:

我们可以使用表值函数“ fn_dblog”检查日志记录。 这不在本文讨论范围之内,但是如果您对此感到好奇,可以使用sp_help查看文档:

sp_help 'sys.fn_dblog'

预写日志记录(WAL) (Write-Ahead Logging (WAL))

Like the others contemporary Relational Database Management System, SQL Server needs to guarantee the durability of your transactions (once you commit your data it is there even in the event of power loss) and the ability to roll back the data changed from uncommitted transactions. The mechanism that is being utilized is called Write-Ahead Logging (WAL). It simply means that SQL Server needs to write the log records associated with a particular modification before it writes the page to the disk regardless if this happening due to a Checkpoint process or as part of Lazy Writer activity. It sounds natural but how is this affecting the logging operations in SQL?

像其他现代的关系数据库管理系统一样,SQL Server需要保证事务的持久性(一旦您提交了数据,即使发生断电,它也在那里),并且能够回滚未提交的事务中更改的数据。 正在使用的机制称为预写日志记录(WAL)。 这仅表示SQL Server需要在将页面写入磁盘之前,先将与特定修改相关联的日志记录写入磁盘,无论这是由于Checkpoint进程还是Lazy Writer活动的一部分而发生。 听起来很自然,但这如何影响SQL中的日志记录操作?

There is a common misunderstanding of when the log records are being flushed to disk. The general understanding is that log records, the information about the modification we are doing in our databases, are being sent to the disk immediately and they are hardened to the log. Well this is not true! While we are making changes to our data, SQL is generating log records about every change. They are being stored in Log Blocks which are part of the Buffer Pool (grey areas are data pages):

对于何时将日志记录刷新到磁盘有一个普遍的误解。 一般的理解是,日志记录(我们在数据库中所做的有关修改的信息)将立即发送到磁盘,并被硬化到日志中。 好吧,这不是真的! 在更改数据的同时,SQL会生成有关每个更改的日志记录。 它们被存储在缓冲池中的日志块中(灰色区域是数据页):

So the first step is to store the Log records in the Buffer Pool and then they are being flushed to disk in one of the following situations:

因此,第一步是将日志记录存储在缓冲池中,然后在下列情况之一中将它们刷新到磁盘:

  • When we commit/rollback a transaction当我们提交/回滚交易时
  • When a log block hits its maximum size of 60 KB日志块达到其最大大小60 KB时
  • When a data page is being written to disk – all the log records up to including the last one affecting this page must be written to disk regardless of which transactions they are part of将数据页写入磁盘时–包括影响该页的最后一个记录在内的所有日志记录都必须写入磁盘,无论它们属于哪个事务

I guess the last case is bringing up the following question (in case this is due to checkpoint): Why we are simply not flushing all the records when the checkpoint begins? The checkpoint operation might take a while and there could be modified pages after it was fired that could eventually be written to the file system before the log records. If this happens, we will not adhere to the WAL algorithm and SQL might not be able to roll back the changes performed by specific transactions.

我猜最后一种情况是提出以下问题(如果这是由于检查点引起的):为什么我们在检查点开始时根本不刷新所有记录? 检查点操作可能要花一些时间,并且在触发之后可能会有修改过的页面,这些页面最终可能在日志记录之前写入文件系统。 如果发生这种情况,我们将不遵守WAL算法,并且SQL可能无法回滚特定事务执行的更改。

The understanding of the Log Structure, a number of VLFs we have and WAL is essential to our work and can dramatically improve our effectiveness as SQL Server engineers. A journey is ahead of us through the top reasons for performance problems with the transaction log. Stay tuned!

对日志结构,我们拥有的许多VLF和WAL的理解对于我们的工作至关重要,并且可以极大地提高我们作为SQL Server工程师的效率。 事务日志性能问题的首要原因正在我们前面。 敬请关注!

翻译自: https://www.sqlshack.com/sql-server-transaction-log-part-1-log-structure-write-ahead-logging-wal-algorithm/

hbase记录日志wal

hbase记录日志wal_SQL Server事务日志–第1部分–日志结构和预写日志记录(WAL)算法相关推荐

  1. hbase 预写日志_HBase存储结构

    一.Hbase存储框架 图1  Hbase存储架构图 1.结构 HBase中的每张表都通过行键按照一定的范围被分割成多个子表(HRegion),默认一个HRegion超过256M就要被分割成两个,由H ...

  2. HBase预写日志WAL机制

    预写日志(Write-ahead log,WAL) 最重要的作用是灾难恢复,一旦服务器崩溃,通过重放log,我们可以恢复崩溃之前的数据.如果写入WAL失败,整个操作也将认为失败. 从上图看: 1 客户 ...

  3. hbase 预写日志_HDInsight HBase 加速写入现已正式发布

    HDInsight HBase 加速写入现已正式发布. 要点 Apache HBase 和 Phoenix 的写入性能提升多达 9 倍. 非常适合 Azure Data Lake Storage Ge ...

  4. 数仓知识12:PostgreSQL预写日志(WAL)和逻辑解码方案

    目录 PostgreSQL预写日志(WAL) PostgreSQL逻辑解码(Logical Decoding) 逻辑解码方案研究分析 PostgreSQL预写日志(WAL) 从PostgreSQL 9 ...

  5. mysql的预写日志_编写数据库:第2部分-预写日志

    所以,您的数据不是很耐用... 在第1部分中,我使用gRPC和Go编写了一个非常简单的服务器,该服务器用于服务Get和Put请求内存中的映射.如果服务器退出,它将丢失所有数据,对于数据库,我必须承认这 ...

  6. linux c 日志写入文件,linux下C语言实现写日志功能

    先上程序,该程序经过测试能够很好的实现写日志要求 /************************************************************************* ...

  7. Hbase 预写日志WAL处理源码分析之 LogCleaner

    Hlog WALs和oldWALs 这里先介绍一下Hlog失效和Hlog删除的规则 HLog失效:写入数据一旦从MemStore中刷新到磁盘,HLog(默认存储目录在/hbase/WALs下)就会自动 ...

  8. Hbase 预写日志WAL处理源码分析之 LogCleaner

    目录 Hlog  WALs和oldWALs 整体流程 HMaster 初始化 定时执行 LogCleaner 日志清理类 ReplicationLogCleaner 日志清理类 总结 Hlog  WA ...

  9. SQL Server事务日志体系结构

    This article will cover SQL Server transaction log architecture including file topography, basic ove ...

最新文章

  1. ECSHOP系统纯静态网页的生成
  2. BZOJ 4247 挂饰 背包DP
  3. zoj How Many Shortest Path
  4. 基于CSMA -CA协议的无线星型网络的应用案例介绍
  5. 224秒!ImageNet上训练ResNet-50最佳战绩出炉,索尼下血本破纪录
  6. 开课吧学python靠谱吗-开课吧9.9元学Python课程适合哪些人?开课吧靠谱吗?
  7. 密码库LibTomcrypt的内容介绍及分析
  8. java下面哪些定义正确_Java认证考试题
  9. python做一个微型美颜图片处理器,十行代码即可完成
  10. 如何把门禁卡做成你用不起的样子?B站up主自制迷你卡片,公司小区通刷,还带墨水屏的那种...
  11. dmg文件如何安装linux,我怎么能打开.dmg文件?
  12. 昆仑固件涉密专用计算机,存储处理国家秘密的计算机信息系统按照涉密程序实行...
  13. 浅析硬件构造Tone mapping曲线
  14. 人工智能基础——知识的概念
  15. HEVC Tile 编码器-kvazaar
  16. 【codevs1419】藤原妹红 树形DP
  17. 几款开源的ETL工具及ELT初探
  18. amh升级php版本,AMH4.2升级PHP版本后续之组件安装
  19. 【ML复习】什么是 监督学习,什么是 非监督学习?二者的区别是什么?列举常见的 监督学习算法 和 非监督学习算法。
  20. Nancy如何接收POST过来的Json数据

热门文章

  1. python的函数调用_三个案例带你了解python回调函数
  2. python判断安全密码_python 字符串实例:检查并判断密码字符串的安全强度
  3. python制作个人信息管理系统_python实现简易学生信息管理系统
  4. Constructor vs Object
  5. Docker管理工具-Swarm部署记录
  6. 如何设置 Windows 默认命令行窗口大小和缓冲区大小
  7. eclipse注释模板
  8. LeetCode(606)——根据二叉树创建字符串(JavaScript)
  9. Linux---进程间通信
  10. malloc/free 与 new/delete的区别