sql server表分区_SQL Server中的FORCESCAN和分区表

sql server表分区

I would like to share one curios case that I recently came across.

我想分享一下我最近遇到的一个古玩案例。

Long story short:

长话短说：

This bug may lead to incorrect results if you use a partitioned table and the FORCESCAN hint.

如果您使用分区表和FORCESCAN提示，则此错误可能导致错误的结果。

虫子 (Bug)

Consider the following example, let’s keep it simple.

考虑下面的示例，让我们保持简单。

The following script creates partition function that has 4 ranges

以下脚本创建具有4个范围的分区函数

Range 1 – (…,0] 范围1 –（…，0]
Range 2 – (0,10] 范围2 –（0,10]
Range 3 – (10,100] 范围3 –（10,100]
Range 4 – (100,…) 范围4 –（100，…）

Then the partition scheme on that function and the table partitioned by the clustered primary key. The table contains 4 values: 1, 10, 50, 100

然后，该函数的分区方案和由集群主键分区的表。该表包含4个值：1、10、50、100

Now run the script, creating the following test data and the queries (I added (select 1) to avoid simple parameterization, to keep it simple):

现在运行脚本，创建以下测试数据和查询（我添加了（选择1）以避免简单的参数化，以使其保持简单）：

use tempdb;
go
create partition function pf(int) as range for values (0, 10, 100)
create partition scheme ps as partition pf all to ([primary]);
go
if object_id('t') is not null drop table t;
create table t (a int primary key) on ps(a);
insert t(a) values (1),(10),(50),(100);
go
select a,b=(select 1) from t where a < 100;
select a,b=(select 1) from t with(forcescan) where a < 100;
go

Results

结果

The first query returned results according to the specified predicate a < 100, among 1, 10, 50, 100 those are 1, 10, 50.

第一个查询根据指定的谓词a <100返回结果，在1、10、50、100中有1、10、50。

The second one is missing value 50! That is not what we expected.

第二个缺失值50！那不是我们所期望的。

勘探 (Exploration)

Before we are going to deep dive into details, I’d like to advise to refresh some background if needed. Here is a remarkable resource of information about how does the optimizer treats partitioned tables starting from the 2008 version: Partitioned Tables in SQL Server 2008.

在我们深入研究细节之前，我建议您根据需要刷新一些背景知识。从2008版开始，以下是有关优化器如何处理分区表的重要信息资源： SQL Server 2008中的分区表。

Long story short with the quote:

长话短说：

“SQL Server 2005 treats partitioned tables specially and creates special plans, such as the above one, for partitioned tables. SQL Server 2008, on the other hand, for the most part treats partitioned tables as regular tables that just happen to be logically indexed on the partition id column. For example, for the purposes or query optimization and query execution, SQL Server 2008 treats the above table not as a heap but as an index on [PtnId]. If we create a partitioned index (clustered or non-clustered), SQL Server 2008 logically adds the partition id column as the first column of the index.”

“ SQL Server 2005特别对待分区表，并为分区表创建特殊的计划，例如上面的计划。 另一方面，SQL Server 2008在大多数情况下将分区表视为常规表，而这些表恰好在分区ID列上进行了逻辑索引。 例如，出于目的或查询优化和查询执行的目的，SQL Server 2008不将上面的表视为堆，而是作为[PtnId]上的索引。 如果我们创建分区索引（群集索引或非群集索引），SQL Server 2008将在逻辑上将分区ID列添加为索引的第一列。”

Let us examine the query plans from the very right Clustered Index Seek operator of the first plan, that produces correct results, the Seek Predicates property.

让我们从第一个计划的最右边的“聚集索引搜寻”运算符检查查询计划，该运算符会产生正确的结果，即“搜寻谓词”属性。

You may observe two seek keys (not two seek predicates), the second one is what we specified in the query, and the query optimizer adds the first one automatically. The purpose for that is a Partition Elimination Technique.

您可能会观察到两个搜索键（不是两个搜索谓词），第二个是我们在查询中指定的键，查询优化器会自动添加第一个。其目的是一种分区消除技术。

This is an optimization trick to avoid accessing partitions, which the query would not touch to speed up execution. For example if we ask for “a < 100”, and the table is partitioned by a – why would we access partitions that do not contain the data and we know it beforehand. The partition range is determined, to eliminate useless.

这是一种优化技巧，可避免访问查询不会触及的分区以加快执行速度。例如，如果我们问“<100”，并且该表是由一个分区-我们为什么要访问不包含数据的分区，我们事先知道它。确定分区范围，以消除无用。

There are two types of partition elimination static (our case, because we know the value 100 in “a < 100” predicate), and dynamic (if we use a parameter or a variable).

分区消除有两种类型：静态（我们的情况，因为我们知道“ a <100 ”谓词中的值100）和动态（如果使用参数或变量）。

Both static and dynamic are vulnerable to the bug.

静态和动态都容易受到该错误的影响。

Back to our query plan, with that knowledge we may see that a partition elimination was involved by adding a predicate: Start: PtnId1000 >= Scalar Operator((1)); End: PtnId1000 <= Scalar Operator((3))

回到我们的查询计划，我们可以了解到，通过添加谓词可以消除分区： Start：PtnId1000> = Scalar Operator（（1））; 结束：PtnId1000 <=标量运算符（（3））

That means that the optimizer considering the partition function determined to scan Partitons 1 to 3.

这意味着考虑分区功能的优化程序决定扫描Partitons 1至3。

We may determine it by exploring the Actual Plan property Actual Partition Accessed:

我们可以通过浏览实际计划属性访问的实际分区来确定它：

Now let’s examine the second query plan.

现在，让我们检查第二个查询计划。

What we see is (from up to bottom):

我们看到的是（从上到下）：

Residual predicate (what we specified in the query predicate)

残留谓词（我们在查询谓词中指定的内容）

Static partition elimination, but, with a complex key, you see those delimiter “;” in the End property. We start seek from the partition 1 – PtnId1000>=Scalar Operator(1), and end up with: PtnId1000; [tempdb].[dbo].[t].a <= Scalar Operator((3)-(1)); Scalar Operator((100)).

消除静态分区，但是使用复杂的密钥，您会看到那些定界符“;” 在End属性中。我们从分区1 – PtnId1000> = Scalar Operator（1）开始搜索 ，最后得到： PtnId1000; [tempdb]。[dbo]。[t] .a <=标量运算符（（3）-（1））; 标量运算符（（100）） 。

Notice the bright red square – the partitions we are going to end with are (3)-(1)!

注意亮红色的正方形-我们将要结束的分区是（3）-（1） ！

Indeed if we examine the plan property:

实际上，如果我们检查计划属性：

If we do a simple query to examine which partition each value belongs, we see that we obviously missed partition 3 for the value 50 that demands our predicate “a < 100”:

如果我们做一个简单的查询来检查每个值属于哪个分区，我们会发现我们显然错过了要求谓词“ a <100”的值50的分区3：

select a, PtnId = $partition.pf(a) from t

We should access 3 of them.

我们应该访问其中的3个。

说明 (Explanation)

The Devil is in the detail. First, we should recall some theory of the query optimization process. Before the query execution plan is build the Query Optimizer builds a physical operator tree, a tree of the C++ objects in memory, that reflects the query logical demands, but already expressed in physical operators. Next the three is extracted and transformed to what we used to call a query plan. This process is known as post optimization rewrite.

细节就是魔鬼。首先，我们应该回顾一下查询优化过程的一些理论。在建立查询执行计划之前，查询优化器会建立一个物理运算符树，这是内存中C ++对象的树，它反映了查询逻辑需求，但已经用物理运算符表示了。接下来，将这三个提取出来并转换为我们称为查询计划的内容。此过程称为后期优化重写。

One of those rewrites – is merging “filter and a scan operator” to “scan with residual”. Fortunately, there is one trace flag that disables this transformation. It is quite helpful for debugging SQL Server plans, and it would be helpful here also – a TF 9130.

这些重写之一是将“过滤器和扫描运算符”合并为“用残留物扫描”。幸运的是，有一个跟踪标志可禁用此转换。这对于调试SQL Server计划非常有帮助，在这里也很有帮助– TF 9130。

When we enforce FORCESCAN hint we enforce the scan with residual predicate, but then it is rewritten during the post optimization rewrite phase to the Seek on partitioned index.

当我们强制执行FORCESCAN提示时，我们将使用残差谓词强制执行扫描，但是在后期优化重写阶段将其重写为“在分区索引中查找”。

Now, what if we disable this “filter merging” thing and rewriting, and run the query with the magic TF:

现在，如果我们禁用此“过滤器合并”功能并重写，并使用魔术TF运行查询，该怎么办：

select a,b=(select 1) from t with(forcescan) where a < 100;
select a,b=(select 1) from t with(forcescan) where a < 100 option(querytraceon 9130);

Plans:

计划：

Results:

结果：

We observe that when the filter is not combined with the scan for the partition table. We have an extra Filter operator and expected results. When it is combined – we have a Seek and wrong results.

我们观察到，当过滤器未与分区表扫描结合使用时。我们还有一个额外的Filter运算符和预期结果。当组合-我们有一个寻求和错误的结果。

Now we obviously see that this issue is not about the optimization of the query, but rewriting it to the query plan.

现在我们很明显地看到，这个问题不是关于查询的优化，而是将其重写为查询计划。

The question is – what it is done during that rewrite?

问题是–在重写过程中做了什么？

The optimizer tries the last chance to push down a predicate closer to the scan and succeeds. It calculates partition range for elimination – and that is where the error sits. It is doing the wrong math. It tries to consider both predicates – residual and seek, and combine them to determine the right range for partition elimination.

优化器尝试最后一次将谓词下推到更接近扫描的机会并成功。它计算要消除的分区范围-这就是错误所在。它做错了数学。它尝试同时考虑两个谓词-剩余谓词和搜寻，并将它们组合在一起以确定消除分区的正确范围。

Lets run the following and examine the query plan Seek Predicate property:

让我们运行以下内容并检查查询计划的“ Seek Predicate”属性：

select a,b=(select 1) from t where a < 100 and a > 10;
select a,b=(select 1) from t with(forcescan) where a < 100 and a > 10;

Results:

结果：

The second one has no rows at all!

第二个根本没有任何行！

Now the plans.

现在的计划。

The first one has:

第一个有：

As you see, It has combined the start of the seek with “a > 10” predicate and calculated that for this condition it should start with Partition ID = 2 + 1. 1 – the first one partition, 2 –partition for the value 10. Totally 1+2 = 3 which value 50 corresponds to. The end of the scan is calculated correctly also because there is no RESIDUAL predicate – and it is 3. So, we scan only one partition with PtnID = 3. You may see it if you examine “Actual Partition Accessed” property.

如您所见，它已将搜索的开始与“ a> 10”谓词组合在一起，并计算出在这种情况下，它应该以Partition ID = 2 +1开始。 1 –第一个分区，2 –值10的分区。总计1 + 2 = 3，值50对应。由于没有RESIDUAL谓词，因此扫描结束的计算正确，因为它为3。因此，我们仅扫描一个PtnID = 3的分区。如果检查“ Actual Partition Accessed”属性，您可能会看到它。

The second one, does the same math for the start predicate, but unfortunately, add the residual one for the calculation and ends up with the following:

第二个，对起始谓词进行相同的数学运算，但不幸的是，将剩余的一个相加用于计算，结果如下：

The condition for the partition elimination is

消除分区的条件是

PtnID <= 3-1 and PtnID >=2+1, i.e – PtnID <= 2 and PtnID >=3 – obviously – no partitions and no rows for that.

PtnID <= 3-1和PtnID> = 2 + 1，即– PtnID <= 2和PtnID> = 3 –显然–没有分区，也没有行。

Also, you may see that this plan has one Seek Key, but the firs one has two Seek Keys. Combining into the one Seek key and applying the same logic as there were two of them – leads to the wrong specification. It is not about the residual itself though, it is about how the predicates get combined to calculate the partition range. I may be wrong in subtle details, because I do not have a source code, but I think I’m right in general.

另外，您可能会看到该计划有一个“搜索键”，而第一个有两个“搜索键”。组合成一个Seek密钥并应用与其中两个相同的逻辑–会导致错误的规范。但是，这与残差本身无关，而与谓词如何组合以计算分区范围有关。我可能在微妙的细节上是错误的，因为我没有源代码，但我认为我总体上是正确的。

解 (Solution)

This is a bug and the only appropriate solution is to update the SQL Server. However, I don’t know if it is a known bug or not, it is not removed by 4199 TF (generally, I think “Uncorrected results” bugs are fixed without any TF), so I’ve file a Connect Item, feel free to vote – more votes, more attention – quicker fix.

这是一个错误，唯一合适的解决方案是更新SQL Server。但是，我不知道它是否是已知的错误，它不会被4199 TF删除（通常，我认为“未更正的结果”错误已修复，没有任何TF），所以我提交了一个Connect Item，免费投票-更多投票，更多关注-更快解决。

As an immediate fix, if you do suffer from an appropriate results – remove the hint FORCEPLAN, or add WITH(INDEX(0)) – that would remove the error, but may lead to inefficient plan.

作为立即解决方案，如果您确实遭受了适当的结果–删除提示FORCEPLAN，或添加WITH（INDEX（0））–这样可以消除错误，但可能导致计划效率低下。

参考资料 (References)

Introduction to Partitioned Tables 分区表简介
Partitioned Tables in SQL Server 2008 SQL Server 2008中的分区表
Partitioned Indexes in SQL Server 2008 SQL Server 2008中的分区索引
Dynamic Partition Elimination Performance 动态分区消除性能

看更多 (See more)

To view and analyze SQL Server query execution plans for free, check out ApexSQL Plan

要免费查看和分析SQL Server查询执行计划，请查看ApexSQL Plan

翻译自: https://www.sqlshack.com/forcescan-hint-and-partitioned-table/

sql server表分区