网络嗅探器如何嗅探_SQL Server中的运行时常量嗅探

网络嗅探器如何嗅探

Most of the people know about the so-called “Parameter Sniffing”. This topic was discussed in many aspects in a number of great articles. It is interesting that not only parameters might be “sniffed” during the first execution, but also a runtime constant functions. Let’s look at the example.

大多数人都知道所谓的“参数嗅探”。在许多精彩的文章中都从多个方面讨论了该主题。有趣的是，不仅参数可能在首次执行期间被“嗅探”，而且运行时常量函数也被“嗅探”。让我们来看一个例子。

测试数据 (Test Data)

I will use a test server and administrator account to run the script below, be sure you have enough privileges on your test server if you want to try out the script below.

我将使用测试服务器和管理员帐户来运行以下脚本，如果要尝试以下脚本，请确保您在测试服务器上具有足够的特权。

At first, we will create a database and two users.

首先，我们将创建一个数据库和两个用户。

-- 1. Create database
create database RtConst;
go
alter database RtConst set compatibility_level = 110;
go
use RtConst;
go-- 2. Create users
create login user1 with password = '123', check_policy = off;
create login user2 with password = '123', check_policy = off;
create user user1 for login user1;
create user user2 for login user2;
grant select, showplan to user1;
grant select, showplan to user2;
go

Now, assume we have a kind of log in our system implemented with the table below, and the non-clustered index on the user field in that table.

现在，假设我们在系统中使用下表实现了一种日志，并在该表的user字段上建立了非聚集索引。

-- 3. Create table Log and fill with data
create table dbo.ActionLog(ActionDate datetime not null, ActionUser sysname not null, ActionData varchar(1000) not null, primary key(ActionDate, ActionUser)
);
create index ix_ActionUser on dbo.ActionLog(ActionUser);
go

It is common enough that each user may have different amount of activities and different amount of log records. For the demonstration, let’s create only two users, the first one has 10 logged actions, the second one much more.

每个用户可能具有不同数量的活动和不同数量的日志记录是很常见的。在演示中，我们仅创建两个用户，第一个用户有10个记录的操作，第二个用户更多。

-- 10 000 numbers
with nums(n) as
(select top(10000) row_number() over(order by(select null)) from master..spt_values v1,master..spt_values v2
)
insert dbo.ActionLog(ActionDate, ActionUser, ActionData)
select ActionDate = dateadd (mi, n, '20150101'), -- Some dateActionUser = case when n%1000 = 0 then N'user1' else N'user2' end, -- 10 actions from user1, 9990 actions fro user2ActionData = 'Some Action '+convert(varchar(10),n) -- SOme action data
from nums
;
go

Now let’s query the table for the user actions and examine the plans and IO stats.

现在，让我们在表中查询用户操作，并检查计划和IO状态。

-- 4. Check for the plans
set statistics xml, io on;
select * from dbo.ActionLog where ActionUser = N'user1';
select * from dbo.ActionLog where ActionUser = N'user2';
set statistics xml, io off;
go

For the user with the small number of entries, the “Index Seek + Lookup” strategy is a better choice. It is cheaper to seek the non-covering non-clustered index and then look up the rest of data using clustered index.

对于条目数量少的用户，“索引查找+查找”策略是一个更好的选择。查找非覆盖非聚簇索引，然后使用聚簇索引查找其余数据会更便宜。

For the second query, it is cheaper to scan the clustered index because the query touches much more rows.

对于第二个查询，因为查询涉及更多的行，所以扫描聚簇索引更便宜。

The IO statistics for those queries are:

这些查询的IO统计信息为：

运行时常量函数嗅探 (Runtime Constant Function Sniffing)

Imagine, that we have the query in our system that displays log actions for the current user and implemented as follows.

想象一下，我们的系统中有查询，该查询显示当前用户的日志操作，并按以下方式实现。

select * from dbo.ActionLog where ActionUser = suser_sname();

Let’s run exactly the same query first under user1, then under user2 and gather IO stats and plans.

让我们先在user1下运行完全相同的查询，然后在user2下运行并收集IO统计信息和计划。

-- 5. Sniffed RT const
dbcc freeproccache -- Warning cache free;
go
set statistics xml, io on;
go
execute as login = 'user1';
go
select * from dbo.ActionLog where ActionUser = suser_sname();
go
revert;
go
execute as login = 'user2';
go
select * from dbo.ActionLog where ActionUser = suser_sname();
go
revert;
go
set statistics xml, io off;
go

The plans would be the same for both users, even for the second user it is better to use Clustered Index Scan strategy.

对于两个用户，计划都是相同的，即使对于第二个用户，也最好使用群集索引扫描策略。

That means the IO stats for the second query is not good.

这意味着第二个查询的IO统计信息不好。

Let’s clear cache and replay this example in reverse order. First execute as user2 and then as user1.

让我们清除缓存并以相反的顺序重播此示例。首先以user2身份执行，然后以user1身份执行。

-- 6. Opposite way
dbcc freeproccache;-- Warning cache free;
go
set statistics xml, io on;
go
execute as login = 'user2';
go
select * from dbo.ActionLog where ActionUser = suser_sname();
go
revert;
go
execute as login = 'user1';
go
select * from dbo.ActionLog where ActionUser = suser_sname();
go
revert;
go
set statistics xml, io off;
go

Both plans now are Clustered Index Scans and have 63 Logical reads, even the plan for the user1.

现在，这两个计划都是“聚集索引扫描”，并且具有63次逻辑读取，甚至是针对用户1的计划。

If you stop for a moment and think about it, you may notice that this behavior is very similar to the “Parameter Sniffing” behavior, with the exception, that we have no parameters here. Like in the “parameter sniffing” pattern, the plan behaves of the value that is used to build a plan during the first execution. The value that had an intrinsic function during the first execution.

如果您停下来想一想，您可能会注意到，此行为与“ Parameter Sniffing”行为非常相似，不同之处在于，此处没有参数。就像“参数嗅探”模式中一样，计划的行为类似于在首次执行期间用于构建计划的值。在第一次执行期间具有内部函数的值。

运行时常量函数 (Runtime Constant Functions)

For some of scalar functions, SQL Server pulls out the function expression from the operator’s tree, caches it and reuses it during the query execution. For example, if you issue the query «select sysdatetime(), sysdatetime()» you will get exactly the same time in both columns up to 10^-6 seconds. If the function was really invoked two times during the execution, then obviously, because it is not deterministic, the time for one of the columns should be slightly different from the other. That does not happen, because the function expression is extracted and executed only once during the execution. I will refer you to the Connor’s Cunningham blog post Conor vs. Runtime Constant Functions for more details.

对于某些标量函数，SQL Server从运算符的树中提取函数表达式，对其进行缓存并在查询执行期间重新使用它。例如，如果发出查询«select sysdatetime（），sysdatetime（）»，您将在两列中获得完全相同的时间，最长为10 ^ -6秒。如果该函数在执行期间确实被两次调用，那么显然，由于它不是确定性的，因此其中一列的时间应与另一列的时间略有不同。不会发生这种情况，因为函数表达式在执行期间仅被提取并执行一次。有关更多详细信息，请参考Connor的Cunningham博客文章Conor vs. Runtime Constant Functions 。

Interesting part is that this expression extraction happens during the plan compilation, and the value is sniffed, as well as parameter in a module, during the first execution.

有趣的是，此表达式的提取发生在计划编译期间，并且在第一次执行期间嗅探到值以及模块中的参数。

We may observe this behavior with attaching debugger, setting break point on sqlmin!CQuery::AddExprCachemethod and running two queries.

我们可以通过附加调试器，在sqlmin！CQuery :: AddExprCache方法上设置断点并运行两个查询来观察这种行为。

go
dbcc freeproccache;-- Warning cache free;
go
declare @user sysname = N'someuser';
select * from dbo.ActionLog where ActionUser = @user;
go
dbcc freeproccache;-- Warning cache free;
go
select * from dbo.ActionLog where ActionUser = suser_sname();
go

The first one query returns immediately because there is no breakpoint hit because there is nothing to cache. The second one will hit the breakpoint on the stage of creating runtime constant (sqllang!CNormalizeExpr::PvrCreateRTConst), and caching it (sqlmin!CQuery::AddExprCache), we will observe the following call stack in WinDbg.

第一个查询将立即返回，因为没有断点命中，因为没有要缓存的内容。第二个将在创建运行时常量（sqllang！CNormalizeExpr :: PvrCreateRTConst）并将其缓存（sqlmin！CQuery :: AddExprCache）的阶段达到断点，我们将在WinDbg中观察以下调用堆栈。

另一个例子 (Another example)

Now let’s imagine that you have some kind of order management system and want to know the orders that were created during the last hour from now. For that purpose, we may write the query like this.

现在，假设您有某种订单管理系统，并且想知道从现在开始的最后一个小时内创建的订单。为此，我们可以这样编写查询。

select * from dbo.Orders where OrderDate >= dateadd(hh,-1, getdate()) and OrderDate < getdate();

If you issue this query first time, at the moment when there are very few orders, you will likely get the plan with Index Seek + Lookup strategy. If during the day there will be some kind of a “rush hour” (say very few orders in the morning and a lot of orders in the evening), SQL Server will still use the cached plan for the small amount of orders, of course unless adding new orders won’t trigger update statistics and the plan would be recompiled. However, that might not happen immediately.

如果您是第一次发出此查询，那么在订单数量很少的时候，您可能会获得带有“索引查找+查找”策略的计划。如果在白天会有某种“高峰时间”（例如早上的订单很少，晚上的订单很多），SQL Server仍然会为少量订单使用缓存的计划除非添加新订单不会触发更新统计信息，否则计划将重新编译。但是，这可能不会立即发生。

For the demo purpose, not to wait one hour we will make the time window very small, let’s wait for 10 seconds. Also, we will add the future orders beforehand with the date – now + 10 seconds – that is not to create a big table, because adding the orders to the small table will exceed 20% percent statistics threshold and trigger update statistics and recompilation, so we will not observe the cached expression effect.

为了演示的目的，不要等待一个小时，我们将使时间窗口很小，让我们等待10秒钟。另外，我们将在日期前加上将来的订单（现在+ 10秒），即不会创建大表，因为将订单添加到小表将超过统计信息阈值的20％，并触发更新统计信息和重新编译，因此我们将不会观察到缓存的表达式效果。

At the first step, we will run and compile the query for the time window [now-10 sec; now]. At that moment, only 10 orders fit this window. Then we will wait for 10 seconds and re-run the query, but this time the order amount is bigger (pretending a lot of orders were created during that time). As a final step, we will manually trigger recompilation by the procedure sp_recompile, to see what the efficient plan should be.

第一步，我们将运行并编译时间窗口的查询[现在10秒；现在]。当时，只有10个订单适合此窗口。然后，我们将等待10秒钟，然后重新运行查询，但是这次订单金额更大（假装在这段时间内创建了很多订单）。最后一步，我们将通过sp_recompile过程手动触发重新编译，以查看有效的计划。

Let’s run the whole script at once.

让我们一次运行整个脚本。

-- 7. Another example
if object_id('dbo.Orders') is not null drop table dbo.Orders;
create table dbo.Orders(OrderID int identity primary key, OrderDate datetime not null, OrderData varchar(1000) not null
);
create index ix_OrderDate on dbo.Orders (OrderDate);
go
with nums(n) as
(select top(10000) row_number() over(order by(select null)) from master..spt_values v1,master..spt_values v2
)
insert dbo.Orders(OrderDate, OrderData)
select OrderDate = dateadd(ss, case when n <= 10 then 1 else 10 end, getdate()),OrderData = 'Some Data'
from nums
;
go
update statistics dbo.Orders with fullscan;
go
waitfor delay '00:00:01';
go
set statistics xml, io on;
select * from dbo.Orders where OrderDate >= dateadd(ss,-10, getdate()) and OrderDate < getdate();
set statistics xml, io off;
go
waitfor delay '00:00:10';
go
set statistics xml, io on;
select * from dbo.Orders where OrderDate >= dateadd(ss,-10, getdate()) and OrderDate < getdate();
set statistics xml, io off;
go
sp_recompile 'dbo.Orders';
go
set statistics xml, io on;
select * from dbo.Orders where OrderDate >= dateadd(ss,-10, getdate()) and OrderDate < getdate();
set statistics xml, io off;
go

Both plans using Seek + Lookup pattern, and the reads are not good for the second query. The third plan, after recompilation, uses Scan and the reads are ok.

这两个计划都使用Seek + Lookup模式，并且读取不适用于第二个查询。重新编译后，第三个计划使用“扫描”，并且读取正常。

SQL Server版本 (SQL Server Version)

As you may have noticed on the database creation step I set the compatibility level to 110 (SQL Server 2012), that is done to force old cardinality estimation (CE) behavior. The first example (with the log table) in case of the new CE will produce scans both times. The expression is still cached, but the new CE will estimate the number of rows using density (like it is doing in case of the optimize for unknown hint). While the second example does not depend on the CE version.

您可能已经在数据库创建步骤中注意到，我将兼容性级别设置为110（SQL Server 2012），这样做是为了强制执行旧基数估计（CE）行为。对于新的CE，第一个示例（带有日志表）将两次扫描。该表达式仍被缓存，但是新的CE将使用密度来估计行数（就像在针对未知提示进行优化的情况下所做的那样）。虽然第二个示例不依赖于CE版本。

结论 (Conclusion)

Though the queries and situations presented here are artificial, I think, it is useful to know what kind of SQL Server behavior. For curiosity TF 4136 (disabling parameter sniffing) or optimize for unknown – gives no effect on this behavior. That is because different classes are responsible for handling parameters and runtime constant functions internally. If this kind of “sniffing” becomes a problem – you may replace the direct call of the intrinsic function by the variable, and then treat the situation as you normally do with variables or parameters.

尽管这里提出的查询和情况是人为的，但我认为了解一下哪种SQL Server行为还是很有用的。出于好奇，TF 4136（禁用参数嗅探）或针对未知进行优化 –对此行为没有影响。这是因为不同的类负责内部处理参数和运行时常量函数。如果这种“嗅探”成为问题–您可以用变量代替对内在函数的直接调用，然后像对待变量或参数一样处理这种情况。

That’s all, thanks for reading.

就这样，谢谢您的阅读。

翻译自: https://www.sqlshack.com/runtime-constants-sniffing/

网络嗅探器如何嗅探