数据库索引统计信息不一致_列存储索引增强功能–克隆数据库中的索引统计信息更新

数据库索引统计信息不一致

SQL Server was launched in 1993 on WinNT and it completed its 25-year anniversary recently. SQL Server has come a long way since its first release. At the same time, Microsoft announced a preview version of SQL Server 2019. SQL Server 2019 provides the ability to extend its support to big data, Apache Spark, Hadoop distributed file system (HDFS) and provides enhancements to database performance, security, new features, and enhancements to SQL Server on Linux.

SQL Server于1993年在WinNT上启动，并于最近完成了25周年纪念。自SQL Server首次发布以来，已经走了很长一段路。同时，Microsoft宣布了SQL Server 2019的预览版.SQL Server 2019提供了将其支持扩展到大数据，Apache Spark，Hadoop分布式文件系统（HDFS）的功能，并增强了数据库性能，安全性和新功能，以及Linux上SQL Server的增强。

SQL Server 2012 first introduced Columnstore indexes to improve the workload performance. Using this feature, we store data in the column for the page. We can get high-performance improvements for large-scale queries especially for data warehouse or business intelligence. Columnstore index provides a high degree of data compression too as compared with the uncompressed data.

SQL Server 2012首先引入了Columnstore索引来提高工作负载性能。使用此功能，我们将数据存储在页面的列中。我们可以为大型查询（尤其是数据仓库或商业智能）提供高性能的改进。与未压缩的数据相比，列存储索引也提供了高度的数据压缩。

You can explore the concepts of Columnstore index in these articles

您可以在这些文章中探索Columnstore索引的概念

Columnstore Index in SQL Server SQL Server中的列存储索引
Columnstore Indexes to improve your Data Warehouse Staging Environment 列存储索引以改善您的数据仓库登台环境
Create a Clustered Columnstore Index on a Memory-Optimized Table 在内存优化表上创建群集的列存储索引

SQL Server 2019 provides following enhancements to columnstore indexes

SQL Server 2019对列存储索引提供了以下增强功能

Columnstore index stats update in clone databases 克隆数据库中的列存储索引统计信息更新
Compression Estimates for Columnstore 列存储的压缩估计
Resumable online index creation 可恢复的在线索引创建

In this series of article on columnstore index enhancements over SQL Server 2019, we will explore Columnstore index stats update in clone databases.

在本系列有关SQL Server 2019上的列存储索引增强的文章中，我们将探讨克隆数据库中的列存储索引统计信息更新。

列存储索引统计信息在克隆数据库中更新。 (Columnstore index stats update in clone databases.)

Before we move further, let us talk little about a newly introduced feature in SQL Server 2014 SP2, DBCC CLONEDATABASE. This command is used to create an empty copy of the online source user database without any data. We can use this feature to analyze or troubleshoot performance issues in the queries related to the query optimizer. This feature creates an internal snapshot of the DB and copies all metadata, schema, and statistics to the new cloned database.

在继续进行之前，我们先介绍一下SQL Server 2014 SP2中的一项新功能DBCC CLONEDATABASE 。此命令用于创建没有任何数据的在线源用户数据库的空副本。我们可以使用此功能来分析或优化与查询优化器有关的查询中的性能问题。此功能创建数据库的内部快照，并将所有元数据，架构和统计信息复制到新的克隆数据库。

If we combine both the features of Columnstore indexes and Clone database, it does not really start to work well until SQL Server 2017. Let me explain this in SQL Server 2017 first and then we will move to SQL Server 2019 enhancements.

如果我们将Columnstore索引和Clone数据库的功能结合在一起，则直到SQL Server 2017才能真正开始工作。让我先在SQL Server 2017中对此进行解释，然后再转向SQL Server 2019增强功能。

列存储索引和克隆数据库SQL Server 2017 (Columnstore index and clone database SQL Server 2017 )

First, we will prepare the environment with the sample database and load data into it.

首先，我们将使用示例数据库准备环境并将数据加载到其中。

Create Database SQLShackDemoColumnSore
GO
Use SQLShackDemoColumnSore
GO
CREATE TABLE [dbo].[Employees]([EmpID] [int] NOT NULL,[EmpName] [varchar](50) NOT NULL,)
Go

We’ll load some 2m rows of test data into our database

我们将约200万行测试数据加载到数据库中

In the next step, create the clustered columnstore index on this table with the following query.

在下一步中，使用以下查询在此表上创建聚集的列存储索引。

Use SQLShackDemoColumnSore
Go
CREATE CLUSTERED COLUMNSTORE INDEX [CCS_Employees] ON [dbo]. [Employees]

Let us check the statistics for this table using the sys.stats catalogue view. This view provides information about each statistic in the table, indexes, and views.

让我们使用sys.stats目录视图检查该表的统计信息。该视图提供有关表，索引和视图中每个统计信息的信息。

select * from sys.stats where object_id=OBJECT_ID('employees')
go

In the result set, we can see statistics named ‘CCS_Employees’. This statistics is created automatically at the time of columnstore index.

在结果集中，我们可以看到名为“ CCS_Employees”的统计信息。列存储索引时将自动创建此统计信息。

Let us insert 3 million records again into this table and perform some select operation to create the automatic statistics.

让我们再次将300万条记录插入此表中，并执行一些选择操作以创建自动统计信息。

Now, if we view the statistics we can one more statistics with the name as _wa_sys*. These statistics are created automatically by SQL Server.

现在，如果我们查看统计信息，则可以再添加一个名为_wa_sys *的统计信息。这些统计信息由SQL Server自动创建。

Let us view the execution plan as well to verify the estimated and actual number of rows in the table. We can see the table is having 5 million rows as we inserted earlier.

让我们也查看执行计划，以验证表中的估计行数和实际行数。我们可以看到该表在前面插入的行中有500万行。

Create the clone database using DBCC CLONEDATBASE Command as below.

如下所示，使用DBCC CLONEDATBASE命令创建克隆数据库。

DBCC CLONEDATABASE (SQLShackDemoColumnSore, SQLShackDemoColumnSore_CLONE);

We have created the database clone and our clone database name is SQLShackDemoColumnSore_CLONE. Just make a note of the highlighted statement that ‘clone database should be used for diagnostics purpose only and is not supported for use in a production environment’.

我们创建了数据库克隆，克隆数据库名称为SQLShackDemoColumnSore_CLONE 。只需注意以下突出显示的语句：“克隆数据库应仅用于诊断目的，不支持在生产环境中使用”。

We can view both the Source and Clone database in SSMS.

我们可以在SSMS中查看Source和Clone数据库。

In the clone database, as shown below, both statistics exist that we checked earlier for the source database.

如下所示，在克隆数据库中，存在两个统计信息，我们之前已经检查了它们是否为源数据库。

In SSMS, we can also compare the statistics for both the databases.

在SSMS中，我们还可以比较两个数据库的统计信息。

Now let us run a query against our clone database and in the actual execution plan, we can see that a warning appears ‘Columns with No statistics’.

现在让我们对克隆数据库运行查询，在实际执行计划中，我们可以看到警告出现“没有统计信息的列”。

Also, note the estimated number of rows is 2 million only while our source table contains 5 million rows. It should show the 5 million rows since both the statistics present in the clone database also. It shows that the statistics are not updated for the columnstore index.

另外，请注意，仅当我们的源表包含500万行时，估计的行数才为200万。它应该显示500万行，因为克隆数据库中同时存在两个统计信息。它表明未更新列存储索引的统计信息。

Now let us view this behavior in SQL Server 2019.

现在让我们在SQL Server 2019中查看此行为。

SQL Server 2019列存储索引和克隆数据库行为 (SQL Server 2019 Columnstore index and clone database behavior)

We observed above that until SQL Server 2017 if we create a clone database for the source database with columnstore index, it does not update the statistics after we create it. This makes difficult to analyze the performance issue if stats are not getting updated.

上面我们观察到，直到SQL Server 2017，如果我们为具有列存储索引的源数据库创建克隆数据库，它在创建后不会更新统计信息。如果统计信息没有更新，这将使分析性能问题变得困难。

SQL Server 2019 provides enhancements to columnstore index stats for clone database. Now stats will be updated automatically in the database created using the DBCC CLONEDATABASE.

SQL Server 2019增强了克隆数据库的列存储索引统计信息。现在，统计信息将在使用DBCC CLONEDATABASE创建的数据库中自动更新。

Let us perform the same test we did in SQL Server 2017. I have copied the data at each stage using Import and Export wizard to keep the data same for both SQL Server 2017 and 2019 versions.

让我们执行与SQL Server 2017中相同的测试。我已经在每个阶段使用导入和导出向导复制了数据，以使SQL Server 2017和2019版本的数据保持相同。

In the below image, we can observe that there is no warning sign if we execute the query in the clone database and the number of records is also 5 million as per the source table. This shows that the stats are updated automatically in SQL Server 2019 for columnstore indexes as well. We can get the same kind of execution plan with all details in SQL Server 2019 as compared to previous versions of SQL Server.

在下图中，如果在克隆数据库中执行查询，则可以看到没有警告标志，并且根据源表，记录数也为500万。这表明在SQL Server 2019中，列存储索引的统计信息也会自动更新。与早期版本SQL Server相比，我们可以在SQL Server 2019中获得具有所有详细信息的相同类型的执行计划。

No warning observed in SQL Server 2019 as we can observe here.

正如我们在这里可以观察到的那样，在SQL Server 2019中未观察到警告。

结论 (Conclusion)

SQL Server 2019 enhancements for columnstore index stats on the clone database provide a way to troubleshoot performance issues for columnstore indexes as well. Previously, we have had to manually work on the stats update in SQL Server 2017 or prior. We might get more enhancements coming over in next releases related to this. In the next series of article, we will take an overview of a few more feature enhancements of columnstore indexes in SQL Server 2019.

克隆数据库上列存储索引统计信息SQL Server 2019增强功能也提供了一种解决列存储索引性能问题的方法。以前，我们必须手动处理SQL Server 2017或更早版本中的统计信息更新。在与此相关的下一个发行版中，我们可能会提供更多增强功能。在下一篇文章中，我们将概述SQL Server 2019中列存储索引的其他一些功能增强。

目录 (Table of contents)

Columnstore Index Enhancements – Index stats update in clone databases

Columnstore Index Enhancements – data compression, estimates and savings

Columnstore Index Enhancements – online and offline (re)builds

列存储索引增强功能–克隆数据库中的索引统计信息更新

列存储索引增强功能–数据压缩，估计和节省

列存储索引增强功能–在线和离线（重新）构建

翻译自: https://www.sqlshack.com/columnstore-index-enhancements-index-stats-update-in-clone-databases/

数据库索引统计信息不一致