索引sql server_SQL Server索引设计的五个主要注意事项

索引sql server

In this article, we will discuss the most important points that we should consider when designing an optimal SQL index. Before going through the index design procedure, let us revise the SQL Server index concept.

在本文中，我们将讨论在设计最佳SQL索引时应考虑的最重要点。在进行索引设计过程之前，让我们修改SQL Server索引概念。

SQL Server索引概述 (SQL Server index overview)

SQL index is considered as one of the most important factors in the SQL Server performance tuning field. It helps in speeding up the queries by providing swift access to the requested data, called index seek operation, instead of scanning the whole table to retrieve a few records. It works similar to the book’s index that helps in identifying the location of each unique word, by providing the page where you can find that word, rather than spending the whole weekdays reading the book to check a specific subject or identifying that word. In other words, the existence of that index will save time and resources.

SQL索引被认为是SQL Server性能调整领域中最重要的因素之一。通过提供对请求数据的快速访问（称为索引查找操作），而不是扫描整个表以检索一些记录，可以帮助加快查询速度。它的工作原理与书籍的索引相似，它通过提供可在其中找到该单词的页面来帮助识别每个唯一单词的位置，而不是花整个工作日阅读本书来检查特定主题或识别该单词。换句话说，该索引的存在将节省时间和资源。

For more information about the SQL index structure, different operations that can be performed on an index and how to take advantage of the index in tuning the T-SQL queries performance, check the following articles:
- SQL Server Index Structure and Concepts
- SQL Server index operations
- Tracing and tuning queries using SQL Server indexes
有关SQL索引结构，可以对索引执行的不同操作以及如何在优化T-SQL查询性能时利用索引的更多信息，请查看以下文章：
- SQL Server索引结构和概念
- SQL Server索引操作
- 使用SQL Server索引跟踪和调整查询

SQL Server provides us with two main types of indexes. The clustered index that is used to store the whole table data based on a select index key consists of one or multiple columns, with the ability to create only one clustered index per each table. The clustered index existence covert the table from an unsorted heap table to a sorted clustered table.

SQL Server为我们提供了两种主要的索引类型。用于基于选择索引键存储整个表数据的聚簇索引由一列或多列组成，并且每个表只能创建一个聚簇索引。聚集索引的存在将表从未排序的堆表转换为已排序的聚集表。

Designing effective SQL Server clustered indexes.设计有效SQL Server聚簇索引。

The second main type of SQL Server indexes is the non-clustered index in which the leaf nodes of that index stores only the index key values with a pointer to the storage location of that rows in the main heap table or the clustered index, with the ability to create up to 99 non-clustered indexes per each table.

SQL Server索引的第二种主要类型是非聚集索引，其中该索引的叶节点仅存储索引键值，并带有指向主堆表或聚集索引中该行的存储位置的指针。每个表最多可以创建99个非聚集索引。

Designing effective SQL Server non-clustered indexes.设计有效SQL Server非聚集索引” 。

SQL Server provides us also with other special purposes SQL indexes, derived from the clustered and non-clustered types, that can help in improving the performance of the T-SQL queries. These indexes include the Unique index, Filtered index, Spatial Index, XML index, Clomunstore index, Full-Text index, and Hash index.

SQL Server还为我们提供了其他特殊用途SQL索引，这些索引是从群集和非群集类型派生的，它们可以帮助提高T-SQL查询的性能。这些索引包括唯一索引，过滤索引，空间索引，XML索引，Clomunstore索引，全文本索引和哈希索引。

Working with different SQL Server indexes types.使用不同SQL Server索引类型。

After creating the index, we need also to monitor that index usage to make sure that it still efficient and useful for us. This can be performed by gathering statistical information about the indexes and its usage then perform the proper maintenance operation on these indexes to keep it in a healthy state.

创建索引后，我们还需要监视该索引的使用情况，以确保它仍然对我们有效。这可以通过收集有关索引及其使用情况的统计信息，然后对这些索引执行适当的维护操作以使其保持健康状态来执行。

索引设计注意事项 (Index design considerations)

The target of the SQL Server index design task is to have an index that SQL Server Query Optimizer will choose to enhance the performance of the submitted queries. I used to describe the index, in all my articles and sessions, as a double-edged sword. In this way, I will make sure that we are not blaming the index for our mistakes. The index will be our superhero and improve the performance of our queries if we design it in a correct way. But if that index is poorly designed, it will cause performance degradation in our queries and slow down the data retrieval process. In other words, the absence of poorly designed indexes is better than its existence.

SQL Server索引设计任务的目标是拥有一个索引，SQL Server Query Optimizer将选择该索引来增强所提交查询的性能。我以前在所有文章和会话中都用一把双刃剑来描述索引。这样，我将确保我们不会将错误归咎于索引。如果我们以正确的方式设计索引，它将成为我们的超级英雄，并改善查询的性能。但是，如果该索引设计不当，将导致查询性能下降，并减慢数据检索过程。换句话说，缺少设计不当的索引比存在更好。

The process of selecting the right SQL Server index that fits your database and workload requirements is not an easy mission, but also not impossible mission. In this process, you need to balance between the index gain in the shape of speeding up the data retrieval operation and the index overhead on the data insertion and modification operations.

选择适合您的数据库和工作负载要求的正确SQL Server索引的过程并非易事，而且也不是不可能的任务。在此过程中，您需要在加快数据检索操作速度的索引增益与数据插入和修改操作的索引开销之间取得平衡。

To help you in designing a proper index that SQL Server Query Optimizer will take advantage of to enhance your queries performance, we will discuss here the top five points that you need to consider when planning to create an index.

为了帮助您设计适当的索引，SQL Server Query Optimizer将利用该索引来提高查询性能，我们将在此处讨论计划创建索引时需要考虑的前五点。

数据库设计 (Database design)

In order to design a proper index, you need to study the characteristics of the database on which the SQL Server index will be created. If the database is created to handle the Online Transaction Processing (OLTP) workload, with a large number of inserting and data modification queries, it is recommended not to overload the database with a large number of indexes. This is due to the fact that inserting, updating or deleting any row on the underlying table will also require reflecting the same changes to all related indexes in that table. So, you should create the minimum possible number of indexes in the OLTP tables with the least possible number of columns participating in the index’s key. In this way, we can take advantage of the created SQL indexes in speeding up the data retrieval process with minimal overhead on the data modification operations.

为了设计适当的索引，您需要研究将在其上创建SQL Server索引的数据库的特征。如果创建数据库以处理带有大量插入和数据修改查询的联机事务处理（OLTP）工作负载，则建议不要使包含大量索引的数据库过载。这是由于以下事实：插入，更新或删除基础表上的任何行也将需要反映对该表中所有相关索引的相同更改。因此，您应该在OLTP表中创建尽可能少的索引，并且尽可能少的列参与索引的键。这样，我们可以利用创建SQL索引来以最小的数据修改操作开销来加快数据检索过程。

If the database is created to handle Online Analytical Processing (OLAP) workload, which is used in Data Warehouse as a part of the Business Intelligence structure, most of the workload will be in the shape of SELECT queries to retrieve a large amount of analytical data for analysis or reporting purposes, and a small number of data modification queries. In this case, you can create a large number of SQL Server indexes, adding all required columns as index key or non-key columns to enhance the performance of the SELECT queries and get the requested data faster.

如果创建数据库来处理在线分析处理（OLAP）工作负载（在数据仓库中用作商务智能结构的一部分），则大多数工作负载将采用SELECT查询的形式来检索大量分析数据用于分析或报告目的，以及少量的数据修改查询。在这种情况下，您可以创建大量SQL Server索引，将所有必需的列添加为索引键或非键列，以增强SELECT查询的性能并更快地获取请求的数据。

Another thing to consider when indexing a database table is the size of the table. If the table is small with less than 1000 pages, no performance enhancement can be gained from indexing that table, as SQL Server Query Optimizer will prefer scanning the whole table rather than examining the SQL index and try to create the best possible plan. In other words, this index on the small table will not be used and will have overhead on the table as it should be maintained when the table is changed.

索引数据库表时要考虑的另一件事是表的大小。如果表很小且少于1000页，则对该表建立索引将无法提高性能，因为SQL Server Query Optimizer宁愿扫描整个表而不是检查SQL索引并尝试创建最佳方案。换句话说，小表上的该索引将不被使用，并且将在表上产生开销，因为在更改表时应保留该索引。

You need also to have a look at the database views and check the ones that contain multiple joins and aggregations and create indexes on these views to enhance the performance of reading from it.

您还需要查看数据库视图，并检查包含多个联接和聚合的数据库视图，并在这些视图上创建索引以增强从数据库中读取的性能。

SQL Server Indexed Views.SQL Server索引视图。

T-SQL查询 (T-SQL query)

Studying the queries that are hitting the database tables very frequently, by checking with the system developer or using the profiling tools such as SQL Profiler or Extended Events, will help in designing the SQL Server index that helps more in enhancing the overall system performance.

通过与系统开发人员进行检查或使用性能分析工具（例如SQL Profiler或Extended Events）来研究非常频繁地访问数据库表的查询，将有助于设计SQL Server索引，从而更有助于提高整体系统性能。

After getting statistics about the frequently executed queries, we should check the columns that are used in the predicates and join conditions in these queries and create the proper index, by adding all necessary columns to the index to cover the frequently executed query and avoid any unnecessary column, to speed up the data retrieval operation.

在获得有关频繁执行查询的统计信息之后，我们应该检查谓词中使用的列以及这些查询中的联接条件，并通过向索引添加所有必要的列以覆盖频繁执行的查询并避免任何不必要的操作来创建正确的索引列，以加快数据检索操作。

As a SQL Server developer, it is recommended to write data insertion and modification queries to insert, update or delete as many rows as possible in the same query, rather than writing multiple queries. This will help in reducing the overhead of the index on the data modification statement, where all these changes performed on the table will be replicated to the SQL index as one-shot when executed as a single query.

作为SQL Server开发人员，建议编写数据插入和修改查询以在同一查询中插入，更新或删除尽可能多的行，而不是编写多个查询。这将有助于减少数据修改语句上索引的开销，在表中执行的所有这些更改将在作为单个查询执行时作为一个快照复制到SQL索引。

列 (Columns)

After studying the characteristic of the frequently executed query that we need to enhance and having a list of columns to participate in the index key, we need to consider some points when choosing which column we should add to the index.

在研究了我们需要增强的频繁执行查询的特性并列出了参与索引键的列之后，在选择应添加到索引的列时，我们需要考虑一些要点。

The first point is the column characteristic. Not all data types are recommended to be used as an index key. For example, the best candidate data type for the SQL Server index is the integer column due to its small size. On the other hand, columns with text, ntext, image, varchar(max), nvarchar(max), and varbinary(max) data types cannot participate in the index key. However, most of it still can be added to the non-clustered non-key columns. A column with XML data type can be added only to an XML index. In addition, a column with UNIQUE and NOT NULL values will be a good candidate, due to its high selectivity level, as an index key column that makes the index more useful.

第一点是色谱柱特性。建议不要将所有数据类型都用作索引键。例如，SQL Server索引的最佳候选数据类型是整数列，因为它的大小很小。另一方面，具有text，ntext，image，varchar（max），nvarchar（max）和varbinary（max）数据类型的列不能参与索引键。但是，大多数内容仍可以添加到非群集非键列中。具有XML数据类型的列只能添加到XML索引中。另外，具有高UNIQUE和NOT NULL值的列由于其较高的选择性级别，因此可以作为索引键列，使索引更有用，这是一个很好的选择。

The second point is the location of the column in the query. For instance, the columns used in the WHERE clause, the JOIN prediction, LIKE and the ORDER BY clause are the best candidate columns to be indexed. In addition, indexing the computed columns and the foreign key columns will enhance the performance of the queries that read from these columns.

第二点是查询中列的位置。例如，在WHERE子句，JOIN预测，LIKE和ORDER BY子句中使用的列是要建立索引的最佳候选列。此外，索引计算列和外键列将增强从这些列读取的查询的性能。

An important point to consider after selecting the proper columns to be involved in the index key is the order of the columns in the index key, especially when the key consists of multiple columns. Try to place the columns that are used in the query conditions first in the SQL Server index key. Also, try to define the columns sorting criteria, ascending or descending, in a way that matches the order used in the ORDER BY clause in your query. In this way, you will overcome the SORT operator high overhead, enhancing the performance of the query.

选择适当的列以包含在索引键中后，要考虑的重要一点是索引键中列的顺序，尤其是当键包含多个列时。尝试首先将查询条件中使用的列放在SQL Server索引键中。另外，尝试以与查询中ORDER BY子句中使用的顺序相匹配的方式定义列的升序或降序排序标准。这样，您将克服SORT运算符的高开销，从而提高查询的性能。

索引类型 (Index types)

To that step, we have made a decision that we need to create an index in a specific database table to cover a specific query that is called very frequently, and we need to add candidate column(s) to the index key. We need now to decide which SQL index type fits the query requirements. In other words, we need to specify if we should create a clustered or non-clustered index, a unique or non-unique index, columnstore, or rowstore index. All these decisions will be made based on the query coverage and enhancements requirement.

为此，我们决定需要在特定的数据库表中创建索引以覆盖非常频繁地调用的特定查询，并且需要在索引键中添加候选列。现在，我们需要确定哪种SQL索引类型符合查询要求。换句话说，我们需要指定是创建聚簇索引还是非聚簇索引，唯一索引还是非唯一索引，列存储或行存储索引。所有这些决定都将基于查询覆盖率和增强要求而做出。

As mentioned previously, SQL Server provides us with different types of special purposes indexes that we can use to enhance the performance of the queries. For example, try to use a Filtered index on the columns that have well-defined data subsets, such as Sparse columns with mostly NULL values.

如前所述，SQL Server为我们提供了不同类型的特殊目的索引，我们可以使用这些索引来增强查询的性能。例如，尝试在具有明确定义的数据子集的列上使用过滤索引，例如，稀疏列的值大多为NULL。

Working with different SQL Server indexes types.使用不同SQL Server索引类型。

It is recommended to start indexing the table by creating a clustered index, that covers the column(s) called very frequently, which will convert it from the heap table to a sorted clustered table, then create the required non-clustered indexes that cover the remaining queries in the system. In this way, the non-clustered indexes will be built over the SQL Server clustered index and the pointers on the leaf level nodes of the non-clustered indexes will also point to the location of the row in the sorted clustered index.

建议通过创建聚集索引来开始建立表的索引，该聚集索引将覆盖经常调用的列，该列会将其从堆表转换为排序的聚集表，然后创建覆盖该表的所需非聚集索引。系统中剩余的查询。这样，将在SQL Server聚集索引之上构建非聚集索引，并且非聚集索引的叶级节点上的指针也将指向已排序聚集索引中的行的位置。

If the clustered index is created on a table with clustered indexes already exist, all the non-clustered indexes will be dropped and created again to change the pointers in the leaf level nodes that were pointing to the heap table to point to the newly-created clustered index. So that, creating it in the correct order will overcome the overhead of recreating the non-clustered index again.

如果在已经存在聚集索引的表上创建聚集索引，则将删除并重新创建所有非聚集索引，以更改指向堆表的叶级节点中的指针以指向新创建的索引聚集索引。这样，以正确的顺序创建它将克服重新创建非聚集索引的开销。

索引储存 (Index storage)

When a SQL Server index is created, it will be stored in the same filegroup where the main table is created. The partitioned clustered index and the non-clustered index can be stored on the same filegroup as the main table or on a different filegroup.

创建SQL Server索引后，它将存储在创建主表的同一文件组中。分区的聚集索引和非聚集索引可以存储在与主表相同的文件组中，也可以存储在不同的文件组中。

Selecting the proper storage criteria for the index during the design phase will help in improving the query performance by increasing the I/O performance. For instance, creating the non-clustered index on a filegroup located in a different disk drive than the disk drive where the main table is created will improve the performance of the queries that use that non-clustered index, as it will not be affected by the concurrent reading of the data and SQL index pages, that are spread across different disks, which will be performed on different disk drives.

在设计阶段为索引选择适当的存储条件将通过提高I / O性能来帮助提高查询性能。例如，在与创建主表所在的磁盘驱动器不同的磁盘驱动器上的文件组上创建非聚集索引将提高使用该非聚集索引的查询的性能，因为它不会受到以下影响同时读取分布在不同磁盘上的数据和SQL索引页，这将在不同的磁盘驱动器上执行。

In addition, the clustered and non-clustered indexes, that are created over large tables, can be partitioned across multiple filegroups, with each filegroup stored on a separate disk drive, improving the concurrent data access and retrieval operations, due to the fact that the data is distributed over different disk drives within the SQL index and the Query Optimizer will process only the partitions that the query will access, excluding all other partitions.

此外，在大型表上创建的聚集索引和非聚集索引可以跨多个文件组进行分区，每个文件组存储在单独的磁盘驱动器上，从而改善了并发数据访问和检索操作。数据分布在SQL索引内的不同磁盘驱动器上，并且查询优化器将仅处理查询将访问的分区，但不包括所有其他分区。

Another important storage concept that should be considered also is the FILLFACTOR option, which is an option that can be defined when creating or rebuilding an index, with its value between 0 and 100, that specifies the percentage of space that will be filled on each leaf-level data page in the created index. For example, setting the FILLFACTOR value to 80% will leave 20% of each page empty during the SQL Server index creation or rebuilding process, and this 20% percent will help when a new data is inserted, or an existing data is modified but not fit in the current space, where the data will be inserted in that free space instead of splitting the current page into multiple pages causing index fragmentation issue that will degrade the index performance with time. Fill factor will help in enhancing the performance of the T-SQL queries and minimize the amount of index storage and index maintenance overhead.

还应该考虑的另一个重要存储概念是FILLFACTOR选项，该选项可以在创建或重建索引时定义，该索引的值在0到100之间，该索引指定将在每个叶子上填充的空间百分比创建的索引中的高级别数据页。例如，将FILLFACTOR值设置为80％将在SQL Server索引创建或重建过程中将每个页面的20％留空，而这20％将在插入新数据或修改现有数据但不修改时提供帮助适合当前空间，在该空间中会将数据插入该可用空间，而不是将当前页面分为多个页面，这会导致索引碎片问题，从而导致索引性能随时间下降。填充因子将有助于提高T-SQL查询的性能，并最大程度地减少索引存储量和索引维护开销。

It is also recommended to create narrow indexes with the least possible number of useful columns, rather than creating wide indexes with many unnecessary columns, as it requires less disk space and will have less SQL Server index maintenance overhead.

还建议创建有用索引数尽可能少的窄索引，而不是创建包含许多不必要列的宽索引，因为它需要更少的磁盘空间，并且将减少SQL Server索引维护的开销。

All the previously mentioned points will help in designing the most optimal index that enhance the performance of the T-SQL queries, but it is very important to test the index first on a development environment before creating it on the production environment and make sure that it is useful for your workload and keep monitoring it and marinating it once created on the production environment.

前面提到的所有要点都将有助于设计可增强T-SQL查询性能的最佳索引，但是在生产环境上创建索引之前先在开发环境上对其进行测试并确保该索引非常重要，这一点非常重要。对您的工作负载很有用，并在生产环境中创建后继续对其进行监视和腌制。

翻译自: https://www.sqlshack.com/top-five-considerations-for-sql-server-index-design/

索引sql server