cte公用表表达式_在SQL Server中使用CTE进行插入和更新（公用表表达式）

cte公用表表达式

In CTEs in SQL Server; Querying Common Table Expressions the first article of this series, we looked at creating common table expressions for select statements to help us organize data. This can be useful in aggregates, partition-based selections from within data, or for calculations where ordering data within groups can help us. We also saw that we weren’t required to explicitly create a table an insert data, but we did have to ensure that we had names for each of the columns along with the names being unique. Now, we’ll use our select statements for inserts and updates.

在SQL Server中的CTE中；查询公共表表达式本系列的第一篇文章探讨了为select语句创建公共表表达式以帮助我们组织数据。这在汇总，从数据中基于分区的选择中，或在对组内的数据排序有帮助的计算中很有用。我们还看到不需要显式地创建一个表以插入数据，但是我们必须确保每个列的名称以及唯一的名称。现在，我们将使用select语句进行插入和更新。

具有插入，更新和删除的开发样式 (Development styles with Inserts, Updates and Deletes )

Outside of environments that use all three SQL CRUD operations (inserts, updates and deletes), there are two predominant development styles with these write operations that are useful to know when we consider common table expressions and write operations:

在使用所有三个SQL CRUD操作（插入，更新和删除）的环境之外，这些写操作有两种主要的开发样式，在我们考虑公用表表达式和写操作时，这些样式很有用，可帮助您了解这些样式：

Remove everything and reload: in SQL Server, this can be achieved through the use of the truncate plus insert operations. If designed in a horizontally scaled manner, this can provide the fastest route for new data with only a small amount of updated data (or no updated data)删除所有内容并重新加载 ：在SQL Server中，这可以通过使用truncate plus insert操作来实现。如果以水平缩放的方式设计，这可以为仅包含少量更新数据（或没有更新数据）的新数据提供最快的路由
Never delete, only add and update: this is the soft delete or soft transaction approach where records aren’t removed, but updated to be inactive. This removes the delete operation, but can add in storage and performance costs since data are never removed从不删除，仅添加和更新：这是一种软删除或软事务处理方法，其中的记录不会被删除，而是被更新为不活动的。这样就删除了删除操作，但是由于从不删除数据，因此会增加存储和性能成本。

These are the other popular combinations along with environments that use all three write CRUD operations. What’s important to consider relates to how inserts, updates and deletes fundamentally function and what this means for performance:

这些是其他流行的组合，以及使用所有三个write CRUD操作的环境。需要考虑的重要事项与插入，更新和删除从根本上如何起作用以及这对性能的意义有关：

Inserts add data from a data set, whether that data set is a file, table, variable, hard-coded value, or other data. This can be all the data from that data set or a subset of the data. Relative to design (new records versus adding records between existing records), inserts can be a light write operation 插入来自数据集的添加数据，无论该数据集是文件，表，变量，硬编码值还是其他数据。这可以是来自该数据集的所有数据，也可以是数据的子集。相对于设计（新记录或在现有记录之间添加记录），插入可以是轻写操作
Updates change existing data either in full or in partial from other information, whether a data source or a data variable. Relative to the space with existing data and the update performed, this can be costly in fragmentation, in the data source that is used as a reference, etc. Common table expressions may reduce our likelihood of reversing an update, which can be very costly in some cases 更新会从其他信息（无论是数据源还是数据变量）完全或部分更改现有数据。相对于具有现有数据和执行更新的空间，这在碎片化，用作引用的数据源等方面可能会付出高昂的代价。通用表表达式可能会减少我们撤消更新的可能性，这在更新中可能会非常昂贵。某些情况下
Deletes remove existing partial data from sets. I will assume here that anyone wanting to remove all data from a table will use a truncate to reduce logging (though there may be reasons truncate is avoided). Deletes therefore inherently use a select in that they remove partial data and the removal may create extra space among existing records in storage along how deletes mark records in the transaction log删除将从集中删除现有的部分数据。我将在这里假设任何想要从表中删除所有数据的人都将使用截断来减少日志记录（尽管可能有避免截断的原因）。因此，删除操作本质上是使用选择操作，因为删除操作会删除部分数据，并且删除操作可能会在存储中的现有记录之间创建额外的空间，以及如何删除事务日志中的标记记录

The reason these points are important is that we can optimized write operations for the best performance, but we can’t out-optimize their inherent design. If we have to remove 100 records against a table that will cause fragmentation because of how our records are organized, no common table expression or subquery will subvert the minimum cost required by the transaction. We can use this tool for helping us reduce the cost to as close to minimum, but each write operation will come with costs.

这些要点之所以重要，是因为我们可以优化写入操作以获得最佳性能，但不能对它们的固有设计进行优化。如果由于表的组织方式，我们必须针对一个表删除100条记录，这将导致碎片，那么没有公共表表达式或子查询会破坏事务所需的最低成本。我们可以使用该工具来帮助我们将成本降低到最小程度，但是每次写入操作都会带来成本。

插入SQL CTE (Inserts with SQL CTEs )

Generally, many insert transactions do not require significant complexity outside of transformations or validation. For this reason, I will rarely use any common table expression, subquery or temp table structure with insert transactions. If I use any of these three tools with inserts, the query almost always meets the following criteria:

通常，许多插入事务不需要进行转换或验证即可。因此，我很少在插入事务中使用任何常见的表表达式，子查询或临时表结构。如果我将这三个工具中的任何一个与插入一起使用，则查询几乎总是满足以下条件：

organization of data on top of 组织顶部的数据 new structure or added structure. An example of this would be a query that has data partitioned by year that then needs the totals and average for the year. The data partitioned would be the new structure on top of the data, and the aggregates would be the organization on top of that new structure新结构或添加的结构。例如，查询按年份对数据进行分区，然后需要该年份的总数和平均值。分区的数据将是数据之上的新结构，聚合将是该新结构之上的组织
The insert comes from a query that involves analysis of comparing data sets or comparing values where the organization of the values occurs before comparison. An example of this would be a join of two tables by a value that must be derived from a query, such as getting the year from a date field to join tables 插入来自一个查询，该查询涉及分析比较数据集或比较值，其中在比较之前发生值的组织。这样的一个示例是通过必须从查询派生的值将两个表联接起来，例如从日期字段中获取年份以联接表

The above scenarios tend to be more common in data warehouse (OLAP) environments and like with other transactions, we have alternatives that may be more appropriate. For an example of an insert with common table expressions, in the below query, we see an insert occur to the table, reportOldestAlmondAverages, with the table being created through the select statement (and dropped before if it exists).

以上场景在数据仓库（OLAP）环境中更常见，并且与其他事务一样，我们有一些更合适的选择。对于具有常见表表达式的插入示例，在下面的查询中，我们看到表（reportOldestAlmondAverages）发生插入，并且该表是通过select语句创建的（如果存在，则删除该表）。

IF OBJECT_ID('reportOldestAlmondAverages') IS NOT NULL
BEGIN DROP TABLE reportOldestAlmondAverages
END;WITH GroupAlmondDates AS(SELECT YEAR(AlmondDate) AlmondYear, AlmondDate, AlmondValueFROM tbAlmondDataWHERE AlmondDate &lt; '1990-12-31'
), GetAverageByYear AS(SELECT AlmondYear, AVG(AlmondValue) AvgAlmondValueForYearFROM GroupAlmondDatesGROUP BY AlmondYear
)
SELECT t.AlmondDate, tt.AvgAlmondValueForYear AnnualAvg, (t.AlmondValue - tt.AvgAlmondValueForYear) ValueDiff
INTO reportOldestAlmondAverages
FROM GroupAlmondDates tINNER JOIN GetAverageByYear tt ON t.AlmondYear = tt.AlmondYearSELECT * FROM reportOldestAlmondAverages

*Our created report table from the two CTEs joined.* *我们加入的两个CTE创建的报告表。*

The CTE in SQL Server offers us one way to solve the above query – reporting on the annual average and value difference from this average for the first three years of our data. We take the least amount of data we’ll need to use in our first common table expression, then get the average in our next, and join these together to return our report.

SQL Server中的CTE为我们提供了一种解决上述查询的方法-报告数据的前三年的年平均值和与该平均值的差值。我们在第一个公用表表达式中使用的数据量最少，然后在下一个公用表表达式中取平均值，然后将它们合并在一起以返回报告。

The above insert statement also illustrates a development technique that we should apply to all data operations – filter as early as possible and use as little as required with data. We don’t want aggregates being run against a full table, if we only want to run an aggregate for a small timeframe. While SQL CTEs can make development easy, there is a tendency to get everything early, then filter later (this is also common with other data operations too). The better development technique is to filter as strict as possible early so that we return the fewest data points we need, from unnecessary rows to unnecessary columns. This especially becomes true if we migrate data to another server and our query is involved in a linked server query.

上面的insert语句还说明了一种开发技术，我们应该将其应用于所有数据操作-尽早进行过滤，并尽可能少地使用数据。如果只想在较小的时间范围内运行汇总，则我们不希望针对完整表运行汇总。尽管SQL CTE使开发变得容易，但有一种趋势是尽早获取所有内容，然后再进行过滤（这在其他数据操作中也很常见）。更好的开发技术是尽早过滤尽可能严格的数据，以便我们返回所需的最少数据点，从不必要的行到不必要的列。如果我们将数据迁移到另一台服务器并且我们的查询涉及链接服务器查询，则尤其如此。

Like with other transactions including select statements, the data from the wrapped query inside the parenthesis is inserted, meaning if the wrapped query has 100 records, 100 records will be inserted unless a where excludes them (the actual columns are determined by what is selected).

与包括select语句的其他事务一样，插入括弧内的包装查询中的数据，这意味着如果包装查询中有100条记录，则将插入100条记录，除非a将其排除在外（实际列由所选内容确定）。

SQL CTE更新 (Updates with SQL CTEs )

We can use common table expressions to update data in a table and this becomes very intuitive when we do updates with JOINs. Similar to other operations, we will use a wrapped select for the data we want to update and the transaction will only run against the records that are a part of the select statement. We’ll first look at a simple update, then look at the easy of doing a joined update.

我们可以使用公用表表达式来更新表中的数据，当我们使用JOIN进行更新时，这变得非常直观。与其他操作类似，我们将对要更新的数据使用包装的选择，并且事务将仅针对属于select语句一部分的记录运行。我们将首先看一个简单的更新，然后看做合并更新的简易性。

In the below example, we first add a column to our table that allows 9 varchar characters and we use a SQL CTE to update all the records in our table to a blank value (previous records were null values). Following what we’ve learned in inserts and selects, we only select what we want to update and nothing more – we always want to get in the practice of returning the least amount of data we need (both for performance and security). Once we add our column and update our records to blank, we can used the wrapped query inside the common table expression to check our blank values.

在下面的示例中，我们首先在表中添加一列，该列允许9个varchar字符，并使用SQL CTE将表中的所有记录更新为空值（以前的记录为空值）。遵循我们在插入和选择中学到的知识之后，我们仅选择要更新的内容，仅此而已–我们始终希望获得返回所需最少数据量（性能和安全性）的实践。一旦添加了列并将记录更新为空白，就可以使用公共表表达式中的包装查询来检查空白值。

ALTER TABLE tbAlmondData ADD Timeframe VARCHAR(9);WITH UpdateAll AS(SELECT TimeframeFROM tbAlmondData
)
UPDATE UpdateAll
SET Timeframe = ''

We can run a validation after we run the update by highlighting the query inside the SQL CTE.

在运行更新之后，可以通过突出显示SQL CTE内部的查询来运行验证。

If I had specified top 10 or had added a where clause for only 10 values, the update would have only run against those 10 values. This becomes incredibly useful to limit the scope of updates with our select statement inside the SQL CTE specifying the exact records to update.

如果我指定了前10个值或仅对10个值添加了where子句，则更新将仅针对这10个值进行。在SQL CTE中使用select语句指定要更新的确切记录时，这对于限制更新范围非常有用。

Next, we’ll create a quarter table that we’ll use for an update CTE in SQL Server, with a join and insert four records. For our update, we’ll join our tbAlmondData to our newly created QuarterTable on the quarter part of the AlmonddDate (we could run this update by using the DATEPART function alone, but this example will also show how we can use a join statement to make updating easy with SQL CTEs). We want our new timeframe column to hold the value of QN YYYY, such as Q1 1989. For our select statement inside the common table expression, we’ll select our Timeframe column (which will need to be updated) as well as the varchar combination of QuarterValue and casted year of our AlmondDate column as a varchar of size four. We can check how the existing Timeframe column and how the NewTimeframe column look before we run the update.

接下来，我们将创建一个四分之一表，该表将用于具有联接SQL Server中的更新CTE，并插入四个记录。对于我们的更新，我们将tbAlmondData加入到AlmonddDate的季度部分的新创建的QuarterTable中（我们可以仅通过使用DATEPART函数来运行此更新，但是此示例还将说明如何使用join语句进行创建使用SQL CTE轻松更新）。我们希望新的timeframe列保留QN YYYY的值，例如1989年Q1。对于公共表表达式中的select语句，我们将选择Timeframe列（需要更新）以及varchar组合QuarterValue的值和AlmondDate列的强制转换年份作为大小为4的varchar。在运行更新之前，我们可以检查现有的Timeframe列以及NewTimeframe列的外观。

CREATE TABLE QuarterTable(QuarterId TINYINT IDENTITY(1,1),QuarterValue VARCHAR(2)
)INSERT INTO QuarterTable
VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4');WITH UpdateTimeframe AS(SELECTt.Timeframe , tt.QuarterValue + ' ' + CAST(YEAR(AlmondDate) AS VARCHAR(4)) NewTimeframeFROM tbAlmondData tINNER JOIN QuarterTable tt ON tt.QuarterId = DATEPART(QUARTER,t.AlmondDate)
)
UPDATE UpdateTimeframe
SET Timeframe = NewTimeframe

*Using the joined query, we update our timeframe column.* *使用合并的查询，我们更新时间范围列。*

If we needed to update one column that would be created from three joined tables, we could apply the same logic in the above query – join our data in the wrapped select statement with the existing record and the new record we need to update the existing record to, then run our updates. Not only does this allow us to run a quick check before we make an update – because we can select and run the wrapped query – it means we can use the intuitive design of joins when updating data with selects. In a similar manner, by choosing CTE names that capture what we’re doing and using column names that indicate the existing versus new, the SQL CTE itself explains the update with little confusion.

如果需要更新由三个联接表创建的一列，则可以在上述查询中应用相同的逻辑–将包装的select语句中的数据与现有记录联接，并需要新记录来更新现有记录，然后运行我们的更新。这不仅使我们能够在进行更新之前进行快速检查（因为我们可以选择并运行包装的查询），这意味着在使用selects更新数据时，我们可以使用联接的直观设计。以类似的方式，通过选择捕获我们所做工作的CTE名称并使用指示现有与新名称的列名称，SQL CTE本身几乎不会引起混乱。

结论 (Conclusion )

We see that we can quickly create insert and update statements with common table expressions and organize our data easily. We can combine these with other development techniques, such as temp tables or transaction-based queries, to simplify our troubleshooting if we experience issues. Like with large select statements, SQL CTEs may have drawbacks if we stack too many of them on each other, as we won’t have the convenient ability to query the wrapped data. In addition, we may still find situations where we don’t want to use these, as they don’t offer the best performance.

我们看到，我们可以使用常见的表表达式快速创建插入和更新语句，并轻松组织数据。如果遇到问题，我们可以将它们与其他开发技术（例如临时表或基于事务的查询）结合使用，以简化故障排除过程。与大型选择语句一样，如果我们将彼此堆叠太多，则SQL CTE可能会有缺点，因为我们没有方便的能力来查询包装的数据。此外，我们仍然可能会发现我们不想使用这些情况，因为它们无法提供最佳性能。

目录 (Table of contents)

CTEs in SQL Server; Querying Common Table Expressions

Inserts and Updates with CTEs in SQL Server (Common Table Expressions)

CTE SQL Deletes; Considerations when Deleting Data with Common Table Expressions in SQL Server

CTEs in SQL Server; Using Common Table Expressions To Solve Rebasing an Identifier Column

SQL Server中的CTE；查询公用表表达式

在SQL Server中使用CTE进行插入和更新（公用表表达式）

CTE SQL删除；在SQL Server中删除具有公用表表达式的数据时的注意事项

SQL Server中的CTE；使用公用表表达式解决重新编制标识符列的问题

翻译自: https://www.sqlshack.com/inserts-and-updates-with-ctes-in-sql-server-common-table-expressions/

cte公用表表达式