SQL Server数据库迁移–将数据库克隆到另一个排序规则

Database migration is a vital task in any environment, complex or otherwise. Seamless migrations are the goal but the efforts required to ensure it are tremendous.

在任何复杂或其他环境中，数据库迁移都是一项至关重要的任务。目标是无缝迁移，但要确保无缝迁移需要付出巨大的努力。

Backing up and restoring the database is surely a preferred and a robust approach, but does it work well in all situations? How do we plan this when the source and the destination databases need different configurations? How do we make such a migration seamless?

备份和还原数据库无疑是一种首选且可靠的方法，但是它在所有情况下都能正常工作吗？当源数据库和目标数据库需要不同的配置时，我们该如何计划？我们如何使这种迁移无缝化？

I recently worked on a database migration project, which needed a collation setting of case sensitivity that is “SQL_Latin1_General_CP1_CS_AS”, but the source setting of the database was “SQL_Latin1_General_CP1_CI_AS”.

我最近从事一个数据库迁移项目，该项目需要区分大小写的排序规则设置为“ SQL_Latin1_General_CP1_ CS _AS” ，但数据库的源设置为“ SQL_Latin1_General_CP1_ CI _AS” 。

Anyone would feel that it is fairly straightforward to restore the database and change the collation of the database to the said one. But when I dug into the database, I found there were various references being made on the columns, and that clustered index was created on many columns. Just like you cannot change the collation on a column that is being referenced by a foreign key, you cannot change the collation setting on a column that is set as the clustered index, which is the primary key column of the table.

任何人都会觉得，还原数据库并将数据库的排序规则更改为上述数据库非常简单。但是，当我进入数据库时，发现在列上进行了各种引用，并且在许多列上创建了聚集索引。就像您无法更改外键引用的列上的排序规则一样，您也无法更改设置为聚集索引的列（表的主键列）上的排序规则设置。

A global SQL Server Collation Sequence update is the answer to the puzzle, but it is a multi-step operation. The first step is to identify the case sensitivity of the collation assigned to the repository database. If you find that the repository database is not case sensitive, you must create a new empty (blank) case-sensitive database to migrate the data to. You must then import the data from the old case-insensitive repository database to the new empty case-sensitive database, and then establish the new case-sensitive database as the repository database.

全局SQL Server排序规则更新是解决之道，但它是一个多步骤操作。第一步是确定分配给存储库数据库的排序规则的区分大小写。如果发现存储库数据库不区分大小写，则必须创建一个新的空（空白）区分大小写的数据库，以将数据迁移到该数据库。然后，您必须将数据从旧的不区分大小写的存储库数据库导入到新的空区分大小写的数据库，然后将新的区分大小写的数据库建立为存储库数据库。

背景 (Background)

The collation setting provides sorting rules, case sensitivity, and accent sensitivity properties of character data types such as char and varchar. They apply not only to data stored in tables, but also to all text used in SQL Server, including variable names, metadata, and temporary tables. If the collation is case sensitive, then the uppercase letters are treated differently than the lowercase letters. Objects can also be set to accent sensitive in which letters with accent characters will sort in a specific order.

归类设置提供字符数据类型（例如char和varchar）的排序规则，区分大小写和重音符号属性。它们不仅适用于表中存储的数据，还适用于SQL Server中使用的所有文本，包括变量名，元数据和临时表。如果排序规则区分大小写，则将大写字母与小写字母区别对待。还可以将对象设置为区分重音 ，其中带有重音字符的字母将按特定顺序排序。

Collation is by default applied to the entire SQL Server instance. Also, when we install SQL Server, it picks the collation setting from the Windows system language. If the systems language setting is set to English (US), then the collation for SQL Server would be “SQL_Latin1_General_CP1_CI_AS” by default.

默认情况下，排序规则应用于整个SQL Server实例。另外，当我们安装SQL Server时，它将从Windows系统语言中选择排序规则设置。如果系统语言设置设置为英语（US），则SQL Server的排序规则默认为“ SQL_Latin1_General_CP1_CI_AS ”。

归类名称 (Collation Names)

SQL Server maintains a list of pre-defined collations that conform to the following pattern

SQL Server维护符合以下模式的预定义排序规则的列表

SQL_<SortRules><[_Pref]>_<CPCodepage>_<CaseSensitivity>_<AccentSensitivity>

SQL_ <SortRules> <[_ Pref]> _ <CPCodepage> _ <CaseSensitivity> _ <AccentSensitivity>

Each field within the collation name is separated by an underscore character (_). Values in these fields display characteristics of the collation and its sequence.

排序规则名称中的每个字段都由下划线字符（ _ ）分隔。这些字段中的值显示排序规则及其顺序的特征。

The SortRules and CPCodepage fields identify the name of the language/locale or alphabet that the collation was designed to support and the character numbering rule used when sorting terms. The CaseSensitivity and AccentSensitivity fields identify additional sorting rules to use when two letters within that language/locale or alphabet do not share the same case or accent.

SortRules和CPCodepage字段标识排序规则旨在支持的语言/区域名称或字母的名称，以及在对术语进行排序时使用的字符编号规则。 CaseSensitivity和AccentSensitivity字段标识在该语言/地区或字母表中的两个字母不共享相同大小写或重音时要使用的其他排序规则。

To identify the case sensitivity of a selected collation, review the fifth term (CaseSensitivity) in the collation name. This term can contain one of the following values:

要确定所选排序规则的区分大小写，请检查排序规则名称中的第五项（ CaseSensitivity ）。该术语可以包含以下值之一：

CI : Case Insensitive CI：C ASE 我 nsensitive
CS : Case Sensitive CS：C ASE 小号 ensitive

To create a case-sensitive repository database, select the collation that supports the language/locale or alphabet used by your repository database and include the value CS in the fifth term of the collation name.

要创建区分大小写的存储库数据库，请选择支持存储库数据库使用的语言/语言环境或字母的排序规则，并在排序规则名称的第五项中包含CS值。

如何识别排序规则设置及其各个级别 (How to identify collation setting and its various levels)

Oftentimes, we’re stuck at a point where we’re unable to find what collation setting is applied to a certain object. Let us look at some of the useful queries that will help you find the collation and its levels.

通常，我们陷入无法找到适用于某个对象的排序规则设置的问题。让我们看一些有用的查询，这些查询将帮助您找到排序规则及其级别。

First of all, you need to know that collation can be set four levels

首先，您需要知道排序规则可以设置为四个级别

Server-level collations

服务器级排序规则

Set during the installation of SQL Server 在SQL Server安装过程中设置
Default collation for all the databases, both system and user-defined databases 所有数据库（系统数据库和用户定义数据库）的默认归类

If you’re uncertain what collation has been assigned to a certain SQL Server instance, use the SERVERPROPERTY system function to find that out:

如果不确定已将什么排序规则分配给某个SQL Server实例，请使用SERVERPROPERTY系统函数来找出：


SELECT CONVERT (varchar, SERVERPROPERTY('collation')) SQLServerCollation;

Review the fifth term of the collation name returned from the query. If the fifth term is CI, the database collation is case-insensitive

查看查询返回的归类名称的第五项。如果第五项是CI ，则数据库排序规则不区分大小写

Database-level collations

数据库级排序规则

It can be set during the creation of database using the collate clause of create or alter database SQL statement 可以使用create或alter database SQL语句的collate子句在数据库创建期间设置它
Collation property of a newly-created database would be inherited from the instance, if we don’t explicitly mention the collation setting. 如果我们未明确提及排序规则设置，则将从实例继承新创建的数据库的排序规则属性。


SELECT name, collation_name FROM sys.databases;
--OR
SELECT CONVERT (varchar, DATABASEPROPERTYEX('SQLShack','collation'));

In this case the fifth term is CS, which means that the database collation is case-sensitive.

在这种情况下，第五项是CS，这意味着数据库排序规则区分大小写。

Column-level collation

列级排序规则

It can be set during the time of the creation of the table or altered using the collate clause 它可以在创建表时设置，也可以使用collate子句进行更改
A column is a child object of the database, so the collation is inherited from the database, unless explicitly changed. 列是数据库的子对象，因此排序规则是从数据库继承的，除非明确更改。


SELECT name, collation_name FROM sys.columns

Expression-level collation

表达式级别的整理

Set when a SQL statement is run using the collate clause 使用collate子句运行SQL语句时设置


SELECT name, collation_name FROM sys.columns order by name collate Latin1_General_100_CI_AS

The sort order is based on the defined collation setting

排序顺序基于定义的排序规则设置

To view the all the collations supported in SQL Server, you can use the following query:

要查看SQL Server支持的所有归类，可以使用以下查询：


--The sys.fn_helpcollations system table function is used to view all collations supported by SQL Server 2017
SELECT name, description FROM sys.fn_helpcollations();

准备不区分大小写的存储库数据库以进行迁移 (Preparing a Case-Insensitive Repository Database for Migration)

The first step in preparing the case-insensitive repository database for transfer to a case-sensitive database is to run the SQL query, DBCC CHECKDB WITH DATA_PURITY. This query identifies any issues you must address before transferring the data.

准备不区分大小写的存储库数据库以传输到区分大小写的数据库的第一步是运行SQL查询DBCC CHECKDB WITH DATA_PURITY 。此查询标识在传输数据之前必须解决的所有问题。

In the Object Explorer pane, right-click the repository database objects, and then click New Query. 在“ 对象资源管理器”窗格中，右键单击存储库数据库对象，然后单击“ 新建查询” 。
Type the following statement in the SQL Query pane.
```
USE WF_REPP
GO
DBCC CHECKDB WITH DATA_PURITY
```
在“ SQL查询”窗格中键入以下语句。
Click Execute 点击执行

In case of any errors, troubleshoot and fix the problems and run all of the above steps again. In this case, all of the statistics are set to zero (0), which means that there are no issues.

如果有任何错误，请排除故障并解决问题，然后再次运行上述所有步骤。在这种情况下，所有统计信息都设置为零（0），这意味着没有问题。

创建数据库备份 (Create the database backup)

The second step is to create a backup copy of the case-insensitive version of the repository database using the SQL Server Database Backup. We can use this backup database to recover the data if in case issues occur during or after the transfer from the case-insensitive database to the case-sensitive database

第二步是使用SQL Server数据库备份创建不区分大小写版本的存储库数据库的备份副本。如果从不区分大小写的数据库转移到区分大小写的数据库的过程中或之后出现问题，我们可以使用此备份数据库来恢复数据

创建区分大小写的数据库 (Creating a Case-sensitive Database)

In the Object Explorer pane, right-click the Databases folder, and then click New Database. 在“ 对象资源管理器”窗格中，右键单击“ 数据库”文件夹，然后单击“ 新建数据库” 。
Type the name of your new repository database in the Name field of the New Database dialog box.
For example, let’s call the new database, WF_REPP 在“ 新建数据库”对话框的“ 名称”字段中键入新存储库数据库的名称。
例如，我们叫新数据库WF_REPP
In the Select a page pane, click Options. 在“ 选择页面”窗格中，单击“ 选项” 。
Open the Collation field drop-down list, and click the case-sensitive collation SQL_Latin1_General_CP1_CS_AS 打开排序规则字段下拉列表，然后单击区分大小写的排序规则SQL_Latin1_General_CP1_CS_AS
Click OK. 单击确定。
An icon for the new repository database appears in the Databases folder in the Object Explorer pane. 新存储库数据库的图标出现在“ 对象资源管理器”窗格的“ 数据库”文件夹中。

不区分大小写的存储库的脚本对象–生成和发布脚本向导 (Scripting objects of case in-sensitive repository – Generate and Publish Scripts Wizard)

In Object Explorer, expand Databases, right-click a database, point to Tasks, and then click Generate Scripts. Follow the steps in the wizard to script the database objects. 在“ 对象资源管理器”中 ，展开“ 数据库” ，右键单击一个数据库，指向“ 任务” ，然后单击“ 生成脚本” 。按照向导中的步骤编写数据库对象的脚本。
On the Choose Objects page, select entire database and all the database objects.

在“ 选择对象”页面上，选择整个数据库和所有数据库对象。
Click Next 点击下一步
On the Set Scripting Options page, select Save to new query window 在“ 设置脚本选项”页面上，选择“ 保存到新查询”窗口
To specify advanced scripting options, select the Advanced button and leave the default value of collation and change script to drop and create

要指定高级脚本选项，请选择“ 高级”按钮，并保留默认排序规则值并将更改脚本更改为拖放并创建
On the Summary page, review your selections. Click Previous to change your selections. Click Next to generate a script of the objects you selected. On the Save or Publish Scripts page, monitor the progress of the script generation.

在“ 摘要”页面上，查看您的选择。单击上一步更改选择。单击“ 下一步”生成所选对象的脚本。在“ 保存或发布脚本”页面上，监视脚本生成的进度。
Fix the code and check related objects such UDF and TVFs, Views, computed columns, and constraints. 修复代码并检查相关对象，例如UDF和TVF，视图，计算列和约束。
Execute the script on the case-sensitive repository. 在区分大小写的存储库上执行脚本。

识别依赖性 (Identify the dependency)

By now, we know how to fix all the logical objects of the database. Let’s now take a look at fixing the collation settings on the tables. It is important to understand the foreign key relationships between the objects. Use the following query to list the dependent objects, where the collation setting SQL_Latin1_General_CPI_CI_AS of the column

至此，我们知道了如何修复数据库的所有逻辑对象。现在让我们看一下固定表上的排序规则设置。了解对象之间的外键关系很重要。使用以下查询列出相关对象，其中该列的排序规则设置为SQL_Latin1_General_CPI_CI_AS


SelectROW_NUMBER() OVER (ORDER BY object_name(rkeyid), c1.name) as ID,object_name(rkeyid) Parent_Table,object_name(fkeyid) Child_Table,object_name(constid) FKey_Name,c1.name FKey_Col,c2.name Ref_KeyCol
Fromsys.sysforeignkeys sINNER JOIN sys.syscolumns c1on ( s.fkeyid = c1.id AND s.fkey = c1.colid )INNER JOIN syscolumns c2on ( s.rkeyid = c2.id AND s.rkey = c2.colid )
WHERE c1.collation='SQL_Latin1_General_CP1_CI_AS'
Order byParent_Table,Child_Table

Identify Primary Key column dependency


SELECT si.name AS PrimaryKey,OBJECT_NAME(ic.object_id) TableName,COL_NAME(ic.object_id,ic.column_id) ColumnName,sc.collation_name CollationFROM sys.indexes AS si INNER JOIN sys.index_columns AS ic ON si.object_id = ic.object_id AND si.index_id = ic.index_id INNER JOINsys.columns sc ON ic.object_id=sc.object_id ANDsc.column_id=ic.column_idWHERE si.is_primary_key=1AND sc.collation_name like 'SQL_Latin1_General_CP1_CI_AS'

识别主键列依赖性

SELECT si.name AS PrimaryKey,OBJECT_NAME(ic.object_id) TableName,COL_NAME(ic.object_id,ic.column_id) ColumnName,sc.collation_name CollationFROM sys.indexes AS si INNER JOIN sys.index_columns AS ic ON si.object_id = ic.object_id AND si.index_id = ic.index_id INNER JOINsys.columns sc ON ic.object_id=sc.object_id ANDsc.column_id=ic.column_idWHERE si.is_primary_key=1AND sc.collation_name like 'SQL_Latin1_General_CP1_CI_AS'

The above information clearly shows that using ALTER COLUMN to modify the collation will not work until the table is rebuilt. Let’s see a demo to change the collation of Primary Key column


--Step 1: Create the table with CS (Case-Sensitive) collation
CREATE TABLE [dbo].[SQLShackDemo]([customer_id] [varchar](20)  COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL,[customer_name] [varchar](20) NOT NULL,[phone_number] [int] NULL,[sales_total] [int] NULL,CONSTRAINT [PK_CustomerID] PRIMARY KEY CLUSTERED
([customer_id] ASC
))--Step 2: Insert the Dummy dataInsert into SQLShackDemo values(1,'Prashanth',168082,1000),(2,'Jayaram',68082,1500),(3,'thanVitha',68083,2500)
SELECT * from [SQLShackDemo]--Step 3: Drop the PK constraintALTER TABLE [SQLShackDemo] DROP CONSTRAINT [PK_CustomerID]--Step 4: Alter the column with the new collation settingALTER TABLE [<Table>] ALTER COLUMN [<Column>] <DataType(Size)> COLLATE <NewCollationType>ALTER TABLE [SQLShackDemo] ALTER COLUMN [customer_id] [varchar](20)  COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL--Step 5: re-add the Primary Key constraint
ALTER TABLE [SQLShackDemo] ADD CONSTRAINT [PK_CustomerID] PRIMARY KEY (customer_id)SELECT * from [SQLShackDemo]

上面的信息清楚地表明，使用ALTER COLUMN修改排序规则只有在重建表后才能起作用。让我们看一个演示如何更改主键列的排序规则

Here’s what we did in the last steps:

这是我们在最后步骤中所做的：

Script drop and re-create a foreign key constraint 脚本删除并重新创建外键约束
Drop all the all foreign key constraints 删除所有所有外键约束
Migrate data using SQL Server Import and Export Wizard 使用SQL Server导入和导出向导迁移数据
Modify the target database by re-enabling the foreign key constraints 通过重新启用外键约束来修改目标数据库
Rebuild the indexes 重建索引

Please note that the above steps are an option only when working on a very few tables. Doing this on a huge table would be a daunting task, not to mention the significant amount of resources that would be consumed if the data is too large.

请注意，仅在处理很少的表时，才可以选择上述步骤。在巨大的表上执行此操作将是一项艰巨的任务，更不用说如果数据太大则将消耗大量资源。

摘要 (Summary)

Many times, as administrators, we test various methodologies of migration. This article is an effort to demonstrate a simple technique to move the data to a different (new) database with a different configuration while ensuring to maintain data integrity.

很多时候，作为管理员，我们测试各种迁移方法。本文旨在演示一种简单的技术，可在确保维护数据完整性的同时将数据移动到具有不同配置的其他（新）数据库中。

SQL Server supports storing objects that have different collations within a database—we saw how we can set collation at multiple levels.

SQL Server支持在数据库中存储具有不同归类的对象-我们看到了如何在多个级别上设置归类。

The execution plan may vary depending on the type of collation used in the transact SQL statement. The behavior of SQL execution may be different results with different collation settings.

执行计划可能会有所不同，具体取决于事务 SQL语句中使用的排序规则类型。使用不同的排序规则设置，SQL执行的行为可能会导致不同的结果。

If you want to use a collation other than the default when installing SQL Server, be sure to change the collation on the Collation tab of the wizard’s Server Configuration screen.

如果要在安装SQL Server时使用默认排序规则以外的其他排序规则 ，请确保在向导的“ 服务器配置”屏幕的“ 排序规则”选项卡上更改排序规则。

You should try your best to get the server collation right when you install SQL Server because changing it after the installation is no small feat. You must take steps such as backing up the data, rebuilding the master database, recreating the user databases and all the objects within and importing the data into the newly created tables.

安装SQL Server时，应尽力使服务器排序规则正确，因为在安装后进行更改并非易事。您必须采取诸如备份数据，重建主数据库，重新创建用户数据库和其中的所有对象以及将数据导入新创建的表之类的步骤。

Fortunately, instead of changing the server collation, you can assign a different collation to your user databases, and you can assign a specific collation to even a character column. But as usual, better to keep in mind the inheritance, so as to keep things simple and manageable.

幸运的是，无需更改服务器排序规则，您可以为用户数据库分配其他排序规则，甚至可以将特定排序规则分配给字符列。但是像往常一样，最好记住继承性，以使事情简单易管理。

翻译自: https://www.sqlshack.com/sql-server-db-migration-cloning-a-database-to-another-collation/