人工智能及其体系结构_一些复制体系结构错误及其解决方案

人工智能及其体系结构

背景 (Background)

From time to time, I’ve run into replication issues in inherited environments that I did not architect and some of these environments experienced errors in replication because of how it was constructed from the beginning. In this tip, we look at some of the basics in replication architecture and then at solving some of these problems. Some of the replication issues I’ve seen are caused by misunderstanding what is impossible and possible with replication.

有时，我会遇到我没有架构的继承环境中的复制问题，并且其中某些环境在复制过程中会遇到错误，因为复制是从一开始就被构造的。在本文中，我们介绍了复制体系结构的一些基础知识，然后介绍了解决其中的一些问题。我看到的一些复制问题是由于误解了复制不可能和可能引起的。

讨论区 (Discussion)

In an ideal replication set up, we have a publication server, distribution server, and a subscriber server (see the below image).

在理想的复制设置中，我们具有发布服务器，分发服务器和订阅服务器（请参见下图）。

The reason for this design is scale from the beginning, especially if we’re replicating large data sets for analysis, as we can scale out publishers if we horizontally distribution publications and we can reduce the load on an individual distribution database. Companies looking to reduce or cut costs might want to consider using Microsoft Azure SQL for subscriptions to publications. Another helpful way to reduce costs is considering what version of SQL Server you’re using for the distributor, publisher and subscriber – for an example, if your heaviest analysis is on your subscriber, you might want to think about making sure it’s build to handle that load.

进行此设计的原因是从一开始就具有规模，特别是如果我们要复制大型数据集进行分析，因为如果我们水平分布出版物，则可以向外扩展发布者，并且可以减少单个分布数据库上的负载。希望减少或削减成本的公司可能希望考虑使用Microsoft Azure SQL来订阅出版物。降低成本的另一种有用的方法是考虑为发行者，发行者和订阅者使用哪个版本SQL Server-例如，如果最繁重的分析是针对订阅者的，则您可能要考虑确保它可以处理那个负载。

What would happen if I replicated one table to the same subscription twice in two different publications? In the below image, we see the exact same table with the same data set being replicated from two different publications to the exact same destination table – let’s assume that in this case it comes from the same database and same server.

如果我在两个不同的出版物中两次将一个表复制到同一预订中，将会发生什么情况？在下图中，我们看到具有相同数据集的完全相同的表从两个不同的发布复制到完全相同的目标表–假设在这种情况下，它来自相同的数据库和相同的服务器。

We will see primary key errors in the replication error log because the primary key is identical and we’re attempting to duplicate the data. I’ve seen this issue before where DBAs simply added replication on a target that already existed, and the errors flooding the log caused a tie up in the distribution database. This may also occur when a destination table receives data from two identical source tables that are on different servers; we will see in the subscriber folder the sources and can track which publications may be sending duplicate data.

我们将在复制错误日志中看到主键错误，因为主键是相同的，并且我们正在尝试复制数据。在DBA只是在已经存在的目标上简单地添加复制，并且日志泛滥的错误在分发数据库中造成了麻烦之前，我就已经看到过这个问题。当目标表从不同服务器上的两个相同源表接收数据时，也可能会发生这种情况。我们将在订户文件夹中看到源，并可以跟踪哪些出版物可能正在发送重复数据。

Similar to the above issue, another error I’ve seen with replication that also involved poor architecture was replicating the same publisher table twice using different columns in each case to two different destinations on the same database, yet using the same underlying stored procedures. If we’re going to use stored procedures to replicate data, then we have to think about either naming them in a way that the stored procedure name is unique (including destination database, schema and table name in their name), or we may want to consider an alternate way to replicate the data – like direct inserts, updates or deletes. If we don’t, consider that the same named stored procedure will try to copy two different column data sets over to two different tables, and yet this is wrong. Typically, in these cases we’ll see data failures and skipped records and finally a publication going inactive. When we look at the stored procedure, we may realize that the definition doesn’t match one of the tables.

与上述问题类似，我在复制过程中遇到的另一个错误也涉及较差的体系结构，即使用同一列将同一发布者表两次复制到同一数据库上的两个不同目标，但使用了相同的底层存储过程。如果要使用存储过程复制数据，则必须考虑以一种唯一的方式命名存储过程名称（包括名称中的目标数据库，模式和表名），或者我们可能想要考虑另一种复制数据的方式，例如直接插入，更新或删除。如果不这样做，请考虑同一个命名存储过程将尝试将两个不同的列数据集复制到两个不同的表中，但这是错误的。通常，在这些情况下，我们将看到数据故障和跳过的记录，最后将发布变为非活动状态。当我们查看存储过程时，我们可能会意识到定义与表之一不匹配。

I do not like the idea of replicating the same table with a different column set of the table to the same subscriber (though there are exceptions in cases where the two together are significantly less than the full table as a whole), so I’ll generally push back on this and replicate the full table along with two views that will pull part of the table, or use Entity Framework for a select.

我不喜欢将具有不同表列集的同一表复制到同一订户的想法（尽管在某些情况下，两者合计显着小于整个表的整体情况是例外的），所以我将通常回退并复制整个表以及两个将提取表一部分的视图，或者使用Entity Framework进行选择。

Other issue involves using DELETEs over drop-and-recreate or TRUNCATEs. I do not have any problems with using DELETEs, as there are situations where it is the only way to eliminate data, as inconvenient and slow as it is. DBAs who use these over the other methods must first understand what overhead will be added by using them and must optimize for them, otherwise they will experience issues.

另一个问题涉及在删除和重新创建或TRUNCATE上使用DELETE。我在使用DELETE时没有任何问题，因为在某些情况下，这是消除数据的唯一方法，尽管它带来了不便和缓慢。与其他方法一起使用这些DBA的DBA必须首先了解使用它们会增加哪些开销，并且必须对其进行优化，否则它们将遇到问题。

一些有用的提示和问题 (Some useful tips and questions)

First, always verify that what you’re planning to replicate doesn’t already exist. This is a very basic step in replication and it helps avoid flooding the error log with a meaningless error that reflects a misunderstanding of replication. In addition, adding replication to the exact same destination from the exact same source is wasteful. This would be like trying to create a database that already exists and getting upset when it fails – if it already exists, use it. The below basic query will tell you if you have a duplicate article being replicated to the same database schema and table on the same server – you would need to expand this on all publisher servers if you want to track across multiple servers:

首先，请始终确认您要复制的内容不存在。这是复制中非常基本的步骤，它有助于避免在错误日志中充斥无意义的错误，该错误反映了对复制的误解。另外，将复制从完全相同的源添加到完全相同的目标非常浪费。这就像试图创建一个已经存在的数据库，并在失败时变得沮丧–如果它已经存在，请使用它。下面的基本查询将告诉您是否有重复的文章被复制到同一服务器上的相同数据库模式和表中-如果要跨多个服务器进行跟踪，则需要在所有发布服务器上都进行扩展：


SELECT a.publisher_db + '.' + a.source_owner + '.' + a.source_object AS SourceObject, s.subscriber_db + '.' + CASE WHEN a.destination_owner IS NULL THEN a.source_owner END + '.' + a.destination_object AS DestinationObject
FROM MSarticles aINNER JOIN MSsubscriptions s ON a.publication_id = s.publication_id
ORDER BY (s.subscriber_db + '.' + CASE WHEN a.destination_owner IS NULL THEN a.source_owner END + '.' + a.destination_object) DESC

Second, DBAs should question whether replication needs to be added in the first place, as there are situations where other tools exist and are superior. For an example, timed-loads might be better with ETL. In the case of replicating the different column set of the same source table to the same destination database, how much do the two sets add in comparison to the full set of data, and why is that the choice over replicating the full table with two views, or allowing Entity Framework to select a different column set for the report or application? These architecture questions can save a lot of time when troubleshooting later. If something sounds wrong or redundant, always push back, as most redundancy is misunderstanding the problem.

其次，DBA应该首先质疑是否需要添加复制，因为在某些情况下，存在其他工具并且性能优越。例如，使用ETL定时加载可能更好。在将同一源表的不同列集复制到同一目标数据库的情况下，与整个数据集相比，这两个集合要增加多少，为什么选择用两个视图复制整个表呢？，还是允许Entity Framework为报表或应用程序选择其他列集？这些体系结构问题可以在以后进行故障排除时节省大量时间。如果听起来有些错误或多余，请务必后退，因为大多数冗余都是对问题的误解。

Third, when using DELETEs over other methods, DBAs should do the following:

第三，当通过其他方法使用DELETE时，DBA应该执行以下操作：

Optimize the logs for DELETEs, otherwise you will have to stay on top of log growth (log micromanagement). Additionally, DELETEs have an effect on the system’s resources as well; they do have their place, provided that a person stays on top of the effects. In my experience, log micromanagement increases the risk for integrity check failures; I highly advise against it. The best practice is to give your log drive at least double the maximum required space for replication if using DELETEs.

优化日志以进行DELETE，否则您将不得不紧随日志增长（日志微管理）之上。另外，DELETE也影响系统资源。只要有人留在效果之上，它们就一定有自己的位置。以我的经验，日志微管理会增加完整性检查失败的风险；我强烈建议不要这样做。最佳做法是，如果使用DELETE，则给您的日志驱动器至少两倍的最大所需复制空间 。
Do not promise clients or end users quick turn-arounds. DELETEs are incredibly slow whereas TRUNCATEs or drop-and-recreates are not. Microsoft even advises that if all data in the table must be removed, use TRUNCATE (see below references). If we choose a slow method, we can’t promise a quick turn-around.

不要向客户或最终用户承诺快速周转。删除的速度非常慢，而截断或拖放操作却没有。 Microsoft甚至建议如果必须删除表中的所有数据，请使用TRUNCATE（请参阅下面的参考资料）。如果选择慢速方法，则不能保证快速周转。

最后的想法 (Final thoughts)

Many replication problems occur because of poor architecture and the same is just as true with custom made ETL design. If an environment commits to using replication, I’d suggest thinking about the design, being strict about the objects that are replicated and building validation steps both before and after replication processes. Without all of this in place, replication won’t be a solution, but a problem.

许多复制问题是由于体系结构不佳而发生的，对于定制的ETL设计也是如此。如果环境致力于使用复制，我建议您考虑一下设计，严格考虑要复制的对象，并在复制过程之前和之后都建立验证步骤。没有所有这些，复制将不是解决方案，而是一个问题。

翻译自: https://www.sqlshack.com/some-replication-architecture-errors-and-their-resolutions/

人工智能及其体系结构