介绍 (Introduction)

在上一篇有关introduction to SQL Server business intelligence we covered the general structure of an enterprise business intelligence solution. The tools needed to build these solutions were briefly mentioned. The purpose of this article is to provide you with a deeper understanding into the creation of an ETL (Extract, Transform and Load) dataflow. To do this one needs to use SQL Server Data Tools – Business Intelligence (previously known as BIDS or Business Intelligence Development Studio). In this article we’ll take a look at the basic functionality of SQL Server Data Tools and how to use it to keep your data warehouse up to date. It’s worth noting that there are many different ways to go about building your ETL solution. This article gives sound advice and pointers as to how to approach the problem. SQL Server商业智能简介的文章中,我们介绍了企业商业智能解决方案的一般结构。 简要介绍了构建这些解决方案所需的工具。 本文的目的是使您对ETL(提取,转换和加载)数据流的创建有更深入的了解。 为此,需要使用SQL Server数据工具–商业智能(以前称为BIDS或商业智能开发工作室)。 在本文中,我们将研究SQL Server数据工具的基本功能以及如何使用它来使您的数据仓库保持最新状态。 值得注意的是,构建ETL解决方案有很多不同的方法。 本文提供了有关如何解决该问题的合理建议和指示。

馈送数据仓库 (Feeding the Data Warehouse)

This article presumes that you have already created a de-normalized data warehouse. If you need more information as to how to create a star-schema based database you can read more here

本文假定您已经创建了非规范化数据仓库。 如果您需要有关如何创建基于星形模式的数据库的更多信息,请在此处内容。

Once you have created a de-normalized database model for your data warehouse you will need to fill it with data and schedule regular updates to keep your SQL Server business intelligence data up to date. Depending on your version of SQL Server you will be able to do this with either BIDS or SQL Server Data Tools.

为数据仓库创建非标准化数据库模型后,您将需要为它填充数据并安排定期更新,以使SQL Server商业智能数据保持最新。 根据您SQL Server版本,您将可以使用BIDS或SQL Server数据工具来执行此操作。

This program is an extremely handy ETL and automation tool. You can do anything from executing systems processes and PowerShell scripts, running raw T-SQL and custom C# scripts to sending e-mails or connecting to FTP servers or Web Services. However, the most commonly used tool is the Data Flow Task.

该程序是非常方便的ETL和自动化工具。 从执行系统进程和PowerShell脚本,运行原始T-SQL和自定义C#脚本到发送电子邮件或连接到FTP服务器或Web服务,您可以执行任何操作。 但是,最常用的工具是数据流任务。

跟踪更新 (Keeping track of updates)

A data warehouse does not need to be updated often. As the business intelligence is about analyzing historical data the data of the current day is not critically important. It is also obvious that one should not empty and refill a database every day. It is therefore necessary to keep a log table for the processing history of the data warehouse.

数据仓库不需要经常更新。 由于商业智能是要分析历史数据,因此当前数据并不是至关重要的。 同样很明显,不应每天清空并重新填充数据库。 因此,有必要为数据仓库的处理历史保留一个日志表。


CREATE TABLE [dbo].[ProcessingLog]([ProcessingLogID] [int] IDENTITY(1,1) NOT NULL,[ProcessingTime] [datetime] NOT NULL,[Object] [sysname] NULL
)

连接管理器 (Connection managers)

Allowing for connections between one or many sources and/or one or many destinations, a connection manager is very often the starting point of an ETL package. There are ordinary database connections (ODBC, OLEDB, ADO.NET etc.), connections to HTTP, FTP and SMTP servers and connections to various file types like CSV, flat-file or even Excel. For a simple BI data flow however two OLE DB connections would suffice: one for the data warehouse and another for the OLTP/Production database. You can set them up in the following way.

允许连接一个或多个源和/或一个或多个目的地之间的连接时,连接管理器通常是ETL包的起点。 有普通的数据库连接(ODBC,OLEDB,ADO.NET等),与HTTP,FTP和SMTP服务器的连接以及与各种文件类型(如CSV,平面文件甚至Excel)的连接。 但是,对于简单的BI数据流,两个OLE DB连接就足够了:一个用于数据仓库,另一个用于OLTP /生产数据库。 您可以通过以下方式进行设置。

(Note that this connection is on the local machine and uses Windows Authentication. You can always connect directly to an instance in IP_ADDRESS\INSTANCE format and you may use SQL Server authentication if the instance does not support Active Directory accounts.)

(请注意,此连接位于本地计算机上,并且使用Windows身份验证。您始终可以直接连接到IP_ADDRESS \ INSTANCE格式的实例,如果该实例不支持Active Directory帐户,则可以使用SQL Server身份验证。)

Now that you have your connections you will need to make your first Data Flow Task.

现在您已建立连接,您将需要执行第一个数据流任务。

数据流任务 (The Data Flow Task)

To describe this task very simply; it is a unit of work that links a data source with a destination and provides the possibility to transform the dataflow in a number of ways. Sorting, conversion, merging, counting aggregating are just some of the possible tools that can be used within the Data Flow Task.

简单地描述这个任务; 它是将数据源与目标链接在一起的工作单元,并提供了以多种方式转换数据流的可能性。 排序,转换,合并,计数聚合只是“数据流任务”中可以使用的一些可能工具。

For SQL Server business intelligence solutions it is usually about sourcing data from the OLTP database, converting it from its relational model and inserting it into the star or snowflake model of your data warehouse. A tidy approach, if you’re comfortable with writing T-SQL, is to write SELECT statements that could fill your destination fact and dimension tables based on your OLTP database tables and save them as views in the source database. This would allow you to have a direct mapping from an object in your OLTP database to an object in your data warehouse without having to create complex ETL packages.

对于SQL Server商业智能解决方案,通常是从OLTP数据库中获取数据,从其关系模型转换数据并将其插入数据仓库的星型或雪花型。 如果您习惯于编写T-SQL,那么一种整洁的方法是编写SELECT语句,这些语句可以基于OLTP数据库表填充目标事实表和维表,并将它们另存为源数据库中的视图。 这将使您能够从OLTP数据库中的对象直接映射到数据仓库中的对象,而无需创建复杂的ETL包。

For fact tables you will normally have date/time fields and you will be able to create a simple data flow from your data warehouse view in the OLTP database to the fact table in the data warehouse. You can do this by running an Execute SQL task just before your data task and mapping the result set to the a variable storing the last processing time for that table. You can use the following query for this

对于事实表,通常将具有日期/时间字段,并且您将能够创建从OLTP数据库中的数据仓库视图到数据仓库中的事实表的简单数据流。 为此,您可以在数据任务之前运行Execute SQL任务,然后将结果集映射到一个变量,该变量存储该表的最后处理时间。 您可以为此使用以下查询


SELECT TOP 1 ProcessingTime
FROM ProcessingLog
WHERE object = 'YourFactTable'
ORDER BY ProcessingTime DESC

As you can see above you have added a “?” parameter to the source query. This allows you to add the last processed time to the query. You can do this by clicking on “Parameters…” and setting Parameter0 to you variable.

如上所示,您已经添加了“?” 源查询的参数。 这使您可以将最后处理的时间添加到查询中。 您可以通过单击“ Parameters…”并将参数0设置为变量来执行此操作。

Once you have your source that pulls only the most recent data that is not in your data warehouse you can create your destination connection and map the fields to your fact table fields.

一旦您拥有仅提取不在数据仓库中的最新数据的源,就可以创建目标连接并将字段映射到事实表字段。

Once you have completed the data flow task you must add an Execute SQL task to update the log table to prepare for the next day’s ETL processing:

完成数据流任务后,必须添加执行SQL任务以更新日志表,以为第二天的ETL处理做准备:


INSERT INTO [ProcessingLog] ([ProcessingTime], [Object])SELECTGETDATE(),'FactFinance'

Dimension tables often have to be dealt with differently but two common options involve comparing the source and destination objects and transferring the difference. This can be down by using the Slowly Changing Dimension or an Execute SQL task and a MERGE statement if the two databases are in the same instance.

维度表通常必须以不同的方式处理,但是两个常见的选项涉及比较源对象和目标对象并转移差异。 如果两个数据库在同一实例中,则可以使用“缓慢变化的维”或“执行SQL”任务和MERGE语句来降低此开销。

SQL Server代理 (SQL Server agent)

It may seem strange to mention the SQL Server agent when focusing on business intelligence, however, it is central to keeping the structure up to date. Once you have a functioning ETL package you need to automate it and run it regularly to make sure your data warehouse gets updated.

在专注于商业智能时,提到SQL Server代理似乎有些奇怪,但是,对于保持结构的最新状态至关重要。 具有功能正常的ETL软件包后,您需要对其进行自动化并定期运行以确保您的数据仓库得到更新。

从这里到哪里? (Where to from here?)

Now that you have a running ETL solution that feeds your data warehouse with interesting, business intelligence data, you can take one of two routes. If, for any reason, you cannot use SSAS to build a multidimensional cube you can start building SSRS reports using SQL queries based directly on your new data warehouse system.

既然您已经有了一个运行中的ETL解决方案,可以为数据仓库提供有趣的商业智能数据,那么您可以采用以下两种方法之一。 如果由于某种原因而无法使用SSAS构建多维数据集,则可以直接基于新数据仓库系统使用SQL查询开始构建SSRS报表。

资源资源 (Resources)

Designing the Star Schema Database
SQL Server Integration Services (SSIS) overview
Dimension Tables

设计星型数据库
SQL Server集成服务(SSIS)概述
尺寸表

翻译自: https://www.sqlshack.com/sql-server-business-intelligence-features-sql-server-data-tools-business-intelligence/

SQL Server商业智能功能– SQL Server数据工具–商业智能相关推荐

  1. ssdt_用于商业智能开发SQL Server数据工具(SSDT)的演变

    ssdt By all accounts, the introduction of SQL Server Data Tools (SSDT) in SQL Server 2012 was a wate ...

  2. sql示例_SQL Server Lead功能概述和示例

    sql示例 This article explores the SQL Server Lead function and its usage with various examples. 本文通过各种 ...

  3. sql server 加密_SQL Server机密–第一部分–加密基础知识和SQL Server加密功能

    sql server 加密 介绍 (Intro) We use cryptography every day: on the internet, mobile devices, ATM machine ...

  4. 【数据库和SQL学习笔记】1.SQL语言的功能和特点,数据定义语言和应用,主键和外键

    本专栏是我对数据库系统和SQL语言的学习笔记分享~ 数据库系统软件:SQL Server 2019 Express(免费,初学者使用足够,足够部署小型项目) 操作系统:Windows 10 安装过程略 ...

  5. 经典解读商业智能BI、大数据、数据中台三者关系

    大数据.数据中台都是商业智能BI发展到一定阶段的产物,核心都是围绕数据,数据采集.数据处理能力.算力的提升催生了大数据,数据资产和数据服务催生了数据中台,核心的数仓建模自商业智能BI一脉相承未曾改变, ...

  6. SQL Server商业智能功能–创建简单的OLAP多维数据集

    介绍 (Introduction) 多维多维数据集和事务数据库是两个非常不同的事物. 从经验中,我什至是经验丰富的DBA都完全避免使用多维数据集这一主题,因为对于他们而言,多维数据集实在是一个未知领域 ...

  7. tfs 文件系统部署_使用SQL Server数据工具和使用自定义工作流文件的TFS部署到多个数据库

    tfs 文件系统部署 In the previous blog post : Deployment to several databases using SQL Server Data Tools a ...

  8. 使用SQL Server数据工具进行SQL单元测试

    This article on SQL Unit Testing is the second part on the series about SSDT and database developmen ...

  9. 使用SQL Server数据工具和Visual Studio Online进行连续部署

    In the previous posts 在以前的帖子中 Deployment to several databases using SQL Server Data Tools and TFS us ...

最新文章

  1. Android 5.0学习之ListView升级版RecyclerView
  2. pytorch函数整理
  3. python安装MySQLdb包遇到的坑:EnvironmentError: mysql_config not found
  4. Day.04 selenium的基础
  5. c#.net2005 调用evc4.0生成的dll文件
  6. map(平均平均精度_客户的平均平均精度
  7. c语言格式化知识点(sscanf)
  8. cassandra随机获取数据_Cassandra维护最终一致性 和存储机制 分区策略
  9. 建立高可用性的数据库群集
  10. 使用winRAR脚本bat,需要的参数
  11. 计算机的生产过程,笔记本电脑生产流程介绍 .pptx
  12. stm32之HAL库实现us延时方法
  13. python单机考试答题系统如何制作_在线答题系统是如何创建试卷的?
  14. 德鲁克:卓有成效管理者的5项基本能力
  15. 计算机科学与技术专业成功人士,我校2002级计算机科学与技术专业校友重返母校...
  16. matlab 色温图,什么是色温?色温的定义及色温图
  17. 如何在centos环境下运行.exe文件
  18. 恩格尔注塑机数据采集
  19. 写作小课堂:如何写好工作邮件?
  20. 新笑傲江湖手游如何用电脑玩 新笑傲江湖手游PC电脑版玩法教程

热门文章

  1. rss spring 接口_spring mvc: rss(xml)输出
  2. 记录表类型 oracle,[转]关于oracle的记录类型
  3. graphpad多条不同的曲线_应用Graphpad Prism制作多组ROC曲线图
  4. SQL Server中Text和varchar(max)数据类型区别
  5. java set集合与List集合练习
  6. 对象内存布局 (15)
  7. 用Supermemo背单词达到8000词条
  8. 【24】淘宝sdk——入门实战之左右悬浮模块
  9. TempTable临时表
  10. Js + Css的msn式的popup提示窗口的实现 (转自:月牙儿)