mysql维表的代理键字段_mysql多维数据仓库指南--第三篇第12章(2)
宾夕法尼亚州地区客户维
在本节我将用宾夕法尼亚州地区客户的子集维度来解释第二种维度子集的类型。我也将向你说明如何测试该子集维度。
相对的,一个向上钻取的维包含了它基础维的所有更高级别的数据。而一个特定子集维度则选择了它基础维的某个特定的数据集合。列表12-3所示的脚本产生并加载了宾夕法尼亚州(PA)地区客户子集维。
注意到,有两个事情是宾夕法尼亚州地区客户子集维区别于月份子集维的地方。
npa_customer_dim表和customer_dim表的字段结构一样,而month_dim表没有 date_dim表中的日期字段。
npa_customer_dim表的代理键是客户表的代理键。Month_dim表的代理键只属于month_dim表,并不是来自日期维表。
列表12-3 PA的客户:
/**********************************************************************/
/**/
/* pa_customer.sql*/
/**/
/**********************************************************************/
USE dw;
CREATE TABLE pa_customer_dim
( customer_sk INT NOT NULL AUTO_INCREMENT PRIMARY KEY
, customer_number INT
, customer_name CHAR (50)
, customer_street_address CHAR (50)
, customer_zip_code INT (5)
, customer_city CHAR (30)
, customer_state CHAR (2)
, shipping_address CHAR (50)
, shipping_zip_code INT (5)
, shipping_city CHAR (30)
, shipping_state CHAR (2)
, effective_date DATE
, expiry_date DATE )
;
INSERT INTO pa_customer_dim
SELECT
customer_sk
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, effective_date
, expiry_date
FROM customer_dim
WHERE customer_state = 'PA'
;
/* end of script*/
为了测试PA子维脚本,你需要先用列表12-4的脚本来增加三个居住于俄亥俄州的客户。
列表12-4:非PA客户:
/**********************************************************************/
/**/
/* non_pa_customer.sql*/
/**/
/**********************************************************************/
/* default to dw*/
USE dw;
INSERT INTO customer_dim
( customer_sk
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, effective_date
, expiry_date )
VALUES
(NULL, 10, 'Bigger Customers', '7777 Ridge Rd.', '44102',
'Cleveland', 'OH', '7777 Ridge Rd.', '44102', 'Cleveland',
'OH', CURRENT_DATE, '9999-12-31')
, (NULL, 11, 'Smaller Stores', '8888 Jennings Fwy.', '44102',
'Cleveland', 'OH', '8888 Jennings Fwy.', '44102',
'Cleveland', 'OH', CURRENT_DATE, '9999-12-31')
, (NULL, 12, 'Small-Medium Retailers', '9999 Memphis Ave.', '44102',
'Cleveland', 'OH', '9999 Memphis Ave.', '44102', 'Cleveland',
'OH', CURRENT_DATE, '9999-12-31')
;
/* end of script*/
用以下命令运行列表12-4中的脚本。
mysql> \. c:\mysql\scripts\non_pa_customer.sql
在你的控制台上将得到下面的响应信息:
Database changed
Query OK, 3 rows affected (0.86 sec)
Records: 3Duplicates: 0Warnings: 0
现在你已经准备好运行列表12-3中的pa_customer.sql脚本,在你做这些之前,确定你的Msql数据库的日期仍然是2007-03-02。
你可以用如下命令方式运行pa_customer.sql脚本。
mysql> \. c:\mysql\scripts\pa_customer.sql
你将在控制台上看到:
Database changed
Query OK, 0 rows affected (0.20 sec)
Query OK, 18 rows affected (0.08 sec)
Records: 18 Duplicates: 0 Warnings: 0
为了确保三个OH客户已经成功的载入,查询customer_dim表:
mysql> select customer_name, customer_state, effective_date from customer_dim;
控制台上将看到:
+---------------------------------+----------------+---------------+
| customer_name| customer_state | effective_date|
+---------------------------------+----------------+---------------+
| Really Large Customers| PA| 2005-03-01|
| Small Stores| PA| 2005-03-01|
| Medium Retailers| PA| 2005-03-01|
| Good Companies| PA| 2005-03-01|
| Wonderful Shops| PA| 2005-03-01|
| Extremely Loyal Clients| PA| 2005-03-01|
| Distinguished Agencies| PA| 2005-03-01|
| Extremely Loyal Clients| PA| 2007-03-01|
| Subsidiaries| PA| 2007-03-01|
| Really Large Customers| PA| 2007-03-02|
| Small Stores| PA| 2007-03-02|
| Medium Retailers| PA| 2007-03-02|
| Good Companies| PA| 2007-03-02|
| Wonderful Shops| PA| 2007-03-02|
| Extremely Loyal Clients| PA| 2007-03-02|
| Distinguished Agencies| PA| 2007-03-02|
| Subsidiaries| PA| 2007-03-02|
| Online Distributors| PA| 2007-03-02|
| Bigger Customers| OH| 2007-03-02|
| Smaller Stores| OH| 2007-03-02|
| Small-Medium Retailers| OH| 2007-03-02|
+---------------------------------+----------------+---------------+
21 rows in set (0.00 sec)
现在查询pa_customer_dim表来确定只有PA的客户在PA客户维表中。
mysql> select customer_name, customer_state, effective_date from pa_customer_dim;
结果如下:
+---------------------------------+----------------+---------------+
| customer_name| customer_state | effective_date|
+---------------------------------+----------------+---------------+
| Really Large Customers| PA| 2004-01-01|
| Small Stores| PA| 2004-01-01|
| Medium Retailers| PA| 2004-01-01|
| Good Companies| PA| 2004-01-01|
| Wonderful Shops| PA| 2004-01-01|
| Extremely Loyal Clients| PA| 2004-01-01|
| Distinguished Agencies| PA| 2004-01-01|
| Extremely Loyal Clients| PA| 2005-11-01|
| Subsidiaries| PA| 2005-11-01|
| Really Large Customers| PA| 2005-11-03|
| Small Stores| PA| 2005-11-03|
| Medium Retailers| PA| 2005-11-03|
| Good Companies| PA| 2005-11-03|
| Wonderful Shops| PA| 2005-11-03|
| Extremely Loyal Clients| PA| 2005-11-03|
| Distinguished Agencies| PA | 2005-11-03|
| Subsidiaries| PA| 2005-11-03|
| Online Distributors| PA| 2005-11-03|
+---------------------------------+----------------+---------------+
18 rows in set (0.00 sec)
正如你所看到的,只有PA客户进入到该表中,之前加入的OH客户并没有在其中。
修改定期装载
当一个新的PA客户资料加入到客户维中时,为了能够同时加入PA客户维表,你需要把PA客户子集维的装载合并到数据仓库的定期装载过程中。修改后的定期装载脚本将在列表12-5中列出。这个改变(新增)用粗体显示。注意到,当每次你运行每日的定期装载脚本时,该脚本重建(截断,然后加入所有的PA客户)了PA客户子集维。
列表12-5:修正后的每日DW定期装载
/******************************************************************/
/**/
/* dw_regular_12.sql*/
/**/
/******************************************************************/
USE dw;
/* CUSTOMER_DIM POPULATION*/
TRUNCATE customer_stg;
LOAD DATA INFILE 'customer.csv'
INTO TABLE customer_stg
FIELDS TERMINATED BY ' , '
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
( customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state )
;
/* SCD 2 ON ADDRESSES*/
UPDATE
customer_dim a
, customer_stg b
SET
a.expiry_date = SUBDATE (CURRENT_DATE, 1)
WHERE
a.customer_number = b.customer_number
AND (a.customer_street_address <> b.customer_street_address
OR a.customer_city <> b.customer_city
OR a.customer_zip_code <> b.customer_zip_code
OR a.customer_state <> b.customer_state
OR a.shipping_address <> b.shipping_address
OR a.shipping_city <> b.shipping_city
OR a.shipping_zip_code <> b.shipping_zip_code
OR a.shipping_state <> b.shipping_state
OR a.shipping_address IS NULL
OR a.shipping_city IS NULL
OR a.shipping_zip_code IS NULL
OR a.shipping_state IS NULL)
AND expiry_date = '9999-12-31'
;
INSERT INTO customer_dim
SELECT
NULL
, b.customer_number
, b.customer_name
, b.customer_street_address
, b.customer_zip_code
, b.customer_city
, b.customer_state
, b.shipping_address
, b.shipping_zip_code
, b.shipping_city
, b.shipping_state
, CURRENT_DATE
, '9999-12-31'
FROM
customer_dim a
, customer_stg b
WHERE
a.customer_number = b.customer_number
AND (a.customer_street_address <> b.customer_street_address
OR a.customer_city <> b.customer_city
OR a.customer_zip_code <> b.customer_zip_code
OR a.customer_state <> b.customer_state
OR a.shipping_address <> b.shipping_address
OR a.shipping_city <> b.shipping_city
OR a.shipping_zip_code <> b.shipping_zip_code
OR a.shipping_state <> b.shipping_state
OR a.shipping_address IS NULL
OR a.shipping_city IS NULL
OR a.shipping_zip_code IS NULL
OR a.shipping_state IS NULL)
AND EXISTS (
SELECT *
FROM customer_dim x
WHERE b.customer_number = x.customer_number
AND a.expiry_date - SUBDATE (CURRENT_DATE, 1))
AND NOT EXISTS (
SELECT *
FROM customer_dim y
WHEREb.customer_number = y.customer_number
AND y.expiry_date - '9999-12-31')
;
/* END OF SCD 2*/
/* SCD 1 ON NAME*/
UPDATE customer_dim a, customer_stg b
SET a.customer_name = b.customer_name
WHEREa.customer_number = b.customer_number
AND a.expiry_date - '9999-12-31'
AND a.customer_name <> b.customer_name
;
/* ADD NEW CUSTOMER*/
INSERT INTO customer_dim
SELECT
NULL
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, CURRENT_DATE
, '9999-12-31'
FROM customer_stg
WHERE customer_number NOT IN(
SELECT a.customer_number
FROM
customer_dim a
, customer_stg b
WHERE b.customer_number = a.customer_number )
/* RE-BUILD PA CUSTOMER DIMENSION*/
TRUNCATE pa_customer_dim;
INSERT INTO pa_customer_dim
SELECT
customer_sk
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, effective_date
, expiry_date
FROM customer_dim
WHERE customer_state = 'PA'
;
/* END OF CUSTOMER_DIM POPULATION*/
/* product dimension loading*/
TRUNCATE product_stg
;
LOAD DATA INFILE 'product.txt'
INTO TABLE product_stg
FIELDS TERMINATED BY ''
OPTIONALLY ENCLOSED BY ''
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
( product_code
, product_name
, product_category )
;
/* PRODUCT_DIM POPULATION*/
/* SCD2 ON PRODUCT NAME AND GROUP*/
UPDATE
product_dim a
, product_stg b
SET
expiry_date = SUBDATE (CURRENT_DATE, 1)
WHERE
a.product_code = b.product_code
AND (a.product_name <> b.product_name
OR a.product_category <> b.product_category )
AND expiry_date = '9999-12-31'
;
INSERT INTO product_dim
SELECT
NULL
, b.product_code
, b.product_name
, b.product_category
, CURRENT_DATE
, '9999-12-31'
FROM
product_dim a
, product_stg b
WHERE
a.product_code = b.product_code
AND (a.product_name <> b.product_name
OR a.product_category <> b.product_category )
AND EXISTS (
SELECT *
FROM product_dim x
WHERE b.product_code = x.product_code
AND a.expiry_date = SUBDATE (CURRENT_DATE, 1) )
AND NOT EXISTS (
SELECT *
FROM product_dim y
WHERE b.product_code = y.product_code
AND y.expiry_date = '9999-12-31')
;
/* END OF SCD 2*/
/* ADD NEW PRODUCT*/
INSERT INTO product_dim
SELECT
NULL
, product_code
, product_name
, product_category
, CURRENT_DATE
, '9999-12-31'
FROM product_stg
WHERE product_code NOT IN(
SELECT y.product_code
FROM product_dim x, product_stg y
WHERE x.product_code = y.product_code
;
/* END OF PRODUCT_DIM POPULATION*/
/* ORDER_DIM POPULATION*/
INSERT INTO order_dim (
order_sk
, order_number
, effective_date
, expiry_date
)
SELECT
NULL
, order_number
, order_date
, '9999-12-31'
FROM source.sales_order
WHERE entry_date = CURRENT_DATE
;
/* END OF ORDER_DIM POPULATION*/
/* SALES_ORDER_FACT POPULATION*/
INSERT INTO sales_order_fact
SELECT
order_sk
, customer_sk
, product_sk
, date_sk
, order_amount
, order_quantity
FROM
source.sales_order a
, order_dim b
, customer_dim c
, product_dim d
, date_dim e
WHERE
a.order_number = b.order_number
AND a.customer_number = c.customer_number
AND a.order_date >= c.effective_date
AND a.order_date <= c.expiry_date
AND a.product_code = d.product_code
AND a.order_date >= d.effective_date
AND a.order_date <= d.expiry_date
AND a.order_date = e.date
AND a.entry_date = CURRENT_DATE
;
/* end of script*/
测试修正后的定期装载
现在你可以测试列表12-5的脚本。在你实施之前,先增加一些客户数据,通过运行列表12-6的脚本增加一个PA客户和一个OH客户到客户维中。
列表12-6:增加两个客户
/******************************************************************/
/**/
/* two_more_customers.sql*/
/**/
/******************************************************************/
/* default to dw*/
USE dw;
INSERT INTO customer_dim
( customer_sk
, customer_number
, customer_name
, customer_street_address
, customer_zip_code
, customer_city
, customer_state
, shipping_address
, shipping_zip_code
, shipping_city
, shipping_state
, effective_date
, expiry_date )
VALUES
(NULL, 13, 'PA Customer', '1111 Louise Dr.', '17050',
'Mechanicsburg', 'PA', '1111 Louise Dr.', '17050',
'Mechanicsburg', 'PA', CURRENT_DATE, '9999-12-31')
, (NULL, 14, 'OH Customer', '6666 Ridge Rd.', '44102',
'Cleveland', 'OH', '6666 Ridge Rd.', '44102',
'Cleveland', 'OH', CURRENT_DATE, '9999-12-31')
;
/* end of script*/
现在运行列表12-6中的脚本:
mysql> \. c:\mysql\scripts\two_more_customers.sql
Mysql将显示有两个记录生效。
Database changed
Query OK, 2 rows affected (0.06 sec)
Records: 2Duplicates: 0Warnings: 0
现在更改你的Msql数据库的日期为2007-03-03以保证老的数据不会重新载入,然后运行dw_regular_12.sql脚本。
mysql> \. c:\mysql\scripts\dw_regular_12.sql
你将看到你的控制台有如下显示:
Database changed
Query OK, 9 rows affected (0.15 sec)
Query OK, 9 rows affected (0.14 sec)
Records: 9Deleted: 0Skipped: 0Warnings: 0
Query OK, 0 rows affected (0.05 sec)
Rows matched: 0Changed: 0Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0Duplicates: 0Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0Changed: 0Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Records: 0Duplicates: 0Warnings: 0
Query OK, 18 rows affected (0.04 sec)
Query OK, 19 rows affected (0.06 sec)
Records: 19Duplicates: 0Warnings: 0
Query OK, 4 rows affected (0.09 sec)
Query OK, 4 rows affected (0.07 sec)
Records: 4Deleted: 0Skipped: 0Warnings: 0
Query OK, 0 rows affected (0.06 sec)
Rows matched: 0Changed: 0Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Records: 0Duplicates: 0Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Records: 0Duplicates: 0Warnings: 0
Query OK, 0 rows affected (0.15 sec)
Records: 0 Duplicates: 0Warnings: 0
Query OK, 0 rows affected (0.17 sec)
Records: 0Duplicates: 0Warnings: 0
现在用该语句查询pa_customer_dim表。你将看到新的PA客户被插入到表中。
mysql> select customer_name, customer_state, effective_date from pa_customer_dim;
这是结果:
+-------------------------+----------------+---------------+
| customer_name| customer_state | effective_date|
+-------------------------+----------------+---------------+
| Really Large Customers| PA| 2005-03-01|
| Small Stores| PA| 2005-03-01|
| Medium Retailers| PA| 2005-03-01|
| Good Companies| PA| 2005-03-01|
| Wonderful Shops| PA| 2005-03-01|
| Extremely Loyal Clients | PA| 2005-03-01|
| Distinguished Agencies| PA| 2005-03-01|
| Extremely Loyal Clients | PA| 2007-03-01|
| Subsidiaries| PA| 2007-03-01|
| Really Large Customers| PA| 2007-03-02|
| Small Stores| PA| 2007-03-02|
| Medium Retailers| PA| 2007-03-02|
| Good Companies| PA| 2007-03-02|
| Wonderful Shops| PA| 2007-03-02|
| Extremely Loyal Clients | PA| 2007-03-02|
| Distinguished Agencies| PA| 2007-03-02|
| Subsidiaries| PA| 2007-03-02|
| Online Distributors| PA| 2007-03-02|
| PA Customer| PA| 2007-03-03|
+-------------------------+----------------+---------------+
19 rows in set (0.00 sec)
小结
在这章,你学习了两种类型的子集维。月份子集维是向上钻取维的一个例子,一个装载自比它更为详细的基础维的具有更高级的维。PA客户维是一个特定的子集维;只从它的基础维选择PA客户数据进行装载。
下一章,你将学习另外一个复用现有维的技术,叫角色扮演维度。
mysql维表的代理键字段_mysql多维数据仓库指南--第三篇第12章(2)相关推荐
- MySQL修改表的主键字段
MySQL修改表的主键字段 1. 命令 ALTER TABLE sleep_device_day_temp DROP PRIMARY KEY ,ADD PRIMARY KEY ( id );
- mysql 删除表时外键约束_MySQL删除表的时候忽略外键约束的简单实现
删除表不是特别常用,特别是对于存在外键关联的表,删除更得小心.但是在开发过程中,发现Schema设计的有问题而且要删除现有的数据库中所有的表来重新创建也是常有的事情:另外在测试的时候,也有需要重新创建 ...
- mysql给表加外键约束_MySQL为表添加外键约束
为表添加外键约束的语法 Alter table 表名 add constraint FK_ID foreign key(外键字段名) REFERENCES 外表表名(主键字段名): 为表student ...
- mysql大表不停机加字段_mysql大表在不停机的情况下增加字段该怎么处理
MySQL中给一张千万甚至更大量级的表添加字段一直是比较头疼的问题,遇到此情况通常该如果处理?本文通过常见的三种场景进行案例说明. 1. 环境准备 数据库版本: 5.7.25-28(Percona 分 ...
- 如何提高增加包含大量记录的表的主键字段的效率
如何提高增加包含大量记录的表的主键字段的效率 LazyBee 1 问题的提出: 在给客户升级数据库系统时,由于报表的需要,系统中每一个表都需要有主键字段.系统审计表自然也有这个要求-需要增加一个ide ...
- oracle中设置表的主键字段为自增序列(实例)
oracle中设置表的主键字段为自增序列(实例) 1.首先创建一个表(如日志表) //删除库表中存在的日志表 drop table S_LOG_INFO cascade constraints; // ...
- mysql表定义外键语法_mysql设置外键的语法怎么写?
2012-08-31 回答 mysql外键设置详解 (1) 外键的使用: 外键的作用,主要有两个: 一个是让数据库自己通过外键来保证数据的完整性和一致性 一个就是能够增加er图的可读性 有些人认为外键 ...
- mysql建表时主键_mysql建表时怎么设置主键?
设置方法:在"CREATE TABLE"语句中,通过"PRIMARY KEY"关键字来指定主键,语法格式"字段名 数据类型 PRIMARY KEY [ ...
- mysql数据库怎么添加主键约束_mysql修改表时怎么添加主键约束?
mysql中可以通过"ALTER TABLE 表名 ADD PRIMARY KEY(字段名);"语句在修改数据表时添加主键约束:当在修改表时要设置表中某个字段的主键约束时,要确保设 ...
最新文章
- UPS故障案例集(二)
- [Java基础]List集合
- Oracle 数据定义
- jsp删除时提示_Java修行第058-059天 Servlet+JSP+JavaBean整合项目总结
- 百度系无人车创业公司领骏科技完成新一轮融资
- bzoj 1638: [Usaco2007 Mar]Cow Traffic 奶牛交通(拓扑排序?+DP)
- 如何在 Git 里撤销(几乎)任何操作
- Internet Explorer 无法打开搜索页完美解决办法
- 线性代数:03 向量空间 -- 向量空间的基与维数,坐标,过渡矩阵
- CAJ格式文件怎么转换为PDF格式
- 余弦距离的应用 -- cosine distance
- java计算机毕业设计消防网站源代码+数据库+系统+lw文档
- 【ACPC2013】马里奥赛车(01背包)
- Telegram图文详解-- 编程机器人(谷歌脚本服务)
- 中国铜行业市场消费量调研及投资潜力预测分析报告2022-2027年
- 深银色心扉之Silverlight-MMORPG游戏引擎第一阶段移植
- 微型计算机控制系统优缺点,微型计算机控制系统习题总结精华要点.pdf
- 最流行的10种Android恶意软件类别解释
- 学金融和计算机有什么前途,金融学和计算机科学与技术哪个有前途
- 关于字符串转整型、整型转字符串 Java
热门文章
- docker启动elasticsearch——ERROR: Elasticsearch did not exit normally - check the logs at xxx
- MyBatis-Plus_查询进阶05
- 工作组访问不到别人的计算机,众果搜的博客
- break详细讲解啊
- 土木转计算机 但计算机学院不好,土木妹子转计算机,较高三维水科研,求指导!...
- python可以构建sem模型_python-分组的熊猫DataFrames:如何将scipy.stats.sem应用于它们?...
- java 多态与重载的区别_java实现多态 方法的重写和重载的区别
- mysql在单片机移植_移植MySQL到嵌入式ARM平台
- c# 溢出抛异常_Rust竟然没有异常处理?
- php里push的用法,php array_push函数怎么用?