Oracle优化器:星型转换
星型转换的基本思路是尽量避免直接去扫描星型模式中的事实表,因为这些事实表总会因为存有大量数据而十分庞大,对这些表的全表扫描会引起大量物理读并且效率低下。在典型的星型查询中,事实表总是会和多个与之相比小得多的维度表发生连接(join)操作。典型的事实表针对每一个维度表会存在一个外键(foreign key),除去这些键值(key)外还会存在一些度量字段譬如销售额度(sales amount)。与之对应的键值(key)在维度表上扮演主键的角色。而事实表与维度表间的连接操作一般都会发生在事实表上的外键和与之对应的维度表的主键间。同时这类查询总是会在维度表的其他列上存在限制十分严格的过滤谓词。充分结合这些维度表上的过滤谓词可以有效减少需要从事实表上访问的数据集合。这也就是星型转换(star transformation)的根本目的,仅访问事实表上相关的、过滤后精简的数据集合。 Oracle在Sample Schema示例模式中就存有星型模型的Schema,譬如SH:
SQL> select * from v$version;BANNER ---------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for 32-bit Windows: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - ProductionSQL> select * from global_name;GLOBAL_NAME ----------------------------------- www.oracledatabase12g.comSQL> conn maclean/maclean Connected.SQL> select table_name,comments2 from dba_tab_comments3 where owner = 'SH'4 and table_name in ('SALES', 'CUSTOMERS', 'CHANNELS', 'TIMES');TABLE_NAME COMMENTS ------------------------------ -------------------------------------------------------------------------------- CHANNELS small dimension table CUSTOMERS dimension table SALES facts table, without a primary key; all rows are uniquely identified by the comb TIMES Time dimension table to support multiple hierarchies and materialized views
可以从以上各表的注释(comment)中看到,SALES表是SH模式下一个没有主键的事实表,而CHANNELS、CUSTOMERS、TIMES三个小表充当维度表的角色。我们试着构建以下星型查询语句,该查询用以检索出从1999年12月至2000年2月间Florida州所有城市直销形式的每月销售额。
SQL> col name for a35 SQL> col description for a45 SQL> col value for a8 SQL> select name,value,description from v$system_parameter where name='star_transformation_enabled';NAME VALUE DESCRIPTION ----------------------------------- -------- --------------------------------------------- star_transformation_enabled FALSE enable the use of star transformation/* 初始化参数star_transformation_enabled用以控制如何启用星型转换,默认为FALSE,该参数可以动态修改 */SELECT c.cust_city,t.calendar_quarter_desc,SUM(s.amount_sold) sales_amountFROM sh.sales s, sh.times t, sh.customers c, sh.channels chWHERE s.time_id = t.time_idAND s.cust_id = c.cust_idAND s.channel_id = ch.channel_idAND c.cust_state_province = 'FL'AND ch.channel_desc = 'Direct Sales'AND t.calendar_quarter_desc IN ('2000-01', '2000-02','1999-12')GROUP BY c.cust_city, t.calendar_quarter_desc;SQL> select * from table(dbms_xplan.display_cursor(format => 'IOSTATS'));PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------SQL_ID ddjm7k72b8p2a, child number 1 ------------------------------------- SELECT /*+ gather_plan_statistics */ c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sh.sales s, sh.times t, sh.customers c, sh.channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'FL' AND ch.channel_desc = 'Direct Sales' AND t.calendar_quarter_desc IN ('2000-01', '2000-02','1999-12') GROUP BY c.cust_city, t.calendar_quarter_descPlan hash value: 382868716--------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | --------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 24 |00:00:00.62 | 1735 | 1726 | | 1 | HASH GROUP BY | | 1 | 24 | 24 |00:00:00.62 | 1735 | 1726 | |* 2 | HASH JOIN | | 1 | 1580 | 6015 |00:00:00.42 | 1735 | 1726 | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 2438 | 2438 |00:00:01.73 | 1459 | 1455 | |* 4 | HASH JOIN | | 1 | 4575 | 74631 |00:00:00.18 | 276 | 271 | | 5 | PART JOIN FILTER CREATE | :BF0000 | 1 | 227 | 182 |00:00:00.04 | 59 | 60 | | 6 | MERGE JOIN CARTESIAN | | 1 | 227 | 182 |00:00:00.04 | 59 | 60 | |* 7 | TABLE ACCESS FULL | CHANNELS | 1 | 1 | 1 |00:00:00.01 | 3 | 6 | | 8 | BUFFER SORT | | 1 | 227 | 182 |00:00:00.02 | 56 | 54 | |* 9 | TABLE ACCESS FULL | TIMES | 1 | 227 | 182 |00:00:00.02 | 56 | 54 | | 10 | PARTITION RANGE JOIN-FILTER| | 1 | 117K| 117K|00:00:00.09 | 217 | 211 | | 11 | TABLE ACCESS FULL | SALES | 2 | 117K| 117K|00:00:00.07 | 217 | 211 | ---------------------------------------------------------------------------------------------------------------
可以看到在以上不使用星型转换的执行计划中对事实表SALES执行了全表扫描,这是我们不希望发生的。因为SALES表中每一行记录都对应于一笔销售记录,因此其可能包含数百万行记录。但实际上这其中仅有极小部分是我们在查询中指定的季度在弗罗里达州直销的纪录。若我们启用星型转换,执行计划是否有所改善?
SQL> alter session set star_transformation_enabled=temp_disable; Session altered.SQL> alter session set events '10053 trace name context forever,level 1'; Session altered.在我们的理想当中星型变化会将原查询语句转换成如下形式:SELECT c.cust_city,t.calendar_quarter_desc,SUM(s.amount_sold) sales_amountFROM sh.sales s, sh.times t, sh.customers cWHERE s.time_id = t.time_idAND s.cust_id = c.cust_idAND c.cust_state_province = 'FL'AND t.calendar_quarter_desc IN ('2000-01', '2000-02', '1999-12')AND s.time_id IN(SELECT time_idFROM sh.timesWHERE calendar_quarter_desc IN ('2000-01', '2000-02', '1999-12'))AND s.cust_id IN(SELECT cust_id FROM sh.customers WHERE cust_state_province = 'FL')AND s.channel_id IN(SELECT channel_idFROM sh.channelsWHERE channel_desc = 'Direct Sales')GROUP BY c.cust_city, t.calendar_quarter_desc;/* 以添加AND..IN的形式明确了利用组合过滤谓词来减少需要处理的数据集 */通过10053优化trace我们可以了解Oracle优化器是如何真正产生这部分过度谓词的:FPD: Considering simple filter push in query block SEL$C3AF6D21 (#1) "S"."CHANNEL_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "CH"."CHANNEL_ID" FROM "SH"."CHANNELS" "CH") AND "S"."CUST_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "C"."CUST_ID" FROM "SH"."CUSTOMERS" "C") AND "S"."TIME_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "T"."TIME_ID FPD: Considering simple filter push in query block SEL$ACF30367 (#4) "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02' OR "T"."CALENDAR_QUARTER_DESC"='1999-12' try to generate transitive predicate from check constraints for query block SEL$ACF30367 (#4) finally: "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02' OR "T"."CALENDAR_QUARTER_DESC"='1999-12'FPD: Considering simple filter push in query block SEL$F6045C7B (#3) "C"."CUST_STATE_PROVINCE"='FL' try to generate transitive predicate from check constraints for query block SEL$F6045C7B (#3) finally: "C"."CUST_STATE_PROVINCE"='FL'FPD: Considering simple filter push in query block SEL$6EE793B7 (#2) "CH"."CHANNEL_DESC"='Direct Sales' try to generate transitive predicate from check constraints for query block SEL$6EE793B7 (#2) finally: "CH"."CHANNEL_DESC"='Direct Sales'try to generate transitive predicate from check constraints for query block SEL$C3AF6D21 (#1) finally: "S"."CHANNEL_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "CH"."CHANNEL_ID" FROM "SH"."CHANNELS" "CH") AND "S"."CUST_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "C"."CUST_ID" FROM "SH"."CUSTOMERS" "C") AND "S"."TIME_ID"=ANY (SELECT /*+ SEMIJOIN_DRIVER */ "T"."TIME_IDFinal query after transformations:******* UNPARSED QUERY IS *******最终转换后的查询语句:SELECT "C"."CUST_CITY" "CUST_CITY","T"."CALENDAR_QUARTER_DESC" "CALENDAR_QUARTER_DESC",SUM("S"."AMOUNT_SOLD") "SALES_AMOUNT"FROM "SH"."SALES" "S", "SH"."TIMES" "T", "SH"."CUSTOMERS" "C"WHERE "S"."CHANNEL_ID" = ANY (SELECT /*+ SEMIJOIN_DRIVER */"CH"."CHANNEL_ID" "ITEM_1"FROM "SH"."CHANNELS" "CH"WHERE "CH"."CHANNEL_DESC" = 'Direct Sales')AND "S"."CUST_ID" = ANY (SELECT /*+ SEMIJOIN_DRIVER */"C"."CUST_ID" "ITEM_1"FROM "SH"."CUSTOMERS" "C"WHERE "C"."CUST_STATE_PROVINCE" = 'FL')AND "S"."TIME_ID" = ANY(SELECT /*+ SEMIJOIN_DRIVER */"T"."TIME_ID" "ITEM_1"FROM "SH"."TIMES" "T"WHERE "T"."CALENDAR_QUARTER_DESC" = '2000-01'OR "T"."CALENDAR_QUARTER_DESC" = '2000-02'OR "T"."CALENDAR_QUARTER_DESC" = '1999-12')AND "S"."TIME_ID" = "T"."TIME_ID"AND "S"."CUST_ID" = "C"."CUST_ID"AND "C"."CUST_STATE_PROVINCE" = 'FL'AND ("T"."CALENDAR_QUARTER_DESC" = '2000-01' OR"T"."CALENDAR_QUARTER_DESC" = '2000-02' OR"T"."CALENDAR_QUARTER_DESC" = '1999-12')GROUP BY "C"."CUST_CITY", "T"."CALENDAR_QUARTER_DESC"/* 要比我们想想的复杂一些,子查询将IN语句化解了,并且AND...ANY的形式追加了过度谓词条件 */------------------------------------------------------------------+-----------------------------------+---------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop | ------------------------------------------------------------------+-----------------------------------+---------------+ | 0 | SELECT STATEMENT | | | | 1710 | | | | | 1 | HASH GROUP BY | | 1254 | 77K | 1710 | 00:00:21 | | | | 2 | HASH JOIN | | 1254 | 77K | 1283 | 00:00:16 | | | | 3 | HASH JOIN | | 1254 | 45K | 877 | 00:00:11 | | | | 4 | TABLE ACCESS FULL | TIMES | 227 | 3632 | 18 | 00:00:01 | | | | 5 | PARTITION RANGE SUBQUERY | | 1254 | 26K | 858 | 00:00:11 | KEY(SUBQUERY)| KEY(SUBQUERY)| | 6 | TABLE ACCESS BY LOCAL INDEX ROWID | SALES | 1254 | 26K | 858 | 00:00:11 | KEY(SUBQUERY)| KEY(SUBQUERY)| | 7 | BITMAP CONVERSION TO ROWIDS | | | | | | | | | 8 | BITMAP AND | | | | | | | | | 9 | BITMAP MERGE | | | | | | | | | 10 | BITMAP KEY ITERATION | | | | | | | | | 11 | BUFFER SORT | | | | | | | | | 12 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 | 00:00:01 | | | | 13 | BITMAP INDEX RANGE SCAN | SALES_CHANNEL_BIX| | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 14 | BITMAP MERGE | | | | | | | | | 15 | BITMAP KEY ITERATION | | | | | | | | | 16 | BUFFER SORT | | | | | | | | | 17 | TABLE ACCESS FULL | TIMES | 227 | 3632 | 18 | 00:00:01 | | | | 18 | BITMAP INDEX RANGE SCAN | SALES_TIME_BIX | | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 19 | BITMAP MERGE | | | | | | | | | 20 | BITMAP KEY ITERATION | | | | | | | | | 21 | BUFFER SORT | | | | | | | | | 22 | TABLE ACCESS FULL | CUSTOMERS | 2438 | 38K | 406 | 00:00:05 | | | | 23 | BITMAP INDEX RANGE SCAN | SALES_CUST_BIX | | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 24 | TABLE ACCESS FULL | CUSTOMERS | 2438 | 62K | 406 | 00:00:05 | | | ------------------------------------------------------------------+-----------------------------------+---------------+ Predicate Information: ---------------------- 2 - access("S"."CUST_ID"="C"."CUST_ID") 3 - access("S"."TIME_ID"="T"."TIME_ID") 4 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-12' OR "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02')) 12 - filter("CH"."CHANNEL_DESC"='Direct Sales') 13 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 17 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-12' OR "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02')) 18 - access("S"."TIME_ID"="T"."TIME_ID") 22 - filter("C"."CUST_STATE_PROVINCE"='FL') 23 - access("S"."CUST_ID"="C"."CUST_ID") 24 - filter("C"."CUST_STATE_PROVINCE"='FL')
从以上演示中可以看到,星型转换添加了必要的对应于维度表约束的子查询谓词。这些子查询谓词又被称为位图半连接谓词(bitmap semi-join predicates,见SEMIJOIN_DRIVER hint)。通过迭代来自于子查询的键值,再通过位图(bitmap)的AND、OR操作(这些位图可以源于位图索引bitmap index,但也可以取自普通的B*tree索引),我们可以做到仅仅访问事实表上的查询相关记录。理想状况下维度表上的过滤谓词可以帮我们过滤掉大量的数据,这样就可以使执行计划效率大大提升。当我们获取到事实表上的相关行后,这部分结果集可能仍需要同维度表使用原谓词重复连接(join back)。某些情况下,重复连接可以被省略,之后我们会提到。 如上演示中列出了星型转换后的查询语句的执行计划。这里可以看到Oracle是使用"TABLE ACCESS BY LOCAL INDEX ROWID"形式访问SALES事实表的,而非全表扫描。这里我们仅关心7-23行的执行计划,服务进程分别在(12,17,22)行从维度表中取得各维度表的相关键值(key value),同时对部分结果集执行了BUFFER SORT操作;在(13,18,23)行的'bitmap index range scan'操作中服务进程从事实表的三个对应于维度表外键的位图索引上(SALES_CHANNEL_BIX,SALES_TIME_BIX,SALES_CUST_BIX)获取了最原始的位图。位图上的每一个bit都对应于事实表上的一行记录。若从子查询中获取的键值(key values)与事实表上的值一致则bit置为1,否则为0。举例而言位图bitmap:[1][0][1][1][0][0][0]..[0](之后都为零)表示事实表上仅有第一、三、四行匹配于由子查询提供的键值。我们假设以上位图是由times表子查询提供的众多键值中的一个(如'2000-01')的对应于事实表的位图表达式。 接着在执行计划的(10,15,20)行上的'bitmap key iteration'操作会迭代每一个由子查询提供的键值并获取相应的位图。我们假设times表子查询提供的另外2个键值'2000-02'和'1999-12'分别对应的位图为[0][0][0][0][0][1]..[0]和[0][0][0][0][1][0]...[0]即每键值都只有一行符合。 毫无疑问ITERATION迭代操作会为我们生成众多位图,接下来需要对这些不同键值对应的位图进行位图合并操作(BITMAP MERGE,相当于对位图做OR操作),可以看到在上例执行计划中为(9,14,19)行;以我们假设的times表子查询位图合并而言,会生产一个简单的位图[1][0][1][1][1][1][0][0]..[0],这个位图对应事实表上的第一、三、四、五、六行,是对'2000-01','2000-02','1999-12'三个键值对应位图的合并。 在获得最后位图前我们还需要对来自于三个子查询的位图进一步处理,因为原始查询语句中各约束条件是AND与的形式,因此我们还要对这些已合并的位图执行AND与操作,如执行计划中的第八行"BITMAP AND",因为是AND与操作所以这步又会过滤掉大量记录。我们假设最终获得的位图是[1][0][1][0]...[0],即仅有第一、三行。 通过最终bitmap位图Oracle可以极高效地生成事实表的ROWID,此步骤表现为第七行的"BITMAP CONVERSION TO ROWIDS",我们使用这些ROWID来访问事实表取得少量的"绝对"相关记录。以我们的假设而言最终位图仅有2位为1,只需要用这2行的ROWID从事实表上直接fetch2条记录即可,从而避免了低效的全表扫描。
省略重复连接
因为子查询及位图树只是通过维度表上的过滤条件为事实表过滤掉大量的数据,所以从事实表上获取的相关数据仍可能需要重复一次和维度表的连接。省略重复连接的前提是维度表上所有的谓词都是半连接谓词子查询的一部分,And 由子查询检索到的列均唯一(unique) And 维度表的列不被select或group by涉及。在上例中无需对CHANNELS表再次连接的理由是没有select(或group by)CHANNEL表上的列,且channel_id列是唯一的。
临时表转换
若在已知星型转换中重复连接维度表无法被省略的话,Oracle可以将对维度表的子查询结果集存储到内存中的全局临时表(global temporary table)上以避免重复扫描维度表。此外,因为将子查询的结果集物化了,故而若使用并行查询则每个并行子进程(slave)可以直接从物化结果集的临时表中获得数据,而不需要反复执行子查询。 试看以下示例,了解Oracle是如何利用物化临时表避免反复连接的:
SQL> alter session set star_transformation_enabled=true; Session altered.SQL> alter session set events '10053 trace name context forever,level 1'; Session altered.SELECT "T1"."C1" "CUST_CITY","T"."CALENDAR_QUARTER_DESC" "CALENDAR_QUARTER_DESC",SUM("S"."AMOUNT_SOLD") "SALES_AMOUNT"FROM "SH"."SALES" "S","SH"."TIMES" "T","SYS"."SYS_TEMP_0FD9D660E_1DF5D6" "T1"WHERE "S"."CUST_ID" = ANY (SELECT /*+ SEMIJOIN_DRIVER CACHE_TEMP_TABLE ("T1") */"T1"."C0" "C0"FROM "SYS"."SYS_TEMP_0FD9D660E_1DF5D6" "T1")AND "S"."CHANNEL_ID" = ANY(SELECT /*+ SEMIJOIN_DRIVER */"CH"."CHANNEL_ID" "ITEM_1"FROM "SH"."CHANNELS" "CH"WHERE "CH"."CHANNEL_DESC" = 'Direct Sales')AND "S"."TIME_ID" = ANY(SELECT /*+ SEMIJOIN_DRIVER */"T"."TIME_ID" "ITEM_1"FROM "SH"."TIMES" "T"WHERE "T"."CALENDAR_QUARTER_DESC" = '2000-01'OR "T"."CALENDAR_QUARTER_DESC" = '2000-02'OR "T"."CALENDAR_QUARTER_DESC" = '1999-12')AND "S"."TIME_ID" = "T"."TIME_ID"AND "S"."CUST_ID" = "T1"."C0"AND ("T"."CALENDAR_QUARTER_DESC" = '2000-01' OR"T"."CALENDAR_QUARTER_DESC" = '2000-02' OR"T"."CALENDAR_QUARTER_DESC" = '1999-12')GROUP BY "T1"."C1", "T"."CALENDAR_QUARTER_DESC"以上为启用临时表后的星型转换后的查询语句,相应的执行计划如下: ---------------------------------------------------------------------------+-----------------------------------+---------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop | ---------------------------------------------------------------------------+-----------------------------------+---------------+ | 0 | SELECT STATEMENT | | | | 911 | | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT | | | | | | | | | 3 | TABLE ACCESS FULL | CUSTOMERS | 2438 | 62K | 406 | 00:00:05 | | | | 4 | HASH GROUP BY | | 1254 | 64K | 506 | 00:00:07 | | | | 5 | HASH JOIN | | 1254 | 64K | 479 | 00:00:06 | | | | 6 | HASH JOIN | | 1254 | 45K | 475 | 00:00:06 | | | | 7 | TABLE ACCESS FULL | TIMES | 227 | 3632 | 18 | 00:00:01 | | | | 8 | PARTITION RANGE SUBQUERY | | 1254 | 26K | 456 | 00:00:06 | KEY(SUBQUERY)| KEY(SUBQUERY)| | 9 | TABLE ACCESS BY LOCAL INDEX ROWID | SALES | 1254 | 26K | 456 | 00:00:06 | KEY(SUBQUERY)| KEY(SUBQUERY)| | 10 | BITMAP CONVERSION TO ROWIDS | | | | | | | | | 11 | BITMAP AND | | | | | | | | | 12 | BITMAP MERGE | | | | | | | | | 13 | BITMAP KEY ITERATION | | | | | | | | | 14 | BUFFER SORT | | | | | | | | | 15 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 | 00:00:01 | | | | 16 | BITMAP INDEX RANGE SCAN | SALES_CHANNEL_BIX | | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 17 | BITMAP MERGE | | | | | | | | | 18 | BITMAP KEY ITERATION | | | | | | | | | 19 | BUFFER SORT | | | | | | | | | 20 | TABLE ACCESS FULL | TIMES | 227 | 3632 | 18 | 00:00:01 | | | | 21 | BITMAP INDEX RANGE SCAN | SALES_TIME_BIX | | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 22 | BITMAP MERGE | | | | | | | | | 23 | BITMAP KEY ITERATION | | | | | | | | | 24 | BUFFER SORT | | | | | | | | | 25 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660E_1DF5D6| 2438 | 12K | 4 | 00:00:01 | | | | 26 | BITMAP INDEX RANGE SCAN | SALES_CUST_BIX | | | | | KEY(SUBQUERY)| KEY(SUBQUERY)| | 27 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660E_1DF5D6| 2438 | 36K | 4 | 00:00:01 | | | ---------------------------------------------------------------------------+-----------------------------------+---------------+Predicate Information: ---------------------- 3 - filter("C"."CUST_STATE_PROVINCE"='FL') 5 - access("S"."CUST_ID"="C0") 6 - access("S"."TIME_ID"="T"."TIME_ID") 7 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-12' OR "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02')) 15 - filter("CH"."CHANNEL_DESC"='Direct Sales') 16 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 20 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-12' OR "T"."CALENDAR_QUARTER_DESC"='2000-01' OR "T"."CALENDAR_QUARTER_DESC"='2000-02')) 21 - access("S"."TIME_ID"="T"."TIME_ID") 26 - access("S"."CUST_ID"="C0")
从以上trace中可以看到系统命名的临时表SYS_TEMP_0FD9D660E_1DF5D6缓存CUSTOMERS表,之后原先CUSTOMERS表被SYS_TEMP_0FD9D660E_1DF5D6所取代,原CUSTOMERS表上的cust_id和cust_city列均被替换为别名为T1的临时表的C0和C1列。实际上该临时表也仅需要这2列即可满足计划的需求,所以该临时表以如下查询语句填充:
ST: Subquery text:******* UNPARSED QUERY IS ******* SELECT "C"."CUST_ID" "ITEM_1","C"."CUST_CITY" "ITEM_2" FROM "SH"."CUSTOMERS" "C" WHERE "C"."CUST_STATE_PROVINCE"='FL' Copy query block qb# -1 () : SELECT /*+ CACHE_TEMP_TABLE(T1) */ "C0" FROM "SYS"."SYS_TEMP_0FD9D660E_1DF5D6" T1 ST: Subquery (temp table) text:******* UNPARSED QUERY IS ******* SELECT /*+ CACHE_TEMP_TABLE ("T1") */ "T1"."C0" "C0" FROM "SYS"."SYS_TEMP_0FD9D660E_1DF5D6" "T1" Copy query block qb# -1 () : SELECT /*+ CACHE_TEMP_TABLE(T1) */ "C0", "C1" FROM "SYS"."SYS_TEMP_0FD9D660E_1DF5D6" T1 ST: Join back qbc text:******* UNPARSED QUERY IS ******* SELECT /*+ CACHE_TEMP_TABLE ("T1") */ "T1"."C0" "C0","T1"."C1" "C1" FROM "SYS"."SYS_TEMP_0FD9D660E_1DF5D6" "T1"
可以从以上执行计划中看到第一、二、三行的"TEMP TABLE TRANSFORMATION LOAD AS SELECT TABLE ACCESS FULL CUSTOMERS"看到Oracle是如何将子查询物化为临时表的。在第25行,Oracle直接以该临时表替代了子查询来构建我们所需要的位图。到第27行Oracle直接利用该临时表来重复连接,避免再次扫描customers表。因为我们在构建临时表时已经使用谓词条件(如上面的红字语句),故而我们无需对临时表再次过滤。
如何启用星型查询
星型转换由初始化参数star_transformation_enabled控制,该参数可以有三种选项:
- TRUE: Oracle优化器自动识别语句中的事实表和约束维度表并进行星型转换。这一切优化尝试都在CBO的藩篱内,优化器需要确定转换后的执行计划成本要低于不转换的执行计划;同时优化器还会尝试利用物化的临时表,如果那样真的好的话。
- False: 优化器不会考虑星型转换。
- TEMP_DISABLE:当一个维度表超过100个块时,"如果简单地设置star_transformation_enabled为TRUE来启用星型变换,那么会话会创建一个内存中的全局临时表(global temporary table)来保存已过滤的维度数据,这在过去会造成很多问题;"这里说的100个块其实是隐式参数_temp_tran_block_threshold(number of blocks for a dimension before we temp transform)的默认值,此外隐式参数_temp_tran_cache(determines if temp table is created with cache option,默认为TRUE)决定了这类临时表是否被缓存住;为了避免创建全局临时表可能带来的问题,就可以用到TEMP_DISABLE这个禁用临时表的选项,让优化器不再考虑使用物化的临时表。
默认该参数为False,若要问这是为什么?因为星型转换适用的场景是数据仓库环境中具有星型模型的模式,而且需要事实表的各个连接列上均有良好的索引时才能发挥其优势。如果能确定以上因素,那么我们可以放心的使用星型转换了,把star_transformation_enabled改为true或temp_disable吧!
总结
星型转换可以有效改善大的事实表与多个具有良好选择率的维度表间连接的查询。星型转换有效避免了全表扫描的性能窘境。它只fetch那些事实表上的"绝对"相关行。同时星型转换是基于CBO优化器的,Oracle能很好地认清使用该种转换是否有利。一旦维度表上的过滤无法有效减少需要从事实表上处理的数据集和时,那么可能全表扫描相对而言更为恰当。 以上我们力图通过一些简单的查询和执行计划来诠释星型转换的基本理念,但现实生产环境中实际的查询语句可能要复杂的多;举例而言如果查询涉及星型模型中的多个事实表的话,那么其复杂度就大幅提高了;如何正确构建事实表上的索引,收集相关列上的柱状图信息,在Oracle优化器无法正确判断的情况下循循善诱,都是大型数据仓库环境中DBA所面临的难题。
本文转自maclean_007 51CTO博客,原文链接:http://blog.51cto.com/maclean/1277672
Oracle优化器:星型转换相关推荐
- Oracle优化器:星型转换(Star Query Transformation )
Oracle优化器:星型转换(Star Query Transformation ) Star query是一个事实表(fact table)和一些维度表(dimension)的join.每个维度 ...
- Oracle 12CR2查询转换之星型转换
星型转换是一种优化转换它用来避免对星型方案中的事实表进行全表扫描.一个星型方案将数据分成事实与维度表.事实是对一个事件比如销售的测量通常是数字.维度是标识事实的分类,比如日期,位置与产品.一个事实表有 ...
- Oracle 优化器
什么是优化器 优化器是Oracle中的一个核心模块,它的作用是为用户输入的SQL选择一个它计算出来的最高效的执行计划.SQL语句在Oracle中的执行过程如下图所示: 基于规则优化器RBO 基于规则的 ...
- mysql和oracle优化器_Oracle 优化器
一.优化器基本知识 Oracle在执行一个SQL之前,首先要分析一下语句的执行计划,然后再按执行计划去执行.分析语句的执行计划的工作是由优化器(Optimizer)来完成的.不同的情况,一条SQL可能 ...
- 【云和恩墨大讲堂】从执行计划洞察ORACLE优化器的“小聪明”
作者简介 黄浩 惠普 十年一剑,十年磨砺.3年通信行业,写就近3万条SQL:5年制造行业,遨游在ETL的浪潮:2年性能优化,厚积薄发自成一家 主题介绍: Oracle执行计划的另类解读:调皮的执行计 ...
- oracle stalestats_深入理解oracle优化器统计数据(Optimizer Statistics)
理解oracle优化器统计数据 首先来介绍oracle数据库使用基于规则优化器(RBO)来决定如何执行一个sql语句.基于规则优化器顾名思义,它是遵循一组规则来判断一个sql语句的执行计划.这组规则是 ...
- Oracle 优化器_表连接
概述 在写SQL的时候,有时候涉及到的不仅只有一个表,这个时候,就需要表连接了.Oracle优化器处理SQL语句时,根据SQL语句,确定表的连接顺序(谁是驱动表,谁是被驱动表及 哪个表先和哪个表做链接 ...
- 将oracle优化器改为CBO,【DB.Oracle】Oracle 优化器 (RBO, CBO)
Oracle优化器 Oracle数据库中的优化器又叫查询优化器(Query Optimizer).它是SQL分析和执行的优化工具,它负责生成.制定SQL的执行计划.Oracle的优化器有两种,基于规则 ...
- oracle默认的优化器,Oracle优化器相关参数设置
oracle优化器相关参数设置 Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Prod ...
最新文章
- go系列(5)- beego自己写controller
- 列表推导式 python原理_Python进阶-列表推导式详解总结
- 虚拟化部署之灵活应用Hyper-V快照
- php 价钱计算,php公式计算
- leetcode130. 被围绕的区域
- teradata 数据定义
- 剁手党也有春天 -- 淘宝 UWP ”比较“功能诞生记
- 解决eclipse中mybatis的xml配置文件无代码提示问题
- 有关 strongSwan 的英文文档
- maven插件打包exec_Exec Maven插件–从Maven Build运行Java程序
- 《数学模型(第五版)》简记
- MySQL聚集索引与非聚集索引相关总结
- GTK+:GTK+的简介、安装、使用方法之详细攻略
- ce游戏逆向修改之植物大战僵尸
- 天龙微信游戏找服务器,《天龙八部3》微信公众号积分系统维护公告
- 视频工作室必备利器——群晖NAS
- POJ - 3067
- Oracle中用户角色权限管理
- TR069协议向导—— 一个帮助你了解TR069协议的简明教程(二)
- 【DP】CF940E Cashback