Oracle SQL高级编程——分析函数(窗口函数)全面讲解

概述

分析函数是以一定的方法在一个与当前行相关的结果子集中进行计算,也称为窗口函数。

一般结构为

Function(arg1 , arg2 ……) over(partition by clause order by clause windowing clause )

Windowing clause : rows | range between start_expr and end_expr

Start_expr is unbounded preceding | current row | n preceding | n following

End_expr is unbounded following | current row | n preceding | n following

不是所有的分析函数都支持开窗子句。

创建测试表

create table sales_fact as

select country_name country , country_subregion region , prod_name product , calendar_year year , calendar_week_number week ,

sum(amount_sold) sale , sum(amount_sold*

(case when mod(rownum , 10 ) = 0 then 1.4

when mod(rownum , 5)= 0 then 0.6

when mod(rownum , 2)= 0 then 0.9

when mod(rownum , 2)=1 then 1.2

else 1 end ) ) receipts

from sales , times , customers , countries , products

where sales.time_id = times.time_id and

sales.prod_id = products.prod_id and

sales.cust_id = customers.cust_id and

customers.country_id = countries.country_id

group by country_name , country_subregion , prod_name , calendar_year , calendar_week_number ;

把聚合函数当作分析函数使用分析函数列只是一列数值,每一行对应一个值,对于查询的其它方面没有任何影响。

从以下查询可以得出以下几点:

1.over分区条件中的列可以不在select列表中,但是必须在数据源中。

2.over排序条件中的列可以不在select列表中,但是必须在数据源中。

3.over排序条件是对所在分区中的数据进行排序,与select语句中的排序无关。但是会影响到分析函数的结果。

4.over中的开窗条件的范围一般仅限于分区本身。rows between unbounded preceding and current row表示从分区的最开始到当前行。

5.分析函数的数据来自结果集(施加了where条件之后的)。

下面的查询中的分析列表示该年从开始到该周的销售累计。

SH@ prod> select year , week , sale ,

sum(sale) over( partition by region , year order by week rowsbetweenunbounded precedingandcurrentrow ) running_sum_ytd

from sales_fact

where country in ('Australia') and product='Xtend Memory' and week < 10

order by year , week ;

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

1998 1 58.15 58.15

1998 2 29.39 87.54

1998 3 29.49 117.03

1998 4 29.49 146.52

1998 5 29.8 176.32

1998 6 58.78 235.1

1998 9 58.78 293.88

1999 1 53.52 53.52

1999 3 94.6 148.12

1999 4 40.5 188.62

1999 5 80.01 268.63

1999 6 40.5 309.13

1999 8 103.11 412.24

1999 9 53.34 465.58

2000 1 46.7 46.7

2000 3 93.41 140.11

2000 4 46.54 186.65

2000 5 46.7 233.35

2000 7 70.8 304.15

2000 8 46.54 350.69

2001 1 92.26 92.26

2001 2 118.38 210.64

2001 3 47.24 257.88

2001 4 256.7 514.58

2001 5 93.44 608.02

2001 6 22.44 630.46

2001 7 69.96 700.42

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

2001 8 46.06 746.48

2001 9 92.67 839.15

29 rows selected.

结果与上面相同,只是排序不同方式,分析列看起来就没有规律了。

SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by week rowsbetweenunbounded precedingandcurrentrow ) running_sum_ytdfrom sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , sale ;

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

1998 2 29.39 87.54

1998 4 29.49 146.52

1998 3 29.49 117.03

1998 5 29.8 176.32

1998 1 58.15 58.15

1998 6 58.78 235.1

1998 9 58.78 293.88

1999 4 40.5 188.62

1999 6 40.5 309.13

1999 9 53.34 465.58

1999 1 53.52 53.52

1999 5 80.01 268.63

1999 3 94.6 148.12

1999 8 103.11 412.24

2000 4 46.54 186.65

2000 8 46.54 350.69

2000 1 46.7 46.7

2000 5 46.7 233.35

2000 7 70.8 304.15

2000 3 93.41 140.11

2001 6 22.44 630.46

2001 8 46.06 746.48

2001 3 47.24 257.88

2001 7 69.96 700.42

2001 1 92.26 92.26

2001 9 92.67 839.15

2001 5 93.44 608.02

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

2001 2 118.38 210.64

2001 4 256.7 514.58

29 rows selected.

分区中的排序选取不恰当,则分析列结果没有什么意义了。分区开窗排序的选取与分析列的结果密切相关。

SH@ prod> select year , week , sale , sum(sale) over( partition by region , year order by salerowsbetweenunbounded precedingandcurrentrow ) running_sum_ytd

from sales_fact

where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , week ;

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

1998 1 58.15 176.32

1998 2 29.39 29.39

1998 3 29.49 88.37

1998 4 29.49 58.88

1998 5 29.8 118.17

1998 6 58.78 235.1

1998 9 58.78 293.88

1999 1 53.52 187.86

1999 3 94.6 362.47

1999 4 40.5 40.5

1999 5 80.01 267.87

1999 6 40.5 81

1999 8 103.11 465.58

1999 9 53.34 134.34

2000 1 46.7 186.48

2000 3 93.41 350.69

2000 4 46.54 46.54

2000 5 46.7 139.78

2000 7 70.8 257.28

2000 8 46.54 93.08

2001 1 92.26 277.96

2001 2 118.38 582.45

2001 3 47.24 115.74

2001 4 256.7 839.15

2001 5 93.44 464.07

2001 6 22.44 22.44

2001 7 69.96 185.7

YEAR WEEK SALE RUNNING_SUM_YTD

---------- ---------- ---------- ---------------

2001 8 46.06 68.5

2001 9 92.67 370.63

29 rows selected.

分析函数的执行计划虽然有分析函数还是只需要一次全表扫描,但是需要排序。

WINDOW SORT是分析函数的典型特征。

SH@ prod> explain plan for select year , week , sale , sum(sale) over( partition by region , year order by sale rowsbetweenunbounded precedingandcurrentrow ) running_sum_ytdfrom sales_fact where country in ('Australia') and product='Xtend Memory' and week < 10 order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display()) ;

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------

Plan hash value: 173857439

----------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 18 | 1890 | 311 (1)| 00:00:04 |

| 1 | SORT ORDER BY | | 18 | 1890 | 311 (1)| 00:00:04 |

| 2 | WINDOW SORT | | 18 | 1890 | 311 (1)| 00:00:04 |

|* 3 | TABLE ACCESS FULL| SALES_FACT | 18 | 1890 | 309 (1)| 00:00:04 |

----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND

"WEEK"<10)

Note

-----

- dynamic sampling used for this statement (level=2) 说明该表还没有统计信息。

20 rows selected.

不加分析列,只是少了一步window sort。

SH@ prod> explain plan for

2 select year , week , sale

3 from sales_fact

4 where country in ('Australia') and product='Xtend Memory' and week < 10

5 order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display()) ;

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1978576542

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 18 | 1584 | 310 (1)| 00:00:04 |

| 1 | SORT ORDER BY | | 18 | 1584 | 310 (1)| 00:00:04 |

|* 2 | TABLE ACCESS FULL| SALES_FACT | 18 | 1584 | 309 (1)| 00:00:04 |

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND

"WEEK"<10)

Note

-----

- dynamic sampling used for this statement (level=2)

19 rows selected.

如何使窗口充满整个分区SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year

2 order by week

3 rows between unbounded preceding and unbounded following )

4 max_sale

5 from sales_fact

6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

7 order by product , country , year , week ;

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

1998 1 58.15 58.78

1998 2 29.39 58.78

1998 3 29.49 58.78

1998 4 29.49 58.78

1998 5 29.8 58.78

1998 6 58.78 58.78

1998 9 58.78 58.78

1999 1 53.52 103.11

1999 3 94.6 103.11

1999 4 40.5 103.11

1999 5 80.01 103.11

1999 6 40.5 103.11

1999 8 103.11 103.11

1999 9 53.34 103.11

2000 1 46.7 93.41

2000 3 93.41 93.41

2000 4 46.54 93.41

2000 5 46.7 93.41

2000 7 70.8 93.41

2000 8 46.54 93.41

2001 1 92.26 256.7

2001 2 118.38 256.7

2001 3 47.24 256.7

2001 4 256.7 256.7

2001 5 93.44 256.7

2001 6 22.44 256.7

2001 7 69.96 256.7

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

2001 8 46.06 256.7

2001 9 92.67 256.7

29 rows selected.

两个边界都滑动的窗口

下面语句的窗口是往前两周,加往后两周,加当前周,一共五周。(到达边界时窗口会自动缩小)

SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year

2 order by week

3 rows between 2 preceding and 2 following )

4 max_sale

5 from sales_fact

6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

7 order by product , country , year , week ;

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

1998 1 58.15 58.15

1998 2 29.39 58.15

1998 3 29.49 58.15

1998 4 29.49 58.78

1998 5 29.8 58.78

1998 6 58.78 58.78

1998 9 58.78 58.78

1999 1 53.52 94.6

1999 3 94.6 94.6

1999 4 40.5 94.6

1999 5 80.01 103.11

1999 6 40.5 103.11

1999 8 103.11 103.11

1999 9 53.34 103.11

2000 1 46.7 93.41

2000 3 93.41 93.41

2000 4 46.54 93.41

2000 5 46.7 93.41

2000 7 70.8 70.8

2000 8 46.54 70.8 这里只所以是70.8因为窗口缩小了。

2001 1 92.26 118.38

2001 2 118.38 256.7

2001 3 47.24 256.7

2001 4 256.7 256.7

2001 5 93.44 256.7

2001 6 22.44 256.7

2001 7 69.96 93.44

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

2001 8 46.06 92.67

2001 9 92.67 92.67

29 rows selected.

默认窗口是什么?一看便知。

SH@ prod> select year , week , sale , max(sale) over(partition by product , country , region , year

2 order by week )

3 max_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

1998 1 58.15 58.15

1998 2 29.39 58.15

1998 3 29.49 58.15

1998 4 29.49 58.15

1998 5 29.8 58.15

1998 6 58.78 58.78

1998 9 58.78 58.78

1999 1 53.52 53.52

1999 3 94.6 94.6

1999 4 40.5 94.6

1999 5 80.01 94.6

1999 6 40.5 94.6

1999 8 103.11 103.11

1999 9 53.34 103.11

2000 1 46.7 46.7

2000 3 93.41 93.41

2000 4 46.54 93.41

2000 5 46.7 93.41

2000 7 70.8 93.41

2000 8 46.54 93.41

2001 1 92.26 92.26

2001 2 118.38 118.38

2001 3 47.24 118.38

2001 4 256.7 256.7

2001 5 93.44 256.7

2001 6 22.44 256.7

2001 7 69.96 256.7

YEAR WEEK SALE MAX_SALE

---------- ---------- ---------- ----------

2001 8 46.06 256.7

2001 9 92.67 256.7

29 rows selected.

Lead和Lag(不支持开窗的函数)有开窗语句时会报这样的错

2

3

4

rows between 2 preceding and 2 following )

*

ERROR at line 3:

ORA-00907: missing right parenthesis

LEAD是求下一个,而不是前一个。在分区的下边界处,LEAD处回空值。

SH@ prod> select year , week , sale , lead(sale) over(partition by product , country , region , year

2 order by week )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 29.39

1998 2 29.39 29.49

1998 3 29.49 29.49

1998 4 29.49 29.8

1998 5 29.8 58.78

1998 6 58.78 58.78

1998 9 58.78

1999 1 53.52 94.6

1999 3 94.6 40.5

1999 4 40.5 80.01

1999 5 80.01 40.5

1999 6 40.5 103.11

1999 8 103.11 53.34

1999 9 53.34

2000 1 46.7 93.41

2000 3 93.41 46.54

2000 4 46.54 46.7

2000 5 46.7 70.8

2000 7 70.8 46.54

2000 8 46.54

2001 1 92.26 118.38

2001 2 118.38 47.24

2001 3 47.24 256.7

2001 4 256.7 93.44

2001 5 93.44 22.44

2001 6 22.44 69.96

2001 7 69.96 46.06

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 92.67

2001 9 92.67

29 rows selected.

LAG求上一个,也就是前一个。在分区的上边界处返回空值。

SH@ prod> select year , week , sale , lag(sale) over(partition by product , country , region , year

2 order by week )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15

1998 2 29.39 58.15

1998 3 29.49 29.39

1998 4 29.49 29.49

1998 5 29.8 29.49

1998 6 58.78 29.8

1998 9 58.78 58.78

1999 1 53.52

1999 3 94.6 53.52

1999 4 40.5 94.6

1999 5 80.01 40.5

1999 6 40.5 80.01

1999 8 103.11 40.5

1999 9 53.34 103.11

2000 1 46.7

2000 3 93.41 46.7

2000 4 46.54 93.41

2000 5 46.7 46.54

2000 7 70.8 46.7

2000 8 46.54 70.8

2001 1 92.26

2001 2 118.38 92.26

2001 3 47.24 118.38

2001 4 256.7 47.24

2001 5 93.44 256.7

2001 6 22.44 93.44

2001 7 69.96 22.44

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 69.96

2001 9 92.67 46.06

29 rows selected.

复杂的Lead和LagLead和lag函数的第一参数为返回的列,第二参数为相隔行数(非负),第三个参数为不存在时的默认值(可以指定为当前行的值)。

SH@ prod> select year , week , sale , lag(sale , 2 , 0 ) over(partition by product , country , region , year

2 order by week )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 0

1998 2 29.39 0

1998 3 29.49 58.15

1998 4 29.49 29.39

1998 5 29.8 29.49

1998 6 58.78 29.49

1998 9 58.78 29.8

1999 1 53.52 0

1999 3 94.6 0

1999 4 40.5 53.52

1999 5 80.01 94.6

1999 6 40.5 40.5

1999 8 103.11 80.01

1999 9 53.34 40.5

2000 1 46.7 0

2000 3 93.41 0

2000 4 46.54 46.7

2000 5 46.7 93.41

2000 7 70.8 46.54

2000 8 46.54 46.7

2001 1 92.26 0

2001 2 118.38 0

2001 3 47.24 92.26

2001 4 256.7 118.38

2001 5 93.44 47.24

2001 6 22.44 256.7

2001 7 69.96 93.44

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 22.44

2001 9 92.67 69.96

29 rows selected.

将默认值指定为当前行的值。

SH@ prod> select year , week , sale , lag(sale , 2 , sale ) over(partition by product , country , region , year

2 order by week )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 58.15

1998 2 29.39 29.39

1998 3 29.49 58.15

1998 4 29.49 29.39

1998 5 29.8 29.49

1998 6 58.78 29.49

1998 9 58.78 29.8

1999 1 53.52 53.52

1999 3 94.6 94.6

1999 4 40.5 53.52

1999 5 80.01 94.6

1999 6 40.5 40.5

1999 8 103.11 80.01

1999 9 53.34 40.5

2000 1 46.7 46.7

2000 3 93.41 93.41

2000 4 46.54 46.7

2000 5 46.7 93.41

2000 7 70.8 46.54

2000 8 46.54 46.7

2001 1 92.26 92.26

2001 2 118.38 118.38

2001 3 47.24 92.26

2001 4 256.7 118.38

2001 5 93.44 47.24

2001 6 22.44 256.7

2001 7 69.96 93.44

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 22.44

2001 9 92.67 69.96

29 rows selected.

LEAD与LAG关于数据缺口的问题

LAG(sale , 10 ) 这表示与它相隔10行的数据,可是我想访问的10周前的数据。如果中间数据有缺口会出现严重的问题。

FIRST_VALUE和LAST_VALUE这两个函数都可以与order by条件配合得到最大值和最小值。

First_value返回窗口中的第一个值。Ignore nulls表示忽略空值,如果第一个是空值返回第二个。

SH@ prod> select year , week , sale , first_value(sale ignore nulls) over(partition by product , country , region , year

2 order by week

3 rows between unbounded preceding and unbounded following )

4 former_sale

5 from sales_fact

6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

7 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 58.15

1998 2 29.39 58.15

1998 3 29.49 58.15

1998 4 29.49 58.15

1998 5 29.8 58.15

1998 6 58.78 58.15

1998 9 58.78 58.15

1999 1 53.52 53.52

1999 3 94.6 53.52

1999 4 40.5 53.52

1999 5 80.01 53.52

1999 6 40.5 53.52

1999 8 103.11 53.52

1999 9 53.34 53.52

2000 1 46.7 46.7

2000 3 93.41 46.7

2000 4 46.54 46.7

2000 5 46.7 46.7

2000 7 70.8 46.7

2000 8 46.54 46.7

2001 1 92.26 92.26

2001 2 118.38 92.26

2001 3 47.24 92.26

2001 4 256.7 92.26

2001 5 93.44 92.26

2001 6 22.44 92.26

2001 7 69.96 92.26

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 92.26

2001 9 92.67 92.26

29 rows selected.

Last_value返回窗口中的最后一个值。Respect nulls表示识别空值,如果最后一个是空值也将其返回。

SH@ prod> select year , week , sale , last_value(sale respect nulls) over(partition by product , country , region , year

2 order by week

3 rows between unbounded preceding and unbounded following )

4 former_sale

5 from sales_fact

6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

7 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 58.78

1998 2 29.39 58.78

1998 3 29.49 58.78

1998 4 29.49 58.78

1998 5 29.8 58.78

1998 6 58.78 58.78

1998 9 58.78 58.78

1999 1 53.52 53.34

1999 3 94.6 53.34

1999 4 40.5 53.34

1999 5 80.01 53.34

1999 6 40.5 53.34

1999 8 103.11 53.34

1999 9 53.34 53.34

2000 1 46.7 46.54

2000 3 93.41 46.54

2000 4 46.54 46.54

2000 5 46.7 46.54

2000 7 70.8 46.54

2000 8 46.54 46.54

2001 1 92.26 92.67

2001 2 118.38 92.67

2001 3 47.24 92.67

2001 4 256.7 92.67

2001 5 93.44 92.67

2001 6 22.44 92.67

2001 7 69.96 92.67

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 92.67

2001 9 92.67 92.67

29 rows selected.

NTH_VALUE访问分区别的任意指定行FIRST_VALUE相当于NTH_VALUE(sale , 1 )或者NTH_VALUE(sale , 1 )from first respect nulls。

可以与排序配合求第几大,第几小。

SH@ prod> select year , week , sale , nth_value(sale , 1 ) from last ignore nulls over(partition by product , country , region , year

2 order by week

3 rows between unbounded preceding and unbounded following )

4 former_sale

5 from sales_fact

6 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

7 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 58.78

1998 2 29.39 58.78

1998 3 29.49 58.78

1998 4 29.49 58.78

1998 5 29.8 58.78

1998 6 58.78 58.78

1998 9 58.78 58.78

1999 1 53.52 53.34

1999 3 94.6 53.34

1999 4 40.5 53.34

1999 5 80.01 53.34

1999 6 40.5 53.34

1999 8 103.11 53.34

1999 9 53.34 53.34

2000 1 46.7 46.54

2000 3 93.41 46.54

2000 4 46.54 46.54

2000 5 46.7 46.54

2000 7 70.8 46.54

2000 8 46.54 46.54

2001 1 92.26 92.67

2001 2 118.38 92.67

2001 3 47.24 92.67

2001 4 256.7 92.67

2001 5 93.44 92.67

2001 6 22.44 92.67

2001 7 69.96 92.67

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 92.67

2001 9 92.67 92.67

29 rows selected.

RANK函数(不能开窗,作用于整个分区)必须有排序条件,rank就是根据order by条件中的列来定排名的。

RANK函数的排名中,如果出现并列,排名将不连续。

如:1 2(2) 4 5 6 7 8 9 。 如果有两个第二名,那么第三名就不存在了。

请注意空值,在排序子句中可以使用NULLS LAST来把空值放在最后面

SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year

2 order by sale )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 5 没有3

1998 2 29.39 1

1998 3 29.49 2

1998 4 29.49 2

1998 5 29.8 4

1998 6 58.78 6

1998 9 58.78 6

1999 1 53.52 4

1999 3 94.6 6

1999 4 40.5 1

1999 5 80.01 5

1999 6 40.5 1

1999 8 103.11 7

1999 9 53.34 3

2000 1 46.7 3

2000 3 93.41 6

2000 4 46.54 1

2000 5 46.7 3

2000 7 70.8 5

2000 8 46.54 1

2001 1 92.26 5

2001 2 118.38 8

2001 3 47.24 3

2001 4 256.7 9

2001 5 93.44 7

2001 6 22.44 1

2001 7 69.96 4

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 2

2001 9 92.67 6

29 rows selected.

DENSE_RANK(与RANK的区别在于排名一是连续的)SH@ prod> select year , week , sale , dense_rank() over(partition by product , country , region , year

2 order by sale )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , week ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 1 58.15 4 第三名是存在的

1998 2 29.39 1

1998 3 29.49 2

1998 4 29.49 2

1998 5 29.8 3

1998 6 58.78 5

1998 9 58.78 5

1999 1 53.52 3

1999 3 94.6 5

1999 4 40.5 1

1999 5 80.01 4

1999 6 40.5 1

1999 8 103.11 6

1999 9 53.34 2

2000 1 46.7 2

2000 3 93.41 4

2000 4 46.54 1

2000 5 46.7 2

2000 7 70.8 3

2000 8 46.54 1

2001 1 92.26 5

2001 2 118.38 8

2001 3 47.24 3

2001 4 256.7 9

2001 5 93.44 7

2001 6 22.44 1

2001 7 69.96 4

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 8 46.06 2

2001 9 92.67 6

29 rows selected.

ROW_NUMBER(不支持开窗,不确定性函数)

为分区中的每一行指定一个递增的编号,如果排序的列的值相同,谁先谁后是随机的。SH@ prod> select year , week , sale , row_number() over(partition by product , country , region , year

2 order by sale )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , sale ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 2 29.39 1

1998 4 29.49 2

1998 3 29.49 3

1998 5 29.8 4

1998 1 58.15 5

1998 6 58.78 6

1998 9 58.78 7

1999 4 40.5 1

1999 6 40.5 2

1999 9 53.34 3

1999 1 53.52 4

1999 5 80.01 5

1999 3 94.6 6

1999 8 103.11 7

2000 4 46.54 1

2000 8 46.54 2

2000 5 46.7 3

2000 1 46.7 4

2000 7 70.8 5

2000 3 93.41 6

2001 6 22.44 1

2001 8 46.06 2

2001 3 47.24 3

2001 7 69.96 4

2001 1 92.26 5

2001 9 92.67 6

2001 5 93.44 7

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 2 118.38 8

2001 4 256.7 9

29 rows selected.

Ratio_to_report(当前行的值与分区总和的比值)

这个函数不支持排序和开窗。

求各周的销量在每年中的比例以及在整个产品销量中的比例。SH@ prod> select year , week , sale ,

2 trunc(100* ratio_to_report(sale) over(partition by year ) , 2) sales_yr ,

3 trunc(100* ratio_to_report(sale) over() , 2 ) sales_prod

4 from sales_fact

5 where country in ('Australia') and product = 'Xtend Memory' and week < 10

6 order by year , week ;

YEAR WEEK SALE SALES_YR SALES_PROD

---------- ---------- ---------- ---------- ----------

1998 1 58.15 19.78 2.98

1998 2 29.39 10 1.5

1998 3 29.49 10.03 1.51

1998 4 29.49 10.03 1.51

1998 5 29.8 10.14 1.52

1998 6 58.78 20 3.01

1998 9 58.78 20 3.01

1999 1 53.52 11.49 2.74

1999 3 94.6 20.31 4.85

1999 4 40.5 8.69 2.07

1999 5 80.01 17.18 4.1

1999 6 40.5 8.69 2.07

1999 8 103.11 22.14 5.28

1999 9 53.34 11.45 2.73

2000 1 46.7 13.31 2.39

2000 3 93.41 26.63 4.79

2000 4 46.54 13.27 2.38

2000 5 46.7 13.31 2.39

2000 7 70.8 20.18 3.63

2000 8 46.54 13.27 2.38

2001 1 92.26 10.99 4.73

2001 2 118.38 14.1 6.07

2001 3 47.24 5.62 2.42

2001 4 256.7 30.59 13.16

2001 5 93.44 11.13 4.79

2001 6 22.44 2.67 1.15

2001 7 69.96 8.33 3.58

YEAR WEEK SALE SALES_YR SALES_PROD

---------- ---------- ---------- ---------- ----------

2001 8 46.06 5.48 2.36

2001 9 92.67 11.04 4.75

29 rows selected.

Percent_rank(排在前百分之几)

用来求当前行的排名的相对百分位置。

比如你对人说自己是第10名,别人可能觉得没什么,如果是100000中的第10名,那就是前1/10000,那就非常牛了。

这个函数与RANK的推导公式为:

PERCENT_RANK = (RANK - 1) / (N – 1) , N代表总行数。

RANK – 1代表排名大于自己的人数。

N – 1代表除自己以外的总人数。

总体的意思是除自己之外的其它中人,排名比自己高的人所占的比例。

SH@ prod> select year , week , sale , rank() over(partition by product , country , region , year

2 order by sale )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , sale ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 2 29.39 1

1998 4 29.49 2

1998 3 29.49 2

1998 5 29.8 4

1998 1 58.15 5

1998 6 58.78 6

1998 9 58.78 6

1999 4 40.5 1

1999 6 40.5 1

1999 9 53.34 3

1999 1 53.52 4

1999 5 80.01 5

1999 3 94.6 6

1999 8 103.11 7

2000 4 46.54 1

2000 8 46.54 1

2000 5 46.7 3

2000 1 46.7 3

2000 7 70.8 5

2000 3 93.41 6

2001 6 22.44 1

2001 8 46.06 2

2001 3 47.24 3

2001 7 69.96 4

2001 1 92.26 5

2001 9 92.67 6

2001 5 93.44 7

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 2 118.38 8

2001 4 256.7 9

29 rows selected.

SH@ prod> select year , week , sale , 100*percent_rank() over(partition by product , country , region , year

2 order by sale )

3 former_sale

4 from sales_fact

5 where country in ( 'Australia') and product = 'Xtend Memory' and week < 10

6 order by product , country , year , sale ;

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

1998 2 29.39 0

1998 4 29.49 16.6666667

1998 3 29.49 16.6666667

1998 5 29.8 50

1998 1 58.15 66.6666667

1998 6 58.78 83.3333333

1998 9 58.78 83.3333333

1999 4 40.5 0

1999 6 40.5 0

1999 9 53.34 33.3333333

1999 1 53.52 50

1999 5 80.01 66.6666667

1999 3 94.6 83.3333333

1999 8 103.11 100

2000 4 46.54 0

2000 8 46.54 0

2000 5 46.7 40

2000 1 46.7 40

2000 7 70.8 80

2000 3 93.41 100

2001 6 22.44 0

2001 8 46.06 12.5

2001 3 47.24 25

2001 7 69.96 37.5

2001 1 92.26 50

2001 9 92.67 62.5

2001 5 93.44 75

YEAR WEEK SALE FORMER_SALE

---------- ---------- ---------- -----------

2001 2 118.38 87.5

2001 4 256.7 100

29 rows selected.

Percentile_cont(大体意思求排在某个百分比时所需的数值)

也可以说是,现在说这样一个值,向分区里面插入这个值,其排名在百分之N(percent_rank为N%),求这个值。

如果有一个行的percent_rank正好等于N,那么就是这个么的值。如果没有匹配的,则要计算概率最大的。

SH@ prod> select year , week , sale ,

2 percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc ,

3 percent_rank() over( partition by year order by sale desc ) pr

4 from sales_fact

5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

1998 10 117.76 43.975 0

1998 9 58.78 43.975 .142857143

1998 6 58.78 43.975 .142857143

1998 1 58.15 43.975 .428571429

1998 5 29.8 43.975 .571428571

1998 3 29.49 43.975 .714285714

1998 4 29.49 43.975 .714285714

1998 2 29.39 43.975 1

1999 8 103.11 62.76 0

1999 3 94.6 62.76 .142857143

1999 5 80.01 62.76 .285714286

1999 10 72 62.76 .428571429

1999 1 53.52 62.76 .571428571

1999 9 53.34 62.76 .714285714

1999 6 40.5 62.76 .857142857

1999 4 40.5 62.76 .857142857

2000 3 93.41 46.7 0

2000 7 70.8 46.7 .2

2000 5 46.7 46.7 .4

2000 1 46.7 46.7 .4

2000 4 46.54 46.7 .8

2000 8 46.54 46.7 .8

2001 4 256.7 81.11 0

2001 2 118.38 81.11 .111111111

2001 5 93.44 81.11 .222222222

2001 9 92.67 81.11 .333333333

2001 1 92.26 81.11 .444444444

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

2001 7 69.96 81.11 .555555556

2001 10 69.05 81.11 .666666667

2001 3 47.24 81.11 .777777778

2001 8 46.06 81.11 .888888889

2001 6 22.44 81.11 1

32 rows selected.

Percentile_disc(功能与Percentile_cont大体相同)

区别在于这个函数取到的值一定是在这个分区的行中的。

如果没有匹配的,Percentile_disc会按照排序取上一个。

SH@ prod> select year , week , sale ,

2 percentile_disc(0.5) within group(order by sale desc )over(partition by year) pc ,

3 percent_rank() over( partition by year order by sale desc ) pr

4 from sales_fact

5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

1998 10 117.76 58.15 0

1998 9 58.78 58.15 .142857143

1998 6 58.78 58.15 .142857143

1998 1 58.15 58.15 .428571429

1998 5 29.8 58.15 .571428571

1998 3 29.49 58.15 .714285714

1998 4 29.49 58.15 .714285714

1998 2 29.39 58.15 1

1999 8 103.11 72 0

1999 3 94.6 72 .142857143

1999 5 80.01 72 .285714286

1999 10 72 72 .428571429

1999 1 53.52 72 .571428571

1999 9 53.34 72 .714285714

1999 6 40.5 72 .857142857

1999 4 40.5 72 .857142857

2000 3 93.41 46.7 0

2000 7 70.8 46.7 .2

2000 5 46.7 46.7 .4

2000 1 46.7 46.7 .4

2000 4 46.54 46.7 .8

2000 8 46.54 46.7 .8

2001 4 256.7 92.26 0

2001 2 118.38 92.26 .111111111

2001 5 93.44 92.26 .222222222

2001 9 92.67 92.26 .333333333

2001 1 92.26 92.26 .444444444

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

2001 7 69.96 92.26 .555555556

2001 10 69.05 92.26 .666666667

2001 3 47.24 92.26 .777777778

2001 8 46.06 92.26 .888888889

2001 6 22.44 92.26 1

32 rows selected.

SH@ prod> select year , week , sale ,

2 percentile_cont(0.5) within group(order by sale desc )over(partition by year) pc ,

3 percent_rank() over( partition by year order by sale desc ) pr

4 from sales_fact

5 where country in ('Australia') and product = 'Xtend Memory' and week < 11 ;

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

1998 10 117.76 43.975 0

1998 9 58.78 43.975 .142857143

1998 6 58.78 43.975 .142857143

1998 1 58.15 43.975 .428571429

1998 5 29.8 43.975 .571428571

1998 3 29.49 43.975 .714285714

1998 4 29.49 43.975 .714285714

1998 2 29.39 43.975 1

1999 8 103.11 62.76 0

1999 3 94.6 62.76 .142857143

1999 5 80.01 62.76 .285714286

1999 10 72 62.76 .428571429

1999 1 53.52 62.76 .571428571

1999 9 53.34 62.76 .714285714

1999 6 40.5 62.76 .857142857

1999 4 40.5 62.76 .857142857

2000 3 93.41 46.7 0

2000 7 70.8 46.7 .2

2000 5 46.7 46.7 .4

2000 1 46.7 46.7 .4

2000 4 46.54 46.7 .8

2000 8 46.54 46.7 .8

2001 4 256.7 81.11 0

2001 2 118.38 81.11 .111111111

2001 5 93.44 81.11 .222222222

2001 9 92.67 81.11 .333333333

2001 1 92.26 81.11 .444444444

YEAR WEEK SALE PC PR

---------- ---------- ---------- ---------- ----------

2001 7 69.96 81.11 .555555556

2001 10 69.05 81.11 .666666667

2001 3 47.24 81.11 .777777778

2001 8 46.06 81.11 .888888889

2001 6 22.44 81.11 1

32 rows selected.

NTILE(类型于建立直方图,不支持开窗)

将排序后的数据均匀分配到指定个数据桶中,返回桶编号,如果不能等分,各个桶中的行数最多相差一行。

在以后的处理中可以通过去除首桶或尾去除异常值。

注意:并不是按值分配的。

SH@ prod> select year , week , sale ,

2 ntile(10) over(order by sale ) group#

3 from sales_fact

4 where country in ('Australia') and product = 'Xtend Memory' and year = 1998 order by year , sale;

YEAR WEEK SALE GROUP#

---------- ---------- ---------- ----------

1998 50 28.76 1

1998 2 29.39 1

1998 4 29.49 1

1998 3 29.49 1

1998 5 29.8 2

1998 43 57.52 2

1998 35 57.52 2

1998 40 57.52 2

1998 46 57.52 3

1998 27 57.52 3

1998 45 57.52 3

1998 44 57.52 3

1998 47 57.72 4

1998 29 57.72 4

1998 28 57.72 4

1998 1 58.15 4

1998 41 58.32 5

1998 51 58.32 5

1998 14 58.78 5

1998 9 58.78 5

1998 15 58.78 6

1998 17 58.78 6

1998 6 58.78 6

1998 19 58.98 6

1998 21 59.6 7

1998 12 59.6 7

1998 52 86.38 7

YEAR WEEK SALE GROUP#

---------- ---------- ---------- ----------

1998 34 115.44 8

1998 39 115.84 8

1998 42 115.84 8

1998 38 115.84 9

1998 23 117.56 9

1998 18 117.56 9

1998 26 117.56 10

1998 10 117.76 10

1998 48 172.56 10

36 rows selected.

Stddev计算标准差(方差的平方根,支持开窗)SH@ prod> select year , week , sale ,

2 stddev(sale) over(

3 partition by product , country , region , year

4 order by sale desc

5 rows between 2 preceding and 2 following ) stddv

6 from sales_fact

7 where country in ('Australia') and product = 'Xtend Memory' and week < 10

8 order by year , week ;

YEAR WEEK SALE STDDV

---------- ---------- ---------- ----------

1998 1 58.15 15.8453416

1998 2 29.39 .057735027

1998 3 29.49 .178021534

1998 4 29.49 12.7945918

1998 5 29.8 15.815738

1998 6 58.78 .36373067

1998 9 58.78 14.3880654

1999 1 53.52 22.178931

1999 3 94.6 21.7319902

1999 4 40.5 7.46550065

1999 5 80.01 22.9761992

1999 6 40.5 7.41317746

1999 8 103.11 11.6825953

1999 9 53.34 16.1305511

2000 1 46.7 21.0022332

2000 3 93.41 23.3589605

2000 4 46.54 .092376043

2000 5 46.7 10.8139207

2000 7 70.8 22.4285538

2000 8 46.54 .092376043

2001 1 92.26 20.3811452

2001 2 118.38 78.5152276

2001 3 47.24 26.5077898

2001 4 256.7 87.947194

2001 5 93.44 71.309193

2001 6 22.44 13.9900965

2001 7 69.96 22.9124643

YEAR WEEK SALE STDDV

---------- ---------- ---------- ----------

2001 8 46.06 19.407678

2001 9 92.67 17.1409691

29 rows selected.

Listagg(把分区中的列按照顺序拼接起来,不支持开窗)SH@ prod> col stddv for a60

SH@ prod> select year , week , sale ,

2 listagg(sale , ',')within group(order by sale desc) over(

3 partition by product , country , region , year ) stddv

4 from sales_fact

5 where country in ('Australia') and product = 'Xtend Memory' and week < 5

6 order by year , week ;

YEAR WEEK SALE STDDV

---------- ---------- ---------- ------------------------------------------------------------

1998 1 58.15 58.15 , 29.49 , 29.49 , 29.39

1998 2 29.39 58.15 , 29.49 , 29.49 , 29.39

1998 3 29.49 58.15 , 29.49 , 29.49 , 29.39

1998 4 29.49 58.15 , 29.49 , 29.49 , 29.39

1999 1 53.52 94.6 , 53.52 , 40.5

1999 3 94.6 94.6 , 53.52 , 40.5

1999 4 40.5 94.6 , 53.52 , 40.5

2000 1 46.7 93.41 , 46.7 , 46.54

2000 3 93.41 93.41 , 46.7 , 46.54

2000 4 46.54 93.41 , 46.7 , 46.54

2001 1 92.26 256.7 , 118.38 , 92.26 , 47.24

2001 2 118.38 256.7 , 118.38 , 92.26 , 47.24

2001 3 47.24 256.7 , 118.38 , 92.26 , 47.24

2001 4 256.7 256.7 , 118.38 , 92.26 , 47.24

14 rows selected.

分析函数对谓词前推的影响使用了分析函数的视图,会影响视图前推,因为分析函数的结果是跨行引用得来的,如果对数据源进行的剪裁,结果可能会不一样

SH@ prod> create or replace view max_5_weeks_vw as

2 select country , product , region , year , week , sale ,

3 max(sale) over(

4 partition by product , country , region , year order by year , week

5 rows between 2 preceding and 2 following ) max_weeks_5

6 from sales_fact ;

View created.

SH@ prod> select year , week , sale , max_weeks_5 from max_5_weeks_vw

2 where country in ('Australia' ) and product = 'Xtend Memory'

3 and region = 'Australia' and year = 2000 and week < 14

4 order by year , week ;

YEAR WEEK SALE MAX_WEEKS_5

---------- ---------- ---------- -----------

2000 1 46.7 93.41

2000 3 93.41 93.41

2000 4 46.54 93.41

2000 5 46.7 93.41

2000 7 70.8 93.74

2000 8 46.54 93.74

2000 11 93.74 117.5

2000 12 46.54 117.67

2000 13 117.5 117.67

9 rows selected.

SH@ prod> explain plan for

2 select year , week , sale , max_weeks_5 from max_5_weeks_vw

3 where country in ('Australia' ) and product = 'Xtend Memory'

4 and region = 'Australia' and year = 2000 and week < 14

5 order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------

Plan hash value: 4167461139

--------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

--------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 90 | 5220 | 310 (1)| 00:00:04 |

|* 1 | VIEW | MAX_5_WEEKS_VW | 90 | 5220 | 310 (1)| 00:00:04 |

| 2 | WINDOW SORT | | 90 | 9450 | 310 (1)| 00:00:04 |

|* 3 | TABLE ACCESS FULL| SALES_FACT | 90 | 9450 | 309 (1)| 00:00:04 |

--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("WEEK"<14)

3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND

"REGION"='Australia' AND "YEAR"=2000)

Note

-----

- dynamic sampling used for this statement (level=2)

21 rows selected.

对比没有分析函数的视图。直接将谓词推入到视图里面。

SH@ prod> create or replace view max_5_weeks_vw1 as

2 select country , product , region , year , week , sale

3 from sales_fact ;

View created.

SH@ prod> explain plan for

2 select year , week , sale from max_5_weeks_vw1

3 where country in ('Australia' ) and product = 'Xtend Memory'

4 and region = 'Australia' and year = 2000 and week < 14

5 order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1978576542

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 105 | 310 (1)| 00:00:04 |

| 1 | SORT ORDER BY | | 1 | 105 | 310 (1)| 00:00:04 |

|* 2 | TABLE ACCESS FULL| SALES_FACT | 1 | 105 | 309 (1)| 00:00:04 |

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory' AND

"REGION"='Australia' AND "YEAR"=2000 AND "WEEK"<14)

Note

-----

- dynamic sampling used for this statement (level=2)

19 rows selected.

分析函数用在动态SQL中SH@ prod> create or replace procedure analytic_dynamic_prc ( part_col_string varchar2 , v_country varchar2 , v_product varchar2 )

2 is

3 type numtab is table of number(18 , 2) index by binary_integer ;

4 l_year numtab ;

5 l_week numtab ;

6 l_sale numtab ;

7 l_rank numtab ;

8 l_sql_string varchar2(512) ;

9 begin

10 l_sql_string := 'select * from ( select year , week , sale , rank() over( partition by' || part_col_string

11 || 'order by sale desc ) sales_rank from sales_fact where country in ('

12 || chr(39) || v_country || chr(39)

13 || ') and product =' || chr(39) || v_product || chr(39)

14 || 'order by product , country , year , week ) where sales_rank <= 10 order by 1,4' ;

15 execute immediate l_sql_string bulk collect into l_year , l_week , l_sale , l_rank ;

16 for i in 1..l_year.count loop

17 dbms_output.put_line( l_year(i) || '|' || l_week(i) || '|' || l_sale(i) || '|' || l_rank(i) ) ;

18 end loop ;

19 end ;

20 /

Procedure created.

SH@ prod> exec analytic_dynamic_prc('product , country , region' , 'Australia' , 'Xtend Memory' ) ;

1998 | 48 | 172.56 | 9

2000 | 46 | 246.74 | 3

2000 | 21 | 187.48 | 5

2000 | 43 | 179.12 | 7

2000 | 34 | 178.52 | 8

2001 | 16 | 278.44 | 1

2001 | 4 | 256.7 | 2

2001 | 21 | 233.7 | 4

2001 | 48 | 182.96 | 6

2001 | 30 | 162.91 | 10

2001 | 14 | 162.91 | 10

PL/SQL procedure successfully completed.

分析函数的“嵌套”

分析函数不能直接嵌套,可能通过子查询来实现。select year , week , top_sale_year ,

lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer

from (

select distinct

first_value(year) over ( 这里的作用不能用MAX代替,这里取列与排序的列是不同的。

partition by product , country , region , year

order by sale desc

rows between unbounded preceding and unbounded following ) year ,

first_value(week) over (

partition by product , country , region , year

order by sale desc

rows between unbounded preceding and unbounded following ) week ,

first_value(sale) over (

partition by product , country , region , year

order by sale desc

rows between unbounded preceding and unbounded following ) top_sale_year

from sales_fact

where country in ('Australia') and product = 'Xtend Memory' )

order by year , week ;

执行结果。

SH@ prod> select year , week , top_sale_year ,

2 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer

3 from (

4 select distinct

5 first_value(year) over (

6 partition by product , country , region , year

7 order by sale desc

8 rows between unbounded preceding and unbounded following ) year ,

9 first_value(week) over (

10 partition by product , country , region , year

11 order by sale desc

12 rows between unbounded preceding and unbounded following ) week ,

13 first_value(sale) over (

14 partition by product , country , region , year

15 order by sale desc

16 rows between unbounded preceding and unbounded following ) top_sale_year

17 from sales_fact

18 where country in ('Australia') and product = 'Xtend Memory' )

19 order by year , week ;

YEAR WEEK TOP_SALE_YEAR PREV_TOP_SALE_YER

---------- ---------- ------------- -----------------

1998 48 172.56 148.12

1999 17 148.12 246.74

2000 46 246.74 278.44

2001 16 278.44

分析函数的并行

SH@ prod> explain plan for

2 select year , week , top_sale_year ,

3 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer

4 from (

5 select distinct

6 first_value(year) over (

7 partition by product , country , region , year

8 order by sale desc

9 rows between unbounded preceding and unbounded following ) year ,

10 first_value(week) over (

11 partition by product , country , region , year

12 order by sale desc

13 rows between unbounded preceding and unbounded following ) week ,

14 first_value(sale) over (

15 partition by product , country , region , year

16 order by sale desc

17 rows between unbounded preceding and unbounded following ) top_sale_year

18 from sales_fact

19 where country in ('Australia') and product = 'Xtend Memory' )

20 order by year , week ;

Explained.

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------

Plan hash value: 2124823565

-------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 197 | 7683 | 313 (2)| 00:00:04 |

| 1 | SORT ORDER BY | | 197 | 7683 | 313 (2)| 00:00:04 |

| 2 | WINDOW SORT | | 197 | 7683 | 313 (2)| 00:00:04 |

| 3 | VIEW | | 197 | 7683 | 311 (1)| 00:00:04 |

| 4 | HASH UNIQUE | | 197 | 20685 | 311 (1)| 00:00:04 |

| 5 | WINDOW SORT | | 197 | 20685 | 311 (1)| 00:00:04 |

|* 6 | TABLE ACCESS FULL| SALES_FACT | 197 | 20685 | 309 (1)| 00:00:04 |

-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

6 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')

Note

-----

- dynamic sampling used for this statement (level=2)

22 rows selected.

(注意DISTINCT操作采用的是HASH UNIQUE而不是排序)

为上面的语句添加并行提示。

SH@ prod> explain plan for

2 select /*+ parallel(3)*/ year , week , top_sale_year ,

3 lag( top_sale_year ) over ( order by year desc ) prev_top_sale_yer

4 from (

5 select distinct

6 first_value(year) over (

7 partition by product , country , region , year

8 order by sale desc

9 rows between unbounded preceding and unbounded following ) year ,

10 first_value(week) over (

11 partition by product , country , region , year

12 order by sale desc

13 rows between unbounded preceding and unbounded following ) week ,

14 first_value(sale) over (

15 partition by product , country , region , year

16 order by sale desc

17 rows between unbounded preceding and unbounded following ) top_sale_year

18 from sales_fact

19 where country in ('Australia') and product = 'Xtend Memory' )

20 order by year , week ;

Explained.

SH@ prod> set linesize 180

SH@ prod> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 2880616722

----------------------------------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |

----------------------------------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 197 | 7683 | 119 (5)| 00:00:02 | | | |

| 1 | SORT ORDER BY | | 197 | 7683 | 119 (5)| 00:00:02 | | | |

| 2 | WINDOW BUFFER | | 197 | 7683 | 119 (5)| 00:00:02 | | | |

| 3 | PX COORDINATOR | | | | | | | | |

| 4 | PX SEND QC (ORDER) | :TQ10003 | 197 | 7683 | 119 (5)| 00:00:02 | Q1,03 | P->S | QC (ORDER) |

| 5 | SORT ORDER BY | | 197 | 7683 | 119 (5)| 00:00:02 | Q1,03 | PCWP | |

| 6 | PX RECEIVE | | 197 | 7683 | 117 (3)| 00:00:02 | Q1,03 | PCWP | |

| 7 | PX SEND RANGE | :TQ10002 | 197 | 7683 | 117 (3)| 00:00:02 | Q1,02 | P->P | RANGE |

| 8 | VIEW | | 197 | 7683 | 117 (3)| 00:00:02 | Q1,02 | PCWP | |

| 9 | HASH UNIQUE | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,02 | PCWP | |

| 10 | PX RECEIVE | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,02 | PCWP | |

| 11 | PX SEND HASH | :TQ10001 | 197 | 20685 | 117 (3)| 00:00:02 | Q1,01 | P->P | HASH |

| 12 | WINDOW SORT | | 197 | 20685 | 117 (3)| 00:00:02 | Q1,01 | PCWP | |

| 13 | PX RECEIVE | | 197 | 20685 | 114 (0)| 00:00:02 | Q1,01 | PCWP | |

| 14 | PX SEND HASH | :TQ10000 | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | P->P | HASH |

| 15 | PX BLOCK ITERATOR | | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | PCWC | |

|* 16 | TABLE ACCESS FULL| SALES_FACT | 197 | 20685 | 114 (0)| 00:00:02 | Q1,00 | PCWP | |

----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

16 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')

Note

-----

- dynamic sampling used for this statement (level=2)

- Degree of Parallelism is 3 because of hint

33 rows selected.

高级排序函数:

[ ROW_NUMBER()| RANK() | DENSE_RANK ] OVER (partition by xx order by xx)

1.row_number() 连续且递增的数字 1 2 3 4  row_number() over (partition by xx order by xx )

--学生表中按照所在专业分组,同专业内按成绩倒序排序,成绩相同则按学号正序排序,并给予组内等级

select row_number() over(partition by class_id order by score desc)rn,t.* from student2016 t

2.rank() 跳跃排序 若有相同数据则排名相同 然后跳跃排序 1 2 2 2 5  rank() over (partition by xx order by xx )

select rank() over(partition by class_id order by score desc)rn,t.* from student2016 t

3.dense_rank 若有相同数据则排名相同 然后递增排序dense_rank  over (partition by xx order by xx ) 1 2 2 2 3

select dense_rank() over(partition by class_id order by score desc)rn,t.* from student2016 t

----------------------------------------------------------------------------------------------------------------------------

高级分组函数group by rollup(a,b,c)

select a,b,c,sum(d) from test group by rollup(a,b,c)

对rollup后面的列 按从右到左以少一列的方式进行分组直到所有列都去掉后的分组(也就是全表分组)

对于n个参数的 rollup,有n+1次分组

即按a,b,c,分组,union all a,b分组 union all a分组 union from test

----------------------------------------------------------------------------------

group by cube(a,b,c)

对n个参数,有2^n次分组

即按 ab,ac,a,bc,b,c最后对 全部分组

----------------------------------------------------------------------------------

group by grouping sets(a,b)

即只列出 对 a分组后,和对 b分组的结果集

-- 创建销售表

create table sales_tab(

year_id number not null,

month_id number not null,

day_id number not null,

sales_value number(10,2) not null

);

-- 插入数据

insert into sales_tab

select trunc(dbms_random.value(low=>2010,high=>2012)) as year_id,

trunc(dbms_random.value(low=>1,high=>13)) as month_id,

trunc(dbms_random.value(low=>1,high=>32)) as day_id,

round(dbms_random.value(low=>1,high=>100)) as sales_value

from dual

connect by level <=1000;

-- 查询 group by 后的数据

select sum(t.sales_value) from SALES_TAB t -- 1行

select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id,t.day_id

order by t.year_id,t.month_id,t.day_id desc; -- 540行

select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id,t.month_id

order by t.year_id,t.month_id desc; -- 24 行

select t.year_id,sum(t.sales_value) sales from SALES_TAB t group by t.year_id

order by t.year_id desc; -- 2 行

-- 使用高级分组函数

-- group by rollup(a,b,c)

select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by rollup(t.year_id,t.month_id,t.day_id)

order by t.year_id,t.month_id,t.day_id; -- 567 行 = 同上面 1+540+24+2

-- group by cube(a,b,c)

select t.year_id,t.month_id,t.day_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id,t.day_id)

order by t.year_id,t.month_id,t.day_id;

--group by grouping sets(a,b,c)

select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by cube(t.year_id,t.month_id)

order by 1,2; -- 39 行

select t.year_id,t.month_id,sum(t.sales_value) sales from SALES_TAB t group by grouping sets(t.year_id,t.month_id)

order by 1,2; -- 14 行

为了方便大家学习和测试,所有的例子都是在Oracle自带用户Scott下建立的。

注:标题中的红色order

by是说明在使用该方法的时候必须要带上order by。

一、rank()/dense_rank() over(partition by ...order

by ...)

现在客户有这样一个需求,查询每个部门工资最高的雇员的信息,相信有一定oracle应用知识的同学都能写出下面的SQL语句:

selecte.ename, e.job, e.sal, e.deptno

fromscott.emp e,

(selecte.deptno,max(e.sal) salfromscott.emp egroupbye.deptno) me

wheree.deptno = me.deptno

ande.sal = me.sal;

在满足客户需求的同时,大家应该习惯性的思考一下是否还有别的方法。这个是肯定的,就是使用本小节标题中rank()

over(partition by...)或dense_rank() over(partition

by...)语法,SQL分别如下:

selecte.ename, e.job, e.sal, e.deptno

from(selecte.ename,

e.job,

e.sal,

e.deptno,

rank() over(partition bye.deptnoorderbye.saldesc) rank

fromscott.emp e) e

wheree.rank = 1;

selecte.ename, e.job, e.sal, e.deptno

from(selecte.ename,

e.job,

e.sal,

e.deptno,

dense_rank() over(partition bye.deptnoorderbye.saldesc) rank

fromscott.emp e) e

wheree.rank = 1;

为什么会得出跟上面的语句一样的结果呢?这里补充讲解一下rank()/dense_rank()

over(partition by e.deptno order by e.sal desc)语法。over:在什么条件之上。partition by

e.deptno:按部门编号划分(分区)。order by e.sal

desc:按工资从高到低排序(使用rank()/dense_rank() 时,必须要带order

by否则非法)rank()/dense_rank():分级整个语句的意思就是:在按部门划分的基础上,按工资从高到低对雇员进行分级,“级别”由从小到大的数字表示(最小值一定为1)。

那么rank()和dense_rank()有什么区别呢?rank():跳跃排序,如果有两个第一级时,接下来就是第三级。dense_rank():连续排序,如果有两个第一级时,接下来仍然是第二级。

小作业:查询部门最低工资的雇员信息。

二、min()/max()

over(partition by ...)

现在我们已经查询得到了部门最高/最低工资,客户需求又来了,查询雇员信息的同时算出雇员工资与部门最高/最低工资的差额。这个还是比较简单,在第一节的groupby语句的基础上进行修改如下:

selecte.ename,

e.job,

e.sal,

e.deptno,

e.sal - me.min_sal diff_min_sal,

me.max_sal - e.sal diff_max_sal

fromscott.emp e,

(selecte.deptno,min(e.sal) min_sal,max(e.sal) max_sal

fromscott.emp e

groupbye.deptno) me

wheree.deptno = me.deptno

orderbye.deptno, e.sal;

上面我们用到了min()和max(),前者求最小值,后者求最大值。如果这两个方法配合over(partition

by ...)使用会是什么效果呢?大家看看下面的SQL语句:

selecte.ename,

e.job,

e.sal,

e.deptno,

nvl(e.sal - min(e.sal) over(partitionbye.deptno), 0) diff_min_sal,

nvl(max(e.sal) over(partitionbye.deptno) - e.sal, 0) diff_max_sal

fromscott.emp e;

这两个语句的查询结果是一样的,大家可以看到min()和max()实际上求的还是最小值和最大值,只不过是在partition

by分区基础上的。

小作业:如果在本例中加上order by,会得到什么结果呢?

三、lead()/lag()

over(partition by ... order by

...)

中国人爱攀比,好面子,闻名世界。客户更是好这一口,在和最高/最低工资比较完之后还觉得不过瘾,这次就提出了一个比较变态的需求,计算个人工资与比自己高一位/低一位工资的差额。这个需求确实让我很是为难,在groupby语句中不知道应该怎么去实现。不过。。。。现在我们有了over(partition

by ...),一切看起来是那么的简单。如下:

selecte.ename,

e.job,

e.sal,

e.deptno,

lead(e.sal, 1, 0) over(partition bye.deptnoorderbye.sal) lead_sal,

lag(e.sal, 1, 0) over(partition bye.deptnoorderbye.sal) lag_sal,

nvl(lead(e.sal) over(partition bye.deptnoorderbye.sal) - e.sal,

0) diff_lead_sal,

nvl(e.sal - lag(e.sal) over(partition bye.deptnoorderbye.sal), 0) diff_lag_sal

fromscott.emp e;

看了上面的语句后,大家是否也会觉得虚惊一场呢(惊出一身冷汗后突然鸡冻起来,这样容易感冒)?我们还是来讲解一下上面用到的两个新方法吧。lead(列名,n,m):当前记录后面第n行记录的的值,没有则默认值为m;如果不带参数n,m,则查找当前记录后面第一行的记录的值,没有则默认值为null。lag(列名,n,m):当前记录前面第n行记录的的值,没有则默认值为m;如果不带参数n,m,则查找当前记录前面第一行的记录的值,没有则默认值为null。

下面再列举一些常用的方法在该语法中的应用(注:带order by子句的方法说明在使用该方法的时候必须要带order

by):

selecte.ename,

e.job,

e.sal,

e.deptno,

first_value(e.sal) over(partition bye.deptno) first_sal,

last_value(e.sal) over(partition bye.deptno) last_sal,

sum(e.sal) over(partitionbye.deptno) sum_sal,

avg(e.sal) over(partitionbye.deptno) avg_sal,

count(e.sal) over(partitionbye.deptno) count_num,

row_number() over(partition bye.deptnoorderbye.sal) row_num

fromscott.emp e;

重要提示:大家在读完本片文章之后可能会有点误解,就是OVER (PARTITION BY ..)比GROUP

BY更好,实际并非如此,前者不可能替代后者,而且在执行效率上前者也没有后者高,只是前者提供了更多的功能而已,所以希望大家在使用中要根据需求情况进行选择。

路在脚下

==================================================================================================================================================

oracle上浮下浮分析函数_Oracle SQL高级编程——分析函数(窗口函数)全面讲解...相关推荐

  1. Oracle SQL高级编程——分析函数(窗口函数)全面讲解

    Oracle SQL高级编程--分析函数(窗口函数)全面讲解 注:本文来源于:<Oracle SQL高级编程--分析函数(窗口函数)全面讲解> 概述 分析函数是以一定的方法在一个与当前行相 ...

  2. linq to sql 行转列_SQL进阶:SQL高级编程语法实例

    之前写过的文章: canhui87:Oracle数据统计必会语法知识​zhuanlan.zhihu.com 主要是告诉大家学习SQL高级编程的参考路径,现在把书里简单的实例代码整理出来,方便以后随时查 ...

  3. PL/SQL高级编程

    PL/SQL高级编程 实验目的: 1.掌握PL/SQL的数据结构和编程结构,掌握应用PL/SQL编写简单程序的方法 2.理解存储过程的概念,掌握编写存储过程的方法 3.理解函数的概念,掌握编写存储过程 ...

  4. oracle上浮下浮分析函数_Oracle分析函数简析

    oracle的分析函数over(Partition by...) Sql代码 over(Partition by...) 一个超级牛皮的ORACLE特有函数. 最近工作中才接触到这个功能强大而灵活的函 ...

  5. oracle pl/sql编程详解,OraclePL/SQL高级编程详解

    –创建一个表,此表作为子表 create table fk_t as select *from user_objects; delete from fk_t where object_id is nu ...

  6. oracle model类型,Oracle SQL高级编程——Model子句全解析-Oracle

    下面的语句会报错,因既没有显式的指定求解顺序,又没有加Automatic Order,此时Oracle会使用Sequential Order这种顺序,并不适合这个规则. SH@ prod> se ...

  7. oracle distinct分页优化_Oracle SQL性能优化最常用的40条建议

    关于Oracle SQL优化的内容,这一篇应该能满足常规大部分的应用优化需求,整整40条优化建议,干货满满. 1. SQL语句执行步骤 语法分析> 语义分析> 视图转换 >表达式转换 ...

  8. oracle判断为周一_oracle sql技巧:取上周一到上周日(本周一到本周日\下周一到下周日)的时间...

    /** * * @param anObject:  thisWeek=本周:nextWeek=下周 * @return monday */ public String getMonday(String ...

  9. oracle当月一天_Oracle SQL取上一个月日期(trunc,add_month)

    取上个月月初到月末的日期 [必备知识] trunc(date,frt) date 待做截取处理的日期 frt 日期格式,该日期将由指定的元素格式所截去.忽略它则由最近的日期截去 trunc(numbe ...

最新文章

  1. 基于数组的无锁队列(译)
  2. 自定义ServicesLoader来实现根据配置使用不通的SPI实现从而实现项目扩展
  3. 杭州/北京内推 | 蚂蚁集团智能决策团队招聘运筹优化算法工程师/实习生
  4. Android:在安卓中使用TFLite模型
  5. javascript --- XMLHttp2级、CORS(跨域资源共享)
  6. 设计和实现一个 Chrome 插件提升登录效率
  7. Java:实验四第6题
  8. Quartus 13.0安装教程
  9. anylogic和java_Anylogic各个版本的功能对比
  10. 最简单的WIN7内核PE系统
  11. android screencap 源码,android后台截屏实现(3)--编译screencap
  12. 学习自媒体究竟要投入多少?
  13. 股票交易日志3 12.15
  14. 云计算企业级小架构部署应用综合练习-二- Ansible 部署 Elastic Stack(ELK)
  15. WGS84经纬度坐标6度分带高斯投影正算
  16. 在win10下把iPad利用起来,当作电脑的副屏
  17. ps入门第11天_ps渐变映射 案例:战争之殇
  18. Check failed: top_shape[j] == bottom[i]-shape(j) (1 vs. 2) All inputs must have the same shape, exc
  19. 领带的10种打法(男人必看)
  20. web.g.cs_我没有CS学位而从零开始成为Web开发人员的旅程(以及我从中学到的东西......

热门文章

  1. ext.net 开发学习之FileUploadField与Image (四)
  2. JConsole/JvisualVM 远程连接失败处理
  3. Android应用程序更新并下载
  4. 猫猫学IOS(四十)UI之核心动画_抖动效果_CAKeyframeAnimation
  5. 也谈谈程序员职业规划的几个问题——我的一些故事
  6. 鸟哥Linux私房菜_基础篇(第二版)_第七章学习笔记
  7. 虚拟话安全:4个基本点
  8. python中如何创建包_如何在Python中创建命名空间包?
  9. Java高级特性:clone()方法
  10. mysql安装的根目录_MySql安装及基础配置(一)