[20170604]12c Top Frequency histogram补充.txt

1.环境:
SCOTT@test01p> @ ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.1.0.1.0     Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0

--//如果要建立Top Frequency histogram必须要满足几个条件:
--//链接 raajeshwaran.blogspot.co.id/2016/06/top-frequency-histogram-in-12c.html

The database creates a Top frequency histogram, when the following criteria are met.

NDV is greater than n, where n is the requested number of buckets (default 254)
The percentage of rows occupied by Top-frequent values is greater than or equal to the threshold p where p is (1-(1/n)*100).
The estimate_percent parameter in dbms_stats gathering procedure should be auto_sample_size (set to default)

SCOTT@test01p> create table t as select * from dba_objects;
Table created.

select column_name,num_distinct,density,histogram,SAMPLE_SIZE
  from user_tab_col_statistics
  where table_name ='T'
  and column_name ='OWNER';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32     .03125 NONE                  91695

--//12c ctas 建立统计信息,但是不会建立直方图.density 1/32=.03125.
SCOTT@test01p> select count(*) from t;
  COUNT(*)
----------
     91695

--//随手写的sql语句:
with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3  from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b;

ROWNUM OWNER                N1         N2         N3         X1         X2
------ ----------------- ----- ---------- ---------- ---------- ----------
     1 SYS               41942      91695      41942     .45741          0
     2 PUBLIC            37142      91695      79084     .86247         .5
     3 APEX_040200        3405      91695      82489      .8996     .66667
     4 ORDSYS             3157      91695      85646     .93403        .75
     5 MDSYS              1819      91695      87465     .95387         .8
     6 XDB                 985      91695      88450     .96461     .83333
     7 SYSTEM              641      91695      89091      .9716     .85714
     8 CTXSYS              405      91695      89496     .97602       .875
     9 WMSYS               387      91695      89883     .98024     .88889
    10 DVSYS               352      91695      90235     .98408         .9
    11 SH                  309      91695      90544     .98745     .90909
    12 ORDDATA             292      91695      90836     .99063     .91667
    13 LBACSYS             209      91695      91045     .99291     .92308
    14 OE                  142      91695      91187     .99446     .92857
    15 SCOTT                96      91695      91283     .99551     .93333
    16 GSMADMIN_INTERNAL    77      91695      91360     .99635      .9375
    17 IX                   58      91695      91418     .99698     .94118
    18 DBSNMP               55      91695      91473     .99758     .94444
    19 PM                   44      91695      91517     .99806     .94737
    20 HR                   35      91695      91552     .99844        .95
    21 OLAPSYS              25      91695      91577     .99871     .95238
    22 OJVMSYS              23      91695      91600     .99896     .95455
    23 DVF                  19      91695      91619     .99917     .95652
    24 FLOWS_FILES          13      91695      91632     .99931     .95833
    25 AUDSYS               12      91695      91644     .99944        .96
    26 ORDPLUGINS           10      91695      91664     .99966     .96154
    27 OUTLN                10      91695      91664     .99966     .96296
    28 BI                    8      91695      91688     .99992     .96429
    29 ORACLE_OCM            8      91695      91688     .99992     .96552
    30 SI_INFORMTN_SCHEM     8      91695      91688     .99992     .96667
    31 APPQOSSYS             5      91695      91693     .99998     .96774
    32 TEST                  2      91695      91695          1     .96875

D:\temp>cat a1.sql
cat a1.sql
exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size &1');
select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';

SCOTT@test01p> @ a1.sql 2
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32     .03125 HYBRID                 5500

SCOTT@test01p> @ a1.sql 3
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 4
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 31
PL/SQL procedure successfully completed.
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 32
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 FREQUENCY             91695

--//除了bucket=2,32建立的直方图HYBRID,FREQUENCY外,建立的都是TOP-FREQUENCY.
--//以10个bucket为例.解方程式(90235-x)/(91695-x)=0.9 ,得到x=77095.也就是要减少77095.

--//delete t where owner='SYS' and rownum<=41000;
--//delete t where owner='PUBLIC' and rownum<=36095;

SCOTT@test01p> delete t where owner='SYS' and rownum<=41000;
41000 rows deleted.

SCOTT@test01p> delete t where owner='PUBLIC' and rownum<=36095;
36095 rows deleted.

SCOTT@test01p> commit ;
Commit complete.

with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3  from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b where rownum<=11;

ROWNUM OWNER         N1         N2         N3         X1         X2
------ ----------- ---- ---------- ---------- ---------- ----------
     1 APEX_040200 3405      14600       3405     .23322          0
     2 ORDSYS      3157      14600       6562     .44945         .5
     3 MDSYS       1819      14600       8381     .57404     .66667
     4 PUBLIC      1047      14600       9428     .64575        .75
     5 XDB          985      14600      10413     .71322         .8
     6 SYS          942      14600      11355     .77774     .83333
     7 SYSTEM       641      14600      11996     .82164     .85714
     8 CTXSYS       405      14600      12401     .84938       .875
     9 WMSYS        387      14600      12788     .87589     .88889
    10 DVSYS        352      14600      13140         .9         .9
    11 SH           309      14600      13449     .92116     .90909
11 rows selected.
--//backet=10,前面10个值占90%.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//再减少1条记录.
SCOTT@test01p> delete t where owner='SYS' and rownum<=1;
1 row deleted.

SCOTT@test01p> commit ;
Commit complete.

ROWNUM OWNER         N1         N2         N3         X1         X2
------ ----------- ---- ---------- ---------- ---------- ----------
     1 APEX_040200 3405      14599       3405     .23324          0
     2 ORDSYS      3157      14599       6562     .44948         .5
     3 MDSYS       1819      14599       8381     .57408     .66667
     4 PUBLIC      1047      14599       9428      .6458        .75
     5 XDB          985      14599      10413     .71327         .8
     6 SYS          941      14599      11354     .77772     .83333
     7 SYSTEM       641      14599      11995     .82163     .85714
     8 CTXSYS       405      14599      12400     .84937       .875
     9 WMSYS        387      14599      12787     .87588     .88889
    10 DVSYS        352      14599      13139     .89999         .9
    11 SH           309      14599      13448     .92116     .90909
11 rows selected.
--//现在前10占.89999.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32    .018378 HYBRID                14599

--//可以发现建立的直方图不是TOP-FREQUENCY,而是HYBRID(混合型直方图).
--//转化成TOP-FREQUENCY.
SCOTT@test01p> insert into t  select * from dba_objects where owner='SYS' and rownum=1;
1 row created.

SCOTT@test01p> commit ;
Commit complete.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//以上内容是昨天的测试.
--//前面我提到如果取样不是auto_sample_size,也可能不行,测试看看.

2.取样大小Estimate_Percent  => NULL.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => NULL);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32    .018379 HYBRID                14600

--//可以发现如果全取样,反而生成混合型直方图.

3.取样大小Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,Block_sample => TRUE.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,Block_sample => TRUE);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//可以发现Estimate_Percent=>SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,不管是块取样依旧.

4.取样Estimate_Percent  => 100,90看看.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => 100);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .062908991 HEIGHT BALANCED       14600

SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => 90);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32      .0625 HEIGHT BALANCED       13108

--//看来仅仅Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE才会生成TOP-FREQUENCY直方图.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//有机会研究HYBRID直方图.

转载于:https://www.cnblogs.com/lfree/p/6947430.html

[20170604]12c Top Frequency histogram补充.txt相关推荐

  1. [20170603]12c Top Frequency histogram.txt

    ---恢复内容开始--- [20170603]12c Top Frequency histogram.txt --//个人对直方图了解很少,以前2种直方图类型对于目前的许多应用来讲已经足够,或者讲遇到 ...

  2. oracle 分桶函数,Oracle 12c新特性 - Top frequency histogram 3

    4.     生成top frequency histogram 的必要条件: 1). The data set has more than n distinct values.  (n表示桶数量) ...

  3. [20160813]12c开启附加日志问题.txt

    [20160813]12c开启附加日志问题.txt --测试需要要在12c下开启附加日志,遇到一些问题,做1个记录: 1.环境: SCOTT@test01p> @ ver1 PORT_STRIN ...

  4. TOP命令的补充笔记

    精准定位需求,解决实际问题: 理论结合实际,共同努力提高: 本文是 shell脚本中使用top命令查看cpu或内存情况的技巧_sixtome-CSDN博客的补充, 在使用top命令的时候发现,在不同的 ...

  5. leetcode top interview题目补充

    这个列表中有78道题目,是leetcode上top interview questions中的.原本是有145道题目,但是部分题目和top 100 liked questions是重复的(见另一篇文章 ...

  6. SQL调优指南笔记11:Histograms

    本文为SQL Tuning Guide第11章"Histograms"的笔记. 重要基本概念 endpoint value An endpoint value is the hig ...

  7. 2.1_11 Oralce 执行计划之3_直方图(Histograms)

    目录 Summarize 总结 一.Purpose of Histograms 二.When Oracle Database Creates Histograms 三.How Oracle Datab ...

  8. linux top命令查看内存及多核CPU的使用讲述 [转]

    2019独角兽企业重金招聘Python工程师标准>>> linux top命令查看内存及多核CPU的使用讲述 [转] FROM: http://www.lvtao.net/html/ ...

  9. linux top命令查看内存及多核CPU的使用讲述【转】

    转载一下top使用后详细的参数,之前做的笔记找不见了,转载一下,作为以后的使用参考: 原文地址:http://blog.csdn.net/linghao00/article/details/80592 ...

最新文章

  1. 最新!百度首发 OCR 自训练平台 EasyDL OCR
  2. 达摩院实现自动驾驶核心技术突破,达摩院首次实现3D物体检测精度与速度的兼得
  3. python 用两个栈实现一个队列
  4. thinkphp通过模型查询mysql_thinkPHP视图模型详解,把mysql表关联简单化!
  5. C#有关Session 操作的几个误区【转】
  6. apt-get for ubuntu 工具简介
  7. 22--删除字符串中的所有相邻重复项
  8. 超级管理器Android,超级文件管理器app
  9. 超 10000 名开发者在追的技术栏目,你绝不能错过!
  10. php socket发数据打印,PHP向socket服务器收发数据的方法
  11. HDU2523 SORT AGAIN【计数排序】
  12. Facebook登陆SDK接入(Android)
  13. android 自定义吐司,[Android开发]Android 自定义Toast
  14. 要点初见:Stable Diffusion NovelAI模型优质文字Tag汇总与实践【魔咒汇总】
  15. 企业征信报告的查询内容有哪些?
  16. word 自己写的发给他人显示批注 并且字体颜色也有变化
  17. java rar_java如何解压rar文件
  18. PRML 学习: (1) Polynomial Curve Fitting
  19. 【集合论】序关系 : 总结 ( 偏序关系 | 偏序集 | 可比 | 严格小于 | 覆盖 | 哈斯图 | 全序关系 | 拟序关系 | 偏序关系八种特殊元素 | 链 | 反链 ) ★★
  20. python大作业报告_python大作业含报告 相关实例(示例源码)下载 - 好例子网

热门文章

  1. 【汇编】JMP跳转指令的指令长度、直接转移与间接转移、段内跳转与段间跳转
  2. 计算机网络与社会需求,计算机网络的技术论文计算机网络与社会需求.doc
  3. linux nfs挂载域名,Linux系统挂载NFS的方法
  4. python数据驱动登录_python之数据驱动ddt操作(方法三)
  5. 超级牛皮的oracle的分析函数over(Partition by...) 及开窗函数
  6. ZJOI2019 线段树
  7. Zookeeper - 简述ZAB 协议和zookeeper
  8. python学习(1)启程
  9. ​EMC存储产品介绍分析(二):大数据利器Isilon (2)
  10. 8th,Jan Time Shedule_1st Day