Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？—

——理解适当使用每个索引对性能的影响

Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？——1-5

Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？——2-5

Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？——3-5

Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？——4-5

Bitmap 索引 vs. B-tree 索引：如何选择以及何时使用？——5-5

本文内容

比较索引
步骤 3A（TEST_NORMAL 表 EMPNO 列创建 Bitmap 索引，执行范围查询）
步骤 3B（TEST_NORMAL 表 EMPNO 列创建 B-tree 索引，执行范围查询）
步骤 4A（TEST_RANDOM 表 EMPNO 列创建 Bitmap 索引，执行范围查询）
步骤 4B（TEST_RANDOM 表 EMPNO 列创建 B-tree 索引，执行范围查询）

步骤 3A（在 TEST_NORMAL）

该步骤创建 Bitmap 索引（同步骤 1A）。我们已经知道索引大小（28MB）及其聚类系数（等于表的行数）。现在执行一些范围谓词谓词查询。

SQL> drop index normal_empno_idx;

索引已删除。

SQL> create Bitmap index normal_empno_bmx on test_normal(empno);

索引已创建。

SQL> analyze table test_normal compute statistics for table for all indexes for

all columns;

表已分析。

SQL>

SQL> set autot traceonly

SQL> select * from test_normal where empno between &range1 and &range2;

输入 range1 的值:  1

输入 range2 的值:  2300

原值    1: select * from test_normal where empno between &range1 and &range2

新值    1: select * from test_normal where empno between 1 and 2300

已选择2300行。

执行计划

----------------------------------------------------------

Plan hash value: 641040856

-------------------------------------------------------------------------------------------------

| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT             |                  |  2299 | 85063 |   417  (0)| 00:00:06 |

|   1 |  TABLE ACCESS BY INDEX ROWID | TEST_NORMAL      |  2299 | 85063 |   417  (0)| 00:00:06 |

|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |

|*  3 |    BITMAP INDEX RANGE SCAN   | NORMAL_EMPNO_BMX |       |       |           |          |

-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   3 - access("EMPNO">=1 AND "EMPNO"<=2300)

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        331  consistent gets

          0  physical reads

          0  redo size

     130220  bytes sent via SQL*Net to client

       2202  bytes received via SQL*Net from client

        155  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2300  rows processed

SQL>

步骤 3B（在 TEST_NORMAL）

该步骤对 TEST_NORMAL 表 EMPNO 列使用 B-tree 索引，执行范围谓词查询。

SQL> set autot off

SQL> drop index normal_empno_bmx;

索引已删除。

SQL> create index normal_empno_idx on test_normal(empno);

索引已创建。

SQL> analyze table test_normal compute statistics for table for all indexes for

all indexed columns;

表已分析。

SQL>

SQL> set autot traceonly

SQL> select * from test_normal where empno between &range1 and &range2;

输入 range1 的值:  1

输入 range2 的值:  2300

原值    1: select * from test_normal where empno between &range1 and &range2

新值    1: select * from test_normal where empno between 1 and 2300

已选择2300行。

执行计划

----------------------------------------------------------

Plan hash value: 1781697849

------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |

------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                  |  2299 | 85063 |    23 (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TEST_NORMAL      |  2299 | 85063 |    23 (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | NORMAL_EMPNO_IDX |  2299 |       |     8 (0)| 00:00:01 |

------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("EMPNO">=1 AND "EMPNO"<=2300)

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        329  consistent gets

          0  physical reads

          0  redo size

     130220  bytes sent via SQL*Net to client

       2202  bytes received via SQL*Net from client

        155  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2300  rows processed

SQL>

当输入不同范围查询时，结果如下所示：

表 4 TEST_NORMAL 表 EMPNO 列利用 Bitmap 和 B-tree 索引执行范围查询比较

Bitmap			B-tree
Consistent Reads	Physical Reads	EMPNO (Range)	Consistent Reads	Physical Reads
331	0	1-2300	329	0
285	0	8-1980	283	0
346	19	1850-4250	344	16
427	31	28888-31850	424	28
371	27	82900-85478	367	23
2157	149	984888-1000000	2139	35

如上表所示，两个索引的 consistent gets 和 physical reads 值很接近。表最后一行查询范围 (984888-1000000) 返回了将近 15000 行。因此，当我们要求一个全表扫描时（指定优化器提示为 /*+ full(test_normal) */ ），consistent read 和 physical read 值分别为 7239 和 5663。

3A 和 3B 的演示，在 TEST_NORMAL 表执行范围查询时，优化器使用了 EMPNO 列上的相应索引，逻辑 IO 和物理 IO 只是稍有差异。

步骤 4A（在 TEST_RANDOM）

该步骤在 TEST_RANDOM 表 EMPNO 列使用 Bitmap 索引进行范围查询，检查 consistent gets 和 physical reads 值。这里，你会看到聚类系数的影响。

SQL> drop index random_empno_idx;

索引已删除。

SQL> create Bitmap index random_empno_bmx on test_random(empno);

索引已创建。

SQL> analyze table test_random compute statistics for table for all indexes for

all indexed columns;

表已分析。

SQL>

SQL> set autot traceonly

SQL> select * from test_random where empno between &range1 and &range2;

输入 range1 的值:  1

输入 range2 的值:  2300

原值    1: select * from test_random where empno between &range1 and &range2

新值    1: select * from test_random where empno between 1 and 2300

已选择2300行。

执行计划

----------------------------------------------------------

Plan hash value: 4105816815

-------------------------------------------------------------------------------------------------

| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT             |                  |  2299 | 89661 |   418  (1)| 00:00:06 |

|   1 |  TABLE ACCESS BY INDEX ROWID | TEST_RANDOM      |  2299 | 89661 |   418  (1)| 00:00:06 |

|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |

|*  3 |    BITMAP INDEX RANGE SCAN   | RANDOM_EMPNO_BMX |       |       |           |          |

-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   3 - access("EMPNO">=1 AND "EMPNO"<=2300)

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

       2463  consistent gets

          0  physical reads

          0  redo size

     130220  bytes sent via SQL*Net to client

       2202  bytes received via SQL*Net from client

        155  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2300  rows processed

SQL>

步骤 4B（在 TEST_RANDOM）

该步骤在 TEST_RANDOM 表 EMPNO 列使用 B-tree 索引进行范围查询。回忆一下索引的聚类系数非常接近表中行数（因此，不会很有效）。下面看看优化器是如何说的：

SQL> drop index random_empno_bmx;

索引已删除。

SQL> create index random_empno_idx on test_random(empno);

索引已创建。

SQL> analyze table test_random compute statistics for table for all indexes for

  2  all indexed columns;

表已分析。

SQL>

SQL> select * from test_random where empno between &range1 and &range2;

输入 range1 的值:  1

输入 range2 的值:  2300

原值    1: select * from test_random where empno between &range1 and &range2

新值    1: select * from test_random where empno between 1 and 2300

已选择2300行。

执行计划

----------------------------------------------------------

Plan hash value: 2650160170

---------------------------------------------------------------------------------

| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time

---------------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |             |  2299 | 89661 |  1740   (1)| 00:00:21|

|*  1 |  TABLE ACCESS FULL| TEST_RANDOM |  2299 | 89661 |  1740   (1)| 00:00:21|

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - filter("EMPNO"<=2300 AND "EMPNO">=1)

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

       6412  consistent gets

          0  physical reads

          0  redo size

     121076  bytes sent via SQL*Net to client

       2202  bytes received via SQL*Net from client

        155  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       2300  rows processed

SQL>

由于聚类系数的原因，优化器选择了全表扫描，而不是索引：

Bitmap			B-tree
Consistent Reads	Physical Reads	EMPNO (Range)	Consistent Reads	Physical Reads
2463	1200	1-2300	6415	4910
2114	31	8-1980	6389	4910
2572	1135	1850-4250	6418	4909
3173	1620	28888-31850	6456	4909
2762	1358	82900-85478	6431	4909
7254	3329	984888-1000000	7254	4909

只有表最后一行，对 Bitmap 索引，优化器选择了全表扫描，而对于所有的范围查询，对 B-tree 索引，优化器选择全表扫描。这种差异是由于簇因素：当使用 bitmap 索引产生执行计划时，优化器不考虑聚类系数的值，而对 B-tree 索引，则考虑。在这个场景，Bitmap 索引比 B-tree 索引更有效率。

4A 和 4B 的演示，在 TEST_RANDOM 表执行范围查询时，当索引时 Bitmap 索引时，优化器使用了；可当索引时 B-tree 索引时，优化器没有使用，而是进行了全表扫描，逻辑 IO 和物理 IO 自然也就差异很大。

原因就在于 TEST_NORMAL 表是已组织的，而 TEST_RANDOM 表示无组织的。这就好像数据结构中的查找算法或排序算法，如果当前数组是已有序的，查找和排序会快很多。

下面步骤会揭示关于索引更有趣的事实。