MySQL索引知识总结

对MySQL索引知识的总结笔记。

普通索引

索引是一种数据结构，主要用于性能的提高。

比如我们有一个表t_users，有4个字段：

create table t_users (

id bigint(20) not null auto_increment,

name varchar(255) not null,

age bigint(20) not null,

num bigint(20) not null,

primary key (id)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

往这个表中插入100w条数据，name字段是format-1到100w，age字段也是1-100w，num字段是个随机的10以内的数字。然后执行sql语句进行查询：

mysql> select * from t_users where name = 'format-500000';

+--------+---------------+--------+-----+

| id | name | age | num |

+--------+---------------+--------+-----+

| 500000 | format-500000 | 500000 | 38 |

+--------+---------------+--------+-----+

1 row in set (0.47 sec)

explain一下这条语句：

mysql> explain select * from t_users where name = 'format-500000';

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

发现查询了996677条数据，进行了全表的扫描。

我们给name字段加上索引：

1	mysql> create index IDX_FOR_NAME on t_users(name);

再次执行sql语句：

mysql> select * from t_users where name = 'format-500000';

+--------+---------------+--------+-----+

| id | name | age | num |

+--------+---------------+--------+-----+

| 500000 | format-500000 | 500000 | 38 |

+--------+---------------+--------+-----+

1 row in set (0.00 sec)

explain一下：

mysql> explain select * from t_users where name = 'format-500000';

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

1 row in set (0.00 sec)

只查询了1条数据，因为我们加了btree索引，可以快速定位到具体的值。

接下来我们查询num字段：

mysql> select * from t_users where num = 66;

+--------+---------------+--------+-----+

....

| 965109 | format-965109 | 965109 | 66 |

| 965172 | format-965172 | 965172 | 66 |

| 965182 | format-965182 | 965182 | 66 |

| 965213 | format-965213 | 965213 | 66 |

....

+--------+---------------+--------+-----+

10029 rows in set (0.30 sec)

由于num字段也是没有加索引的，查询的时候也进行全表的扫描，查询耗时。接下来我们给num字段加上索引。

1	mysql> create index IDX_FOR_NUM on t_users(num);

然后进行查询：

mysql> select * from t_users where num = 5;

+--------+---------------+--------+-----+

....

| 965109 | format-965109 | 965109 | 5 |

| 965172 | format-965172 | 965172 | 5 |

| 965182 | format-965182 | 965182 | 5 |

| 965213 | format-965213 | 965213 | 5 |

....

+--------+---------------+--------+-----+

10029 rows in set (0.04 sec)

explain一下：

mysql> explain select * from t_users where num = 5;

+----+-------------+---------+------+---------------+-------------+---------+-------+-------+-------+

+----+-------------+---------+------+---------------+-------------+---------+-------+-------+-------+

+----+-------------+---------+------+---------------+-------------+---------+-------+-------+-------+

虽然我们在num字段上加了索引，但是由于num值在这100w条数据中有很多重复的数据，这个时候索引对查询速度的提高就没有那么明显了。

因为不论是hash类型的索引还是btree类型的索引，他们对于重复的数据的查询并没有提高多少。相反，由于添加了索引，导致数据写入性能变差，而查询性能又没有增强多少。所以说不能盲目地添加索引。

复合索引

复合索引也叫联合索引，表示索引建立在多个列上。

我们删除t_users上的索引，然后创建一个复合索引。

mysql> drop index IDX_FOR_NUM on t_users;

mysql> drop index IDX_FOR_NAME on t_users;

mysql> create index INDEX_FOR_NAME_AGE on t_users(name, age);

复合索引支持最左原则，也就是说INDEX_FOR_NAME_AGE索引支持name字段的查找，支持name，age字段的查找，但是不支持age，name字段的查找以及age字段的查找。

mysql> explain select * from t_users where name = 'format-100000';

+----+-------------+---------+------+--------------------+--------------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------+------+-----------------------+

mysql> explain select * from t_users where name = 'format-100000' and age = 100000;

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

mysql> explain select * from t_users where age = 100000;

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

使用age，name字段查找：

mysql> explain select * from t_users where age = 100000 and name = 'format-100000';

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

+----+-------------+---------+------+--------------------+--------------------+---------+-------------+------+-----------------------+

我们发现使用age，name字段的查找使用了复合索引，这是因为MySQL内部有个查询优化器帮我们进行了优化。

索引的生效

索引创建之后，查询的过程并不一定会使用索引。整理如下(t_users表中name和num字段都有单独的索引)：

1.当使用索引查询比全表扫描更慢。比如下面这句sql中num字段的值分布在1-10之间

mysql> explain select * from t_users where num > 1 and num < 8;

+----+-------------+----------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+----------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+----------+------+---------------+------+---------+------+--------+-------------+

2.使用or进行查询，并且or左右两边的列有不存在索引的列

mysql> explain select * from t_users where name = 'format-4000' or age = 50;

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

or两边的列都有索引：

mysql> explain select * from t_users where name = 'format-4000' or num = 50;

+----+-------------+----------+-------------+---------------------------------+---------------------------------+---------+------+------+-----------------------------------------------------------+

+----+-------------+----------+-------------+---------------------------------+---------------------------------+---------+------+------+-----------------------------------------------------------+

+----+-------------+----------+-------------+---------------------------------+---------------------------------+---------+------+------+-----------------------------------------------------------+

3.使用like，并以 % 开头

mysql> explain select * from t_users where name like "%format-200";

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

mysql> explain select * from t_users where name like "%format-200%";

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

以 % 结尾的查询会使用索引：

mysql> explain select * from t_users where name like "format-200%";

+----+-------------+---------+-------+---------------+--------------+---------+------+------+-----------------------+

+----+-------------+---------+-------+---------------+--------------+---------+------+------+-----------------------+

+----+-------------+---------+-------+---------------+--------------+---------+------+------+-----------------------+

4.复合索引

不是复合索引的最左字段(t_users表有(name,age)复合索引)。

mysql> explain select * from t_users where age = 100000;

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

5.对varchar类型的字段进行查询的时候，没有加上引号

mysql> explain select * from t_users where name = 111;

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

mysql> explain select * from t_users where name = "111";

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

+----+-------------+---------+------+---------------+--------------+---------+-------+------+-----------------------+

hash索引和btree索引的区别

hash索引不能使用范围查询，只能使用一些比如 “=”, “<>”, “in”查询。因为hash索引会计算索引列的hash值，计算出后的hash值经过了hash算法与原先的值完全不一样，只能进行等值的过滤，不能基于范围的过滤
hash索引遇到大量hash值相同的情况下，性能比btree要差
hash索引并不一定一次可以定位到数据。因为基于索引列计算出的hash值会有重复，重复的话需要扫描hash表进行比较
由于hash索引中存放的是经过hash计算之后的hash值，而且hash值的大小关系并不一定和hash运算前的键值完全一样，所以数据库无法利用索引的数据来避免任何排序运算
对于组合索引，hash索引在计算hash值的时候是组合索引键合并后再一起计算hash值，而不是单独计算hash值，所以通过组合索引的前面一个或几个索引键进行查询的时候，hash索引也无法被利用
InnoDB和MyISAM引擎不支持hash索引

本文作者：Format
原文链接： http://fangjian0423.github.io/2017/07/05/mysql-index-summary/