mysql8多值索引 Multi-Valued Indexes.

  • 1、多值索引使用示例(引用官方示例)
    • 1.1 创建表及多值索引
    • 1.2 查询时使用索引
  • 2、性能测试
    • 2.1、不加索引的情况
    • 2.2、添加索引情况
    • 2.3 、测试结果
  • 3、扩展使用
  • 4、字符类型多值索引
    • 4.1、字符集验证示例一(utf8mb4_0900_ai_ci)
    • 4.1、字符集验证示例二(utf8mb4_0900_as_cs)

多值索引是基于json类型的数组进行设置使用,json 数据类型从msyql5.7就已经支持,而多值索引从mysql8.0.17才开始支持。需要对json数据类型有一定了解,网上有很多例子,这里不做介绍。

以下测试使用的mysql版本为:8.0.22

1、多值索引使用示例(引用官方示例)

官方链接 https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-multi-valued

1.1 创建表及多值索引

数据示例:客户表(customers)有一个用户信息字段 custinfo,为json类型,其中zipcode有多个值,如下:

{"user":"Bob","user_id":31,"zipcode":[94477,94536]
}

针对zipcode进行索引查询

方式一:创建表时同时创建索引

CREATE TABLE customers (id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,custinfo JSON,INDEX zips ( ( CAST( custinfo -> '$.zipcode' AS UNSIGNED ARRAY )) )
);

方式二:先创建表,再通过alter table添加索引

CREATE TABLE customers (id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,custinfo JSON);ALTER TABLE customers ADD INDEX zips( (CAST(custinfo->'$.zipcode' AS UNSIGNED ARRAY)) );

方式三:先创建表,再通过create index添加索引

CREATE TABLE customers (id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,custinfo JSON);CREATE INDEX zips ON customers ( (CAST(custinfo->'$.zipcode' AS UNSIGNED ARRAY)) );

PS:多值索引也可以作为复合索引的一部分,如下:

CREATE TABLE customers (id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,custinfo JSON);ALTER TABLE customers ADD INDEX comp(id, modified,(CAST(custinfo->'$.zipcode' AS UNSIGNED ARRAY)) );

1.2 查询时使用索引

插入数据

mysql> INSERT INTO customers VALUES->     (NULL, NOW(), '{"user":"Jack","user_id":37,"zipcode":[94582,94536]}'),->     (NULL, NOW(), '{"user":"Jill","user_id":22,"zipcode":[94568,94507,94582]}'),->     (NULL, NOW(), '{"user":"Bob","user_id":31,"zipcode":[94477,94507]}'),->     (NULL, NOW(), '{"user":"Mary","user_id":72,"zipcode":[94536]}'),->     (NULL, NOW(), '{"user":"Ted","user_id":56,"zipcode":[94507,94582]}');

查询,提供了三个查询函数

MEMBER OF() 数组中是否存在某一值

JSON_CONTAINS() 数组中是否包含某些值,交集(必须都包含)

JSON_OVERLAPS() 数组中是否包含某些值,并集(只要包含一个就行)

mysql> SELECT * FROM customers->     WHERE 94507 MEMBER OF(custinfo->'$.zipcode');
+----+---------------------+-------------------------------------------------------------------+
| id | modified            | custinfo                                                          |
+----+---------------------+-------------------------------------------------------------------+
|  2 | 2019-06-29 22:23:12 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} |
|  3 | 2019-06-29 22:23:12 | {"user": "Bob", "user_id": 31, "zipcode": [94477, 94507]}         |
|  5 | 2019-06-29 22:23:12 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]}         |
+----+---------------------+-------------------------------------------------------------------+
3 rows in set (0.00 sec)mysql> SELECT * FROM customers->     WHERE JSON_CONTAINS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON));
+----+---------------------+-------------------------------------------------------------------+
| id | modified            | custinfo                                                          |
+----+---------------------+-------------------------------------------------------------------+
|  2 | 2019-06-29 22:23:12 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} |
|  5 | 2019-06-29 22:23:12 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]}         |
+----+---------------------+-------------------------------------------------------------------+
2 rows in set (0.00 sec)mysql> SELECT * FROM customers->     WHERE JSON_OVERLAPS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON));
+----+---------------------+-------------------------------------------------------------------+
| id | modified            | custinfo                                                          |
+----+---------------------+-------------------------------------------------------------------+
|  1 | 2019-06-29 22:23:12 | {"user": "Jack", "user_id": 37, "zipcode": [94582, 94536]}        |
|  2 | 2019-06-29 22:23:12 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} |
|  3 | 2019-06-29 22:23:12 | {"user": "Bob", "user_id": 31, "zipcode": [94477, 94507]}         |
|  5 | 2019-06-29 22:23:12 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]}         |
+----+---------------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)

explain分析上述查询索引使用情况,能看出都是走了索引的

mysql> EXPLAIN SELECT * FROM customers->     WHERE 94507 MEMBER OF(custinfo->'$.zipcode');
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | customers | NULL       | ref  | zips          | zips | 9       | const |    1 |   100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)mysql> EXPLAIN SELECT * FROM customers->     WHERE JSON_CONTAINS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON));
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table     | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | customers | NULL       | range | zips          | zips | 9       | NULL |    6 |   100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)mysql> EXPLAIN SELECT * FROM customers->     WHERE JSON_OVERLAPS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON));
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table     | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | customers | NULL       | range | zips          | zips | 9       | NULL |    6 |   100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

2、性能测试

2.1、不加索引的情况

创建测试表,不加索引

CREATE TABLE `test_int_array` (`a` bigint NOT NULL AUTO_INCREMENT,`b` json NOT NULL,PRIMARY KEY (`a`)
) ENGINE=InnoDB AUTO_INCREMENT=19264 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

存储过程创建测试数据(10万条数据)

DROP PROCEDURE IF EXISTS proc_initData;
DELIMITER $;
CREATE PROCEDURE proc_initData () BEGINDECLARE  i INT DEFAULT 1;DECLARE jsondoc JSON ;DECLARE CODE INT;WHILE i <= 1 DOSET jsondoc = '{"user":"Jack","user_id":37,"type":[1100]}';SET CODE = CEILING( RAND()* 10 )*100;SET jsondoc = JSON_SET(jsondoc, "$.type", JSON_ARRAY_INSERT( jsondoc->'$.type', '$[0]', CODE ));SET CODE = CEILING( RAND()* 10 )*100;SET jsondoc = JSON_SET(jsondoc, "$.type", JSON_ARRAY_INSERT( jsondoc->'$.type', '$[0]', CODE ));SET CODE = CEILING( RAND()* 10 )*100;SET jsondoc = JSON_SET(jsondoc, "$.type", JSON_ARRAY_INSERT( jsondoc->'$.type', '$[0]', CODE ));SET CODE = CEILING( RAND()* 10 )*100;SET jsondoc = JSON_SET(jsondoc, "$.type", JSON_ARRAY_INSERT( jsondoc->'$.type', '$[0]', CODE ));SET CODE = CEILING( RAND()* 10 )*100;SET jsondoc = JSON_SET(jsondoc, "$.type", JSON_ARRAY_INSERT( jsondoc->'$.type', '$[0]', CODE ));INSERT INTO test_int_array  VALUES  ( NULL, jsondoc );SET i = i + 1;END WHILE;
END $;
CALL proc_initData ();

查询测试

select count(*) from test_int_array where 500 member of(b->'$.type')
> OK
> Query Time: 0.155s
count(*)
40800select count(*) from test_int_array where JSON_CONTAINS(b->'$.type', '[500]')
> OK
> Query Time: 0.344s
count(*)
40800select count(*) from test_int_array where JSON_CONTAINS(b->'$.type', '[500,600,900]')
> OK
> Query Time: 0.435s
count(*)
4095select count(*) from test_int_array where JSON_OVERLAPS(b->'$.type', '[500]')
> OK
> Query Time: 0.409s
count(*)
40800select count(*) from test_int_array where JSON_OVERLAPS(b->'$.type', '[500,600,900]')
> OK
> Query Time: 0.453s
count(*)
83504

2.2、添加索引情况

#添加索引
CREATE INDEX types ON test_int_array ( (CAST(b->'$.type' AS UNSIGNED ARRAY)) );select count(*) from test_int_array where 500 member of(b->'$.type')
> OK
> Query Time: 0.248sselect count(*) from test_int_array where JSON_CONTAINS(b->'$.type', '[500]')
> OK
> Query Time: 0.678sselect count(*) from test_int_array where JSON_CONTAINS(b->'$.type', '[500,600,900]')
> OK
> Query Time: 0.855sselect count(*) from test_int_array where JSON_OVERLAPS(b->'$.type', '[500]')
> OK
> Query Time: 0.764sselect count(*) from test_int_array where JSON_OVERLAPS(b->'$.type', '[500,600,900]')
> OK
> Query Time: 0.81s

2.3 、测试结果

加完索引查询反而慢了…

3、扩展使用

直接把zipcode提出来作为表字段

CREATE TABLE `customers2` (`id` bigint NOT NULL AUTO_INCREMENT,`modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,`user` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,`user_id` int DEFAULT NULL,`zipcode` json DEFAULT NULL,PRIMARY KEY (`id`),KEY `zips` ((cast(`zipcode` as unsigned array)))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;INSERT INTO customers2 VALUES(NULL, NOW(), "Jack",37,'[94582,94536]'),(NULL, NOW(),"Jill",22, json_array(94568,94507,94582)),(NULL, NOW(), "Bob",31,'[94477,94507]'),(NULL, NOW(),"Mary",72, '[94536]'),(NULL, NOW(), "Ted",56,'[94507,94582]');SELECT * FROM customers2 WHERE 94507 MEMBER OF(zipcode);

SELECT * FROM customers2 WHERE JSON_CONTAINS(zipcode, CAST('[94507,94582]' AS JSON));

查询计划
explain SELECT * FROM customers2 WHERE 94507 MEMBER OF(zipcode);


explain SELECT * FROM customers2 WHERE JSON_CONTAINS(zipcode, CAST('[94507,94582]' AS JSON));


查询功能和查询结果都可以,但是索引不生效,不知道是索引加的方式有问题还是不支持这种使用方式。

4、字符类型多值索引

以上zipcode是整形的多值索引使用,字符类型多值索引使用有局限性

(摘自官方文档)
Character sets and collations other than the following two combinations of character set and collation are not supported for multi-valued indexes:a.The binary character set with the default binary collationb.The utf8mb4 character set with the default utf8mb4_0900_as_cs collation.多值索引只支持以下两种字符集和排序规则的组合:a.具有默认binary排序规则的binary字符集。b.具有默认utf8mb4_0900_as_cs排序规则的utf8mb4字符集。

字符多值索引的创建需要把 unsigned array 改为 char array;

4.1、字符集验证示例一(utf8mb4_0900_ai_ci)

CREATE TABLE `customers3` (`id` bigint NOT NULL AUTO_INCREMENT,`modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,`custinfo` json DEFAULT NULL,PRIMARY KEY (`id`),KEY `zips` ((cast(json_extract(`custinfo`,_utf8mb4'$.zipcode') as char(6) array)))
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;INSERT INTO customers3 VALUES(NULL, NOW(), '{"user":"Jack","user_id":37,"zipcode":["94582","94536"]}'),(NULL, NOW(), '{"user":"Jill","user_id":22,"zipcode":["94568","94507","94582"]}'),(NULL, NOW(), '{"user":"Bob","user_id":31,"zipcode":["94477","94507"]}'),(NULL, NOW(), '{"user":"Mary","user_id":72,"zipcode":["94536"]}'),(NULL, NOW(), '{"user":"Ted","user_id":56,"zipcode":["94507","94582"]}');SELECT * FROM customers3 WHERE "94507" MEMBER OF(custinfo->'$.zipcode');


SELECT * FROM customers3 WHERE JSON_CONTAINS(custinfo->'$.zipcode', CAST('["94507","94582"]' AS JSON));


SELECT * FROM customers3 WHERE JSON_OVERLAPS(custinfo->'$.zipcode', CAST('["94507","94582"]' AS JSON));


4.1、字符集验证示例二(utf8mb4_0900_as_cs)

CREATE TABLE `customers4` (`id` bigint NOT NULL AUTO_INCREMENT,`modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,`custinfo` json DEFAULT NULL,PRIMARY KEY (`id`),KEY `zips` ((cast(json_extract(`custinfo`,_utf8mb4'$.zipcode') as char(6) array)))
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs;其他步骤与验证示例一相同

结果:示例一和示例二的结果一样,从执行计划里面看都走了索引,跟字符集排序规则好像没关系。不知道实际的性能有没有区别。

mysql8多值索引(Multi-Valued Indexes)使用方法和性能测试相关推荐

  1. MySQL为JSON字段创建索引(Multi-Valued Indexes 多值索引)

    版权说明: 本文由博主keep丶原创,转载请注明出处. 原文地址: https://blog.csdn.net/qq_38688267/article/details/119383103 环境说明: ...

  2. MongoDB——索引属性之TTL索引(TTL Indexes)

    目录 一.MongoDB官网地址 二.TTL索引(TTL Indexes)的概述 2.1.TTL索引(TTL Indexes)的前提条件 2.2.处理历史数据通常的做法 2.3.MongoDB提供的做 ...

  3. mongo 唯一约束索引_mongodb索引详解(Indexes)

    索引介绍 索引在mongodb中被支持,如果没有索引,mongodb必须扫描每一个文档集合选择匹配的查询记录.这样扫描集合效率并不高,因为它需要mongod进程使用大量的数据作遍历操作. 索引是一种特 ...

  4. pythonpandas设置索引_pandas DataFrame的修改方法(值、列、索引)

    对于DataFrame的修改操作其实有很多,不单单是某个部分的值的修改,还有一些索引的修改.列名的修改,类型修改等等.我们仅选取部分进行介绍. 一.值的修改 DataFrame的修改方法,其实前面介绍 ...

  5. python处理问题汇总二(重复值索引,顺序读取文件,drop函数,数组元素排名)

    文章目录 1. python处理重复值索引问题 2.python读取文件夹列表的问题 3. dataframe删除指定行.列 1)drop函数 2)inplace参数 4. argsort数组索引排名 ...

  6. Python实现三维数据(x, y, z)的索引——即通过(x, y)的值索引z的值

    Python实现三维数据(x, y, z)的索引--即通过(x, y)的值索引z的值 一.需求分析 1.通常情况下我们对于二维数据(x, y)的存储,表示和索引,可以用字典来满足我们的需求,比如下列一 ...

  7. OS X各版本原厂系统镜像校验值索引

    OS X各版本原厂系统镜像校验值 索引 更新: 2019-05-14:添加了版本10.14.5的校验值 2019-03-26:添加如何从Apple直接下载macOS安装App一节.添加了版本10.14 ...

  8. java获取map遍历,Map获取键值,Map的几种遍历方法总结(推荐)

    Map以按键/数值对的形式存储数据,和数组非常相似,在数组中存在的索引,它们本身也是对象. Map的接口 Map---实现Map Map.Entry--Map的内部类,描述Map中的按键/数值对. S ...

  9. PHP实现对多维数组按照某个键值排序的两种解决方法

    实现对多维数组按照某个键值排序的两种解决方法(array_multisort和array_sort): 第一种: array_multisort()函数对多个数组或多维数组进行排序. //对数组$ho ...

最新文章

  1. 不同浏览器的怪癖小结【转】
  2. 基于ESP32的竞赛裁判系统功能调试-光电条检测板
  3. C# WinForm开发系列 - DataGridView A
  4. 解决EF 4.0 中数据缓存机制
  5. 为什么大学感觉学编程很难?原因有这三点。
  6. sysdig_Linux 监控和调试利器 Sysdig 入门教程
  7. LINUX右键新建,增加项目
  8. 工作缺点和不足及措施_个人工作问题不足20条以及改进措施
  9. 刚拿到手新鲜的offer,给大家分享一下我的面试心得
  10. 原装世嘉土星SS手柄(Sega Saturn)改USB接口,实现低延时USB格斗手柄
  11. 靖哥哥教你如何安装chrome浏览器离线插件
  12. 2022-08-08 第二小组 张明旭 Java学习记录
  13. #牛客网 吐泡泡 (栈)
  14. 打印Excel工作表时忽略打印区域
  15. 4G图传数传一体机GSLINK实测20180930
  16. 一文了解 DataLeap 中的 Notebook
  17. 拼多多免单券怎么领取 拼多多免单券是真的吗
  18. Linux系统下服务和运行目标管理——单用户和多用户模式的切换
  19. Bootstrap前端组件库+构建管理
  20. 【Spring】mybatis-spring

热门文章

  1. 【SSL 2882】[POJ 3250]排队【单调栈模板】
  2. 选修课:唐宋词课堂鉴赏笔记01
  3. vue导出word纯前端实现
  4. 纯CSS3画哆啦A梦
  5. 这是一场 DDD 的探索之旅
  6. android英语字典(源代码),android英语字典(内含源码哦)
  7. Python爬取京东:价格、商品ID、标题、评价、店名、是否自营
  8. METTLER TOLEDO托利多Bplus 标签格式设置教程(scale manager)
  9. K8S 数据卷volumes之ConfigMap
  10. 紫光服务器型号,紫光新华三全新HPE Gen10系列服务器响彻“云”端