一条SQL语句执行成本的计算
在一条单表查询语句真正执行之前,MySQL的查询优化器会找出执行该语句所有可能使用的方案,对比之后找出成本最低的方案,这个成本最低的方案就是所谓的执行计划,之后才会调用存储引擎提供的接口真正的执行查询。
准备数据
employees数据库来自MySQL官方示例数据库employees。
mysql> use employees;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -ADatabase changed
mysql> create table t_emp like employees;
Query OK, 0 rows affected (0.03 sec)mysql> insert into t_emp select * from employees;
Query OK, 300024 rows affected (2.57 sec)
Records: 300024 Duplicates: 0 Warnings: 0mysql> alter table t_emp add index idx_hire_date(hire_date);
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0mysql> alter table t_emp add index idx_birth_date(birth_date);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0mysql> show index from t_emp;
+-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_emp | 0 | PRIMARY | 1 | emp_no | A | 299645 | NULL | NULL | | BTREE | | |
| t_emp | 1 | idx_hire_date | 1 | hire_date | A | 5590 | NULL | NULL | | BTREE | | |
| t_emp | 1 | idx_birth_date | 1 | birth_date | A | 4770 | NULL | NULL | | BTREE | | |
+-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)
待分析的SQL:
mysql> explain select * from t_emp where hire_date > '1990-11-20' and birth_date > '1840-11-20';
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t_emp | NULL | ALL | idx_hire_date,idx_birth_date | NULL | NULL | NULL | 299645 | 25.00 | Using where |
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
为什么走的是全表扫描,而不是索引idx_hire_date或idx_birth_date呢?
成本的计算过程
根据搜索条件,找出所有可能使用的索引
计算全表扫描的代价
计算使用不同索引执行查询的代价
对比各种执行方案的代价,找出成本最低的那个
计算全表扫描的代价
全表扫描即将聚簇索引从对应的页面加载到内存,然后检测记录是否满足条件计算。
计算公式:页面数 x 1 + 记录数 x 0.2
页面数和记录数如何获得?查看表的统计信息。
mysql> show table status like 't_emp'\G;
*************************** 1. row ***************************Name: t_empEngine: InnoDBVersion: 10Row_format: DynamicRows: 299645Avg_row_length: 50Data_length: 15220736
Max_data_length: 0Index_length: 9469952Data_free: 2097152Auto_increment: NULLCreate_time: 2021-08-19 07:19:44Update_time: 2021-08-19 07:19:31Check_time: NULLCollation: latin1_swedish_ciChecksum: NULLCreate_options:Comment:
1 row in set (0.00 sec)
从上面可以看到Rows: 299645
,记录数为299645,数据的大小为Data_length=15220736
,页面数=数据大小/16/1024=929。
从下面的统计表的数据中也可以直接看出记录数和页面数:
mysql> select * from mysql.innodb_table_stats where table_name='t_emp'\G;
*************************** 1. row ***************************database_name: employeestable_name: t_emplast_update: 2021-08-19 07:19:44n_rows: 299645clustered_index_size: 929
sum_of_other_index_sizes: 578
1 row in set (0.00 sec)
因此全表扫描的总成本为:929x1.0+299645x0.2=60858。
来看一下MySQL算的是多少:
mysql> explain format=json select * from t_emp where hire_date > '1990-11-20' and birth_date > '1840-11-20';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"query_block": {"select_id": 1,"cost_info": {"query_cost": "60858.00"},"table": {"table_name": "t_emp","access_type": "ALL","possible_keys": ["idx_hire_date","idx_birth_date"],"rows_examined_per_scan": 299645,"rows_produced_per_join": 74910,"filtered": "25.00","cost_info": {"read_cost": "45875.85","eval_cost": "14982.15","prefix_cost": "60858.00","data_read_per_join": "3M"},"used_columns": ["emp_no","birth_date","first_name","last_name","gender","hire_date"],"attached_condition": "((`employees`.`t_emp`.`hire_date` > '1990-11-20') and (`employees`.`t_emp`.`birth_date` > '1840-11-20'))"}}
} |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
计算使用索引idx_hire_date的代价
计算二级索引成本的大步骤为:
计算搜索二级索引树中满足条件的记录
回表查询
搜索二级索引树代价计算:
IO成本:范围区间占用页面数量
,查询优化器粗暴的认为读取索引的一个范围区间的I/O成本和读取一个页面是相同的。
CPU成本:需要回表的记录数。
如何计算idx_hire_date在hire_date > '1990-11-20’这个范围区间中包含多少二级索引记录
:
当左边界记录和右边界记录相隔较小时
,直接遍历这些页面的PAGE HEADER中的PAGE_N_RECS字段(记录该页面有多少条数据)可以获得精确值。当左边界记录和右边界记录相隔较大时
,从左边界所在页面向右读取10个页面,计算平均每个页面中包含多少记录。再乘以左边界和右边界之间的页面数量即可
,左边界和右边界之间的页面数量可以从B+树中的父节点获取(若跨越太多页面则需要递归)。
这里我们可以用count(*)计算一下:
mysql> select count(*) from t_emp where hire_date > '1990-11-20';
+----------+
| count(*) |
+----------+
| 112374 |
+----------+
1 row in set (0.05 sec)
搜索二级索引树的成本:1x1.0+112374x0.2=22475.8
回表查询代价计算
:回表查询聚簇索引需加载的页面数量(用于计算IO成本)
,设计MySQL的大叔评估回表操作的I/O成本依旧很豪放,他们认为每次回表操作都相当于访问一个页面
,回表查询定位页面后,需要定位记录并且判断其他过滤条件(用于计算CPU成本)。
回表的成本:112374x1.0+112374x0.2=134848.8
总成本:22475.8+134848.8=157324.6
再来看一下MySQL算的是多少:
mysql> explain format=json select * from t_emp force index(idx_hire_date) where hire_date > '1990-11-20' and birth_date > '1840-11-20';
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"query_block": {"select_id": 1,"cost_info": {"query_cost": "209751.81"},"table": {"table_name": "t_emp","access_type": "range","possible_keys": ["idx_hire_date"],"key": "idx_hire_date","used_key_parts": ["hire_date"],"key_length": "3","rows_examined_per_scan": 149822,"rows_produced_per_join": 49935,"filtered": "33.33","index_condition": "(`employees`.`t_emp`.`hire_date` > '1990-11-20')","cost_info": {"read_cost": "199764.68","eval_cost": "9987.13","prefix_cost": "209751.81","data_read_per_join": "2M"},"used_columns": ["emp_no","birth_date","first_name","last_name","gender","hire_date"],"attached_condition": "(`employees`.`t_emp`.`birth_date` > '1840-11-20')"}}
} |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
为什么和我们算的不一样?因为MySQL使用的是统计数据,计算出来的记录数为149822。
总成本=1 x 1.0 + 149822 x 0.2 + 149822 x 1.0 + 149822 x 0.2 = 209751.8
计算使用索引idx_birth_date的代价
二级索引idx_birth_date的成本的计算方式类似。
记录数:
mysql> select count(*) from t_emp where birth_date > '1840-11-20';
+----------+
| count(*) |
+----------+
| 300024 |
+----------+
1 row in set (0.08 sec)
二级索引扫描成本:1 x 1.0 + 300024 x 0.2=60005.8
回表成本:300024 x 1.0 + 300024 x 0.2=360,028.8
总成本:420034.6
通过对比各种执行方案的代价,找出成本最低的那个为全表扫描。
看下MySQL算出来的成本:
mysql> explain format=json select * from t_emp force index(idx_birth_date) where hire_date > '1990-11-20' and birth_date > '1840-11-20';
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"query_block": {"select_id": 1,"cost_info": {"query_cost": "209751.81"},"table": {"table_name": "t_emp","access_type": "range","possible_keys": ["idx_birth_date"],"key": "idx_birth_date","used_key_parts": ["birth_date"],"key_length": "3","rows_examined_per_scan": 149822,"rows_produced_per_join": 49935,"filtered": "33.33","index_condition": "(`employees`.`t_emp`.`birth_date` > '1840-11-20')","cost_info": {"read_cost": "199764.68","eval_cost": "9987.13","prefix_cost": "209751.81","data_read_per_join": "2M"},"used_columns": ["emp_no","birth_date","first_name","last_name","gender","hire_date"],"attached_condition": "(`employees`.`t_emp`.`hire_date` > '1990-11-20')"}}
} |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
这个索引可以看出MySQL的统计数据很不准确。
optimizer_trace
上面只能看到MySQL选择的方案的成本,要想看到MySQL对SQL的执行过程可以使用optimizer_trace来查看。
参数介绍:
QUERY:跟踪语句的文本。
TRACE:跟踪,JSON格式。
MISSING_BYTES_BEYOND_MAX_MEM_SIZE:每个记住的跟踪都是一个字符串,随着优化的进行扩展并将其附加数据。该optimizer_trace_max_mem_size 变量设置所有当前记忆的跟踪所使用的内存总量的限制。如果达到此限制,则当前跟踪不会扩展(因此是不完整的),并且该MISSING_BYTES_BEYOND_MAX_MEM_SIZE列显示该跟踪丢失的字节数。
INSUFFICIENT_PRIVILEGES:如果跟踪的查询使用SQL SECURITY值为的视图或存储的例程DEFINER,则可能是拒绝了除定义者之外的其他用户查看查询的跟踪。在这种情况下,跟踪显示为空,INSUFFICIENT_PRIVILEGES值为1。否则值为0。
我们可以对SQL进行optimizer_trace :
-- optimizer_trace默认不开启,开启会影响性能,用完记得关闭
mysql> show variables like '%optimizer_trace%';
+------------------------------+----------------------------------------------------------------------------+
| Variable_name | Value |
+------------------------------+----------------------------------------------------------------------------+
| optimizer_trace | enabled=off,one_line=off |
| optimizer_trace_features | greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on |
| optimizer_trace_limit | 1 |
| optimizer_trace_max_mem_size | 16384 |
| optimizer_trace_offset | -1 |
+------------------------------+----------------------------------------------------------------------------+
5 rows in set (0.01 sec)-- 设置开启optimizer_trace
mysql> set optimizer_trace="enabled=on";
Query OK, 0 rows affected (0.00 sec)mysql> explain select * from t_emp where hire_date > '1990-11-20' and birth_date > '1840-11-20';
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t_emp | NULL | ALL | idx_hire_date,idx_birth_date | NULL | NULL | NULL | 299468 | 25.00 | Using where |
+----+-------------+-------+------------+------+------------------------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)mysql> select * from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G;
*************************** 1. row ***************************QUERY: explain select * from t_emp where hire_date > '1990-11-20' and birth_date > '1840-11-20'TRACE: {"steps": [{"join_preparation": {"select#": 1,"steps": [{"expanded_query": "/* select#1 */ select `t_emp`.`emp_no` AS `emp_no`,`t_emp`.`birth_date` AS `birth_date`,`t_emp`.`first_name` AS `first_name`,`t_emp`.`last_name` AS `last_name`,`t_emp`.`gender` AS `gender`,`t_emp`.`hire_date` AS `hire_date` from `t_emp` where ((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))"}]}},{"join_optimization": {"select#": 1,"steps": [{"condition_processing": {"condition": "WHERE","original_condition": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))","steps": [{"transformation": "equality_propagation","resulting_condition": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))"},{"transformation": "constant_propagation","resulting_condition": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))"},{"transformation": "trivial_condition_removal","resulting_condition": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))"}]}},{"substitute_generated_columns": {}},{"table_dependencies": [{"table": "`t_emp`","row_may_be_null": false,"map_bit": 0,"depends_on_map_bits": []}]},{"ref_optimizer_key_uses": []},{"rows_estimation": [{"table": "`t_emp`","range_analysis": {"table_scan": {"rows": 299645,"cost": 60860},"potential_range_indexes": [{"index": "PRIMARY","usable": false,"cause": "not_applicable"},{"index": "idx_hire_date","usable": true,"key_parts": ["hire_date","emp_no"]},{"index": "idx_birth_date","usable": true,"key_parts": ["birth_date","emp_no"]}],"setup_range_conditions": [],"group_index_range": {"chosen": false,"cause": "not_group_by_or_distinct"},"analyzing_range_alternatives": {"range_scan_alternatives": [{"index": "idx_hire_date","ranges": ["0x748d0f < hire_date"],"index_dives_for_eq_ranges": true,"rowid_ordered": false,"using_mrr": false,"index_only": false,"rows": 149822,"cost": 179787,"chosen": false,"cause": "cost"},{"index": "idx_birth_date","ranges": ["0x74610e < birth_date"],"index_dives_for_eq_ranges": true,"rowid_ordered": false,"using_mrr": false,"index_only": false,"rows": 149822,"cost": 179787,"chosen": false,"cause": "cost"}],"analyzing_roworder_intersect": {"usable": false,"cause": "too_few_roworder_scans"}}}}]},{"considered_execution_plans": [{"plan_prefix": [],"table": "`t_emp`","best_access_path": {"considered_access_paths": [{"rows_to_scan": 299645,"access_type": "scan","resulting_rows": 74911,"cost": 60858,"chosen": true}]},"condition_filtering_pct": 100,"rows_for_plan": 74911,"cost_for_plan": 60858,"chosen": true}]},{"attaching_conditions_to_tables": {"original_condition": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))","attached_conditions_computation": [],"attached_conditions_summary": [{"table": "`t_emp`","attached": "((`t_emp`.`hire_date` > '1990-11-20') and (`t_emp`.`birth_date` > '1840-11-20'))"}]}},{"refine_plan": [{"table": "`t_emp`"}]}]}},{"join_explain": {"select#": 1,"steps": []}}]
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.00 sec)
细心的你肯定会发现通过optimizer_trace打印的idx_hire_date成本与explain打印的不一致,差别在于回表的CPU成本。
explain:总成本=1 x 1.0 + 149822 x 0.2 + 149822 x 1.0 + 149822 x 0.2 = 209751.8
optimizer_trace:总成本=1 x 1.0 + 149822 x 0.2 + 149822 x 1.0 = 179787.4
一条SQL语句执行成本的计算相关推荐
- 一条SQL语句计算出商品剩余库存数量
库存表(stock) 商品名称(mc) 商品总量(zl) A 100 B 120 销售表(sales) 商品名 ...
- 52条SQL语句性能优化策略
本文会提到 52 条 SQL 语句性能优化策略. 1.对查询进行优化,应尽量避免全表扫描,首先应考虑在 WHERE 及 ORDER BY 涉及的列上建立索引. 2.应尽量避免在 WHERE 子句中对字 ...
- sql加上唯一索引后批量插入_阿里大佬总结的52条SQL语句性能优化策略,建议收藏...
你知道的越多,不知道的就越多,业余的像一棵小草! 你来,我们一起精进!你不来,我和你的竞争对手一起精进! 编辑:业余草 cnblogs.com/SimpleWu/p/9929043.html 推荐:h ...
- oracle sql语句加减,实现四则运算的一条sql语句
实现四则运算的一条sql语句 1.建立一个测试表 create table mar_test( id number, text varchar2(200)) insert into mar_Test ...
- 一条SQL语句查询所有任务分数都在60以上的用户名字
一条SQL语句查询所有任务分数都在60以上的用户名字 昨天面试一家公司的笔试题,有两道SQL语句查询错误,回到家进行再想想思路,做一个记录 有两张表:数据是假设的,不是真实的 用户表(user) 任务 ...
- 一条SQL语句求全年平均值
一年有8760个小时!(才这么点...) 有个气候表,存储了当地从1到8760小时的温度数据.现在,要求全年的温度每天平均值. CREATE TABLE #Climate(h INT ,t DECIM ...
- 【MySQL】47 条SQL语句性能优化策略
本文会提到 47 条 SQL 语句性能优化策略. 1. 对查询进行优化,应尽量避免全表扫描,首先应考虑在 WHERE 及 ORDER BY 涉及的列上建立索引. 2. 应尽量避免在 WHERE 子句中 ...
- 52条SQL语句性能优化策略,建议收藏
点击上方 "编程技术圈"关注, 星标或置顶一起成长 后台回复"大礼包"有惊喜礼包! 每日英文 Promise yourself to be so strong ...
- 52条 SQL 语句性能优化策略全面解析,你知道多少种?建议收藏!
点击上方 "编程技术圈"关注, 星标或置顶一起成长 后台回复"大礼包"有惊喜礼包! 每日英文 Don't always in the memories of t ...
最新文章
- 树莓派安装Ubuntu
- maven项目的pom.xml 标签的含义
- python字符串数组中最短的_python求解数组中两个字符串的最小距离
- Jquery- 错误消息Date未定义,String未定义
- 谷歌2007年上交大考试最后一题解答
- 使用JSF 2.0可以更轻松地进行多字段验证
- [css] 有用过Flex吗?简要说下你对它的了解
- java线程如何避免死锁_Java面试问题,如何避免Java线程中的死锁?
- ffmpeg 2.6.3在Linux下的编译
- 电位器替换遥控器做远程遥控方案,远程控制云台方案
- JVM和GC知识点整理
- scala上界_Scala方差,上界和下界
- 基于MATLAB实现四阶龙格库塔法求解一、二阶微分方程实例
- Hadoop 3.1.0 单机版伪分布式的搭建
- Leftist Heaps 习题解
- 基于auto.js的安卓抢购软件---可实现毫秒级
- ae合成设置快捷键_超全AE快捷键大全
- 访问网络计算机提示0 80070035,Win7局域网连接提示0x80070035错误的解决攻略
- Android error: “Apostrophe not preceded by \” 解决办法
- wps去除登陆_WPS强制用户登录功能,这是要闹哪样?
热门文章
- ENVI中的辐射校正
- SSM框架解决QQ邮箱激活535 Error: ÇëʹÓÃÊÚȨÂëµÇ¼¡£ÏêÇéÇë¿´及端口25被占用问题
- Python中字符串截取
- 熬夜肝了万字Android View 知识体系
- 有道笔记 android,随时随地,记录一切:有道笔记Android版体验
- LeetCode 362. 敲击计数器
- 分数的加减乘除(运算符重载)
- 如何根据自己的需要培养游戏开发技能?又一篇游戏编程入门指南
- 哥德巴赫猜想python版
- TCP/IP协议详解(TCP/IP protocol)