SQL治理经验谈：索引覆盖

背景

explain - format

id: query sql 的标识id
SELECT_TYPE: 查询的类型*（SIMPLE/PRIMARY/SUBQUERY/DERIVED/UNION/UNION RESULT/* DEPENDENT SUBQUERY/DEPENDENT UNION）
table: 表名
Partitions: 表连接的分区数
type: 查询中使用的访问类型（system/const/eq_ref/ref/range/index/ALL），见下表type的字段解析
possible_keys
- 显示可能应用在这张表中的索引，一个或多个。
- 查询涉及到的字段上若存在索引，则该索引将被列出，但不一定被查询实际使用
key
- 实际使用的索引，如果为NULL，则没有使用索引.
- 查询中若使用了覆盖索引，则该索引和查询的select字段重叠
key_len: 使用到的索引的长度
ref: 显示了查询条件类型（const/field_name/func）
rows: query查询到的行数量
- query完成索引命中之后，才会去检查的行总数
- 估算性能耗时：rows用来表示在SQL执行过程中会被扫描的行数，该数值越大，意味着需要扫描的行数，相应的耗时更长。但是需要注意的是EXPLAIN中输出的rows只是一个估算值，不能完全对其百分之百相信，如EXPLAIN中对LIMITS的支持就比较有限。可以参考文章《MySQL EXPLAIN limits and errors》
- 这个rows就是mysql认为必须要逐行去检查和判断的记录的条数。
Filtered: where子句的过滤条件
Extra: query子句执行的附加信息

expalin - type

|-----------------|--------|----------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| System Name | Color | Text on Visual Diagram | Tooltip Related Information |
| SYSTEM | Blue | Single row: system constant | Very low cost |
| CONST | Blue | Single row: constant | Very low cost |
| EQ_REF | Green | Unique Key Lookup | Low cost -- The optimizer is able to find an index that it can use to retrieve the required records. It is fast because the index search directly leads to the page with all the row data |
| REF | Green | Non-Unique Key Lookup | Low-medium -- Low if the number of matching rows is small; higher as the number of rows increases |
| FULLTEXT | Yellow | Fulltext Index Search | Specialized FULLTEXT search. Low -- for this specialized search requirement |
| REF_OR_NULL | Green | Key Lookup + Fetch NULL Values | Low-medium -- if the number of matching rows is small; higher as the number of rows increases |
| INDEX_MERGE | Green | Index Merge | Medium -- look for a better index selection in the query to improve performance |
| UNIQUE_SUBQUERY | Orange | Unique Key Lookup into table of subquery | Low -- Used for efficient Subquery processing |
| INDEX_SUBQUERY | Orange | Non-Unique Key Lookup into table of subquery | Low -- Used for efficient Subquery processing |
| RANGE | Orange | Index Range Scan | Medium -- partial index scan |
| INDEX | Red | Full Index Scan | High -- especially for large indexes |
| ALL | Red | Full Table Scan | Very High -- very costly for large tables, but less of an impact for small ones. No usable indexes were found for the table, which forces the optimizer to search every row. This could also mean that the search range is so broad that the index would be useless. |
| UNKNOWN | Black | unknown | Note: This is the default, in case a match cannot be determined |

实践

建表

sql 复制代码

CREATE TABLE `test_like` (
  `id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '账号',
  `age` int NOT NULL DEFAULT '0' COMMENT '年龄',
  `email` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '邮箱'
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

测试数据

sql 复制代码

INSERT into test_like 
(id, name, age, email)
values
(1, 'aaa', 111, '1111'),
(2, 'aaa', 111, '1111'),
(3, 'aaa', 111, '1111'),
(4, 'aaa', 111, '1111'),
(5, 'aaa', 111, '1111');

添加单列索引

sql 复制代码

-- 单索引 索引覆盖，通过索引idx_on_name的B+树的页节点，可以直接过滤到
ALTER table test_like add key idx_on_name (`name`);

验证案例

案例一：全列检索 select *

测试结果

通配符在右侧

explain select * from test_like where name like '11%';

单列索引，通配符在右侧，select * 会走索引，type=range检索性能好

通配符在左右两侧

explain select * from test_like where name like '%11%';

单列索引，通配符在左右两侧，select * 不会走索引，type=ALL检索性能差

通配符在左侧

explain select * from test_like where name like '%11';

单列索引，通配符在左侧，select * 不会走索引，type=ALL检索性能差

结果说明

加了单列索引还不够，因为select 列和 where 条件语句两个变量，都可能导致sql执行效果走了全表扫描，性能非常差。想要性能过得去，得确保通配符在右侧。

案例二：索引列检索 select [索引列]

测试结果

通配符在右侧

单列索引，通配符在右侧，select [索引列] 会走索引，type=range检索性能好

通配符在左右两侧

explain select * from test_like where name like '%11%';

单列索引，通配符在右侧，select [索引列] 会走索引，

type=index，检索性能和二级索引树的数据量相关；sql性能随着二级索引树节点数量变多，性能变差

通配符在左侧

explain select * from test_like where name like '%11';

单列索引，通配符在左侧，select [索引列] 会走索引，

type=index，检索性能和二级索引树的数据量相关；sql性能随着二级索引树节点数量变多，性能变差

结果说明

案例一的select * 性能优化，可以通过案例二的方法去优化：

select [索引列] 因为索引覆盖，所以会避免了全表扫描的结果，最终性能有提高

案例三：联合索引列检索 select [多索引列]

我们对表的 name、age设置了联合索引

sql 复制代码

-- 联合索引 索引覆盖，通过联合索引idx_on_name_age的页节点，可以直接过滤到
ALTER table test_like add key idx_on_name_age (`name`, `age`);

测试结果

select 满足最左匹配原则 - 单列

select [联合索引左侧单列] + 通配符在右侧

explain select name from test_like where name like '11%';

联合索引，通配符在右侧，select [索引列] 会走索引，type=range检索性能好

select [联合索引左侧单列] + 通配符在左右两侧

explain select name from test_like where name like '%11%';

联合索引，通配符在两侧，select [索引列] 会走索引，type=index，检索性能随索引数据量变大而变差

select [联合索引左侧单列] + 通配符在左侧

explain select name from test_like where name like '%11';

联合索引，通配符在左侧，select [索引列] 会走索引，type=index，检索性能随索引数据量变大而变差

select 不满足最左匹配原则 - 单列

select [联合索引非最左侧单列] + 通配符右侧

explain select age from test_like where name like '11%';

联合索引，通配符在右侧，select [索引列] 会走索引，type=range，检索性能好

select [联合索引非最左侧单列] + 通配符左右两侧

explain select age from test_like where name like '%11%';

联合索引，通配符在两侧，select [索引列] 会走索引，type=ALL，检索性能差

select [联合索引非最左侧单列] + 通配符左侧

explain select age from test_like where name like '%11';

联合索引，通配符在左侧，select [索引列] 会走索引，type=ALL，检索性能差

select 满足最左匹配原则 - 多列索引列（不包含非索引列）

select [联合索引多列] + 通配符在右侧

explain select name,age from test_like where name like '11%';

联合索引，通配符在右侧，select [索引列] 会走索引，type=range，检索性能好

select [联合索引多列] + 通配符在两侧

explain select name,age from test_like where name like '%11%';

联合索引，通配符在两侧，select [索引列] 会走索引，type=ALL，检索性能差

select [联合索引多列] + 通配符在左侧

explain select name,age from test_like where name like '%11';

联合索引，通配符在左侧，select [索引列] 会走索引，type=ALL，检索性能差

select 满足最左匹配原则 - 多列索引列（包含非索引列）

select [联合索引多列+非索引列] + 通配符在右侧

explain select name,age,email from test_like where name like '11%';

联合索引，通配符在右侧，select [索引列] 会走索引，type=range，检索性能好

select [联合索引多列+非索引列] + 通配符在两侧

explain select name,age,email from test_like where name like '%11%';

联合索引，通配符在两侧，select [索引列] 会走索引，type=ALL，检索性能差

select [联合索引多列+非索引列] + 通配符在左侧

explain select name,age,email from test_like where name like '%11';

联合索引，通配符在左侧，select [索引列] 会走索引，type=ALL，检索性能差

结果说明

联合索引下，检索最左侧单列，无论通配符位置，都会索引覆盖
联合索引下，检索非最左侧单列，只有通配符左侧位置，才会索引覆盖
联合索引下，检索多列索引，无论是否包含非索引列，通配符在右侧时，才会索引覆盖

参考文章

EXPLAIN in SQL：https://www.geeksforgeeks.org/explain-in-sql/

官网Mysql的EXPLAIN信息描述：

https://dev.mysql.com/doc/workbench/en/wb-performance-explain.html

其他文章

SQL性能治理经验谈

理解到位：灾备和只读数据库

记录一次Mysql死锁事件（由Insert与uniqueKey导致）

一文带你看懂：亿级大表垂直拆分的工程实践

亿级大表冷热分级的工程实践