order by语句执行顺序

后面也会持续更新，学到新东西会在其中补充。

建议按顺序食用，欢迎批评或者交流！

缺什么东西欢迎评论！我都会及时修改的！

感谢各位大佬写的文章让我学到很多东西！只是在各位大佬的基础加了我自己的思路！

参考文献

Mysql order by 优化及分析执行的神器-Optimizer TraceMysql order by. 优化 - 掘金

看一遍就理解：order by详解

Mysql调优之Using filesort一般情况-CSDN博客

Mysql中单路排序和双路排序详解-CSDN博客

测试环境

MySql5.7、MySql9.0.4

环境搭建

sql 复制代码

CREATE TABLE `staff` (
`id` BIGINT ( 11 ) AUTO_INCREMENT COMMENT '主键id',
`id_card` VARCHAR ( 20 ) NOT NULL COMMENT '身份证号码',
`name` VARCHAR ( 64 ) NOT NULL COMMENT '姓名',
`age` INT ( 4 ) NOT NULL COMMENT '年龄',
`city` VARCHAR ( 64 ) NOT NULL COMMENT '城市',
PRIMARY KEY ( `id`),
INDEX idx_city ( `city` )
) ENGINE = INNODB COMMENT '员工表';
 
insert staff values (1,'449006xxxxxxxx2134','小明',22,'上海');
insert staff values (2,'449006xxxxxxxx2134','小李',23,'深圳');
insert staff values (3,'449006xxxxxxxx2134','小刚',28,'东羡');
insert staff values (4,'449006xxxxxxxx2134','小红',20,'上海');
insert staff values (5,'449006xxxxxxxx2134','小芳',36,'北京');
insert staff values (6,'449006xxxxxxxx2134','小莉',19,'深圳');
insert staff values (7,'449006xxxxxxxx2134','小华',27,'振江');

sql 复制代码

explain select name,age,city from staff where city = '深圳' order by age limit 10;

全字段排序

需要做到心中有"树"。

MySQL 会给每个查询线程分配一块小内存，用于排序的，称为 sort_buffer。

MySQL 为对应的线程初始化sort_buffer，放入需要查询的name、age、city字段；

运行条件

sql 复制代码

MySql9.0.4
mysql> show variables like '%max_length_for_sort_data%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| max_length_for_sort_data | 4096  |
+--------------------------+-------+
1 row in set (0.02 sec)

小于4096 走全字段排序 大于4096 走RowID排序

查询所需的字段全部读取到sort_buffer中，就是全字段排序。

sort_buffer是一块内存来的，如果数据量太大，sort_buffer放不下怎么办呢？

磁盘临时文件辅助排序

如果要排序的数据小于sort_buffer_size，排序在sort_buffer 内存中完成，如果要排序的数据大于sort_buffer_size，则借助磁盘文件来进行排序。

RowID排序

全字段和RowID

MySQL的一个设计思想：如果内存够，就要多利用内存，尽量减少磁盘访问。

rowID 进行两次回表。

全字段进行一次回表。

全字段一次性从磁盘中读取查询需要的所有列，按照 order by 列在 sort_buffer(排序缓存) 缓冲区对他们进行排序，然后扫描排序后的列表输出。因为全字段效率更快，避免了二次读取数据，把随机IO变成了顺序IO，但是会使用更多的空间。

优化的方向则是覆盖索引

全字段测试

sql 复制代码

## 打开optimizer_trace，开启统计
set optimizer_trace = "enabled=on";
## 执行SQL语句
select name,age,city from staff where city = '深圳' order by age limit 10;
## 查询输出的统计信息
SELECT  * FROM `information_schema`.`OPTIMIZER_TRACE`;

MySql9.0.4

接下来解释一下filesort_summary里面的参数！

memory_available: 262144 字节 (256 KB) 是分配给此排序操作可用的最大内存。

key_size: 排序键的大小是 8 字节。也就是索引键(create index idx on student(id))
row_size: 每行数据的大小为 532 字节。和行的类型有关！
max_rows_per_buffer: 每个内存缓冲区最多可以容纳 15 行数据。
num_rows_estimate: 预估的行数为 15 行。
num_rows_found: 实际找到并处理了 2 行数据。
num_initial_chunks_spilled_to_disk: 初始时没有数据块溢出到磁盘（即所有排序都在内存中完成）。
peak_memory_used: 排序过程中使用的峰值内存量为 32784 字节。

排序模式sort_mode

<varlen_sort_key,rowid>：使用了rowid排序模式
<varlen_sort_key, additional_fields>：使用了全字段排序模式
<varlen_sort_key, packed_additional_fields>：使用了打包字段排序模式（与全字段排序模式工作原理一致，不同点在于会将字段紧密的排列在一起，而不是固定长度的空间）通俗的讲就是比如：一个字段定义为VARCHAR(32)，不打包占用32字节，打包后可能占用 20字节。

MySql5.7

Rows Estimate: 预估需要排序的行数为 936 行。
Row Size: 每行数据的大小为 532 字节。
Memory Available: 可用内存为 262144 字节 (256 KB)。

Chosen : 优先队列优化被选中使用 (chosen 为 true)，这意味着当结果集很大时，MySQL 使用了优先队列算法来优化 LIMIT 查询，以减少处理的数据量和提高效率。
Rows: 实际返回的行数为 2 行。
Examined Rows: 实际检查的行数也是 2 行。
Number of Tmp Files: 创建的临时文件数量为 0，表示整个排序过程都在内存中完成，无需使用磁盘上的临时文件。
Sort Buffer Size: 排序缓冲区的大小为 5944 字节。
Sort Mode : 排序模式为 <sort_key, additional_fields>，意味着排序键加上额外的字段一起被排序。

RowID测试

MySql9.0.4

sql 复制代码

mysql> show variables like '%max_length_for_sort_data%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| max_length_for_sort_data | 4096  |
+--------------------------+-------+
1 row in set (0.02 sec)

# 设置的小一点，让它走RowID排序； 这个值小于 name,age,city 这三个值
set max_length_for_sort_data = 10;

## 打开optimizer_trace，开启统计
set optimizer_trace = "enabled=on";
## 执行SQL语句
select name,age,city from staff where city = '深圳' order by age limit 10;
## 查询输出的统计信息
SELECT  * FROM `information_schema`.`OPTIMIZER_TRACE`;

发现优化器还是走了打包字段排序。

sql 复制代码

set max_length_for_sort_data = 4096;

MySql5.7

sql 复制代码

mysql> show variables like '%max_length_for_sort_data%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| max_length_for_sort_data | 1024  |
+--------------------------+-------+
1 row in set (0.01 sec)

# 设置的小一点，让它走RowID排序； 这个值小于 name,age,city 这三个值
set max_length_for_sort_data = 10;

## 打开optimizer_trace，开启统计
set optimizer_trace = "enabled=on";
## 执行SQL语句
select name,age,city from staff where city = '深圳' order by age limit 10;
## 查询输出的统计信息
SELECT  * FROM `information_schema`.`OPTIMIZER_TRACE`;

sql 复制代码

mysql> set max_length_for_sort_data = 1024;
Query OK, 0 rows affected (0.00 sec)

优化

1.调整参数：

max_length_for_sort_data ；修改其大小，可以让mysql 排序选择全局排序还是rowid 字段排序。(从 8.0.20 开始废弃了)
sort_buffer_size 增加 [sort_buffer_size](MySQL :: MySQL 8.0 Reference Manual :: 7.1.8 Server System Variables) 变量值。理想情况下，该值应该足够大，以便整个结果集适合排序缓冲区（以避免写入磁盘和合并过程）。
增加 [read_rnd_buffer_size](MySQL :: MySQL 8.0 参考手册 :: 7.1.8 服务器系统变量 --- MySQL :: MySQL 8.0 Reference Manual :: 7.1.8 Server System Variables) 变量值以便一次读取更多行。

2.使用覆盖索引

若字段上有索引order by还会文件排序吗？

sql 复制代码

mysql> show index from staff;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| staff |          0 | PRIMARY  |            1 | id          | A         |           7 |     NULL | NULL   |      | BTREE      |         |               |
| staff |          1 | idx_city |            1 | city        | A         |           5 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.01 sec)

第一段代码就是走了覆盖索引。

没有where条件，order by字段需要加索引吗？

不加索引

sql 复制代码

mysql> show index from staff;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| staff |          0 | PRIMARY  |            1 | id          | A         |           7 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)

mysql> explain select * from staff order by city;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | staff | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    7 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

加索引

sql 复制代码

mysql> create index idx_city on staff(city);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from staff order by city;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | staff | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    7 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.01 sec)

mysql> show index from staff;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| staff |          0 | PRIMARY  |            1 | id          | A         |           7 |     NULL | NULL   |      | BTREE      |         |               |
| staff |          1 | idx_city |            1 | city        | A         |           5 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)

无条件查询的话，即使city上有索引，也不会使用到。因为MySQL优化器认为走普通二级索引，再去回表成本比全表扫描排序更高。所以选择走全表扫描,然后根据全字段排序或者rowid排序来进行。

修改一下SQL语句

无条件查询,如果m值较小,是可以走索引的.因为MySQL优化器认为，根据索引有序性去回表查数据,然后得到m条数据,就可以终止循环,那么成本比全表扫描小,则选择走二级索引。

索引存储顺序与order by不一致，如何优化？

sql 复制代码

select name,age from staff order by age ,name desc limit 10;
                            age 小到大    name 大到小

我们知道name相对age是有序，name是小到大而我们查询的语句是从大到小就会出现文件排序。

sql 复制代码

mysql> create index idx_name_age on staff(name,age);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select name,age from staff order by name,age desc;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | staff | NULL       | index | NULL          | idx_name_age | 262     | NULL |    7 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select name,age from staff order by name,age desc limit 1;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | staff | NULL       | index | NULL          | idx_name_age | 262     | NULL |    7 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.01 sec)

如何优化呢修改一下索引存储顺序

说白了就算索引存储顺序要和order by顺序对应。

使用了in条件多个属性时，SQL执行是否有排序过程

sql 复制代码

mysql> create index idx_name_age on staff(name,age);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show index from staff;
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| staff |          0 | PRIMARY      |            1 | id          | A         |           7 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| staff |          1 | idx_name_age |            1 | name        | A         |           7 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| staff |          1 | idx_name_age |            2 | age         | A         |           7 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
3 rows in set (0.01 sec)

和我之前说的相似in('深圳') 相当于 = '深圳'

in ('深圳','122121') 有点类似于like

这是因为:in有两个条件，在满足深圳时，age是排好序的，但是把满足122121的age也加进来，就不能保证满足所有的age都是排好序的。因此需要Using filesort。

假如id为主键索引，因为主键索引是有序的，所以导致order by id 失效！

sql 复制代码

select * from a order by id

不是按照最左前缀法则

分页limit过大时，会导致大量排序怎么办?（待续）

假设SQL如下：

复制代码

select * from A order by a limit 100000,10

可以记录上一页最后的id，下一页查询时，查询条件带上id，如：where id > 上一页最后id limit 10。
也可以在业务允许的情况下，限制页数。

Using filesort文件排序

环境搭建：

sql 复制代码

create table tb(id int, type int, weight int, index t_index(type), index w_index(weight));


编写存储过程！
DELIMITER $

DROP PROCEDURE IF EXISTS fill_test_data; $

CREATE PROCEDURE `fill_test_data`()
BEGIN
    DECLARE i INT DEFAULT 1;
    DECLARE w INT DEFAULT 100;
    DECLARE t INT DEFAULT 1;

    WHILE i <= 100000 DO
        INSERT INTO tb VALUES (i, t, w);
        SET i = i + 1;
        SET w = (w + 10) % 1000;
        SET t = (t + 1) % 10;
    END WHILE;
END $

DELIMITER ;

call fill_test_data(); 需要执行一段时间

type和weight分别创建了索引

第一段代码：走t_index索引->回表->全字段排序/rowid排序->结果集

第二段代码：全表扫描-> 全字段排序/rowid排序->结果集

和select *好像并没有关系，因为索引表中只有type 和主键ID weight的数据只能去原数据表拿取。

众所周知索引是有序的我们是否能把两个字段结合起来呢？

再进行测试一下！

第一段代码/第二段代码：都是覆盖索引

第三段代码：范围扫描idx_sum索引->结果集

发现using filesort被排除了！

需要注意的地方

order by 字段不是索引字段
order by 字段是索引字段，但是 select 中没有使用覆盖索引，如：select * from staffs order by age asc;
order by 中同时存在 ASC 升序排序和 DESC 降序排序，如：select a, b from staffs order by a desc, b asc;
order by 多个字段排序时，不是按照索引顺序进行 order by，即不是按照最左前缀法则，如：select a, b from staffs order by b asc, a asc;

总结

遵守最左原则

选择index列进行排序

FileSort和Index，Index效率较高，FileSort方式效率较低。

留点问题

select * from a where type = 5 order by id 会发生文件排序吗？

在type 和 id 上分别建索引呢？

那么在type 和 id 上建了索引呢？

order by语句执行顺序

参考文献

测试环境

环境搭建

全字段排序

磁盘临时文件辅助排序

RowID排序

全字段和RowID

全字段测试

RowID测试

优化

若字段上有索引order by还会文件排序吗？

没有where条件，order by字段需要加索引吗 ？

索引存储顺序与order by不一致，如何优化？

使用了in条件多个属性时，SQL执行是否有排序过程

假如id为主键索引，因为主键索引是有序的，所以导致order by id 失效！

不是按照最左前缀法则

分页limit过大时，会导致大量排序怎么办?（待续）

Using filesort文件排序

需要注意的地方

总结

留点问题

没有where条件，order by字段需要加索引吗？