为什么使用order by子句可以利用索引？

Question

20 浏览2023年5月27日

匿名的 2023年5月27日

0 Comments

假设在tableX中，我们有id（主键），name和age，phone，所有都有索引。

在这个查询中：

select phone from tableX where name='Dennis' order by age

我猜测过程是：

使用name索引获取与Dennis匹配的id。将id集合表示为S
使用age索引对步骤1中获取的id进行排序，得到排序后的id列表，表示为L
使用排序后的id列表L获取phone

我猜测在步骤2中，它可能会沿着B+树叶节点进行顺序扫描，检查该叶节点中的id是否在步骤1中获取的id集合S中。如果是，则将其添加到列表L中，然后我们可以得到按age排序的id列表L。

但是，这与简单的顺序扫描有什么区别呢？它们不都是顺序扫描吗？

编辑：

explain显示它使用索引name并执行了filesort

+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
| id | select_type | table  | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra          |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
|  1 | SIMPLE      | tableX | NULL       | ref  | idx_name      | idx_name | 123     | const |    1 |   100.00 | Using filesort |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+

实际上，我不确定在什么情况下索引可以在order by子句中起作用，所以我想出了一个糟糕的例子来说明我的疑问。

但是，Tim Biegeleisen提供的例子很好。

（如果您感兴趣，以下是更多的表细节：）

mysql> create table tableX(
    -> id int primary key,
    -> name varchar(30),
    -> age int,
    -> phone varchar(30)
    -> );
Query OK, 0 rows affected (0.07 sec)
mysql> create index idx_name on tableX(name);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> create index idx_age on tableX(age);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> create index idx_phone on tableX(phone);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> show index from tableX;
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table  | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| tableX |          0 | PRIMARY   |            1 | id          | A         |           1 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_name  |            1 | name        | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_age   |            1 | age         | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_phone |            1 | phone       | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.01 sec)
mysql> select * from tableX;
+----+--------+------+-------+
| id | name   | age  | phone |
+----+--------+------+-------+
|  1 | Jack   |   20 | 180   |
|  2 | Dennis |   22 | 180   |
|  3 | Dennis |   18 | 1790  |
+----+--------+------+-------+

0

1 答案

匿名的 · Answer 1 · 2023-09-04T12:14:45+00:00

为什么使用ORDER BY子句可以利用索引？

在这里，应该有助于的索引是一个复合索引：

CREATE INDEX idx ON tableX (name, age, phone)

如果使用上述索引，可能会有以下步骤：

可以在B树中通过Dennis的名字记录进行搜索。
一旦定位到B树的该区域，记录将按照年龄进行排序。
可以执行扫描操作，将Dennis的记录按年龄排序填充结果集。
索引还包括phone列，使其成为查询的覆盖索引；所需的所有列都可以直接从索引中读取。

phone是否在此处只是为了使其只需读取索引而不需要读取表格？

是的，这是正确的。包含phone是为了使其成为覆盖索引。在其他数据库上，我们可能有其他选项。例如，在SQL Server上，我们可以使用INCLUDE仅在叶节点中包含phone值（而不使整个索引变大）。