MySQL高级-索引-使用规则-前缀索引

文章目录

1、前缀索引

当字段类型为字符串(varchar,text等)时,有时候需要索引很长的字符串,这会让索引变得很大,查询时,浪费大量的磁盘IO,影响查询效率。此时可以只将字符串的一部分前缀,建立索引,这样可以大大节约索引空间,从而提高索引效率。

sql 复制代码
create index idx_xxxx on table_name(column(n))

2、前缀长度

可以根据索引的选择性来决定,而选择性是指不重复的索引值(基数)和数据表的记录总数的比值,索引选择性越高则查询效率越高,唯一索引的选择性是1,这是最好的索引选择性,性能页是最好的。

sql 复制代码
select count(distinct email)/count(*) from tb_user;
select count(distinct substring(email,1,5))/count(*) from tb_user;

3、查询表数据

cpp 复制代码
mysql> select * from tb_user;
+----+--------+-------------+-----------------------+----------------------+------+--------+--------+---------------------+
| id | name   | phone       | email                 | profession           | age  | gender | status | createtime          |
+----+--------+-------------+-----------------------+----------------------+------+--------+--------+---------------------+
|  1 | 吕布   | 17799990000 | lvbu666@163.com       | 软件工程             |   23 | 1      | 6      | 2001-02-02 00:00:00 |
|  2 | 曹操   | 17799990001 | caocao666@qq.com      | 通讯工程             |   33 | 1      | 0      | 2001-03-05 00:00:00 |
|  3 | 赵云   | 17799990002 | 17799990@139.com      | 英语                 |   34 | 1      | 2      | 2002-03-02 00:00:00 |
|  4 | 孙悟空 | 17799990003 | 17799990@sina.com     | 工程造价             |   54 | 1      | 0      | 2001-07-02 00:00:00 |
|  5 | 花木兰 | 17799990004 | 19980729@sina.com     | 软件工程             |   23 | 2      | 1      | 2001-04-22 00:00:00 |
|  6 | 大乔   | 17799990005 | daqiao666@sina.com    | 舞蹈                 |   22 | 2      | 0      | 2001-02-07 00:00:00 |
|  7 | 露娜   | 17799990006 | luna_love@sina.com    | 应用数学             |   24 | 2      | 0      | 2001-02-08 00:00:00 |
|  8 | 程咬金 | 17799990007 | chengyaojin@163.com   | 化工                 |   38 | 1      | 5      | 2001-05-23 00:00:00 |
|  9 | 项羽   | 17799990008 | xiaoyu666@qq.com      | 金属材料             |   43 | 1      | 0      | 2001-09-18 00:00:00 |
| 10 | 白起   | 17799990009 | baiqi666@sina.com     | 机械工程及其自动
化 |   27 | 1      | 2      | 2001-08-16 00:00:00 |
| 11 | 韩信   | 17799990010 | hanxin520@163.com     | 无机非金属材料工
程 |   27 | 1      | 0      | 2001-06-12 00:00:00 |
| 12 | 荆轲   | 17799990011 | jingke123@163.com     | 会计                 |   29 | 1      | 0      | 2001-05-11 00:00:00 |
| 13 | 兰陵王 | 17799990012 | lanlinwang666@126.com | 工程造价             |   44 | 1      | 1      | 2001-04-09 00:00:00 |
| 14 | 狂铁   | 17799990013 | kuangtie@sina.com     | 应用数学             |   43 | 1      | 2      | 2001-04-10 00:00:00 |
| 15 | 貂蝉   | 17799990014 | 84958948374@qq.com    | 软件工程             |   40 | 2      | 3      | 2001-02-12 00:00:00 |
| 16 | 妲己   | 17799990015 | 2783238293@qq.com     | 软件工程             |   31 | 2      | 0      | 2001-01-30 00:00:00 |
| 17 | 芈月   | 17799990016 | xiaomin2001@sina.com  | 工业经济             |   35 | 2      | 0      | 2000-05-03 00:00:00 |
| 18 | 嬴政   | 17799990017 | 8839434342@qq.com     | 化工                 |   38 | 1      | 1      | 2001-08-08 00:00:00 |
| 19 | 狄仁杰 | 17799990018 | jujiamlm8166@163.com  | 国际贸易             |   30 | 1      | 0      | 2007-03-12 00:00:00 |
| 20 | 安琪拉 | 17799990019 | jdodm1h@126.com       | 城市规划             |   51 | 2      | 0      | 2001-08-15 00:00:00 |
| 21 | 典韦   | 17799990020 | ycaunanjian@163.com   | 城市规划             |   52 | 1      | 2      | 2000-04-12 00:00:00 |
| 22 | 廉颇   | 17799990021 | lianpo321@126.com     | 土木工程             |   19 | 1      | 3      | 2002-07-18 00:00:00 |
| 23 | 后羿   | 17799990022 | altycj2000@139.com    | 城市园林             |   20 | 1      | 0      | 2002-03-10 00:00:00 |
| 24 | 姜子牙 | 17799990023 | 37483844@qq.com       | 工程造价             |   29 | 1      | 4      | 2003-05-26 00:00:00 |
+----+--------+-------------+-----------------------+----------------------+------+--------+--------+---------------------+
24 rows in set (0.00 sec)

mysql>

4、查询表的记录总数

cpp 复制代码
mysql> select count(*) from tb_user;
+----------+
| count(*) |
+----------+
|       24 |
+----------+
1 row in set (0.00 sec)

mysql>

5、计算并返回具有电子邮件地址(email)的用户的数量

cpp 复制代码
mysql> select count(email) from tb_user;
+--------------+
| count(email) |
+--------------+
|           24 |
+--------------+
1 row in set (0.00 sec)

mysql>

6、从tb_user表中计算并返回具有不同电子邮件地址的用户的数量

cpp 复制代码
mysql> select count(distinct email) from tb_user;
+-----------------------+
| count(distinct email) |
+-----------------------+
|                    24 |
+-----------------------+
1 row in set (0.00 sec)

mysql>

7、计算唯一电子邮件地址(email)的比例相对于表中的总行数

cpp 复制代码
mysql> select count(distinct email)/count(*) from tb_user;
+--------------------------------+
| count(distinct email)/count(*) |
+--------------------------------+
|                         1.0000 |
+--------------------------------+
1 row in set (0.00 sec)

mysql>
  • 其中1表示所有用户都有唯一的电子邮件地址,而0表示没有用户有电子邮件地址(尽管这在现实中不太可能)
  • 用来衡量 email 字段的去重比例,即表示不重复的 email 占总记录数的比例。
  • 用来评估数据中电子邮件地址的唯一性程度,或者说检测是否存在大量的重复邮箱账户。如果结果接近1,说明几乎每个行都有一个唯一的电子邮件地址;如果远小于1,则表示有较多的电子邮件地址重复。

8、从每个电子邮件地址中提取前10个字符,并计算这些前10个字符唯一值的数量与总用户数量的比率。

cpp 复制代码
mysql> select count(distinct substring(email,1,10))/count(*) from tb_user;
+------------------------------------------------+
| count(distinct substring(email,1,10))/count(*) |
+------------------------------------------------+
|                                         1.0000 |
+------------------------------------------------+
1 row in set (0.00 sec)

mysql>

9、电子邮件地址的前9个字符的唯一值的数量与总用户数量的比率

cpp 复制代码
mysql> select count(distinct substring(email,1,9))/count(*) from tb_user;
+-----------------------------------------------+
| count(distinct substring(email,1,9))/count(*) |
+-----------------------------------------------+
|                                        0.9583 |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql>

10、电子邮件地址的前8个字符与前9个字符在唯一性方面的表现是相似的

cpp 复制代码
mysql> select count(distinct substring(email,1,8))/count(*) from tb_user;
+-----------------------------------------------+
| count(distinct substring(email,1,8))/count(*) |
+-----------------------------------------------+
|                                        0.9583 |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql>

11、前 6 个字符的不重复数量占总行数的比例

cpp 复制代码
mysql> select count(distinct substring(email,1,6))/count(*) from tb_user;
+-----------------------------------------------+
| count(distinct substring(email,1,6))/count(*) |
+-----------------------------------------------+
|                                        0.9583 |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql>

12、前 5 个字符的不重复数量占总行数的比例

cpp 复制代码
mysql> select count(distinct substring(email,1,5))/count(*) from tb_user;
+-----------------------------------------------+
| count(distinct substring(email,1,5))/count(*) |
+-----------------------------------------------+
|                                        0.9583 |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql>

13、随着截取长度的减少,电子邮件地址前缀的唯一性也在减少

cpp 复制代码
mysql> select count(distinct substring(email,1,4))/count(*) from tb_user;
+-----------------------------------------------+
| count(distinct substring(email,1,4))/count(*) |
+-----------------------------------------------+
|                                        0.9167 |
+-----------------------------------------------+
1 row in set (0.00 sec)

mysql>

14、查看MySQL中tb_user表的索引

cpp 复制代码
mysql> show index from tb_user;
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table   | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| tb_user |          0 | PRIMARY              |            1 | id          | A         |          24 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            1 | profession  | A         |          16 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            2 | age         | A         |          22 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            3 | status      | A         |          24 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.01 sec)

mysql>

15、在tb_user表的email列上创建一个前缀索引,其中只包括email列的前5个字符

cpp 复制代码
mysql> create index idx_email_5 on tb_user(email(5));
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show index from tb_user;
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table   | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| tb_user |          0 | PRIMARY              |            1 | id          | A         |          24 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            1 | profession  | A         |          16 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            2 | age         | A         |          22 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_user_pro_age_sta |            3 | status      | A         |          24 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tb_user |          1 | idx_email_5          |            1 | email       | A         |          23 |        5 |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+---------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
5 rows in set (0.01 sec)

mysql>

16、查询 email='daqiao666@sina.com' 的用户

cpp 复制代码
mysql> select * from tb_user where email='daqiao666@sina.com';
+----+------+-------------+--------------------+------------+------+--------+--------+---------------------+
| id | name | phone       | email              | profession | age  | gender | status | createtime          |
+----+------+-------------+--------------------+------------+------+--------+--------+---------------------+
|  6 | 大乔 | 17799990005 | daqiao666@sina.com | 舞蹈       |   22 | 2      | 0      | 2001-02-07 00:00:00 |
+----+------+-------------+--------------------+------------+------+--------+--------+---------------------+
1 row in set (0.00 sec)

mysql>

17、执行计划 email='daqiao666@sina.com'

cpp 复制代码
mysql> explain select * from tb_user where email='daqiao666@sina.com';
+----+-------------+---------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key         | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | tb_user | NULL       | ref  | idx_email_5   | idx_email_5 | 23      | const |    1 |   100.00 | Using where |
+----+-------------+---------+------------+------+---------------+-------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql>
相关推荐
Dreamboat¿10 分钟前
SQL 注入漏洞
数据库·sql
曹牧1 小时前
Oracle数据库中,将JSON字符串转换为多行数据
数据库·oracle·json
被摘下的星星1 小时前
MySQL count()函数的用法
数据库·mysql
末央&2 小时前
【天机论坛】项目环境搭建和数据库设计
java·数据库
徒 花2 小时前
数据库知识复习07
数据库·作业
素玥2 小时前
实训5 python连接mysql数据库
数据库·python·mysql
jnrjian2 小时前
text index 查看index column index定义 index 刷新频率 index视图
数据库·oracle
瀚高PG实验室2 小时前
审计策略修改
网络·数据库·瀚高数据库
言慢行善3 小时前
sqlserver模糊查询问题
java·数据库·sqlserver
韶博雅3 小时前
emcc24ai
开发语言·数据库·python