【性能优化】MySQL 生产环境 SQL 性能优化实战案例

🚀 MySQL 生产环境 SQL 性能优化实战案例

🏗️ 背景介绍

最近在处理一个项目时,发现在生产环境的工作流相关接口中,某些查询的执行时间异常缓慢,尽管数据量仅为 2 万条。经过分析,发现以下 SQL 语句执行非常慢:

sql 复制代码
SELECT *
FROM ACT_HI_TASKINST t
LEFT JOIN ACT_HI_PROCINST p ON p.PROC_INST_ID_ = t.PROC_INST_ID_
LEFT JOIN ACT_HI_COMMENT c ON c.TASK_ID_ = t.id_;

尤其是在添加 LEFT JOIN ACT_HI_COMMENT 后,查询时间显著增加,达到了 ⏳ 1 分钟。我们需要深入分析并优化此查询。

🔍 执行计划分析

通过 EXPLAIN FORMAT=JSON 分析执行计划,得到的关键结果如下:

  • c 表有 23,754 行,但 rows_produced_per_join 却达到了 4.19 亿 行,产生了 笛卡尔积效应 💥。
  • data_read_per_join 高达 5TB,导致查询执行极为缓慢 🐌。
  • JOIN 方式为 Block Nested Loop(BNL),执行效率较低。
  • TASK_ID_ 字段缺乏合适的索引,导致 c 表进行全表扫描 📜。

执行计划示例:

json 复制代码
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "86412925.39"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "t",
          "access_type": "ALL",
          "rows_examined_per_scan": 17679,
          "rows_produced_per_join": 17679,
          "filtered": "100.00",
          "cost_info": {
            "read_cost": "419.00",
            "eval_cost": "3535.80",
            "prefix_cost": "3954.80",
            "data_read_per_join": "567M"
          },
          "used_columns": [
            "ID_",
            "REV_",
            "PROC_DEF_ID_",
            "TASK_DEF_ID_",
            "TASK_DEF_KEY_",
            "PROC_INST_ID_",
            "EXECUTION_ID_",
            "SCOPE_ID_",
            "SUB_SCOPE_ID_",
            "SCOPE_TYPE_",
            "SCOPE_DEFINITION_ID_",
            "NAME_",
            "PARENT_TASK_ID_",
            "DESCRIPTION_",
            "OWNER_",
            "ASSIGNEE_",
            "START_TIME_",
            "CLAIM_TIME_",
            "END_TIME_",
            "DURATION_",
            "DELETE_REASON_",
            "PRIORITY_",
            "DUE_DATE_",
            "FORM_KEY_",
            "CATEGORY_",
            "TENANT_ID_",
            "LAST_UPDATED_TIME_"
          ]
        }
      },
      {
        "table": {
          "table_name": "p",
          "access_type": "eq_ref",
          "possible_keys": [
            "PROC_INST_ID_"
          ],
          "key": "PROC_INST_ID_",
          "used_key_parts": [
            "PROC_INST_ID_"
          ],
          "key_length": "194",
          "ref": [
            "work_order.t.PROC_INST_ID_"
          ],
          "rows_examined_per_scan": 1,
          "rows_produced_per_join": 17679,
          "filtered": "100.00",
          "cost_info": {
            "read_cost": "17679.00",
            "eval_cost": "3535.80",
            "prefix_cost": "25169.60",
            "data_read_per_join": "319M"
          },
          "used_columns": [
            "ID_",
            "REV_",
            "PROC_INST_ID_",
            "BUSINESS_KEY_",
            "PROC_DEF_ID_",
            "START_TIME_",
            "END_TIME_",
            "DURATION_",
            "START_USER_ID_",
            "START_ACT_ID_",
            "END_ACT_ID_",
            "SUPER_PROCESS_INSTANCE_ID_",
            "DELETE_REASON_",
            "TENANT_ID_",
            "NAME_",
            "CALLBACK_ID_",
            "CALLBACK_TYPE_"
          ]
        }
      },
      {
        "table": {
          "table_name": "c",
          "access_type": "ALL",
          "rows_examined_per_scan": 23754,
          "rows_produced_per_join": 419946966,
          "filtered": "100.00",
          "using_join_buffer": "Block Nested Loop",
          "cost_info": {
            "read_cost": "2398362.59",
            "eval_cost": "83989393.20",
            "prefix_cost": "86412925.39",
            "data_read_per_join": "5T"
          },
          "used_columns": [
            "ID_",
            "TYPE_",
            "TIME_",
            "USER_ID_",
            "TASK_ID_",
            "PROC_INST_ID_",
            "ACTION_",
            "MESSAGE_",
            "FULL_MSG_"
          ],
          "attached_condition": "<if>(is_not_null_compl(c), (work_order.c.TASK_ID_ = work_order.t.ID_), true)"
        }
      }
    ]
  }
}

⚡ 优化方案

✅ 1. 增加索引

c.TASK_ID_ 添加索引,以减少全表扫描带来的影响:

sql 复制代码
ALTER TABLE ACT_HI_COMMENT ADD INDEX idx_comment_task (TASK_ID_);

🔄 2. 重新分析执行计划

索引添加后,c 表的 rows_produced_per_join4.19 亿 降至 24,421 ,查询方式变为 ref(通过索引查找),扫描行数大幅减少 📉。

优化后的执行计划示例:

json 复制代码
{
    "query_block": {
        "select_id": 1,
        "cost_info": {
            "query_cost": "54475.04"
        },
        "nested_loop": [
            {
                "table": {
                    "table_name": "t",
                    "access_type": "ALL",
                    "rows_examined_per_scan": 17679,
                    "rows_produced_per_join": 17679,
                    "filtered": "100.00",
                    "cost_info": {
                        "read_cost": "419.00",
                        "eval_cost": "3535.80",
                        "prefix_cost": "3954.80",
                        "data_read_per_join": "567M"
                    },
                    "used_columns": [
                        "ID_",
                        "REV_",
                        "PROC_DEF_ID_",
                        "TASK_DEF_ID_",
                        "TASK_DEF_KEY_",
                        "PROC_INST_ID_",
                        "EXECUTION_ID_",
                        "SCOPE_ID_",
                        "SUB_SCOPE_ID_",
                        "SCOPE_TYPE_",
                        "SCOPE_DEFINITION_ID_",
                        "NAME_",
                        "PARENT_TASK_ID_",
                        "DESCRIPTION_",
                        "OWNER_",
                        "ASSIGNEE_",
                        "START_TIME_",
                        "CLAIM_TIME_",
                        "END_TIME_",
                        "DURATION_",
                        "DELETE_REASON_",
                        "PRIORITY_",
                        "DUE_DATE_",
                        "FORM_KEY_",
                        "CATEGORY_",
                        "TENANT_ID_",
                        "LAST_UPDATED_TIME_"
                    ]
                }
            },
            {
                "table": {
                    "table_name": "p",
                    "access_type": "eq_ref",
                    "possible_keys": [
                        "PROC_INST_ID_"
                    ],
                    "key": "PROC_INST_ID_",
                    "used_key_parts": [
                        "PROC_INST_ID_"
                    ],
                    "key_length": "194",
                    "ref": [
                        "work_order.t.PROC_INST_ID_"
                    ],
                    "rows_examined_per_scan": 1,
                    "rows_produced_per_join": 17679,
                    "filtered": "100.00",
                    "cost_info": {
                        "read_cost": "17679.00",
                        "eval_cost": "3535.80",
                        "prefix_cost": "25169.60",
                        "data_read_per_join": "319M"
                    },
                    "used_columns": [
                        "ID_",
                        "REV_",
                        "PROC_INST_ID_",
                        "BUSINESS_KEY_",
                        "PROC_DEF_ID_",
                        "START_TIME_",
                        "END_TIME_",
                        "DURATION_",
                        "START_USER_ID_",
                        "START_ACT_ID_",
                        "END_ACT_ID_",
                        "SUPER_PROCESS_INSTANCE_ID_",
                        "DELETE_REASON_",
                        "TENANT_ID_",
                        "NAME_",
                        "CALLBACK_ID_",
                        "CALLBACK_TYPE_"
                    ]
                }
            },
            {
                "table": {
                    "table_name": "c",
                    "access_type": "ref",
                    "possible_keys": [
                        "idx_comment_task"
                    ],
                    "key": "idx_comment_task",
                    "used_key_parts": [
                        "TASK_ID_"
                    ],
                    "key_length": "195",
                    "ref": [
                        "work_order.t.ID_"
                    ],
                    "rows_examined_per_scan": 1,
                    "rows_produced_per_join": 24421,
                    "filtered": "100.00",
                    "cost_info": {
                        "read_cost": "24421.20",
                        "eval_cost": "4884.24",
                        "prefix_cost": "54475.04",
                        "data_read_per_join": "347M"
                    },
                    "used_columns": [
                        "ID_",
                        "TYPE_",
                        "TIME_",
                        "USER_ID_",
                        "TASK_ID_",
                        "PROC_INST_ID_",
                        "ACTION_",
                        "MESSAGE_",
                        "FULL_MSG_"
                    ]
                }
            }
        ]
    }
}

优化后,查询时间从 ⏳ 1 分钟 降至毫秒级 🚀,性能得到了显著提升。

🔬 MySQL 8 本地测试情况

在 MySQL 8 本地环境进行测试时,原 SQL 语句的执行时间没有出现明显的性能问题,可能原因包括:

  • 优化器改进 :MySQL 8 对 JOIN 方式进行了优化,减少了 BNL 的使用。
  • 更智能的默认索引策略:MySQL 8 在索引选择上更为智能,避免了不必要的全表扫描。
  • 测试环境数据量较小:由于本地环境数据较少,无法重现生产环境中的慢查询问题。

尽管在本地 MySQL 8 上运行正常,我们仍建议在生产环境中进行 EXPLAIN 分析,以确保优化方案的有效性。

MySQL 8.0 中引入了 Hash Join 自动选择,取代了传统的 Nested Loop Join(嵌套循环连接)。执行计划中的 "using_join_buffer": "hash join" 证实了这一点。Hash Join、并行查询和 Buffer Pool 的优化是导致问题未能在本地复现的主要原因。因此,最终我们通过与生产环境完全一致的数据库版本和配置复现了问题。

执行计划:

json 复制代码
{
    "query_block": {
        "select_id": 1,
        "cost_info": {
            "query_cost": "36941768.34"
        },
        "nested_loop": [
            {
                "table": {
                    "table_name": "t",
                    "access_type": "ALL",
                    "rows_examined_per_scan": 16978,
                    "rows_produced_per_join": 16978,
                    "filtered": "100.00",
                    "cost_info": {
                        "read_cost": "104.75",
                        "eval_cost": "1697.80",
                        "prefix_cost": "1802.55",
                        "data_read_per_join": "544M"
                    },
                    "used_columns": [
                        "ID_",
                        "REV_",
                        "PROC_DEF_ID_",
                        "TASK_DEF_ID_",
                        "TASK_DEF_KEY_",
                        "PROC_INST_ID_",
                        "EXECUTION_ID_",
                        "SCOPE_ID_",
                        "SUB_SCOPE_ID_",
                        "SCOPE_TYPE_",
                        "SCOPE_DEFINITION_ID_",
                        "NAME_",
                        "PARENT_TASK_ID_",
                        "DESCRIPTION_",
                        "OWNER_",
                        "ASSIGNEE_",
                        "START_TIME_",
                        "CLAIM_TIME_",
                        "END_TIME_",
                        "DURATION_",
                        "DELETE_REASON_",
                        "PRIORITY_",
                        "DUE_DATE_",
                        "FORM_KEY_",
                        "CATEGORY_",
                        "TENANT_ID_",
                        "LAST_UPDATED_TIME_"
                    ]
                }
            },
            {
                "table": {
                    "table_name": "p",
                    "access_type": "eq_ref",
                    "possible_keys": [
                        "PROC_INST_ID_"
                    ],
                    "key": "PROC_INST_ID_",
                    "used_key_parts": [
                        "PROC_INST_ID_"
                    ],
                    "key_length": "194",
                    "ref": [
                        "work_order.t.PROC_INST_ID_"
                    ],
                    "rows_examined_per_scan": 1,
                    "rows_produced_per_join": 16978,
                    "filtered": "100.00",
                    "cost_info": {
                        "read_cost": "4244.50",
                        "eval_cost": "1697.80",
                        "prefix_cost": "7744.85",
                        "data_read_per_join": "306M"
                    },
                    "used_columns": [
                        "ID_",
                        "REV_",
                        "PROC_INST_ID_",
                        "BUSINESS_KEY_",
                        "PROC_DEF_ID_",
                        "START_TIME_",
                        "END_TIME_",
                        "DURATION_",
                        "START_USER_ID_",
                        "START_ACT_ID_",
                        "END_ACT_ID_",
                        "SUPER_PROCESS_INSTANCE_ID_",
                        "DELETE_REASON_",
                        "TENANT_ID_",
                        "NAME_",
                        "CALLBACK_ID_",
                        "CALLBACK_TYPE_"
                    ]
                }
            },
            {
                "table": {
                    "table_name": "c",
                    "access_type": "ALL",
                    "rows_examined_per_scan": 21447,
                    "rows_produced_per_join": 364127166,
                    "filtered": "100.00",
                    "using_join_buffer": "hash join",
                    "cost_info": {
                        "read_cost": "521306.89",
                        "eval_cost": "36412716.60",
                        "prefix_cost": "36941768.34",
                        "data_read_per_join": "4T"
                    },
                    "used_columns": [
                        "ID_",
                        "TYPE_",
                        "TIME_",
                        "USER_ID_",
                        "TASK_ID_",
                        "PROC_INST_ID_",
                        "ACTION_",
                        "MESSAGE_",
                        "FULL_MSG_"
                    ],
                    "attached_condition": "<if>(is_not_null_compl(c), (`work_order`.`c`.`TASK_ID_` = `work_order`.`t`.`ID_`), true)"
                }
            }
        ]
    }
}

🎯 结论

  • 问题根因 :缺少合适的索引,导致 MySQL 使用 BNL 方式进行 JOIN,引发巨量扫描。
  • 优化措施 :为 TASK_ID_ 添加索引,使得 c 表的访问方式从 ALL 变为 ref,减少了扫描行数。
  • 最终效果 :查询时间从 ⏳ 1 分钟 降至毫秒级 🎉。

📌 建议

  • 📊 定期检查慢查询日志,及时优化 SQL 语句。
  • 🛠️ 合理设计索引,避免全表扫描。
  • 🧐 使用 EXPLAIN 分析执行计划,确保查询能够利用索引路径。

希望本文中的优化过程能对你在 MySQL 生产环境中的性能调优有所帮助!🎯💡


相关推荐
Dnui_King10 分钟前
告别复杂日志解析 用bin2sql轻松实现MySQL数据闪回
数据库·mysql
mqiqe10 分钟前
SQL Server数据库基于SQL性能优化
数据库·sql·性能优化
爱吃馒头爱吃鱼13 分钟前
QML编程中的性能优化
开发语言·qt·性能优化
仰望丨苍穹14 分钟前
JavaScript性能优化实战
前端·javascript·性能优化
噔噔噔噔@15 分钟前
JavaScript性能优化的几个方面入手
开发语言·javascript·性能优化
John_ToDebug25 分钟前
Chrome 扩展(Extensions)与插件(Plugins)的区别
c++·chrome·性能优化
winner888143 分钟前
Hive SQL 精进系列:COALESCE 手册
hive·hadoop·sql
阳光九叶草LXGZXJ1 小时前
达梦数据库-学习-10-SQL 注入 HINT 规则(固定执行计划)
linux·运维·数据库·sql·学习
samroom1 小时前
Vue+Node.js+MySQL+Element-Plus实现一个账号注册与登录功能
vue.js·mysql·node.js
只做开心事3 小时前
MySQL基本查询
数据库·mysql