排查Mysql死锁问题

1. 现象

  1. 程序提供业务逻辑:接收提交的任务请求,然后执行一个Job
  2. 任务卡住不执行了
  3. 程序错误日志显示如下错误:
bash 复制代码
scheduleThread error:
com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction

2. 排查

2.1 尝试加大数据库锁超时时间(结果无效)

sql 复制代码
-- 查看当前锁超时时间(默认50)
SHOW VARIABLES LIKE 'innodb_lock_wait_timeout';
-- 查看事务超时时间(默认50)  
SHOW VARIABLES LIKE 'lock_wait_timeout';
  • 临时调整(重启后失效)
sql 复制代码
SET GLOBAL innodb_lock_wait_timeout = 120;
SET GLOBAL lock_wait_timeout = 120;

2.2 查询锁等待情况(看到有2个等待)

sql 复制代码
-- 查看正在等待锁的线程
-- mysql8+ 语法不同,详见本文"补充"
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

2.3 查看当前所有线程(看到有大量线程且时间超过7000秒)

sql 复制代码
-- 查看所有连接线程
SHOW PROCESSLIST;
-- 或者更详细的信息
SELECT * FROM information_schema.PROCESSLIST;
process_id USER HOST DB COMMAND TIME STATE INFO
1744 root 10.10.20.45:62890 rouyi-vue-plus Sleep 7325 NULL
1757 root 10.10.20.45:62937 rouyi-vue-plus Sleep 7326 NULL
1778 root 10.10.20.45:63581 rouyi-vue-plus Sleep 7344 NULL
1739 root 10.10.20.45:62870 rouyi-vue-plus Sleep 7422 NULL
1759 root 10.10.20.45:62951 rouyi-vue-plus Sleep 7344 NULL
1756 root 10.10.20.45:62929 rouyi-vue-plus Sleep 7344 NULL

2.4 查找锁等待相关的线程(显示了一堆线程)

sql 复制代码
-- 查找正在等待锁的线程
SELECT 
    p.ID as process_id,
    p.USER,
    p.HOST, 
    p.DB,
    p.COMMAND,
    p.TIME,
    p.STATE,
    p.INFO
FROM information_schema.PROCESSLIST p
WHERE p.STATE LIKE '%lock%' 
   OR p.COMMAND = 'Sleep' AND p.TIME > 60;

2.5 精确查找阻塞的线程(显示一个阻塞线程1744)

sql 复制代码
-- 查看锁等待关系
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread_id,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread_id,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;
waiting_trx_id waiting_thread_id waiting_query blocking_trx_id blocking_thread_id blocking_query
487168419 1998 select * from xxl_job_lock where lock_name = 'schedule_lock' for update 486996078 1744 NULL

2.6 查看阻塞线程的详细信息

sql 复制代码
-- 查看阻塞线程1744的详细信息
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME as time_seconds,
    STATE,
    INFO
FROM information_schema.PROCESSLIST 
WHERE ID = 1744;

2.7 查看阻塞线程的完整历史

sql 复制代码
-- 查看该线程的完整历史
SELECT * FROM performance_schema.events_statements_history 
WHERE THREAD_ID IN (SELECT THREAD_ID FROM performance_schema.threads WHERE PROCESSLIST_ID = 1744)
ORDER BY EVENT_ID DESC;
  • 线程1744的完整历史记录(简述)
sql 复制代码
# 线程1744的完整历史记录(简述)
# 最后阻塞的原因就是54设置了当前session手动提交事务,55 获取了行锁(select * from xxl_job_lock where lock_name = 'schedule_lock' for update),但始终没有提交。
时间线(从新到旧):
1. EVENT_ID 55: 获取schedule_lock锁 (当前阻塞状态)
2. EVENT_ID 54: 设置autocommit=0 (开启事务)
3. EVENT_ID 53: 执行复杂查询 (扫描394,197行,耗时644秒!)
4. EVENT_ID 52: 执行另一个复杂查询 (扫描53,388行,耗时243秒)
5. EVENT_ID 51: 设置autocommit=1 (提交前一个事务)
6. EVENT_ID 50: commit (提交事务)
7. EVENT_ID 49: 之前获取schedule_lock锁
8. EVENT_ID 48: 设置autocommit=0

3. 解决

  1. 杀掉1744进程(该进程不是正常服务调用产生的进程,是研发连接的一个终端)
sql 复制代码
-- KILL 阻塞的进程 (1744)
KILL 1744;
  1. 杀掉1744进程后,阻塞解除
sql 复制代码
-- 验证阻塞是否解除
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    b.trx_mysql_thread_id blocking_thread
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

4. 补充

4.1 查看长时间运行的查询

sql 复制代码
-- 查找运行时间超过60秒的查询
SELECT 
    '长时间运行查询' as title;
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME,
    STATE,
    LEFT(INFO, 100) as query_snippet
FROM information_schema.PROCESSLIST
WHERE TIME > 60
ORDER BY TIME DESC;

4.2 查看所有线程

sql 复制代码
-- 综合诊断脚本
SELECT 
    '当前进程状态' as title;
SHOW PROCESSLIST;

4.3 查看所有锁等待

sql 复制代码
-- 使用系统库查询
SELECT * FROM sys.innodb_lock_waits;

-- 查看详细信息
-- mysq 5.7 查询
SELECT 
    '锁等待情况' as title;
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- mysq 8.4 查询
SELECT 
    '锁等待情况' as title;
SELECT
  r.trx_id waiting_trx_id,
  r.trx_mysql_thread_id waiting_thread,
  r.trx_query waiting_query,
  b.trx_id blocking_trx_id,
  b.trx_mysql_thread_id blocking_thread,
  b.trx_query blocking_query
FROM performance_schema.data_lock_waits w
INNER JOIN information_schema.innodb_trx b
  ON b.trx_id = w.blocking_engine_transaction_id
INNER JOIN information_schema.innodb_trx r
  ON r.trx_id = w.requesting_engine_transaction_id;
sql 复制代码
SELECT 
    '长时间运行查询' as title;
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME,
    STATE,
    LEFT(INFO, 100) as query_snippet
FROM information_schema.PROCESSLIST
WHERE TIME > 60
ORDER BY TIME DESC;

4.4 查看最近死锁信息

sql 复制代码
SHOW ENGINE INNODB STATUS;

关注以下信息

  1. TRANSACTIONS(事务部分)
  • LOCK WAIT:表示有锁等待
  • lock_mode X:排他锁等待
  • waiting:正在等待锁
  • TRX HAS BEEN WAITING 5 SEC:已等待时间
  1. LATEST DETECTED DEADLOCK(最新死锁信息)

  2. 锁类型说明

    在输出中常见的锁类型

  • lock_mode X:排他锁(写锁)
  • lock_mode S:共享锁(读锁)
  • locks rec but not gap:记录锁
  • locks gap before rec:间隙锁
  • locks gap and rec:临键锁(间隙锁+记录锁)
  • waiting:正在等待该锁
相关推荐
yuzhiboyouye1 小时前
内连接,左连接,右连接怎么区别开来?
数据库
铭毅天下1 小时前
Easysearch 版本进化全图——从 ES 国产替代到 AI Native 搜索数据库
大数据·数据库·人工智能·elasticsearch·搜索引擎
muddjsv1 小时前
SQL 最常用技能详解与实战示例
数据库·sql·mysql
muddjsv3 小时前
大中小型企业数据配置年度成本估算分析
数据库·企业运营
ᰔᩚ. 一怀明月ꦿ3 小时前
MySQL 学习目标
学习·mysql·adb
塔能物联运维3 小时前
存量机房升级成为行业主流方向:热管理重构算力中心价值路径
数据库
lqj_本人3 小时前
鸿蒙electron跨端框架PC工志簿实战:项目、工时、阻塞和下一步都要有位置
数据库·华为·harmonyos
刘一说3 小时前
AI科技热点日报 | 2026年5月22日
数据库·人工智能·科技
LCG元3 小时前
RAG工程指南:从基础检索到生产部署全解析
java·运维·数据库
godspeed_lucip4 小时前
LLM和Agent——专题3: Agentic Workflow 入门(1)
大数据·数据库·人工智能