排查Mysql死锁问题

1. 现象

  1. 程序提供业务逻辑:接收提交的任务请求,然后执行一个Job
  2. 任务卡住不执行了
  3. 程序错误日志显示如下错误:
bash 复制代码
scheduleThread error:
com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction

2. 排查

2.1 尝试加大数据库锁超时时间(结果无效)

sql 复制代码
-- 查看当前锁超时时间(默认50)
SHOW VARIABLES LIKE 'innodb_lock_wait_timeout';
-- 查看事务超时时间(默认50)  
SHOW VARIABLES LIKE 'lock_wait_timeout';
  • 临时调整(重启后失效)
sql 复制代码
SET GLOBAL innodb_lock_wait_timeout = 120;
SET GLOBAL lock_wait_timeout = 120;

2.2 查询锁等待情况(看到有2个等待)

sql 复制代码
-- 查看正在等待锁的线程
-- mysql8+ 语法不同,详见本文"补充"
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

2.3 查看当前所有线程(看到有大量线程且时间超过7000秒)

sql 复制代码
-- 查看所有连接线程
SHOW PROCESSLIST;
-- 或者更详细的信息
SELECT * FROM information_schema.PROCESSLIST;
process_id USER HOST DB COMMAND TIME STATE INFO
1744 root 10.10.20.45:62890 rouyi-vue-plus Sleep 7325 NULL
1757 root 10.10.20.45:62937 rouyi-vue-plus Sleep 7326 NULL
1778 root 10.10.20.45:63581 rouyi-vue-plus Sleep 7344 NULL
1739 root 10.10.20.45:62870 rouyi-vue-plus Sleep 7422 NULL
1759 root 10.10.20.45:62951 rouyi-vue-plus Sleep 7344 NULL
1756 root 10.10.20.45:62929 rouyi-vue-plus Sleep 7344 NULL

2.4 查找锁等待相关的线程(显示了一堆线程)

sql 复制代码
-- 查找正在等待锁的线程
SELECT 
    p.ID as process_id,
    p.USER,
    p.HOST, 
    p.DB,
    p.COMMAND,
    p.TIME,
    p.STATE,
    p.INFO
FROM information_schema.PROCESSLIST p
WHERE p.STATE LIKE '%lock%' 
   OR p.COMMAND = 'Sleep' AND p.TIME > 60;

2.5 精确查找阻塞的线程(显示一个阻塞线程1744)

sql 复制代码
-- 查看锁等待关系
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread_id,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread_id,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;
waiting_trx_id waiting_thread_id waiting_query blocking_trx_id blocking_thread_id blocking_query
487168419 1998 select * from xxl_job_lock where lock_name = 'schedule_lock' for update 486996078 1744 NULL

2.6 查看阻塞线程的详细信息

sql 复制代码
-- 查看阻塞线程1744的详细信息
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME as time_seconds,
    STATE,
    INFO
FROM information_schema.PROCESSLIST 
WHERE ID = 1744;

2.7 查看阻塞线程的完整历史

sql 复制代码
-- 查看该线程的完整历史
SELECT * FROM performance_schema.events_statements_history 
WHERE THREAD_ID IN (SELECT THREAD_ID FROM performance_schema.threads WHERE PROCESSLIST_ID = 1744)
ORDER BY EVENT_ID DESC;
  • 线程1744的完整历史记录(简述)
sql 复制代码
# 线程1744的完整历史记录(简述)
# 最后阻塞的原因就是54设置了当前session手动提交事务,55 获取了行锁(select * from xxl_job_lock where lock_name = 'schedule_lock' for update),但始终没有提交。
时间线(从新到旧):
1. EVENT_ID 55: 获取schedule_lock锁 (当前阻塞状态)
2. EVENT_ID 54: 设置autocommit=0 (开启事务)
3. EVENT_ID 53: 执行复杂查询 (扫描394,197行,耗时644秒!)
4. EVENT_ID 52: 执行另一个复杂查询 (扫描53,388行,耗时243秒)
5. EVENT_ID 51: 设置autocommit=1 (提交前一个事务)
6. EVENT_ID 50: commit (提交事务)
7. EVENT_ID 49: 之前获取schedule_lock锁
8. EVENT_ID 48: 设置autocommit=0

3. 解决

  1. 杀掉1744进程(该进程不是正常服务调用产生的进程,是研发连接的一个终端)
sql 复制代码
-- KILL 阻塞的进程 (1744)
KILL 1744;
  1. 杀掉1744进程后,阻塞解除
sql 复制代码
-- 验证阻塞是否解除
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    b.trx_mysql_thread_id blocking_thread
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

4. 补充

4.1 查看长时间运行的查询

sql 复制代码
-- 查找运行时间超过60秒的查询
SELECT 
    '长时间运行查询' as title;
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME,
    STATE,
    LEFT(INFO, 100) as query_snippet
FROM information_schema.PROCESSLIST
WHERE TIME > 60
ORDER BY TIME DESC;

4.2 查看所有线程

sql 复制代码
-- 综合诊断脚本
SELECT 
    '当前进程状态' as title;
SHOW PROCESSLIST;

4.3 查看所有锁等待

sql 复制代码
-- 使用系统库查询
SELECT * FROM sys.innodb_lock_waits;

-- 查看详细信息
-- mysq 5.7 查询
SELECT 
    '锁等待情况' as title;
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- mysq 8.4 查询
SELECT 
    '锁等待情况' as title;
SELECT
  r.trx_id waiting_trx_id,
  r.trx_mysql_thread_id waiting_thread,
  r.trx_query waiting_query,
  b.trx_id blocking_trx_id,
  b.trx_mysql_thread_id blocking_thread,
  b.trx_query blocking_query
FROM performance_schema.data_lock_waits w
INNER JOIN information_schema.innodb_trx b
  ON b.trx_id = w.blocking_engine_transaction_id
INNER JOIN information_schema.innodb_trx r
  ON r.trx_id = w.requesting_engine_transaction_id;
sql 复制代码
SELECT 
    '长时间运行查询' as title;
SELECT 
    ID as process_id,
    USER,
    HOST,
    DB,
    COMMAND,
    TIME,
    STATE,
    LEFT(INFO, 100) as query_snippet
FROM information_schema.PROCESSLIST
WHERE TIME > 60
ORDER BY TIME DESC;

4.4 查看最近死锁信息

sql 复制代码
SHOW ENGINE INNODB STATUS;

关注以下信息

  1. TRANSACTIONS(事务部分)
  • LOCK WAIT:表示有锁等待
  • lock_mode X:排他锁等待
  • waiting:正在等待锁
  • TRX HAS BEEN WAITING 5 SEC:已等待时间
  1. LATEST DETECTED DEADLOCK(最新死锁信息)

  2. 锁类型说明

    在输出中常见的锁类型

  • lock_mode X:排他锁(写锁)
  • lock_mode S:共享锁(读锁)
  • locks rec but not gap:记录锁
  • locks gap before rec:间隙锁
  • locks gap and rec:临键锁(间隙锁+记录锁)
  • waiting:正在等待该锁
相关推荐
高溪流10 分钟前
3.数据库表的基本操作
数据库·mysql
alonewolf_9918 分钟前
深入剖析MySQL锁机制与MVCC原理:高并发场景下的数据库核心优化
数据库·mysql
一 乐42 分钟前
绿色农产品销售|基于springboot + vue绿色农产品销售系统(源码+数据库+文档)
java·前端·数据库·vue.js·spring boot·后端·宠物
黄宝康1 小时前
sqlyog密钥亲测有效
mysql
Codeking__1 小时前
Redis初识——什么是Redis
数据库·redis·mybatis
YIN_尹1 小时前
【MySQL】数据类型(上)
android·mysql·adb
k***1951 小时前
Spring 核心技术解析【纯干货版】- Ⅶ:Spring 切面编程模块 Spring-Instrument 模块精讲
前端·数据库·spring
程序员黄老师1 小时前
主流向量数据库全面解析
数据库·大模型·向量·rag
Full Stack Developme1 小时前
Redis 可以实现哪些业务功能
数据库·redis·缓存
rgeshfgreh2 小时前
Spring事务传播机制深度解析
java·前端·数据库