GreatSQL 死锁案例分析

1.背景概述

客户业务发生死锁的报错,根据业务程序日志及业务流程,发现造成死锁的原因是:事务1 delete + insert ,事务2 delete + insert 2个事务交替执行导致的死锁;由于GAP锁阻塞了插入意向锁,并且当delete的数据存在时死锁不会发生,当delete的数据不存在时,会发生死锁。

2.问题复现

本次测试基于 GreatSQL-8.0.32-24,隔离级别为 RR

2.1 创建测试表

sql 复制代码
greatsql> create database test;

greatsql> create table test(c1 int unique key, c2 int, c3 int, c4 int);

greatsql> insert into test values (1,1,1,1),(3,3,3,3),(5,5,5,5),(9,9,9,9);

greatsql> select * from test;
+------+------+------+------+
| c1  | c2  | c3  | c4  |
+------+------+------+------+
|   1 |   1 |   1 |   1 |
|   3 |   3 |   3 |   3 |
|   5 |   5 |   5 |   5 |
|   9 |   9 |   9 |   9 |
+------+------+------+------+
4 rows in set (0.01 sec)

2.2 事务执行顺序

按以下事务执行顺序,如果要删除的数据存在,则不会发生死锁;如果要删除的数据不存,并且要删除的数据在同一个GAP锁的区间内则会发生死锁;

时间 事务1 事务2
T1 BEGIN; BEGIN;
T2 delete from test where c1=?;
T3 delete from test where c1=?;
T4 insert into test value(?,?,?,?);
T5 insert into test value(?,?,?,?);

2.3 当delete的数据存在时

事务1:delete

sql 复制代码
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)

greatsql> delete from test where c1=3;
Query OK, 1 row affected (0.00 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     59 | test      | test     | NULL     | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1       | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
3 rows in set (0.00 sec)

此时事务1给 3, 0x000000000201 这条数据加了 记录锁 X,REC_NOT_GAP

事务2:delete

sql 复制代码
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)

greatsql> delete from test where c1=5;
Query OK, 1 row affected (0.00 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |
|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
6 rows in set (0.00 sec)

此时事务2给 5, 0x000000000202 这条数据加了 记录锁 X,REC_NOT_GAP

事务1:insert

sql 复制代码
greatsql> insert into test value(3,3,3,3);
Query OK, 1 row affected (0.00 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+

| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+

|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |

|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |

|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |

|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |

|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |

|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000201 |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000206 |

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
9 rows in set (0.00 sec)

此时事务1给 被delete删除的数据 3, 0x000000000201 ,插入的数据 3, 0x000000000202,以及相邻的下一条数据 5, 0x000000000202 加了 间隙锁 S,GAP

事务2:insert

sql 复制代码
greatsql> insert into test value(5,5,5,5);
Query OK, 1 row affected (0.01 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 9, 0x000000000203 |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000207 |
|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000206 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
12 rows in set (0.00 sec)

此时事务2给 被delete删除的数据 5, 0x000000000202 ,插入的数据 5, 0x000000000207,以及相邻的下一条数据 9, 0x000000000203 加了 间隙锁 S,GAP

由于GAP锁之间是相互兼容的,所以没有发生锁等待及死锁,此时事务1,事务2都执行完成,可以正常提交。

2.4 当delete的数据不存在时

事务1:delete

sql 复制代码
greatsql> begin;
Query OK, 0 rows affected (0.01 sec)

greatsql> delete from test where c1=6;
Query OK, 0 rows affected (0.00 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
|     61 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
2 rows in set (0.00 sec)

此时事务1给 9, 0x000000000203 这条数据加了 间隙锁 X,GAP

事务2:delete

sql 复制代码
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)
 
greatsql> delete from test where c1=7;
Query OK, 0 rows affected (0.00 sec)

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
|     62 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     62 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
4 rows in set (0.00 sec)

此时事务2给 9, 0x000000000203 这条数据加了 间隙锁 X,GAP 间隙锁可以相互兼容,因此没有报错

事务1:insert

sql 复制代码
greatsql> insert into test value(6,6,6,6);
---hang住,处于锁等待

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE        | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
|     62 | test      | test     | NULL    | TABLE   | IX           | GRANTED   | NULL        |
|     62 | test      | test     | c1     | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | NULL    | TABLE   | IX           | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | c1     | RECORD   | X,GAP,INSERT_INTENTION | WAITING   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
5 rows in set (0.00 sec)

此时事务1,要申请给 9, 0x000000000203 这条数据加 间隙锁,插入意向锁 X,GAP,INSERT_INTENTION;由于事务2已经加了 间隙锁 X,GAP 间隙锁与插入意向锁并不兼容,因此事务1的insert处于锁等待状态

事务2:insert

sql 复制代码
greatsql> insert into test value(7,7,7,7);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

查看锁信息:

sql 复制代码
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE        | LOCK_STATUS | LOCK_DATA        |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
|     61 | test      | test     | NULL       | TABLE   | IX           | GRANTED   | NULL          |
|     61 | test      | test     | c1        | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203    |
|     61 | test      | test     | c1        | RECORD   | X,GAP          | GRANTED   | 6, 0x000000000213    |
|     61 | test      | test     | GEN_CLUST_INDEX | RECORD   | X            | GRANTED   | supremum pseudo-record |
|     61 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,GAP          | GRANTED   | 0x000000000213     |
|     61 | test      | test     | c1        | RECORD   | X,GAP,INSERT_INTENTION | GRANTED   | 9, 0x000000000203    |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
6 rows in set (0.01 sec)

事务2回滚,只有事务1的加锁信息。

由于此时事务2,要申请给 9, 0x000000000203 这条数据加 间隙锁,插入意向锁 由于事务1已经加了 间隙锁 X,GAP 间隙锁与插入意向锁并不兼容,因此事务2的insert处于锁等待状态,2个事务相互等待锁导致死锁,此时事务2回滚。

3.总结

此次死锁的发生主要是GAP 锁 和 插入意向锁的冲突,建议让业务修改一下逻辑,先判断数据是否存在 select 一下,存在的话 delete 后在 insert ; 不存在的话直接 insert 不用delete了。

Enjoy GreatSQL :)

关于 GreatSQL

GreatSQL是适用于金融级应用的国内自主开源数据库,具备高性能、高可靠、高易用性、高安全等多个核心特性,可以作为MySQL或Percona Server的可选替换,用于线上生产环境,且完全免费并兼容MySQL或Percona Server。

相关链接: GreatSQL社区 Gitee GitHub Bilibili

GreatSQL社区:

社区有奖建议反馈: greatsql.cn/thread-54-1...

社区博客有奖征稿详情: greatsql.cn/thread-100-...

(对文章有疑问或者有独到见解都可以去社区官网提出或分享哦~)

技术交流群:

微信&QQ群:

QQ群:533341697

微信群:添加GreatSQL社区助手(微信号:wanlidbc )好友,待社区助手拉您进群。

相关推荐
小龙.nice14 分钟前
sqli—labs靶场 5-8关 (每日4关练习)持续更新!!!
数据库
Linux运维技术栈37 分钟前
生产环境centos8 & Red Hat8部署ansible and 一键部署mysql两主两从ansible脚本预告
运维·数据库·mysql·自动化·ansible
ZHOU西口1 小时前
MySQL系列之数据授权(privilege)
数据库·mysql·grant·授权·回收·revoke·privilege
尘浮生1 小时前
Java项目实战II基于微信小程序的课堂助手(开发文档+数据库+源码)
java·开发语言·数据库·微信·微信小程序·小程序·maven
EasyCVR1 小时前
海康大华宇视视频平台EasyCVR私有化视频平台服务器选购主要参数有哪些?
运维·服务器·数据库·音视频
Crossoads2 小时前
【汇编语言】更灵活的定位内存地址的方法(三)—— 不同的寻址方式的灵活应用
android·开发语言·数据库·人工智能·机器学习·数据挖掘·汇编语言
独自破碎E4 小时前
【1】猫眼娱乐后端开发面试题整理
数据库·oracle·娱乐
jiayouzhan28826 小时前
【SQL】mysql常用命令
数据库·sql·mysql·oracle·sql常用命令·mysql常用操作·sql基本操作
生信学习小达人7 小时前
对基因列表中批量的基因进行GO和KEGG注释
数据库
MrJson-架构师8 小时前
MongoDB自定义顺序排序
数据库·mongodb