利用trigger对大表在线同步 UDI

Applies to:

MySQL Server - Version 8.0 and later

Information in this document applies to any platform.

Goal

Modify the datatype of a column in a large table without extended app downtime.

Solution

Modifying the datatype of a column in a MySQL table can not be done as an online operation. The table has to be completely rebuilt. For large tables, this can result in extended application downtime.

Refer to the documentation on Online DDL.

As a workaround, it's possible to create a table that will exist in parallel with the existing table, modify the column datatype, load it with the existing table's data, and keep it up to date with the existing table via triggers. Then, a brief app downtime can be taken to rename the tables so that the new table replaces the old table.

Here is a working example from the standard MySQL "employees" test database:

Given the following table definition:

Create Table: CREATE TABLE 'employees' (

'emp_no' int NOT NULL,

'birth_date' date NOT NULL,

'first_name' varchar(14) NOT NULL,

'last_name' varchar(16) NOT NULL,

'gender' enum('M','F') NOT NULL,

'hire_date' date NOT NULL,

PRIMARY KEY ('emp_no')

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

It's easy to create a new, empty table just like it, but with emp_no as BIGINT vs. INT, like this:

mysql> CREATE TABLE IF NOT EXISTS employees_new LIKE employees;

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> ALTER TABLE employees_new MODIFY COLUMN emp_no BIGINT;

Query OK, 0 rows affected (0.01 sec)

Records: 0 Duplicates: 0 Warnings: 0

Now we have a new table, employees_new, which is empty, but has the same definition as employees, but with BIGINT as the datatype for emp_no.

Note that any foreign key restraints in the original table definition will have to be manually added to the new table, because foreign key restraints are not automatically created when the new table is create with CREATE TABLE.... LIKE.

Triggers will be needed on INSERT, UPDATE, and DELETE to keep the new table in sync with the original table. For example, an AFTER INSERT trigger would look something like this:

DROP TRIGGER IF EXISTS empnew_insert;

DELIMITER //

CREATE TRIGGER empnew_insert

AFTER INSERT ON employees

FOR EACH ROW BEGIN

INSERT INTO

employees_new

SELECT * FROM employees WHERE emp_no = NEW.emp_no;

END;

//

DELIMITER ;

Similar triggers would be created for UPDATE and DELETE operations:

DROP TRIGGER IF EXISTS empnew_delete;

DELIMITER //

CREATE TRIGGER empnew_delete

AFTER DELETE ON employees

FOR EACH ROW BEGIN

DELETE FROM

employees_new

WHERE emp_no = OLD.emp_no;

END;

//

DELIMITER ;

DROP TRIGGER IF EXISTS empnew_update;

DELIMITER //

CREATE TRIGGER empnew_update

AFTER UPDATE ON employees

FOR EACH ROW BEGIN

DELETE FROM employees_new

WHERE emp_no = OLD.emp_no;

INSERT INTO

employees_new

SELECT * FROM employees WHERE emp_no = NEW.emp_no;

END;

//

DELIMITER ;

To initially load all the data from the original table without creating a huge transaction -- which is very important -- a stored procedure would be created to insert rows in groups into employees_new from employees, for rows that exist in employees, but are not yet loaded into employees_new, like this:

mysql> DELIMITER //

mysql> DROP PROCEDURE IF EXISTS copy_emp;

-> CREATE PROCEDURE copy_emp()

-> BEGIN

-> REPEAT
-> INSERT INTO employees_new SELECT employees.*
-> FROM employees LEFT JOIN employees_new
-> USING (emp_no) ---这个join 如果大表是不是有性能问题? NL 的话数据量大,hash的话每次要全表扫描
-> WHERE employees_new.emp_no IS NULL LIMIT 10000;

-> UNTIL ROW_COUNT() = 0

-> END REPEAT;

-> END //

Query OK, 0 rows affected (0.02 sec)

In this example stored procedure, the table is loaded 10,000 rows at a time, and repeats this over and over until there are no more rows to add. The number of rows to be loaded on each iteration can be modified, but should be kept small enough that huge transactions are not created.

Here's an example of calling the stored procedure:

mysql> select count(*) from employees;

+----------+

| count(*) |

+----------+

| 300024 |

+----------+

1 row in set (0.02 sec)

mysql> select count(*) from employees_new;

+----------+

| count(*) |

+----------+

| 0 |

+----------+

1 row in set (0.00 sec)

mysql> call copy_emp();

Query OK, 0 rows affected (14.30 sec)

mysql> select count(*) from employees_new;

+----------+

| count(*) |

+----------+

| 300024 |

+----------+

All of this work to this point can be done without app downtime, though it will create some workload in terms of copying the data and writing the binary logs, etc.

Once the new table is in sync with the original table, the app would be taken down, and the tables would be renamed:

mysql> RENAME TABLE employees TO employees_old;

mysql> RENAME TABLE employees_new TO employees;

Note that any foreign key restraints that referenced the original table will have to be dropped and recreated referencing the original table name, because when the table is renamed, the FK references will automatically be modified to reference the new table name. Also, if any FK constraints that reference the column to be modified exist, the referencing columns in the tables containing those restraints will have to also be converted to BIGINT. To do this, the constraint would be dropped, the column definition modified, and the constraint added back in, referencing the original table name.

Once the new table is ready and any foreign key constraints have been added or modified as needed, the app can be brought back online.

As with any operation, be sure to test the process in a non-production environment prior to implementing in production.

相关推荐
百锦再8 分钟前
SQLSugar 封装原理详解:从架构到核心模块的底层实现
sql·mysql·sqlserver·架构·core·sqlsugar·net
亲爱的非洲野猪26 分钟前
时序数据库的 LSM 树介绍
数据库·时序数据库
熊文豪1 小时前
MySQL数据库迁移到KingbaseES完整指南
数据库·mysql·kingbasees·金仓数据库·kingbasees迁移指南
TDengine (老段)1 小时前
工业数据消费迎来“抖音式”革命:TDengine IDMP 让数据自己开口说话
大数据·数据库·物联网·ai·时序数据库·iot·tdengine
IDOlaoluo2 小时前
PLSQL Developer 12.0.1 x64 安装步骤详解(附Oracle连接设置|附安装包下载)
数据库·oracle
追逐时光者2 小时前
一款为程序员和运维人员量身打造的一站式开发运维利器
数据库·docker·ssh
往日情怀酿做酒 V17639296383 小时前
SQL注入6----(其他注入手法)
数据库·sql
代码的余温3 小时前
Redis vs Elasticsearch:核心区别深度解析
大数据·数据库·redis·elasticsearch
forestsea3 小时前
Nacos-3.0.3 适配PostgreSQL数据库
数据库·postgresql
一 乐3 小时前
医院排班|医护人员排班系统|基于springboot医护人员排班系统设计与实现(源码+数据库+文档)
java·数据库·spring boot·后端·论文·毕设·医护人员排班系统