参数1)skip-slave-start = 0
参数详解:禁用skip-slave-start,slave进程会随着mysql启动而启动。skip-slave-start=1为开启
测试:当前slave的查不到该参数,且/etc/my.cnf也没有配置该文件,重启服务后,发现slave进程自动启动,且停服务期间master插入的数据也复制过来了,看了这个参数默认是skip-slave-start = 0
[root@localhost:mytest1]>show global variables like '%slave-start%';
Empty set (0.01 sec)
[mysql@t3-dtpoc-dtpoc-web05 bin]$ service mysql stop
Shutting down MySQL.. SUCCESS!
[mysql@t3-dtpoc-dtpoc-web05 bin]$ service mysql start
Starting MySQL. SUCCESS!
[root@localhost:mytest1]>show slave status\G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@localhost:mytest1]>select * from t1;
+------+------+
| id | name |
+------+------+
| 8888 | tsss |
| 9999 | jjjj |
+------+------+
2 rows in set (0.00 sec)
我们在/etc/my.cnf写入这个参数skip-slave-start=1,然后重启服务,发现slave进程没有自动启动,停服务期间master端插入的数据也没有复制过来,说明参数生效了,看起来这个参数好像没有什么用,因为我们总是希望备机重启时自动启动slave。但是如果因为主库一个没有主键的大表整表update或者delete,从库追平日志需要一个月时间,这个时候停掉slave没有任何反应,因为slave会把当前事务复制完才会停下来,类似于CDC的停复制停不下来需要强制停掉。这个时候重启数据库服务也无法停止,只能Kill -9 杀掉mysql进程,重启服务后发现slave又从头开始复制了,这个时候你怎么办?可以skip-slave-start=1,让slave不自动启动
[root@localhost:mytest1]>show slave status\G;
Slave_IO_Running: No
Slave_SQL_Running: No
[root@localhost:mytest1]>select * from t1;
+------+------+
| id | name |
+------+------+
| 8888 | tsss |
| 9999 | jjjj |
+------+------+
2 rows in set (0.00 sec)
参数2)slave-parallel-type = LOGICAL_CLOCK
slave-parallel-workers = 16
参数详解:mysql在5.7中加入了slave_parallel_type,默认值是database,需要改成基于时钟逻辑的LOGICAL_CLOCK。slave_parallel_workers数,默认为0,按需修改,可以根据服务器的配置来开启相应的并行度。将slave_parallel_type设置为'LOGICAL_CLOCK',slave_parallel_workers设置为大于0的值,即算是开启了并行复制
当前使用的是默认值:
[root@localhost:mytest1]>show variables like '%slave_parallel%';
+------------------------+----------+
| Variable_name | Value |
+------------------------+----------+
| slave_parallel_type | DATABASE |
| slave_parallel_workers | 0 |
+------------------------+----------+
2 rows in set (0.00 sec)
我们在/etc/my.cnf写入这个参数slave-parallel-type = LOGICAL_CLOCK,slave-parallel-workers = 16然后重启服务,发现已经有16个复制线程了
注意将这个database修改为logical_clock,在此之前,需要先关闭sql_thread
stop slave sql_thread;
set global slave_parallel_type='LOGICAL_CLOCK';
start slave sql_thread;
select * from performance_schema.replication_applier_status_by_worker;
[root@localhost:mytest1]>select * from performance_schema.replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
| | 1 | 27 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 |
| | 2 | 28 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 3 | 29 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 4 | 30 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 5 | 31 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 6 | 32 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 7 | 33 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 8 | 34 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 9 | 35 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 10 | 36 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 11 | 39 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 12 | 40 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 13 | 41 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 14 | 42 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 15 | 43 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 16 | 44 | ON | | 0 | | 0000-00-00 00:00:00 |
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
16 rows in set (0.00 sec)
我们来测试下,对于没有主键的单表的全表更新有没有性能提升?我们之前做过测试对于Log_test的更新,可以看到5分钟的时间,从库更新了不到10000行数据(28300-8962=19338行)。
而开启并发以后发现5分钟的时间,从库更新了31228-14650=16578行,性能并没有提升。可以分析是因为对于一个事务只能有一个线程执行,所有并没有触发并发
[root@localhost:mytest1]>set transaction_isolation='READ-UNCOMMITTED';
Query OK, 0 rows affected (0.00 sec)
[root@localhost:mytest1]>select now();
+---------------------+
| now() |
+---------------------+
| 2023-09-12 10:34:24 |
+---------------------+
1 row in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test where id>80000000;
+----------+
| count(*) |
+----------+
| 14650 |
+----------+
1 row in set (2.18 sec)
[root@localhost:mytest1]>select now();
+---------------------+
| now() |
+---------------------+
| 2023-09-12 10:39:24 |
+---------------------+
1 row in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test where id>80000000;
+----------+
| count(*) |
+----------+
| 31228 |
+----------+
1 row in set (3.04 sec)
我们来测试下,对于没有主键的两个表的全表更新,从库会不会两张表并发复制?复制效率怎么样?
主端插入log_test400W行数据,然后创建表log_test1,再插入log_test1 400W行,发现从库只有把log_test400W行数据全复制完后才会创建log_test1然后再插入,说明并发复制并不会并发事务
[root@localhost:mytest1]>select count(*) from log_test1;
ERROR 1146 (42S02): Table 'mytest1.log_test1' doesn't exist
[root@localhost:mytest1]>select count(*) from log_test;
+----------+
| count(*) |
+----------+
| 8000000 |
+----------+
1 row in set (5.26 sec)
[root@localhost:mytest1]>select count(*) from log_test1;
+----------+
| count(*) |
+----------+
| 299012 |
+----------+
1 row in set (0.13 sec)
再次确认并发是打开的,主库更新两张表后,从库一直在更新一张表,说明事务并没有并发,这和预期不符啊,这算什么并发啊。
[root@localhost:mytest1]>select * from performance_schema.replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
| | 1 | 31 | ON | ANONYMOUS | 0 | | 0000-00-00 00:00:00 |
| | 2 | 32 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 3 | 33 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 4 | 34 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 5 | 35 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 6 | 36 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 7 | 37 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 8 | 38 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 9 | 39 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 10 | 40 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 11 | 41 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 12 | 42 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 13 | 43 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 14 | 44 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 15 | 45 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 16 | 46 | ON | | 0 | | 0000-00-00 00:00:00 |
+--------------+-----------+-----------+---------------+-----------------------+-------------------+--------------------+----------------------+
16 rows in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test where id>80000000;
+----------+
| count(*) |
+----------+
| 20970 |
+----------+
1 row in set (4.55 sec)
[root@localhost:mytest1]>select count(*) from log_test1 where id>80000000;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (2.29 sec)
我们注意到只有第一个线程的LAST_ERROR_NUMBER不为空是ANONYMOUS,有可能是gtid_mode是OFF导致不能并发,接下来继续测试
[root@localhost:mytest1]>show variables like '%gtid_mode%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_mode | OFF |
+---------------+-------+
1 row in set (0.00 sec)
在主库和从库的/etc/my.cnf都写入GTID的设置,然后重启服务,测试无主键的单表全表更新发现还是单线程在复制该事务,5分钟的时间,从库更新了28767-5280=23487行
#GTID相关
enforce_gtid_consistency = 1
gtid_mode = on
[root@localhost:mytest1]>show variables like '%gtid_mode%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_mode | ON |
+---------------+-------+
1 row in set (0.01 sec)
[root@localhost:mytest1]>select now();
+---------------------+
| now() |
+---------------------+
| 2023-09-12 14:07:39 |
+---------------------+
1 row in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test where id>80000000;
+----------+
| count(*) |
+----------+
| 5280 |
+----------+
1 row in set (2.44 sec)
[root@localhost:mytest1]>select now();
+---------------------+
| now() |
+---------------------+
| 2023-09-12 14:12:39 |
+---------------------+
1 row in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test where id>80000000;
+----------+
| count(*) |
+----------+
| 28767 |
+----------+
1 row in set (2.30 sec)
[root@localhost:mytest1]>select * from performance_schema.replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+--------------------+----------------------+
| | 1 | 32 | ON | 6797f03c-2122-11ee-842b-00505695c6d5:8 | 0 | | 0000-00-00 00:00:00 |
| | 2 | 33 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 3 | 34 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 4 | 35 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 5 | 36 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 6 | 37 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 7 | 38 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 8 | 39 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 9 | 40 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 10 | 41 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 11 | 42 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 12 | 43 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 13 | 44 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 14 | 45 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 15 | 46 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 16 | 47 | ON | | 0 | | 0000-00-00 00:00:00 |
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+--------------------+----------------------+
16 rows in set (0.00 sec)
我们来测试下,对于没有主键的两个表的全表更新,发现还是单线程在工作,执行完插入log_test的事务才开始执行插入Log_test1的事务,并没有并发。
[root@localhost:mytest1]>select * from performance_schema.replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+-----------------------------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+-----------------------------------------+-------------------+--------------------+----------------------+
| | 1 | 32 | ON | 6797f03c-2122-11ee-842b-00505695c6d5:11 | 0 | | 0000-00-00 00:00:00 |
| | 2 | 33 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 3 | 34 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 4 | 35 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 5 | 36 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 6 | 37 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 7 | 38 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 8 | 39 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 9 | 40 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 10 | 41 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 11 | 42 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 12 | 43 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 13 | 44 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 14 | 45 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 15 | 46 | ON | | 0 | | 0000-00-00 00:00:00 |
| | 16 | 47 | ON | | 0 | | 0000-00-00 00:00:00 |
+--------------+-----------+-----------+---------------+-----------------------------------------+-------------------+--------------------+----------------------+
16 rows in set (0.00 sec)
[root@localhost:mytest1]>select count(*) from log_test;
+----------+
| count(*) |
+----------+
| 872638 |
+----------+
1 row in set (0.47 sec)
[root@localhost:mytest1]>select count(*) from log_test1;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)