MySQL高可用之MHA

目录

[1 MySQL-HMA 介绍](#1 MySQL-HMA 介绍)

[1.1 为什么要用MHA?](#1.1 为什么要用MHA?)

[1.2 什么是 MHA?](#1.2 什么是 MHA?)

[1.3 MHA 的组成](#1.3 MHA 的组成)

[1.4 MHA 的特点](#1.4 MHA 的特点)

[1.5 故障切换备选主库的算法](#1.5 故障切换备选主库的算法)

[1.6 MHA工作原理](#1.6 MHA工作原理)

[2 MHA环境部署](#2 MHA环境部署)

[2.1 MHA环境安装](#2.1 MHA环境安装)

[2.2 搭建两从一主环境](#2.2 搭建两从一主环境)

[2.2.1 安装半同步插件并修改my.cnf文件](#2.2.1 安装半同步插件并修改my.cnf文件)

[2.2.2 实现MySQL的半同步模式](#2.2.2 实现MySQL的半同步模式)

[2.2.3 验证是否实现半同步模式](#2.2.3 验证是否实现半同步模式)

[3 MHA 配合半同步模式的使用](#3 MHA 配合半同步模式的使用)

[3.1 MHA 在软件中包含的工具包介绍](#3.1 MHA 在软件中包含的工具包介绍)

[3.2 配置MHA 的管理环境](#3.2 配置MHA 的管理环境)

[3.3 配置 MHA的主配置文件](#3.3 配置 MHA的主配置文件)

[3.4 使用工具进行环境检查](#3.4 使用工具进行环境检查)

[3.5 masterha_master_switch 参数说明](#3.5 masterha_master_switch 参数说明)

[3.5 环境说明](#3.5 环境说明)

[3.6 手动切换测试](#3.6 手动切换测试)

[3.6.1 故障模拟](#3.6.1 故障模拟)

[3.7 MHA之自动切换](#3.7 MHA之自动切换)

[3.7.1 将MASTER切换回MySQL-1](#3.7.1 将MASTER切换回MySQL-1)

[3.7.2 开启自动切换功能 masterha_manager](#3.7.2 开启自动切换功能 masterha_manager)

[3.7.3 模拟故障停掉MASTER](#3.7.3 模拟故障停掉MASTER)

[3.7.4 查看状态](#3.7.4 查看状态)

[3.8 MHA之 实现VIP功能](#3.8 MHA之 实现VIP功能)

[3.8.1 什么是MHA的VIP功能](#3.8.1 什么是MHA的VIP功能)

[3.8.2 使用 perl 脚本实现的VIP漂移](#3.8.2 使用 perl 脚本实现的VIP漂移)

[3.8.3 修改perl脚本关键参数](#3.8.3 修改perl脚本关键参数)

[3.8.4 赋予文件执行权限](#3.8.4 赋予文件执行权限)

[3.8.5 修改配置文件--指定脚本的路径](#3.8.5 修改配置文件--指定脚本的路径)

[3.8.6 测试自动切换VIP漂移](#3.8.6 测试自动切换VIP漂移)

[3.8.7 手动切换 实现 VIP 漂移](#3.8.7 手动切换 实现 VIP 漂移)


1 MySQL-HMA 介绍

1.1 为什么要用MHA

Master的单点故障问题

1.2 什么是MHA

  • MHA(Master High Availability)是一套优秀的MySQL高可用环境下故障切换和主从复制的软件。
  • MHA 的出现就是解决MySQL 单点的问题。
  • MySQL故障切换过程中,MHA能做到0-30秒内自动完成故障切换操作。
  • MHA能在故障切换的过程中最大程度上保证数据的一致性,以达到真正意义上的高可用。

1.3 MHA****的组成

  • MHA由两部分组成:MHAManager (管理节点) MHA Node (数据库节点),
  • MHA Manager 可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave 节点上。
  • MHA Manager 会定时探测集群中的 master 节点。
  • 当 master 出现故障时,它可以自动将最新数据的 slave 提升为新的 master, 然后将所有其他的slave 重新指向新的 master。

1.4 MHA****的特点

自动故障切换过程中,MHA从宕机的主服务器上保存二进制日志,最大程度的保证数据不丢失

使用半同步复制,可以大大降低数据丢失的风险,如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数 据一致性目前MHA支持一主多从架构,最少三台服务,即一主两从

1.5 故障切换备选主库的算法

1.一般判断从库的是从(position/GTID)判断优劣,数据有差异,最接近于master的slave,成为备选主。

2.数据一致的情况下,按照配置文件顺序,选择备选主库。

3.设定有权重(candidate_master=1),按照权重强制指定备选主。

(1)默认情况下如果一个slave落后master 100M的relay logs的话,即使有权重,也会失效。

(2)如果check_repl_delay=0的话,即使落后很多日志,也强制选择其为备选主。

1.6 MHA****工作原理

  • 目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群必须最少有3台数据库服务器, 一主二从,即一台充当Master,台充当备用Master,另一台充当从库。
  • MHA Node 运行在每台 MySQL 服务器上
  • MHAManager 会定时探测集群中的master 节点
  • 当master 出现故障时,它可以自动将最新数据的slave 提升为新的master
  • 然后将所有其他的slave 重新指向新的master,VIP自动漂移到新的master。
  • 整个故障转移过程对应用程序完全透明。

本章节实验使用的环境

主从关系 节点 IP地址
MASTER MySQL-1 192.168.239.210
SLAVE-1 MySQL-2 192.168.239.220
SLAVE-2 MySQL-2 192.168.239.230
MHA MHA 192.168.239.50

2 MHA环境部署

2.1 MHA环境安装

MHA机子上:

bash 复制代码
[root@docker-rhel ~]# ls
anaconda-ks.cfg  initial-setup-ks.cfg           MHA-7.zip                         vmset.sh  模板  图片  下载  桌面
centos7.tar      memc-nginx-module-0.20.tar.gz  srcache-nginx-module-0.33.tar.gz  公共      视频  文档  音乐

[root@docker-rhel ~]# unzip MHA-7.zip
[root@docker-rhel ~]# ls
anaconda-ks.cfg  initial-setup-ks.cfg           MHA-7      srcache-nginx-module-0.33.tar.gz  公共  视频  文档  音乐
centos7.tar      memc-nginx-module-0.20.tar.gz  MHA-7.zip  vmset.sh                          模板  图片  下载  桌面
[root@docker-rhel ~]# cd MHA-7/

[root@docker-rhel MHA-7]# ls
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm  perl-Email-Date-Format-1.002-15.el7.noarch.rpm  perl-MIME-Lite-3.030-1.el7.noarch.rpm
mha4mysql-manager-0.58.tar.gz                   perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm       perl-MIME-Types-1.38-2.el7.noarch.rpm
mha4mysql-node-0.58-0.el7.centos.noarch.rpm     perl-Mail-Sender-0.8.23-1.el7.noarch.rpm        perl-Net-Telnet-3.03-19.el7.noarch.rpm
perl-Config-Tiny-2.14-7.el7.noarch.rpm          perl-Mail-Sendmail-0.79-21.el7.noarch.rpm       perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm

[root@docker-rhel MHA-7]#  yum install *.rpm -y


[root@docker-rhel MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm root@192.168.239.210:~    
[root@docker-rhel MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm root@192.168.239.220:~
[root@docker-rhel MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm root@192.168.239.230:~


# 三台数据库均需要安装
[root@mysql-01 ~]#  yum install /mnt/mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@mysql-02 ~]#  yum install /mnt/mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@mysql-03 ~]#  yum install /mnt/mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y

2.2 搭建两从一主环境

2.2.1 安装半同步插件并修改my.cnf文件

bash 复制代码
# MASTER
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

[root@mysql-01 ~]# vim /etc/my.cnf
[mysqld]
datadir=/data/mysql
socket=/data/mysql/mysql.sock
symbolic-links=0
server_id=10
log_bin=mysql-bin
gtid_mode=ON
enforce-gtid-consistency=ON
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_slave_enabled = 1

[root@mysql-01 ~]# systemctl restart mysqld

# SLAVE-1

mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

[root@mysql-02 ~]# vim /etc/my.cnf
[mysqld]
datadir=/data/mysql
socket=/data/mysql/mysql.sock
symbolic-links=0
server_id=20
gtid_mode=ON
enforce-gtid-consistency=ON
rpl_semi_sync_slave_enabled=1
rpl_semi_sync_master_enabled=1
log_bin=mysql-bin

[root@mysql-02 ~]# systemctl restart mysqld

# SLAVE-2

mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

[root@mysql-03 ~]# vim /etc/my.cnf
[mysqld]
datadir=/data/mysql
socket=/data/mysql/mysql.sock
server-id=30
log-bin=mysql-bin
gtid_mode=ON
log_slave_updates=ON
enforce-gtid-consistency=ON
symbolic-links=0
rpl_semi_sync_slave_enabled=1
log_bin=mysql-bin

[root@mysql-03 ~]# systemctl restart mysqld

2.2.2 实现MySQL的半同步模式

sql 复制代码
################################## MASTER ################################### 

CREATE USER 'repl'@'%' IDENTIFIED BY 'Openlab123!';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';
FLUSH PRIVILEGES;


################################### SLAVE-1 ################################### 

CREATE USER 'repl'@'%' IDENTIFIED BY 'Openlab123!';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';
FLUSH PRIVILEGES;
stop slave;
reset slave all;

change master to
master_host='192.168.239.210',
master_user='repl',
master_password='Openlab123!',
master_auto_position=1;

start slave;


#################################### SLAVE-2 ################################### 

CREATE USER 'repl'@'%' IDENTIFIED BY 'Openlab123!';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';
FLUSH PRIVILEGES;
stop slave;
reset slave all;

change master to
master_host='192.168.239.210',
master_user='repl',
master_password='Openlab123!',
master_auto_position=1;

start slave;

2.2.3 验证是否实现半同步模式

sql 复制代码
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.239.210
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000016
          Read_Master_Log_Pos: 194
               Relay_Log_File: mysql-02-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000016
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 194
              Relay_Log_Space: 577
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
                  Master_UUID: abe58cad-6282-11ef-a9ac-000c29a51779
             Master_Info_File: /data/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: abe58cad-6282-11ef-a9ac-000c29a51779:1-6,
d7f89afd-6282-11ef-ab48-000c299efdf0:1-5
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.239.220
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000016
          Read_Master_Log_Pos: 194
               Relay_Log_File: mysql-03-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000016
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 194
              Relay_Log_Space: 577
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 20
                  Master_UUID: d7f89afd-6282-11ef-ab48-000c299efdf0
             Master_Info_File: /data/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: abe58cad-6282-11ef-a9ac-000c29a51779:1-6,
abe65851-6282-11ef-bf19-000c298f2a8a:1,
d7f89afd-6282-11ef-ab48-000c299efdf0:1-5
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

查看 master 与 slave 1 是否加载了半同步插件

sql 复制代码
#############################  MASTER   ################################### 
mysql> show status like 'Rpl%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 2     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+

 #############################  SLAVE-1   ################################### 

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 2     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | ON    |
+--------------------------------------------+-------+

MySQL-1

MySQL-2

3 MHA 配合半同步模式的使用

3.1 MHA 在软件中包含的工具包介绍

1.Manager工具包主要包括以下几个工具:

工具名称 功能描述
masterha_check_ssh 检查 MHA 的 SSH 配置状况
masterha_check_repl 检查 MySQL 复制状况
masterha_manager 启动 MHA
masterha_check_status 检测当前 MHA 运行状态
masterha_master_monitor 检测 master 是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的 server 信息

2.Node工具包 (通常由masterHA主机直接调用,无需人为执行)

工具名称 功能描述
save_binary_logs 保存和复制 master 的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用到其他 slave
filter_mysqlbinlog 去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞 SQL 线程)

3.2 配置MHA的管理环境

bash 复制代码
[root@docker-rhel masterha]# masterha_manager --help
Usage:
    masterha_manager --global_conf=/etc/masterha_default.cnf  #全局配置文件,记录公共设定
    --conf=/usr/local/masterha/conf/app1.cnf #不同管理配置文件,记录各自配置
    See online reference
    (http://code.google.com/p/mysql-master-ha/wiki/masterha_manager) for
    details.

3.3 配置 MHA的主配置文件

-- 因为我们当前只有一套主从,所以我们只需要写一个配置文件即可

-- rpm包中没有为我们准备配置文件的模板

-- 可以解压源码包后在samples中找到配置文件的模板文件

bash 复制代码
[root@docker-rhel ~]# mkdir /etc/masterha
[root@docker-rhel MHA-7]# tar zxf mha4mysql-manager-0.58.tar.gz
[root@docker-rhel MHA-7]# cd mha4mysql-manager-0.58/samples/conf/
[root@docker-rhel conf]# cat masterha_default.cnf app1.cnf > /etc/masterha/app1.cnf
bash 复制代码
[root@docker-rhel masterha]# vim /etc/masterha/app1.cnf
[server default]
#mysql管理员用户,因为需要做自动化配置
user=root
# MySQL 用户名
password=Openlab123!
# MySQL 密码
ssh_user=root
# SSH 用户名
master_binlog_dir=/data/mysql # 主服务器的二进制日志目录
remote_workdir=/tmp  # 远程工作目录

repl_user=repl  #mysql主从复制中负责认证的用户
repl_password=Openlab123!  #mysql主从复制中负责认证的用户密码
secondary_check_script=masterha_secondary_check -s 192.168.239.210 -s 192.168.239.211
# 检查从服务器状态的脚本   其中 192.168.239.210 为MASTER 的IP地址  
# 192.168.239.211 为备用地址 假如说找不到192.168.239.210 就找192.168.239.211

ping_interval=3
# 心跳间隔(秒)
# master_ip_failover_script=/usr/local/bin/master_ip_failover
# 主服务器故障切换脚本
# shutdown_script=/script/masterha/power_manager
# 关闭脚本
# report_script=/script/masterha/send_report
# 报告脚本
# master_ip_online_change_script=/usr/local/bin/master_ip_online_change
# 主服务器在线更改脚本

[server default]
# 默认服务器配置
manager_workdir=/etc/masterha
# 管理器工作目录
manager_log=/etc/masterha/manager.log
# 管理器日志文件

[server1]
# 服务器1配置
hostname=192.168.239.210
# 主机名
candidate_master=1
                        ##默认情况下如果一个slave落后master 100M的relay logs的话
                        #MHA将不会选择该slave作为一个新的master
                        #因为对于这个slave的恢复需要花费很长时间
                        #通过设置check_repl_delay=0
                        #MHA触发切换在选择一个新的master的时候将会忽略复制延时
                        #这个参数对于设置了candidate_master=1的主机非常有用
                        #因为这个候选主在切换的过程中一定是新的master

[server2]
# 服务器2配置
hostname=192.168.239.220
# 主机名
candidate_master=1
# 是否候选主服务器
check_repl_delay=0
# 检查复制延迟(秒)

[server3]
# 服务器3配置
hostname=192.168.239.230
# 主机名
no_master=1
# 不为主服务器

增加master IP地址

bash 复制代码
[root@mysql-01 ~]# ip addr add 192.168.239.211/24 dev eth0

注意:

三台主机必须做相互的免密登录不然无法使用mha

3.4 使用工具进行环境检查

检查环境是否适合

bash 复制代码
[root@docker-rhel ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
bash 复制代码
[root@docker-rhel ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

3.5 masterha_master_switch 参数说明

参数 描述
--conf 指定 MHA 配置文件的位置,例如:/etc/masterha/app1.cnf
--master_state 指定主服务器的状态,这里为 alive 表示主服务器处于活动状态
--new_master_host 新的主服务器的 IP 地址,例如:192.168.239.220
--new_master_port 新的主服务器的端口,例如:3306
--orig_master_is_new_slave 标记原始主服务器成为新的从服务器
--running_updates_limit 在切换过程中允许的最大更新数量,例如:10000

3.5 环境说明

--- 以上已经搭过环境了

主从关系 节点 IP地址
MASTER MySQL-1 192.168.239.210
SLAVE-1 MySQL-2 192.168.239.220
SLAVE-2 MySQL-2 192.168.239.230
MHA MHA 192.168.239.50

3.6 手动切换测试

bash 复制代码
[root@docker-rhel masterha]# masterha_master_switch \
--conf=/etc/masterha/app1.cnf \
--master_state=alive \
--new_master_host=192.168.239.220 \
--new_master_port=3306 \
--orig_master_is_new_slave \
--running_updates_limit=10000

Sun Aug 25 22:39:09 2024 - [info] MHA::MasterRotate version 0.58.
Sun Aug 25 22:39:09 2024 - [info] Starting online master switch..
Sun Aug 25 22:39:09 2024 - [info] 
Sun Aug 25 22:39:09 2024 - [info] * Phase 1: Configuration Check Phase..
Sun Aug 25 22:39:09 2024 - [info] 
Sun Aug 25 22:39:09 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Aug 25 22:39:09 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sun Aug 25 22:39:09 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sun Aug 25 22:39:10 2024 - [info] GTID failover mode = 1
Sun Aug 25 22:39:10 2024 - [info] Current Alive Master: 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 22:39:10 2024 - [info] Alive Slaves:
Sun Aug 25 22:39:10 2024 - [info]   192.168.239.220(192.168.239.220:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 22:39:10 2024 - [info]     GTID ON
Sun Aug 25 22:39:10 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 22:39:10 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Aug 25 22:39:10 2024 - [info]   192.168.239.230(192.168.239.230:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 22:39:10 2024 - [info]     GTID ON
Sun Aug 25 22:39:10 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 22:39:10 2024 - [info]     Not candidate for the new Master (no_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.239.210(192.168.239.210:3306)? (YES/no): yes
Sun Aug 25 22:39:11 2024 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Sun Aug 25 22:39:11 2024 - [info]  ok.
Sun Aug 25 22:39:11 2024 - [info] Checking MHA is not monitoring or doing failover..
Sun Aug 25 22:39:11 2024 - [info] Checking replication health on 192.168.239.220..
Sun Aug 25 22:39:11 2024 - [info]  ok.
Sun Aug 25 22:39:11 2024 - [info] Checking replication health on 192.168.239.230..
Sun Aug 25 22:39:11 2024 - [info]  ok.
Sun Aug 25 22:39:11 2024 - [info] 192.168.239.220 can be new master.
Sun Aug 25 22:39:11 2024 - [info] 
From:
192.168.239.210(192.168.239.210:3306) (current master)
 +--192.168.239.220(192.168.239.220:3306)
 +--192.168.239.230(192.168.239.230:3306)

To:
192.168.239.220(192.168.239.220:3306) (new master)
 +--192.168.239.230(192.168.239.230:3306)
 +--192.168.239.210(192.168.239.210:3306)

Starting master switch from 192.168.239.210(192.168.239.210:3306) to 192.168.239.220(192.168.239.220:3306)? (yes/NO): yes
Sun Aug 25 22:39:13 2024 - [info] Checking whether 192.168.239.220(192.168.239.220:3306) is ok for the new master..
Sun Aug 25 22:39:13 2024 - [info]  ok.
Sun Aug 25 22:39:13 2024 - [info] 192.168.239.210(192.168.239.210:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Sun Aug 25 22:39:13 2024 - [info] 192.168.239.210(192.168.239.210:3306): Resetting slave pointing to the dummy host.
Sun Aug 25 22:39:13 2024 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Aug 25 22:39:13 2024 - [info] 
Sun Aug 25 22:39:13 2024 - [info] * Phase 2: Rejecting updates Phase..
Sun Aug 25 22:39:13 2024 - [info] 
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Sun Aug 25 22:39:14 2024 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Sun Aug 25 22:39:14 2024 - [info] Executing FLUSH TABLES WITH READ LOCK..
Sun Aug 25 22:39:14 2024 - [info]  ok.
Sun Aug 25 22:39:14 2024 - [info] Orig master binlog:pos is mysql-bin.000016:194.
Sun Aug 25 22:39:14 2024 - [info]  Waiting to execute all relay logs on 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 22:39:14 2024 - [info]  master_pos_wait(mysql-bin.000016:194) completed on 192.168.239.220(192.168.239.220:3306). Executed 0 events.
Sun Aug 25 22:39:14 2024 - [info]   done.
Sun Aug 25 22:39:14 2024 - [info] Getting new master's binlog name and position..
Sun Aug 25 22:39:14 2024 - [info]  mysql-bin.000017:194
Sun Aug 25 22:39:14 2024 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.239.220', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sun Aug 25 22:39:14 2024 - [info] Setting read_only=0 on 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 22:39:14 2024 - [info]  ok.
Sun Aug 25 22:39:14 2024 - [info] 
Sun Aug 25 22:39:14 2024 - [info] * Switching slaves in parallel..
Sun Aug 25 22:39:14 2024 - [info] 
Sun Aug 25 22:39:14 2024 - [info] -- Slave switch on host 192.168.239.230(192.168.239.230:3306) started, pid: 13067
Sun Aug 25 22:39:14 2024 - [info] 
Sun Aug 25 22:39:16 2024 - [info] Log messages from 192.168.239.230 ...
Sun Aug 25 22:39:16 2024 - [info] 
Sun Aug 25 22:39:14 2024 - [info]  Waiting to execute all relay logs on 192.168.239.230(192.168.239.230:3306)..
Sun Aug 25 22:39:14 2024 - [info]  master_pos_wait(mysql-bin.000016:194) completed on 192.168.239.230(192.168.239.230:3306). Executed 0 events.
Sun Aug 25 22:39:14 2024 - [info]   done.
Sun Aug 25 22:39:14 2024 - [info]  Resetting slave 192.168.239.230(192.168.239.230:3306) and starting replication from the new master 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 22:39:14 2024 - [info]  Executed CHANGE MASTER.
Sun Aug 25 22:39:15 2024 - [info]  Slave started.
Sun Aug 25 22:39:16 2024 - [info] End of log messages from 192.168.239.230 ...
Sun Aug 25 22:39:16 2024 - [info] 
Sun Aug 25 22:39:16 2024 - [info] -- Slave switch on host 192.168.239.230(192.168.239.230:3306) succeeded.
Sun Aug 25 22:39:16 2024 - [info] Unlocking all tables on the orig master:
Sun Aug 25 22:39:16 2024 - [info] Executing UNLOCK TABLES..
Sun Aug 25 22:39:16 2024 - [info]  ok.
Sun Aug 25 22:39:16 2024 - [info] Starting orig master as a new slave..
Sun Aug 25 22:39:16 2024 - [info]  Resetting slave 192.168.239.210(192.168.239.210:3306) and starting replication from the new master 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 22:39:16 2024 - [info]  Executed CHANGE MASTER.
Sun Aug 25 22:39:17 2024 - [info]  Slave started.
Sun Aug 25 22:39:17 2024 - [info] All new slave servers switched successfully.
Sun Aug 25 22:39:17 2024 - [info] 
Sun Aug 25 22:39:17 2024 - [info] * Phase 5: New master cleanup phase..
Sun Aug 25 22:39:17 2024 - [info] 
Sun Aug 25 22:39:17 2024 - [info]  192.168.239.220: Resetting slave info succeeded.
Sun Aug 25 22:39:17 2024 - [info] Switching master to 192.168.239.220(192.168.239.220:3306) completed successfully.

3.6.1 故障模拟

实际环境中会出现down机的情况

模拟down机情况做出方法模拟

bash 复制代码
[root@mysql-01 ~]# service mysqld stop 
Shutting down MySQL. SUCCESS! 
参数 描述
--master_state 主服务器的状态,这里为 dead 表示主服务器已死机
--conf 指定 MHA 配置文件的位置,例如:/etc/masterha/app1.cnf
--dead_master_host 死机的主服务器的 IP 地址,例如:192.168.239.210
--dead_master_port 死机的主服务器的端口,例如:3306
--new_master_host 新的主服务器的 IP 地址,例如:192.168.239.220
--new_master_port 新的主服务器的端口,例如:3306
--ignore_last_failover 忽略最近一次的故障转移记录(忽略锁表)
bash 复制代码
[root@docker-rhel masterha]# masterha_master_switch \
--master_state=dead \
--conf=/etc/masterha/app1.cnf \
--dead_master_host=192.168.239.210 \
--dead_master_port=3306 \
--new_master_host=192.168.239.220 \
--new_master_port=3306 \
--ignore_last_failover


--dead_master_ip=<dead_master_ip> is not set. Using 192.168.239.210.
Sun Aug 25 23:07:55 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Aug 25 23:07:55 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sun Aug 25 23:07:55 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sun Aug 25 23:07:55 2024 - [info] MHA::MasterFailover version 0.58.
Sun Aug 25 23:07:55 2024 - [info] Starting master failover.
Sun Aug 25 23:07:55 2024 - [info] 
Sun Aug 25 23:07:55 2024 - [info] * Phase 1: Configuration Check Phase..
Sun Aug 25 23:07:55 2024 - [info] 
Sun Aug 25 23:07:56 2024 - [info] GTID failover mode = 1
Sun Aug 25 23:07:56 2024 - [info] Dead Servers:
Sun Aug 25 23:07:56 2024 - [info]   192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:56 2024 - [info] Checking master reachability via MySQL(double check)...
Sun Aug 25 23:07:56 2024 - [info]  ok.
Sun Aug 25 23:07:56 2024 - [info] Alive Servers:
Sun Aug 25 23:07:56 2024 - [info]   192.168.239.220(192.168.239.220:3306)
Sun Aug 25 23:07:56 2024 - [info]   192.168.239.230(192.168.239.230:3306)
Sun Aug 25 23:07:56 2024 - [info] Alive Slaves:
Sun Aug 25 23:07:56 2024 - [info]   192.168.239.220(192.168.239.220:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:56 2024 - [info]     GTID ON
Sun Aug 25 23:07:56 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:56 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Aug 25 23:07:56 2024 - [info]   192.168.239.230(192.168.239.230:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:56 2024 - [info]     GTID ON
Sun Aug 25 23:07:56 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:56 2024 - [info]     Not candidate for the new Master (no_master is set)
Master 192.168.239.210(192.168.239.210:3306) is dead. Proceed? (yes/NO): yes
Sun Aug 25 23:07:57 2024 - [info] Starting GTID based failover.
Sun Aug 25 23:07:57 2024 - [info] 
Sun Aug 25 23:07:57 2024 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Aug 25 23:07:57 2024 - [info] 
Sun Aug 25 23:07:57 2024 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Aug 25 23:07:57 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] HealthCheck: SSH to 192.168.239.210 is reachable.
Sun Aug 25 23:07:58 2024 - [info] Forcing shutdown so that applications never connect to the current master..
Sun Aug 25 23:07:58 2024 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Sun Aug 25 23:07:58 2024 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sun Aug 25 23:07:58 2024 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun Aug 25 23:07:58 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] * Phase 3: Master Recovery Phase..
Sun Aug 25 23:07:58 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun Aug 25 23:07:58 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] The latest binary log file/position on all slaves is mysql-bin.000017:194
Sun Aug 25 23:07:58 2024 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sun Aug 25 23:07:58 2024 - [info]   192.168.239.220(192.168.239.220:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:58 2024 - [info]     GTID ON
Sun Aug 25 23:07:58 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:58 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Aug 25 23:07:58 2024 - [info]   192.168.239.230(192.168.239.230:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:58 2024 - [info]     GTID ON
Sun Aug 25 23:07:58 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:58 2024 - [info]     Not candidate for the new Master (no_master is set)
Sun Aug 25 23:07:58 2024 - [info] The oldest binary log file/position on all slaves is mysql-bin.000017:194
Sun Aug 25 23:07:58 2024 - [info] Oldest slaves:
Sun Aug 25 23:07:58 2024 - [info]   192.168.239.220(192.168.239.220:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:58 2024 - [info]     GTID ON
Sun Aug 25 23:07:58 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:58 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Aug 25 23:07:58 2024 - [info]   192.168.239.230(192.168.239.230:3306)  Version=5.7.44-log (oldest major version between slaves) log-bin:enabled
Sun Aug 25 23:07:58 2024 - [info]     GTID ON
Sun Aug 25 23:07:58 2024 - [info]     Replicating from 192.168.239.210(192.168.239.210:3306)
Sun Aug 25 23:07:58 2024 - [info]     Not candidate for the new Master (no_master is set)
Sun Aug 25 23:07:58 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] * Phase 3.3: Determining New Master Phase..
Sun Aug 25 23:07:58 2024 - [info] 
Sun Aug 25 23:07:58 2024 - [info] 192.168.239.220 can be new master.
Sun Aug 25 23:07:58 2024 - [info] New master is 192.168.239.220(192.168.239.220:3306)
Sun Aug 25 23:07:58 2024 - [info] Starting master failover..
Sun Aug 25 23:07:58 2024 - [info] 
From:
192.168.239.210(192.168.239.210:3306) (current master)
 +--192.168.239.220(192.168.239.220:3306)
 +--192.168.239.230(192.168.239.230:3306)

To:
192.168.239.220(192.168.239.220:3306) (new master)
 +--192.168.239.230(192.168.239.230:3306)

Starting master switch from 192.168.239.210(192.168.239.210:3306) to 192.168.239.220(192.168.239.220:3306)? (yes/NO): yes
Sun Aug 25 23:07:59 2024 - [info] New master decided manually is 192.168.239.220(192.168.239.220:3306)
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info] * Phase 3.3: New Master Recovery Phase..
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info]  Waiting all logs to be applied.. 
Sun Aug 25 23:07:59 2024 - [info]   done.
Sun Aug 25 23:07:59 2024 - [info] Getting new master's binlog name and position..
Sun Aug 25 23:07:59 2024 - [info]  mysql-bin.000017:194
Sun Aug 25 23:07:59 2024 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.239.220', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sun Aug 25 23:07:59 2024 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000017, 194, abe58cad-6282-11ef-a9ac-000c29a51779:1-6,
d7f89afd-6282-11ef-ab48-000c299efdf0:1-5
Sun Aug 25 23:07:59 2024 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Sun Aug 25 23:07:59 2024 - [info] Setting read_only=0 on 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 23:07:59 2024 - [info]  ok.
Sun Aug 25 23:07:59 2024 - [info] ** Finished master recovery successfully.
Sun Aug 25 23:07:59 2024 - [info] * Phase 3: Master Recovery Phase completed.
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info] * Phase 4: Slaves Recovery Phase..
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info] * Phase 4.1: Starting Slaves in parallel..
Sun Aug 25 23:07:59 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info] -- Slave recovery on host 192.168.239.230(192.168.239.230:3306) started, pid: 13538. Check tmp log /etc/masterha/192.168.239.230_3306_20240825230755.log if it takes time..
Sun Aug 25 23:08:01 2024 - [info] 
Sun Aug 25 23:08:01 2024 - [info] Log messages from 192.168.239.230 ...
Sun Aug 25 23:08:01 2024 - [info] 
Sun Aug 25 23:07:59 2024 - [info]  Resetting slave 192.168.239.230(192.168.239.230:3306) and starting replication from the new master 192.168.239.220(192.168.239.220:3306)..
Sun Aug 25 23:07:59 2024 - [info]  Executed CHANGE MASTER.
Sun Aug 25 23:08:00 2024 - [info]  Slave started.
Sun Aug 25 23:08:00 2024 - [info]  gtid_wait(abe58cad-6282-11ef-a9ac-000c29a51779:1-6,
d7f89afd-6282-11ef-ab48-000c299efdf0:1-5) completed on 192.168.239.230(192.168.239.230:3306). Executed 0 events.
Sun Aug 25 23:08:01 2024 - [info] End of log messages from 192.168.239.230.
Sun Aug 25 23:08:01 2024 - [info] -- Slave on host 192.168.239.230(192.168.239.230:3306) started.
Sun Aug 25 23:08:01 2024 - [info] All new slave servers recovered successfully.
Sun Aug 25 23:08:01 2024 - [info] 
Sun Aug 25 23:08:01 2024 - [info] * Phase 5: New master cleanup phase..
Sun Aug 25 23:08:01 2024 - [info] 
Sun Aug 25 23:08:01 2024 - [info] Resetting slave info on the new master..
Sun Aug 25 23:08:01 2024 - [info]  192.168.239.220: Resetting slave info succeeded.
Sun Aug 25 23:08:01 2024 - [info] Master failover to 192.168.239.220(192.168.239.220:3306) completed successfully.
Sun Aug 25 23:08:01 2024 - [info] 

----- Failover Report -----

app1: MySQL Master failover 192.168.239.210(192.168.239.210:3306) to 192.168.239.220(192.168.239.220:3306) succeeded

Master 192.168.239.210(192.168.239.210:3306) is down!

Check MHA Manager logs at docker-rhel for details.

Started manual(interactive) failover.
Selected 192.168.239.220(192.168.239.220:3306) as a new master.
192.168.239.220(192.168.239.220:3306): OK: Applying all logs succeeded.
192.168.239.230(192.168.239.230:3306): OK: Slave started, replicating from 192.168.239.220(192.168.239.220:3306)
192.168.239.220(192.168.239.220:3306): Resetting slave info succeeded.
Master failover to 192.168.239.220(192.168.239.220:3306) completed successfully.

恢复故障节点

bash 复制代码
[root@mysql-01 ~]# systemctl start mysqld

查看SLAVE2发现并不会转回 master 1 MASTER 由 192.168.239.220 主机接管了

在MySQL - 1指定现在的 MASTER(MySQL-2) 并将自己设置为从

bash 复制代码
stop slave;
reset slave all;
change master to
master_host='192.168.239.220',
master_user='repl',
master_password='Openlab123!',
master_auto_position=1;

故障之后会产生锁文件,不删除之后实验进行不了

bash 复制代码
[root@docker-rhel ~]# rm -rf /etc/masterha/app1.failover.complete /etc/masterha/manager.log 

3.7 MHA之自动切换

3.7.1 将MASTER切换回MySQL-1

bash 复制代码
[root@docker-rhel masterha]# masterha_master_switch \
--conf=/etc/masterha/app1.cnf \
--master_state=alive \
--new_master_host=192.168.239.210 \
--new_master_port=3306 \
--orig_master_is_new_slave \
--running_updates_limit=10000
sql 复制代码
############################## MASTER #############################
mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000022
         Position: 194
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: abe58cad-6282-11ef-a9ac-000c29a51779:1-6,
d7f89afd-6282-11ef-ab48-000c299efdf0:1-5




############################## SLAVE-1 #############################

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.239.210
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000022
          Read_Master_Log_Pos: 194
               Relay_Log_File: mysql-02-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000022
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

############################## SLAVE-2 #############################

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.239.210
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000022
          Read_Master_Log_Pos: 194
               Relay_Log_File: mysql-03-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000022
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

3.7.2 开启自动切换功能 masterha_manager

  • 当主服务器发生故障时,**masterha_manager**会选择一个新的主服务器,并将所有从服务器的数据同步到新主服务器,从而保证数据的一致性和可用性。
bash 复制代码
[root@docker-rhel masterha]# pwd
/etc/masterha
[root@docker-rhel masterha]# masterha_manager --conf=/etc/masterha/app1.cnf &

3.7.3 模拟故障停掉MASTER

bash 复制代码
[root@mysql-01 ~]# systemctl stop mysqld

这个时候MySQL-1重新去启动的时候会发现不会顶替掉自动替换的MASTER(MASTER-2),并且也不会自动切换到SLAVE状态-----> 这个时候需要我们手动去指定他

bash 复制代码
[root@mysql-01 ~]# systemctl start mysqld
[root@mysql-01 ~]# mysql -uroot -pOpenlab123!
sql 复制代码
mysql> change master to
       master_host='192.168.239.220',
       master_user='repl',
       master_password='Openlab123!',
       master_auto_position=1;

mysql> start slave;

mysql> show slave status\G

3.7.4 查看状态

3.8 MHA之 实现VIP****功能

3.8.1 什么是MHA的VIP功能

在我们重新设计之初指定的IP地址只有一个,不可能为了故障一个一个的去修改程序太麻烦,那样会显得非常地繁琐,在MHA中就有一个类似于keepalived的功能:VIP漂移
实验环境:

主从关系 节点 IP地址
MASTER MySQL-1 192.168.239.210
SLAVE-1 MySQL-2 192.168.239.220
SLAVE-2 MySQL-2 192.168.239.230

3.8.2 使用 perl 脚本实现的VIP漂移

主要的功能还是以perl语言来实现的

master_ip_failover -- 自动切换VIP漂移脚本

perl 复制代码
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;

my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);

my $vip = '192.168.239.100/24';
my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0";
my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth0";

GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
);

exit &main();

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}

sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

master_ip_online_change -- 手动切换VIP漂移脚本

perl 复制代码
#!/usr/bin/env perl
use strict;  
use warnings FATAL =>'all';  
  
use Getopt::Long;  
  
my $vip = '192.168.239.100/24';
my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0";  
my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth0";  
my $exit_code = 0;  
  
my (  
  $command,              $orig_master_is_new_slave, $orig_master_host,  
  $orig_master_ip,       $orig_master_port,         $orig_master_user,  
  $orig_master_password, $orig_master_ssh_user,     $new_master_host,  
  $new_master_ip,        $new_master_port,          $new_master_user,  
  $new_master_password,  $new_master_ssh_user,  
);  
GetOptions(  
  'command=s'                => \$command,  
  'orig_master_is_new_slave' => \$orig_master_is_new_slave,  
  'orig_master_host=s'       => \$orig_master_host,  
  'orig_master_ip=s'         => \$orig_master_ip,  
  'orig_master_port=i'       => \$orig_master_port,  
  'orig_master_user=s'       => \$orig_master_user,  
  'orig_master_password=s'   => \$orig_master_password,  
  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,  
  'new_master_host=s'        => \$new_master_host,  
  'new_master_ip=s'          => \$new_master_ip,  
  'new_master_port=i'        => \$new_master_port,  
  'new_master_user=s'        => \$new_master_user,  
  'new_master_password=s'    => \$new_master_password,  
  'new_master_ssh_user=s'    => \$new_master_ssh_user,  
);  
  
  
exit &main();  
  
sub main {  
  
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";  
  
if ( $command eq "stop" || $command eq "stopssh" ) {  
  
        # $orig_master_host, $orig_master_ip, $orig_master_port are passed.  
        # If you manage master ip address at global catalog database,  
        # invalidate orig_master_ip here.  
        my $exit_code = 1;  
        eval {  
            print "\n\n\n***************************************************************\n";  
            print "Disabling the VIP - $vip on old master: $orig_master_host\n";  
            print "***************************************************************\n\n\n\n";  
&stop_vip();  
            $exit_code = 0;  
        };  
        if ($@) {  
            warn "Got Error: $@\n";  
            exit $exit_code;  
        }  
        exit $exit_code;  
}  
elsif ( $command eq "start" ) {  
  
        # all arguments are passed.  
        # If you manage master ip address at global catalog database,  
        # activate new_master_ip here.  
        # You can also grant write access (create user, set read_only=0, etc) here.  
my $exit_code = 10;  
        eval {  
            print "\n\n\n***************************************************************\n";  
            print "Enabling the VIP - $vip on new master: $new_master_host \n";  
            print "***************************************************************\n\n\n\n";  
&start_vip();  
            $exit_code = 0;  
        };  
        if ($@) {  
            warn $@;  
            exit $exit_code;  
        }  
        exit $exit_code;  
}  
elsif ( $command eq "status" ) {  
        print "Checking the Status of the script.. OK \n";  
        `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;  
        exit 0;  
}  
else {  
&usage();  
        exit 1;  
}  
}  
  
# A simple system call that enable the VIP on the new master  
sub start_vip() {  
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;  
}  
# A simple system call that disable the VIP on the old_master  
sub stop_vip() {  
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;  
}  
  
sub usage {  
print  
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";  
}

3.8.3 修改perl脚本关键参数

bash 复制代码
# 自动切换VIP 的脚本
[root@docker-rhel masterha]# vim /usr/local/bin/master_ip_failover
bash 复制代码
# 手动切换的脚本
[root@docker-rhel masterha]# vim /usr/local/bin/master_ip_online_change 

3.8.4 赋予文件执行权限

bash 复制代码
[root@docker-rhel masterha]# chmod +x /usr/local/bin/master_ip_failover 
[root@docker-rhel masterha]# chmod +x /usr/local/bin/master_ip_online_change

3.8.5 修改配置文件--指定脚本的路径

bash 复制代码
[root@docker-rhel masterha]# vim /etc/masterha/app1.cnf 

一开始在MASTER上必须要有VIP因为是指定这个IP拿这个IP进行程序通讯的

bash 复制代码
[root@mysql-01 ~]# ip addr add 192.168.239.100/24 dev eth0

3.8.6 测试自动切换VIP漂移

启动自动调换 并实现VIP漂移

bash 复制代码
[root@docker-rhel masterha]# masterha_manager --conf=/etc/masterha/app1.cnf &

停止MASTER(MySQL-1)

bash 复制代码
[root@mysql-01 ~]# systemctl stop mysqld

查看日志

查看SLAVE-2(MySQL-3)

同样的在MySQL-1 端就算启动会发现他不会自动切换为slave

由于MySQL-2 为主需要MySQL-1做 SLAVE 指定MySQL-2

sql 复制代码
mysql> change master to
       master_host='192.168.239.220',
       master_user='repl',
       master_password='Openlab123!',
       master_auto_position=1;

mysql> start slave;

3.8.7 手动切换 实现 VIP 漂移

手动切换的效果也是一样的--实现master_ip_online_change脚本

不过得删除锁文件--不然会出现错误

bash 复制代码
[root@docker-rhel masterha]# rm -rf app1.failover.complete 

手动切换到MySQL-1

bash 复制代码
[root@docker-rhel masterha]# masterha_master_switch \
--conf=/etc/masterha/app1.cnf \
--master_state=alive \
--new_master_host=192.168.239.210 \
--new_master_port=3306 \
--orig_master_is_new_slave \
--running_updates_limit=10000

查看发现VIP已经漂移会来了

相关推荐
大树8818 小时前
金刚石散热越强,管路越先见顶
大数据·运维·服务器·人工智能·ai
摇滚侠18 小时前
Linux CentOS7 rpm 安装 MySQL 5.7
linux·运维·mysql
霸道流氓气质19 小时前
领域驱动设计(DDD)在 Spring Boot 微服务中的实践指南
运维·spring boot·微服务
bush419 小时前
嵌入式linux学习记录十四、术语
linux·嵌入式
载数而行52019 小时前
Linux 11 动态监控指令top
linux
小宇宙Zz19 小时前
Maven依赖冲突
java·服务器·maven
Inhand陈工20 小时前
基于台达PLC与映翰通IG502的智慧水产养殖精准投喂与远程运维解决方案
运维·人工智能·物联网·阿里云·信息与通信
酣大智20 小时前
ARP代理--工作原理
运维·网络·arp·arp代理
麦聪聊数据20 小时前
数据服务化时代:企业数据能力输出的核心路径
数据库
不会C语言的男孩20 小时前
Linux 系统编程 · 第 8 章:进程基础
linux·c语言