一、高可用概述
1.1 什么是高可用
高可用(High Availability,HA)是指系统能够持续提供服务的能力,通常用"几个9"来衡量:
| 可用性级别 | 年停机时间 | 描述 |
|---|---|---|
| 99% | 3.65天 | 常规可用性 |
| 99.9% | 8.76小时 | 较高可用性 |
| 99.99% | 52.56分钟 | 高可用性 |
| 99.999% | 5.26分钟 | 极高可用性 |
1.2 高可用衡量指标
| 指标 | 说明 | 目标 |
|---|---|---|
| RPO | 恢复点目标,可容忍的数据丢失量 | 0(不丢数据) |
| RTO | 恢复时间目标,可容忍的恢复时间 | < 30秒 |
| MTBF | 平均故障间隔时间 | 越长越好 |
| MTTR | 平均修复时间 | 越短越好 |
1.3 MySQL高可用方案对比
| 方案 | 特点 | 优点 | 缺点 | 适用场景 |
|---|---|---|---|---|
| 主从复制+Keepalived | VIP漂移 | 简单易用 | 脑裂风险 | 中小规模 |
| MHA | 自动故障切换 | 切换快(10-30秒) | 需要SSH、部署复杂 | 金融、电商 |
| Orchestrator | 拓扑管理 | 可视化、自动恢复 | 学习成本高 | 中大规模 |
| MGR | 组复制 | 数据一致性高 | MySQL 5.7+ | 强一致场景 |
| InnoDB Cluster | MySQL官方 | 集成度高 | 版本要求高 | 新项目 |
| Galera Cluster | 多主同步 | 同步复制 | 写入性能下降 | 高可用场景 |
| ProxySQL/MyCat | 读写分离 | 灵活控制 | 增加架构复杂度 | 大规模读写分离 |
二、主从复制 + Keepalived
2.1 架构说明
2.2 环境准备
bash
# 环境规划
Master: 192.168.1.100
Slave: 192.168.1.101
VIP: 192.168.1.200
# 确保主从复制已配置
2.3 Keepalived安装与配置
bash
# 1. 安装Keepalived
apt-get install keepalived -y # Ubuntu
yum install keepalived -y # CentOS
# 2. 主库Keepalived配置 (/etc/keepalived/keepalived.conf)
cat > /etc/keepalived/keepalived.conf << 'EOF'
global_defs {
router_id MYSQL_HA
script_user root
enable_script_security
}
vrrp_script check_mysql {
script "/etc/keepalived/check_mysql.sh"
interval 2
timeout 3
weight -10
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.1.200/24 dev eth0
}
track_script {
check_mysql
}
}
EOF
# 3. 从库Keepalived配置
# state改为 BACKUP
# priority改为 90
# 其他相同
# 4. MySQL健康检查脚本 (/etc/keepalived/check_mysql.sh)
cat > /etc/keepalived/check_mysql.sh << 'EOF'
#!/bin/bash
# MySQL健康检查脚本
MYSQL_HOST="127.0.0.1"
MYSQL_USER="root"
MYSQL_PASS="password"
MYSQL_PORT=3306
# 检查MySQL进程
if ! pgrep -x mysqld > /dev/null; then
echo "MySQL进程不存在"
exit 1
fi
# 检查MySQL连接
mysqladmin -h$MYSQL_HOST -u$MYSQL_USER -p$MYSQL_PASS -P$MYSQL_PORT ping > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "MySQL连接失败"
exit 1
fi
# 检查复制状态(从库需要)
# 如果是从库,检查复制是否正常
# IS_SLAVE=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW SLAVE STATUS\G" | grep "Slave_IO_Running:" | awk '{print $2}')
# if [ "$IS_SLAVE" = "Yes" ]; then
# exit 0
# fi
exit 0
EOF
chmod +x /etc/keepalived/check_mysql.sh
# 5. 启动Keepalived
systemctl enable keepalived
systemctl start keepalived
2.4 故障切换测试
bash
# 1. 查看VIP状态
ip addr show | grep 192.168.1.200
# 2. 停止主库MySQL
systemctl stop mysql
# 3. 查看VIP是否漂移到从库
# 在从库上执行
ip addr show | grep 192.168.1.200
三、MHA(Master High Availability)
3.1 MHA架构
3.2 环境准备
bash
# 环境规划
Manager: 192.168.1.100
Master: 192.168.1.101
Slave1: 192.168.1.102
Slave2: 192.168.1.103
# 确保SSH互信已配置
3.3 安装MHA
bash
# 1. 安装依赖包
# CentOS
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch \
perl-Parallel-ForkManager perl-Time-HiRes perl-Digest-SHA1
# Ubuntu
apt-get install -y libdbd-mysql-perl libconfig-tiny-perl \
liblog-dispatch-perl libparallel-forkmanager-perl
# 2. 下载并安装MHA
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.noarch.rpm
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.noarch.rpm
rpm -ivh mha4mysql-node-0.58-0.el7.noarch.rpm
rpm -ivh mha4mysql-manager-0.58-0.el7.noarch.rpm
3.4 配置MHA
bash
# 1. 在所有节点创建MHA用户
mysql -u root -p -e "CREATE USER 'mha'@'%' IDENTIFIED BY 'mha_password';"
mysql -u root -p -e "GRANT ALL PRIVILEGES ON *.* TO 'mha'@'%';"
mysql -u root -p -e "FLUSH PRIVILEGES;"
# 2. 创建MHA配置文件 (/etc/mha/app1.cnf)
cat > /etc/mha/app1.cnf << 'EOF'
[server default]
user=mha
password=mha_password
manager_workdir=/var/log/mha/app1
manager_log=/var/log/mha/app1/manager.log
remote_workdir=/var/log/mha/app1
ssh_user=root
repl_user=repl
repl_password=repl_password
ping_interval=3
master_ip_failover_script=/etc/mha/master_ip_failover
master_ip_online_change_script=/etc/mha/master_ip_online_change
shutdown_script=""
report_script=""
[server1]
hostname=192.168.1.101
port=3306
candidate_master=1
master_binlog_dir=/var/lib/mysql
[server2]
hostname=192.168.1.102
port=3306
candidate_master=1
master_binlog_dir=/var/lib/mysql
[server3]
hostname=192.168.1.103
port=3306
no_master=1
master_binlog_dir=/var/lib/mysql
EOF
# 3. 创建VIP切换脚本 (/etc/mha/master_ip_failover)
cat > /etc/mha/master_ip_failover << 'EOF'
#!/usr/bin/env perl
use strict;
use warnings;
my $vip = '192.168.1.200/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0";
my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth0";
my $command = $ENV{command};
my $orig_master_host = $ENV{orig_master_host};
my $new_master_host = $ENV{new_master_host};
exit 0 if ($command eq "stop");
if ($command eq "start" || $command eq "status") {
system("$ssh_start_vip");
exit 0;
}
if ($command eq "stopssh") {
system("$ssh_stop_vip");
exit 0;
}
EOF
chmod +x /etc/mha/master_ip_failover
# 4. 检查SSH配置
masterha_check_ssh --conf=/etc/mha/app1.cnf
# 5. 检查复制配置
masterha_check_repl --conf=/etc/mha/app1.cnf
3.5 启动和管理MHA
bash
# 1. 启动MHA Manager
nohup masterha_manager --conf=/etc/mha/app1.cnf \
--remove_dead_master_conf \
--ignore_last_failover \
< /dev/null > /var/log/mha/manager.log 2>&1 &
# 2. 查看状态
masterha_check_status --conf=/etc/mha/app1.cnf
# 3. 停止MHA
masterha_stop --conf=/etc/mha/app1.cnf
# 4. 手动切换
masterha_master_switch --conf=/etc/mha/app1.cnf \
--master_state=alive \
--new_master_host=192.168.1.102 \
--interactive=0