原创不易,如果有帮助请点赞、收藏、关注!!!!!
📌 前言
在当今数据驱动的应用中,数据库性能和高可用性是系统架构的核心挑战。随着业务量增长,单数据库实例往往成为性能瓶颈。本文将带你从零开始,使用Docker容器技术,构建一个完整的MySQL主从复制集群,并实现读写分离架构。
🎯 一、架构设计
1.1 为什么需要主从复制?
在实际生产环境中,单一MySQL数据库可能面临以下痛点:
| 痛点 | 影响 | 解决方案 |
|---|---|---|
| 性能瓶颈 | 读写混合,IO压力大 | 读写分离 |
| 单点故障 | 数据库宕机,服务中断 | 主从切换 |
| 备份困难 | 备份期间影响性能 | 从库备份 |
| 扩展性差 | 垂直扩展成本高 | 水平扩展 |
1.2 我们的技术栈选择
📦 Docker + Docker Compose
├── 🐳 MySQL 8.0
├── 🔀 ProxySQL 2.0
└── 📊 自定义监控脚本
1.3 架构拓扑图
binlog复制
binlog复制
应用程序
ProxySQL:6033
MySQL Master
MySQL Slave1
MySQL Slave2
主库数据
从库1数据
从库2数据
🛠️ 二、环境准备
2.1 系统要求
bash
# 检查Docker环境
docker --version
docker-compose --version
# 推荐配置
CPU: 4核以上
内存: 8GB以上
磁盘: 50GB以上
Docker版本: 20.10+
2.2 项目目录结构
bash
mysql-cluster/
├── docker-compose.yml # 编排文件
├── master/ # 主库配置
│ ├── conf/my.cnf
│ └── data/ # 挂载目录
├── slave1/ # 从库1
│ ├── conf/my.cnf
│ └── data/
├── slave2/ # 从库2
│ ├── conf/my.cnf
│ └── data/
├── proxysql/ # 代理配置
│ └── conf/proxysql.cnf
├── scripts/ # 初始化脚本
│ ├── init-replication.sql
│ └── check_replication.sh
├── monitor/ # 监控脚本
│ └── monitor.sh
└── backup/ # 备份脚本
└── backup.sh
2.3 创建专用网络
bash
# 创建Docker网络(让容器可以通过服务名通信)
docker network create --subnet=172.20.0.0/16 mysql-cluster-network
# 验证网络创建
docker network ls | grep mysql-cluster
🚀 三、主从配置实战
3.1 Docker Compose编排文件
创建 docker-compose.yml:
yaml
version: '3.8'
services:
# 主数据库
mysql-master:
image: mysql:8.0.33
container_name: mysql-master
restart: always
environment:
MYSQL_ROOT_PASSWORD: MasterRoot123!
MYSQL_DATABASE: app_db
MYSQL_USER: app_user
MYSQL_PASSWORD: AppUser123!
TZ: Asia/Shanghai
ports:
- "3306:3306"
volumes:
- ./master/data:/var/lib/mysql
- ./master/conf:/etc/mysql/conf.d
- ./scripts:/docker-entrypoint-initdb.d
- /etc/localtime:/etc/localtime:ro
networks:
mysql-cluster:
ipv4_address: 172.20.0.2
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-uroot", "-pMasterRoot123!"]
interval: 10s
timeout: 5s
retries: 3
command:
- --server-id=1
- --log-bin=mysql-bin
- --binlog-format=ROW
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_unicode_ci
- --default-authentication-plugin=mysql_native_password
- --max-connections=1000
# 从数据库1
mysql-slave1:
image: mysql:8.0.33
container_name: mysql-slave1
restart: always
environment:
MYSQL_ROOT_PASSWORD: SlaveRoot123!
TZ: Asia/Shanghai
ports:
- "3307:3306"
volumes:
- ./slave1/data:/var/lib/mysql
- ./slave1/conf:/etc/mysql/conf.d
- /etc/localtime:/etc/localtime:ro
networks:
mysql-cluster:
ipv4_address: 172.20.0.3
depends_on:
mysql-master:
condition: service_healthy
command:
- --server-id=2
- --relay-log=mysql-relay-bin
- --read-only=1
- --log-slave-updates=1
# 从数据库2
mysql-slave2:
image: mysql:8.0.33
container_name: mysql-slave2
restart: always
environment:
MYSQL_ROOT_PASSWORD: SlaveRoot123!
TZ: Asia/Shanghai
ports:
- "3308:3306"
volumes:
- ./slave2/data:/var/lib/mysql
- ./slave2/conf:/etc/mysql/conf.d
- /etc/localtime:/etc/localtime:ro
networks:
mysql-cluster:
ipv4_address: 172.20.0.4
depends_on:
mysql-master:
condition: service_healthy
command:
- --server-id=3
- --relay-log=mysql-relay-bin
- --read-only=1
- --log-slave-updates=1
# 读写分离代理
proxysql:
image: proxysql/proxysql:2.5.4
container_name: proxysql
restart: always
ports:
- "6032:6032" # 管理端口
- "6033:6033" # 应用端口
volumes:
- ./proxysql/data:/var/lib/proxysql
- ./proxysql/conf/proxysql.cnf:/etc/proxysql.cnf
networks:
mysql-cluster:
ipv4_address: 172.20.0.5
depends_on:
- mysql-master
- mysql-slave1
- mysql-slave2
networks:
mysql-cluster:
external: true
name: mysql-cluster-network
3.2 MySQL配置文件
master/conf/my.cnf:
ini
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog_format = ROW
expire_logs_days = 7
max_binlog_size = 100M
sync_binlog = 1
binlog_cache_size = 1M
max_binlog_cache_size = 2M
binlog_do_db = app_db
# 性能优化
innodb_buffer_pool_size = 512M
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
max_connections = 1000
thread_cache_size = 100
table_open_cache = 2000
query_cache_type = 1
query_cache_size = 64M
# 字符集
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
default-character-set = utf8mb4
[client]
default-character-set = utf8mb4
slave/conf/my.cnf:
ini
[mysqld]
server-id = 2 # 注意:每个从库需要不同的ID
relay-log = mysql-relay-bin
read_only = 1
super_read_only = 1
log_slave_updates = 1
relay_log_recovery = 1
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
# 性能优化(与主库类似,可根据需要调整)
innodb_buffer_pool_size = 512M
max_connections = 1000
thread_cache_size = 100
table_open_cache = 2000
# 字符集
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
3.3 启动集群
bash
# 1. 创建目录
mkdir -p {master,slave1,slave2}/conf {master,slave1,slave2}/data
mkdir -p proxysql/{conf,data} scripts monitor backup
# 2. 将配置文件放入对应目录
# 3. 启动服务
docker-compose up -d
# 4. 查看启动状态
docker-compose ps
# 预期输出:
# NAME COMMAND SERVICE STATUS PORTS
# mysql-master "docker-entrypoint.s..." mysql-master running 0.0.0.0:3306->3306/tcp
# mysql-slave1 "docker-entrypoint.s..." mysql-slave1 running 0.0.0.0:3307->3306/tcp
# mysql-slave2 "docker-entrypoint.s..." mysql-slave2 running 0.0.0.0:3308->3306/tcp
# proxysql "/bin/sh -c '/usr/bi..." proxysql running 0.0.0.0:6032-6033->6032-6033/tcp
# 5. 查看日志
docker-compose logs -f mysql-master
3.4 配置主从复制
步骤1:在主库创建复制账号
sql
-- 进入主库容器执行
docker exec -it mysql-master mysql -uroot -pMasterRoot123!
-- 创建复制用户
CREATE USER 'replica'@'%' IDENTIFIED WITH mysql_native_password BY 'Replica123!';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'replica'@'%';
FLUSH PRIVILEGES;
-- 查看主库状态,记录File和Position
SHOW MASTER STATUS\G
输出示例:
File: mysql-bin.000003
Position: 154
步骤2:配置从库连接主库
bash
# 配置从库1
docker exec mysql-slave1 mysql -uroot -pSlaveRoot123! -e "
STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST='mysql-master',
MASTER_USER='replica',
MASTER_PASSWORD='Replica123!',
MASTER_LOG_FILE='mysql-bin.000003',
MASTER_LOG_POS=154,
MASTER_CONNECT_RETRY=10,
MASTER_RETRY_COUNT=10;
START SLAVE;
"
# 配置从库2(注意修改server-id和容器名)
docker exec mysql-slave2 mysql -uroot -pSlaveRoot123! -e "
STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST='mysql-master',
MASTER_USER='replica',
MASTER_PASSWORD='Replica123!',
MASTER_LOG_FILE='mysql-bin.000003',
MASTER_LOG_POS=154,
MASTER_CONNECT_RETRY=10,
MASTER_RETRY_COUNT=10;
START SLAVE;
"
步骤3:验证复制状态
bash
#!/bin/bash
# scripts/check_replication.sh
echo "========== MySQL主从复制状态检查 =========="
echo ""
echo "1. 主库状态:"
docker exec mysql-master mysql -uroot -pMasterRoot123! -e "SHOW MASTER STATUS\G" | grep -E "File|Position"
echo ""
echo "2. 从库1复制状态:"
docker exec mysql-slave1 mysql -uroot -pSlaveRoot123! -e "SHOW SLAVE STATUS\G" | grep -E "Slave_IO_Running|Slave_SQL_Running|Seconds_Behind_Master|Last_Error"
echo ""
echo "3. 从库2复制状态:"
docker exec mysql-slave2 mysql -uroot -pSlaveRoot123! -e "SHOW SLAVE STATUS\G" | grep -E "Slave_IO_Running|Slave_SQL_Running|Seconds_Behind_Master|Last_Error"
echo ""
echo "4. 数据同步测试:"
echo "在主库创建测试数据..."
docker exec mysql-master mysql -uroot -pMasterRoot123! app_db -e "
CREATE TABLE IF NOT EXISTS test_replication (
id INT AUTO_INCREMENT PRIMARY KEY,
data VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO test_replication (data) VALUES ('测试数据-' || UUID());
"
sleep 2
echo "从库1查询结果:"
docker exec mysql-slave1 mysql -uroot -pSlaveRoot123! app_db -e "SELECT * FROM test_replication ORDER BY id DESC LIMIT 1;"
echo "从库2查询结果:"
docker exec mysql-slave2 mysql -uroot -pSlaveRoot123! app_db -e "SELECT * FROM test_replication ORDER BY id DESC LIMIT 1;"
echo ""
echo "========== 检查完成 =========="
执行检查脚本:
bash
chmod +x scripts/check_replication.sh
./scripts/check_replication.sh
🔀 四、读写分离实现
4.1 ProxySQL配置详解
proxysql/conf/proxysql.cnf:
ini
# ProxySQL配置文件
datadir="/var/lib/proxysql"
admin_variables=
{
admin_credentials="admin:admin;radmin:radmin"
mysql_ifaces="0.0.0.0:6032"
refresh_interval=2000
debug=false
}
mysql_variables=
{
threads=4
max_connections=2048
default_query_delay=0
default_query_timeout=36000000
have_compress=true
poll_timeout=2000
interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
default_schema="information_schema"
stacksize=1048576
server_version="8.0.33"
connect_timeout_server=10000
monitor_username="monitor"
monitor_password="monitor"
monitor_history=600000
monitor_connect_interval=60000
monitor_ping_interval=10000
monitor_read_only_interval=1500
monitor_read_only_timeout=500
ping_interval_server_msec=120000
ping_timeout_server=500
commands_stats=true
sessions_sort=true
connect_retries_on_failure=10
}
# 定义MySQL服务器组
mysql_servers=
(
{
address="172.20.0.2" # mysql-master
port=3306
hostgroup=10 # 写组
max_connections=300
max_replication_lag=5
use_ssl=0
comment="主库"
},
{
address="172.20.0.3" # mysql-slave1
port=3306
hostgroup=20 # 读组
max_connections=300
max_replication_lag=5
use_ssl=0
comment="从库1"
},
{
address="172.20.0.4" # mysql-slave2
port=3306
hostgroup=20 # 读组
max_connections=300
max_replication_lag=5
use_ssl=0
comment="从库2"
}
)
# 定义后端用户
mysql_users=
(
{
username="app_user"
password="AppUser123!"
default_hostgroup=10 # 默认路由到写组
active=1
transaction_persistent=0
fast_forward=0
backend=1
frontend=1
max_connections=1000
comment="应用用户"
}
)
# 查询规则 - 智能路由
mysql_query_rules=
(
{
rule_id=100
active=1
match_pattern="^SELECT.*FOR UPDATE"
destination_hostgroup=10
apply=1
comment="SELECT FOR UPDATE 路由到主库"
},
{
rule_id=200
active=1
match_pattern="^SELECT"
destination_hostgroup=20
apply=1
comment="普通SELECT路由到从库"
},
{
rule_id=300
active=1
match_pattern="^INSERT"
destination_hostgroup=10
apply=1
comment="INSERT路由到主库"
},
{
rule_id=400
active=1
match_pattern="^UPDATE"
destination_hostgroup=10
apply=1
comment="UPDATE路由到主库"
},
{
rule_id=500
active=1
match_pattern="^DELETE"
destination_hostgroup=10
apply=1
comment="DELETE路由到主库"
},
{
rule_id=600
active=1
match_pattern="^CALL"
destination_hostgroup=10
apply=1
comment="存储过程路由到主库"
},
{
rule_id=700
active=1
match_pattern="."
destination_hostgroup=10
apply=1
comment="默认路由到主库"
}
)
# 复制组配置
mysql_replication_hostgroups=
(
{
writer_hostgroup=10
reader_hostgroup=20
comment="MySQL主从复制组"
check_type=read_only
}
)
4.2 初始化ProxySQL
bash
# 1. 重启ProxySQL加载配置
docker-compose restart proxysql
# 2. 进入ProxySQL管理界面
docker exec -it proxysql mysql -h127.0.0.1 -P6032 -uadmin -padmin
# 3. 在ProxySQL管理界面执行以下命令
-- 查看服务器状态
SELECT * FROM mysql_servers;
-- 查看用户配置
SELECT * FROM mysql_users;
-- 加载配置到运行时
LOAD MYSQL SERVERS TO RUNTIME;
LOAD MYSQL USERS TO RUNTIME;
LOAD MYSQL QUERY RULES TO RUNTIME;
LOAD MYSQL VARIABLES TO RUNTIME;
-- 保存配置到磁盘
SAVE MYSQL SERVERS TO DISK;
SAVE MYSQL USERS TO DISK;
SAVE MYSQL QUERY RULES TO DISK;
SAVE MYSQL VARIABLES TO DISK;
-- 查看统计信息
SELECT * FROM stats_mysql_connection_pool;
4.3 应用连接示例
Java Spring Boot配置:
yaml
# application.yml
spring:
datasource:
url: jdbc:mysql://localhost:6033/app_db?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=Asia/Shanghai
username: app_user
password: AppUser123!
driver-class-name: com.mysql.cj.jdbc.Driver
hikari:
maximum-pool-size: 20
minimum-idle: 10
connection-timeout: 30000
idle-timeout: 600000
max-lifetime: 1800000
Python Django配置:
python
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'app_db',
'USER': 'app_user',
'PASSWORD': 'AppUser123!',
'HOST': 'localhost', # ProxySQL地址
'PORT': '6033', # ProxySQL端口
'OPTIONS': {
'charset': 'utf8mb4',
'connect_timeout': 30,
},
'CONN_MAX_AGE': 300, # 连接池保持时间
}
}
4.4 读写分离测试
python
# test_read_write.py
import pymysql
import time
def test_proxysql():
"""测试ProxySQL读写分离"""
# 连接配置
config = {
'host': 'localhost',
'port': 6033,
'user': 'app_user',
'password': 'AppUser123!',
'database': 'app_db',
'charset': 'utf8mb4'
}
# 测试写操作
print("=== 测试写操作 ===")
conn = pymysql.connect(**config)
cursor = conn.cursor()
# 创建测试表
cursor.execute("""
CREATE TABLE IF NOT EXISTS proxy_test (
id INT AUTO_INCREMENT PRIMARY KEY,
operation VARCHAR(50),
server_id VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
# 插入数据(应该路由到主库)
for i in range(5):
cursor.execute("""
INSERT INTO proxy_test (operation, server_id)
VALUES ('INSERT', CONCAT('Data-', UUID()))
""")
conn.commit()
print(f"插入数据 {i+1} 完成")
cursor.close()
conn.close()
# 等待数据同步
print("\n等待数据同步...")
time.sleep(2)
# 测试读操作
print("\n=== 测试读操作 ===")
for i in range(10):
conn = pymysql.connect(**config)
cursor = conn.cursor()
# 查询数据(应该路由到从库)
cursor.execute("""
SELECT
COUNT(*) as count,
@@hostname as server_hostname,
@@server_id as server_id
FROM proxy_test
""")
result = cursor.fetchone()
print(f"查询 {i+1}: 数据量={result[0]}, 服务器={result[1]}, ServerID={result[2]}")
cursor.close()
conn.close()
time.sleep(0.5)
if __name__ == "__main__":
test_proxysql()
📊 五、监控与维护
5.1 实时监控脚本
bash
#!/bin/bash
# monitor/monitor.sh
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
log() {
echo -e "$(date '+%Y-%m-%d %H:%M:%S') - $1"
}
check_replication() {
log "${YELLOW}=== 主从复制状态检查 ===${NC}"
local master_file=$(docker exec mysql-master mysql -uroot -pMasterRoot123! -sN -e "SHOW MASTER STATUS\G" | grep "File:" | awk '{print $2}')
local master_pos=$(docker exec mysql-master mysql -uroot -pMasterRoot123! -sN -e "SHOW MASTER STATUS\G" | grep "Position:" | awk '{print $2}')
echo "主库: File=${master_file}, Position=${master_pos}"
for slave in mysql-slave1 mysql-slave2; do
echo -e "\n${slave}:"
local io_running=$(docker exec $slave mysql -uroot -pSlaveRoot123! -sN -e "SHOW SLAVE STATUS\G" | grep "Slave_IO_Running:" | awk '{print $2}')
local sql_running=$(docker exec $slave mysql -uroot -pSlaveRoot123! -sN -e "SHOW SLAVE STATUS\G" | grep "Slave_SQL_Running:" | awk '{print $2}')
local behind=$(docker exec $slave mysql -uroot -pSlaveRoot123! -sN -e "SHOW SLAVE STATUS\G" | grep "Seconds_Behind_Master:" | awk '{print $2}')
if [ "$io_running" = "Yes" ] && [ "$sql_running" = "Yes" ]; then
echo -e "复制状态: ${GREEN}正常${NC}"
echo -e "延迟: ${behind}秒"
else
echo -e "复制状态: ${RED}异常${NC}"
echo -e "IO线程: $io_running, SQL线程: $sql_running"
fi
done
}
check_connections() {
log "${YELLOW}=== 连接数监控 ===${NC}"
echo "服务器 | 当前连接数 | 最大连接数 | 连接使用率"
echo "---------------------|------------|------------|-----------"
for db in mysql-master mysql-slave1 mysql-slave2; do
local current=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW STATUS LIKE 'Threads_connected'" | awk '{print $2}')
local max=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW VARIABLES LIKE 'max_connections'" | awk '{print $2}')
local usage=$(echo "scale=2; $current * 100 / $max" | bc)
if (( $(echo "$usage > 80" | bc -l) )); then
usage_display="${RED}${usage}%${NC}"
elif (( $(echo "$usage > 50" | bc -l) )); then
usage_display="${YELLOW}${usage}%${NC}"
else
usage_display="${GREEN}${usage}%${NC}"
fi
printf "%-20s | %-10s | %-10s | %s\n" "$db" "$current" "$max" "$usage_display"
done
}
check_performance() {
log "${YELLOW}=== 性能指标监控 ===${NC}"
echo "服务器 | QPS | TPS | 命中率 | 缓冲池使用"
echo "---------------------|---------|---------|----------|-----------"
for db in mysql-master mysql-slave1 mysql-slave2; do
# 获取查询统计
local queries=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Questions'" | awk '{print $2}')
local uptime=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Uptime'" | awk '{print $2}')
local qps=$(echo "scale=2; $queries / $uptime" | bc)
# 获取事务统计
local commits=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Com_commit'" | awk '{print $2}')
local rollbacks=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Com_rollback'" | awk '{print $2}')
local tps=$(echo "scale=2; ($commits + $rollbacks) / $uptime" | bc)
# 获取缓存命中率
local hits=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Qcache_hits'" | awk '{print $2}')
local inserts=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Qcache_inserts'" | awk '{print $2}')
local hit_rate=0
if [ $((hits + inserts)) -gt 0 ]; then
hit_rate=$(echo "scale=2; $hits * 100 / ($hits + $inserts)" | bc)
fi
# 获取InnoDB缓冲池使用率
local pool_pages=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_total'" | awk '{print $2}')
local pool_free=$(docker exec $db mysql -uroot -pSlaveRoot123! -sN -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_free'" | awk '{print $2}')
local pool_usage=0
if [ $pool_pages -gt 0 ]; then
pool_usage=$(echo "scale=2; ($pool_pages - $pool_free) * 100 / $pool_pages" | bc)
fi
printf "%-20s | %-7s | %-7s | %-8s | %s%%\n" "$db" "$qps" "$tps" "${hit_rate}%" "$pool_usage"
done
}
check_proxysql() {
log "${YELLOW}=== ProxySQL状态检查 ===${NC}"
# 检查ProxySQL管理接口
if docker exec proxysql mysql -h127.0.0.1 -P6032 -uadmin -padmin -e "SELECT 1" > /dev/null 2>&1; then
echo -e "ProxySQL管理接口: ${GREEN}正常${NC}"
else
echo -e "ProxySQL管理接口: ${RED}异常${NC}"
fi
# 检查ProxySQL应用接口
if mysql -h127.0.0.1 -P6033 -uapp_user -pAppUser123! -e "SELECT 1" > /dev/null 2>&1; then
echo -e "ProxySQL应用接口: ${GREEN}正常${NC}"
else
echo -e "ProxySQL应用接口: ${RED}异常${NC}"
fi
# 获取连接池状态
echo -e "\n连接池状态:"
docker exec proxysql mysql -h127.0.0.1 -P6032 -uadmin -padmin -e "
SELECT
hostgroup,
srv_host,
srv_port,
status,
ConnUsed,
ConnFree,
ConnOK,
ConnERR,
Queries
FROM stats_mysql_connection_pool
ORDER BY hostgroup, srv_host;
"
}
main() {
echo -e "${GREEN}====== MySQL集群监控报告 ======${NC}\n"
check_replication
echo ""
check_connections
echo ""
check_performance
echo ""
check_proxysql
echo -e "\n${GREEN}====== 监控完成 ======${NC}"
}
main
5.2 自动备份策略
bash
#!/bin/bash
# backup/backup.sh
set -e
# 配置
BACKUP_ROOT="/backup/mysql-cluster"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="$BACKUP_ROOT/$DATE"
RETENTION_DAYS=7
LOG_FILE="$BACKUP_ROOT/backup.log"
# 创建目录
mkdir -p "$BACKUP_DIR"
mkdir -p "$BACKUP_ROOT/logs"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}
backup_master() {
log "开始备份主库..."
# 锁定主库,创建一致性备份
docker exec mysql-master mysql -uroot -pMasterRoot123! -e "FLUSH TABLES WITH READ LOCK;"
docker exec mysql-master mysql -uroot -pMasterRoot123! -e "SHOW MASTER STATUS;" > "$BACKUP_DIR/master_status.txt"
# 备份所有数据库
docker exec mysql-master mysqldump -uroot -pMasterRoot123! \
--single-transaction \
--routines \
--triggers \
--events \
--all-databases \
--master-data=2 \
--flush-logs \
| gzip > "$BACKUP_DIR/master_all.sql.gz"
# 解锁
docker exec mysql-master mysql -uroot -pMasterRoot123! -e "UNLOCK TABLES;"
# 验证备份文件
if [ -s "$BACKUP_DIR/master_all.sql.gz" ]; then
log "主库备份成功: $BACKUP_DIR/master_all.sql.gz"
else
log "错误: 主库备份文件为空!"
exit 1
fi
}
backup_slave() {
local slave_name=$1
local container=$2
log "开始备份 $slave_name..."
# 从库备份不需要锁表,因为已经配置了read_only
docker exec $container mysqldump -uroot -pSlaveRoot123! \
--single-transaction \
--routines \
--triggers \
--events \
--all-databases \
| gzip > "$BACKUP_DIR/${slave_name}_all.sql.gz"
# 记录从库状态
docker exec $container mysql -uroot -pSlaveRoot123! -e "SHOW SLAVE STATUS\G" > "$BACKUP_DIR/${slave_name}_status.txt"
if [ -s "$BACKUP_DIR/${slave_name}_all.sql.gz" ]; then
log "$slave_name 备份成功"
else
log "错误: $slave_name 备份文件为空!"
fi
}
backup_binlogs() {
log "备份二进制日志..."
# 备份主库binlog
docker exec mysql-master sh -c '
BINLOG_DIR=/var/lib/mysql
cd $BINLOG_DIR
tar -czf /tmp/binlogs.tar.gz mysql-bin.* 2>/dev/null || true
'
docker cp mysql-master:/tmp/binlogs.tar.gz "$BACKUP_DIR/binlogs.tar.gz"
log "二进制日志备份完成"
}
cleanup_old_backups() {
log "清理 $RETENTION_DAYS 天前的旧备份..."
find "$BACKUP_ROOT" -maxdepth 1 -type d -name "202*" -mtime +$RETENTION_DAYS -exec rm -rf {} \;
# 清理旧日志
find "$BACKUP_ROOT/logs" -name "*.log" -mtime +30 -delete
log "清理完成"
}
calculate_size() {
local size=$(du -sh "$BACKUP_DIR" | cut -f1)
echo "$size"
}
main() {
log "====== 开始备份任务 ======"
# 检查磁盘空间
local available=$(df -h "$BACKUP_ROOT" | awk 'NR==2 {print $4}')
log "可用磁盘空间: $available"
# 执行备份
backup_master
backup_slave "slave1" "mysql-slave1"
backup_slave "slave2" "mysql-slave2"
backup_binlogs
# 计算备份大小
local backup_size=$(calculate_size)
log "本次备份大小: $backup_size"
# 清理旧备份
cleanup_old_backups
# 生成备份报告
echo "备份时间: $(date)" > "$BACKUP_DIR/backup_report.txt"
echo "备份大小: $backup_size" >> "$BACKUP_DIR/backup_report.txt"
echo "备份目录: $BACKUP_DIR" >> "$BACKUP_DIR/backup_report.txt"
echo "包含文件:" >> "$BACKUP_DIR/backup_report.txt"
ls -lh "$BACKUP_DIR" >> "$BACKUP_DIR/backup_report.txt"
log "====== 备份任务完成 ======"
# 发送通知(可选)
# send_notification "MySQL备份完成" "备份大小: $backup_size"
}
# 错误处理
trap 'log "备份过程中断!"; exit 1' INT TERM
main
5.3 设置定时任务
bash
# 编辑crontab
crontab -e
# 添加以下内容
# 每天凌晨2点执行备份
0 2 * * * /bin/bash /path/to/mysql-cluster/backup/backup.sh >> /path/to/mysql-cluster/backup/logs/cron.log 2>&1
# 每5分钟执行监控
*/5 * * * * /bin/bash /path/to/mysql-cluster/monitor/monitor.sh >> /path/to/mysql-cluster/monitor/logs/monitor.log 2>&1
# 每周一凌晨3点清理日志
0 3 * * 1 find /path/to/mysql-cluster -name "*.log" -mtime +30 -delete
🔧 六、故障处理与优化
6.1 常见问题解决方案
问题1:主从复制延迟
sql
-- 诊断延迟原因
SHOW SLAVE STATUS\G
-- 关注:Seconds_Behind_Master, Read_Master_Log_Pos, Exec_Master_Log_Pos
-- 解决方案:
-- 1. 优化从库配置
SET GLOBAL slave_parallel_workers = 8;
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';
-- 2. 调整主库binlog设置
SET GLOBAL sync_binlog = 1000;
SET GLOBAL innodb_flush_log_at_trx_commit = 2;
-- 3. 检查网络
docker exec mysql-slave1 ping mysql-master -c 10
问题2:主从复制中断
bash
# 1. 查看错误详情
docker exec mysql-slave1 mysql -uroot -pSlaveRoot123! -e "SHOW SLAVE STATUS\G" | grep Last_Error
# 2. 常见错误处理
# 错误1062:主键冲突(跳过此错误)
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
# 错误1236:binlog位置错误
-- 重新获取主库状态
SHOW MASTER STATUS;
-- 重新配置从库
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.00000X', MASTER_LOG_POS=XXX;
START SLAVE;
# 3. 完全重新同步(最后手段)
-- 主库:
mysqldump --single-transaction --master-data=2 --all-databases > full_backup.sql
-- 从库:
STOP SLAVE;
RESET SLAVE ALL;
source full_backup.sql;
-- 重新配置复制
问题3:ProxySQL连接失败
sql
-- 1. 检查ProxySQL状态
docker exec proxysql mysql -h127.0.0.1 -P6032 -uadmin -padmin -e "
SELECT * FROM stats_mysql_connection_pool;
"
-- 2. 重新加载配置
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
-- 3. 重启ProxySQL
docker-compose restart proxysql
6.2 性能优化建议
MySQL优化:
sql
-- 1. 优化查询缓存
SET GLOBAL query_cache_size = 256M;
SET GLOBAL query_cache_limit = 4M;
-- 2. 调整InnoDB
SET GLOBAL innodb_buffer_pool_size = 2G; -- 物理内存的70-80%
SET GLOBAL innodb_log_file_size = 256M;
SET GLOBAL innodb_flush_log_at_trx_commit = 1; -- ACID要求高时使用1
-- 3. 连接优化
SET GLOBAL max_connections = 1000;
SET GLOBAL thread_cache_size = 100;
SET GLOBAL wait_timeout = 600;
SET GLOBAL interactive_timeout = 600;
-- 4. 开启慢查询日志
SET GLOBAL slow_query_log = ON;
SET GLOBAL long_query_time = 2;
SET GLOBAL log_queries_not_using_indexes = ON;
-- 5. 定期优化表
OPTIMIZE TABLE large_table;
ANALYZE TABLE important_table;
ProxySQL优化:
sql
-- 1. 调整线程数
UPDATE global_variables SET variable_value='8' WHERE variable_name='mysql-threads';
-- 2. 优化连接池
UPDATE mysql_servers SET max_connections=500, max_replication_lag=10;
-- 3. 启用查询缓存
UPDATE mysql_query_rules SET cache_ttl=30000 WHERE rule_id=200;
-- 4. 监控优化
UPDATE global_variables SET variable_value='5000' WHERE variable_name='mysql-monitor_slave_lag_warning';
UPDATE global_variables SET variable_value='10000' WHERE variable_name='mysql-monitor_slave_lag_timeout';
-- 应用到运行时
LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;
6.3 主库故障转移
bash
#!/bin/bash
# scripts/failover.sh
# 主库故障转移脚本
MASTER_CONTAINER="mysql-master"
SLAVE1_CONTAINER="mysql-slave1"
SLAVE2_CONTAINER="mysql-slave2"
PROXYSQL_CONTAINER="proxysql"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"
}
check_master_health() {
if docker exec $MASTER_CONTAINER mysqladmin ping -uroot -pMasterRoot123! > /dev/null 2>&1; then
return 0
else
return 1
fi
}
promote_slave_to_master() {
local slave=$1
log "开始提升 $slave 为新的主库..."
# 停止复制
docker exec $slave mysql -uroot -pSlaveRoot123! -e "STOP SLAVE;"
docker exec $slave mysql -uroot -pSlaveRoot123! -e "RESET SLAVE ALL;"
# 关闭只读模式
docker exec $slave mysql -uroot -pSlaveRoot123! -e "SET GLOBAL read_only = OFF;"
docker exec $slave mysql -uroot -pSlaveRoot123! -e "SET GLOBAL super_read_only = OFF;"
# 创建复制用户(如果不存在)
docker exec $slave mysql -uroot -pSlaveRoot123! -e "
CREATE USER IF NOT EXISTS 'replica'@'%' IDENTIFIED WITH mysql_native_password BY 'Replica123!';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'replica'@'%';
FLUSH PRIGES;
"
# 记录新主库状态
docker exec $slave mysql -uroot -pSlaveRoot123! -e "SHOW MASTER STATUS;" > /tmp/new_master_status.txt
log "$slave 已提升为主库"
}
repoint_other_slaves() {
local new_master=$1
local other_slave=$2
log "重新配置 $other_slave 指向新的主库 $new_master..."
# 获取新主库状态
local master_file=$(docker exec $new_master mysql -uroot -pSlaveRoot123! -sN -e "SHOW MASTER STATUS\G" | grep "File:" | awk '{print $2}')
local master_pos=$(docker exec $new_master mysql -uroot -pSlaveRoot123! -sN -e "SHOW MASTER STATUS\G" | grep "Position:" | awk '{print $2}')
# 重新配置复制
docker exec $other_slave mysql -uroot -pSlaveRoot123! -e "
STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST='$new_master',
MASTER_USER='replica',
MASTER_PASSWORD='Replica123!',
MASTER_LOG_FILE='$master_file',
MASTER_LOG_POS=$master_pos;
START SLAVE;
"
log "$other_slave 已重新配置"
}
update_proxysql_config() {
local new_master=$1
log "更新ProxySQL配置..."
# 获取新主库IP
local master_ip=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $new_master)
docker exec $PROXYSQL_CONTAINER mysql -h127.0.0.1 -P6032 -uadmin -padmin -e "
-- 将新主库添加到写组
UPDATE mysql_servers
SET hostgroup_id = 10
WHERE hostname = '$master_ip';
-- 将旧主库移到离线组(如果还能连接)
UPDATE mysql_servers
SET hostgroup_id = 30,
status = 'OFFLINE_HARD'
WHERE hostgroup_id = 10
AND hostname != '$master_ip';
-- 应用到运行时
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
"
log "ProxySQL配置已更新"
}
send_alert() {
local message=$1
log "发送告警: $message"
# 这里可以集成邮件、钉钉、企业微信等通知
# curl -X POST -H "Content-Type: application/json" -d "{\"text\":\"$message\"}" $WEBHOOK_URL
}
main() {
log "开始主库故障转移检查..."
if check_master_health; then
log "主库正常,无需故障转移"
exit 0
fi
log "检测到主库故障,开始故障转移流程"
send_alert "MySQL主库故障,开始故障转移"
# 选择延迟最小的从库作为新主库
log "选择新的主库..."
# 检查从库状态(简化版,实际应检查延迟)
if docker exec $SLAVE1_CONTAINER mysqladmin ping -uroot -pSlaveRoot123! > /dev/null 2>&1; then
NEW_MASTER=$SLAVE1_CONTAINER
OTHER_SLAVE=$SLAVE2_CONTAINER
elif docker exec $SLAVE2_CONTAINER mysqladmin ping -uroot -pSlaveRoot123! > /dev/null 2>&1; then
NEW_MASTER=$SLAVE2_CONTAINER
OTHER_SLAVE=$SLAVE1_CONTAINER
else
log "错误:所有从库都不可用!"
send_alert "所有MySQL从库都不可用,请立即处理!"
exit 1
fi
log "选择 $NEW_MASTER 作为新的主库"
# 执行故障转移
promote_slave_to_master $NEW_MASTER
repoint_other_slaves $NEW_MASTER $OTHER_SLAVE
update_proxysql_config $NEW_MASTER
log "故障转移完成"
send_alert "MySQL故障转移完成:$NEW_MASTER 现在是新的主库"
# 生成故障转移报告
echo "故障转移报告" > /tmp/failover_report.txt
echo "时间: $(date)" >> /tmp/failover_report.txt
echo "旧主库: $MASTER_CONTAINER" >> /tmp/failover_report.txt
echo "新主库: $NEW_MASTER" >> /tmp/failover_report.txt
echo "其他从库: $OTHER_SLAVE" >> /tmp/failover_report.txt
}
main
📈 七、总结与最佳实践
7.1 架构优势总结
✅ 高性能 :读写分离,查询负载分散到多个从库
✅ 高可用 :主库故障时可自动或手动切换
✅ 易扩展 :随时增加从库应对增长需求
✅ 可维护 :容器化部署,配置管理方便
✅ 数据安全:多副本存储,备份恢复灵活
7.2 生产环境建议
容量规划:
yaml
# 推荐配置
开发环境:
master: 2C4G
slave: 1C2G × 2
测试环境:
master: 4C8G
slave: 2C4G × 2
生产环境:
master: 8C16G
slave: 4C8G × 3
proxy: 2C4G
监控指标:
bash
关键指标监控清单:
1. 复制延迟: < 30秒
2. 连接数使用率: < 80%
3. CPU使用率: < 70%
4. 内存使用率: < 80%
5. 磁盘使用率: < 85%
6. QPS/TPS: 根据业务基准
安全建议:
- 修改默认端口
- 使用强密码策略
- 限制访问IP
- 启用SSL连接
- 定期审计日志
- 及时打补丁
7.3 扩展方向
高级特性:
-
半同步复制:增强数据一致性
sqlINSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; SET GLOBAL rpl_semi_sync_master_enabled = 1; -
GTID复制:简化故障转移
sqlSET GLOBAL gtid_mode = ON; SET GLOBAL enforce_gtid_consistency = ON; -
组复制:实现多主架构
-
分库分表:应对海量数据
-
异地多活:容灾备份
工具集成:
- 监控:Prometheus + Grafana + Percona Monitoring
- 管理:Percona Toolkit, pt-query-digest
- 部署:Ansible, Kubernetes Operators
- 备份:XtraBackup, mydumper
🎓 学习资源
官方文档:
推荐书籍:
- 《高性能MySQL》(第4版)
- 《MySQL技术内幕:InnoDB存储引擎》
- 《Docker进阶与实践》
社区支持:
📝 作者结语
通过本文的实践,你已经掌握了使用Docker构建MySQL主从复制集群的核心技能。这个架构在实际生产中已经得到广泛应用,能够有效解决数据库的性能和可用性问题。