2024-06-13-debian12安装Mariadb-Galera-Cluster+Nginx+Keepalived高可用多主集群

title: debian12安装Mariadb Galera Cluster+Nginx+Keepalived高可用多主集群

author: Loong
date: 2024-06-13

文章目录

- [title: debian12安装Mariadb Galera Cluster+Nginx+Keepalived高可用多主集群 author: Loong date: 2024-06-13](#title: debian12安装Mariadb Galera Cluster+Nginx+Keepalived高可用多主集群 author: Loong date: 2024-06-13)
- 准备工作
[一、Mariadb Galera Cluster](#一、Mariadb Galera Cluster)
二、Nginx
三、Keepalived

准备工作

主机	IP	说明
galera-node1	192.168.100.111	MariaDB Galera集群节点1
galera-node2	192.168.100.112	MariaDB Galera集群节点2
galera-node3	192.168.100.113	MariaDB Galera集群节点3
galera-nginx1	192.168.100.114	Nginx+keepalived 主节点 VIP 192.168.100.110
galera-nginx2	192.168.100.115	Nginx+keepalived 备份节点

需要5台虚拟机和6个ip

系统安装、主机名配置、网络配置等略

一、Mariadb Galera Cluster

参考：

mariadb官网安装galera集群：Getting Started with MariaDB Galera Cluster - MariaDB Knowledge Base
Debian 12 Bookworm ： MariaDB ：安装： Server World (server-world.info)
Debian 12 Bookworm ： MariaDB ： Galera 集群：服务器世界 (server-world.info)
Installing MariaDB Galera Cluster with APT - MariaDB Knowledge Base

（不推荐）centos7使用yum安装请参考
MariaDB Galera Cluster部署实战-腾讯云开发者社区-腾讯云 (tencent.com)
官网：使用 YUM 安装 MariaDB Galera 集群

MariaDB安装

安装软件包

DVD镜像里有这三个包，不配置apt source.list的话，装入dvd直接安装也好。我的debian12的带的mariadb版本是 MariaDB 10.11.4
apt install mariadb-server mariadb-client galera-4 -y

实际上apt install mariadb-server -y会安装好这三个

(可选)修改一些配置：

复制代码

更改最大连接数
sed -i  "/#max_connections/a\max_connections=10000" /etc/mysql/mariadb.conf.d/50-server.cnf

#默认数据目录/var/lib/mysql 修改为/data/mysql_data 
#sed -i  "s/#datadir=.*/datadir=\/data\/mysql_data/g" /etc/mysql/mariadb.conf.d/50-server.cnf

可以通过配置/usr/lib/systemd/system/mariadb.service来调大打开文件数目
[Service]新添加两行如下参数：
#LimitNOFILE=10000 这个配置文件里默认有 LimitNOFILE=32768
LimitNPROC=10000

启动mariadb

bash 复制代码

systemctl restart mariadb

初始化配置

运行mysql_secure_installation进行数据库初始化配置

bash 复制代码

mysql_secure_installation
Enter current password for root (enter for none): //初次运行时密码为空，直接回车
Switch to unix_socket authentication [Y/n] , so you can safely answer 'n'.//n
Change the root password? [Y/n]//设置数据库的root用户密码 y
New password:               //设置root用户的密码
Re-enter new password:      //再次输入你设置的密码
Remove anonymous users? [Y/n]   //是否删除匿名用户,生产环境建议删除，所以直接回车
Disallow root login remotely? [Y/n] //是否禁止root远程登录,根据自己的需求选择Y/n并回车,建议禁止 直接回车
Remove test database and access to it? [Y/n] //是否删除test数据库,直接回车
Reload privilege tables now? [Y/n] //是否重新加载权限表，直接回车

连接测试

bash 复制代码

mysql -uroot -p

执行一些查询命令

sql 复制代码

show databases; /*列出数据库*/
show grants for root@localhost;/*查看用户在localhost登录的权限*/
select user,host,password from mysql.user; /*列出用户以及允许登录的主机*/
\q  /*退出*/

galera配置

node1节点：

复制代码

vi /etc/mysql/mariadb.conf.d/50-server.cnf
#第27行注释掉
#bind-address           = 127.0.0.1
#末尾加上
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
#集群中所有节点
wsrep_cluster_address="gcomm://192.168.100.111,192.168.100.112,192.168.100.113"
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
# 集群名称
wsrep_cluster_name="MariaDB_Cluster"
# 本节点的ip
wsrep_node_address="192.168.100.111"

先停止数据库 systemctl stop mariadb

然后特别的方式启动mariadb来引导新集群第一个节点 （此处不一定是node1节点，都可以是第一个节点。如果重启集群时，引导节点也要执行这条命令）

在使用 systemd 的操作系统上，可以通过以下方式引导节点：

bash 复制代码

galera_new_cluster

无systemd的系统上mariadbd --wsrep-new-cluster 来引导新集群的第一个节点。(连接到现有群集时，请勿使用 --wsrep-new-cluster 选项)

node2/node3节点:

复制代码

vi /etc/mysql/mariadb.conf.d/50-server.cnf
#第27行注释掉
#bind-address           = 127.0.0.1
#末尾加上
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.100.111,192.168.100.112,192.168.100.113"
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_cluster_name="MariaDB_Cluster"
# 本节点的ip
wsrep_node_address="192.168.100.112" #node3上wsrep_node_address="192.168.100.113"

正常方式重启mariadb即可：

复制代码

systemctl restart mariadb

查看集群状态

在任意虚拟机里执行以下命令

复制代码

mysql -e "show status like 'wsrep_%'" #查看集群全部相关状态参数
mysql -e "show status like 'wsrep_cluster_size'"  # 这里应该显示集群里有3个节点
mysql -e "show status like 'wsrep_connected'"     # 这里应该显示ON
mysql -e "show status like 'wsrep_incoming_addresses'" # 这里应该显示三个节点的地址
mysql -e "show status like 'wsrep_local_state_comment'" # 这里节点的同步状态

自启动问题

参考https://cloud.tencent.com/developer/article/1120419

以后启动都要先一个节点上galera_new_cluster 另外两个节点systemctl start mariadb

如果galera_new_cluster启动报错

Jun 25 13:20:35 node1 mariadbd $1388$ : 2024-06-25 13:20:35 0 $ERROR$ WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluste> Jun 25 13:20:35 node1 mariadbd $1388$ : 2024-06-25 13:20:35 0 $ERROR$ WSREP: wsrep::connect(gcomm://192.168.100.111,192.168.100.112,192.168.100.113) failed: 7 Jun 25 13:20:35 node1 mariadbd $1388$ : 2024-06-25 13:20:35 0 $ERROR$ Aborting Jun 25 13:20:35 node1 systemd $1$ : mariadb.service: Main process exited, code=exited, status=1/FAILURE

那就说明此节点不是最后一个离开集群的节点。（/var/lib/mysql/grastate.dat中safe_to_bootstrap为1，说明是最后一个离开集群的节点）

尝试从其他节点执行galera_new_cluster

再不行，修改其中一个节点/var/lib/mysql/grastate.dat将safe_to_bootstrap设置为1，来指示某个节点是安全的引导节点然后在这个节点里galera_new_cluster

再不行，删除/var/lib/mysql/gvwstate.dat和/var/lib/mysql/gvwstate.dat （如果存在）
很多服务默认自启的

查看服务默认的启动状态：systemctl list-unit-files | grep nginx

nginx.service enabled enabled

root@galera-nginx1:~# systemctl list-unit-files

UNIT FILE STATE PRESET

。。。

查看服务当前的自启设置 service --status-all

root@galera-nginx1:~# service --status-all

$+$ apparmor

$+$ chrony

$-$ console-setup.sh

$+$ cron

$+$ dbus

$-$ hwclock.sh

$+$ ipvsadm

$+$ keepalived

$-$ keyboard-setup.sh

$+$ kmod

$+$ networking

$+$ nginx

$+$ procps

$+$ qemu-guest-agent

$+$ ssh

$-$ sudo

$+$ udev

$+$ zabbix-agent

自制启动脚本

shell 复制代码

##host_list={galera-node1,galera-node2,galera-node3}

#注意先做好galera-node1到1、2、3的免密登录和/etc/hosts

ssh_port=22
#开始重启任务的时间
echo "task start at: $(date +%Y-%m-%d_%H:%M:%S)"
#本机上用于暂存各个数据库节点的状态的文件
bootstrap_tmpfile=/tmp/.db_stat.$(date +%Y%m%d%H%M)
#如果有未停止的，先全部停止一遍数据库服务
stop_db(){
        ssh galera-node3  'systemctl stop mariadb'
        sleep 1
        ssh galera-node2  'systemctl stop mariadb'
        sleep 1
        ssh galera-node1  'systemctl stop mariadb'
}

#检查各个数据库节点的状态，保存到bootstrap_tmpfile
check_status(){
        for host in {galera-node1,galera-node2,galera-node3};
        do
                echo ${host} $(ssh ${host} "grep safe_to_bootstrap /var/lib/mysql/grastate.dat"|awk '{print $NF}')
        done > ${bootstrap_tmpfile}
}

#检查
start_db(){
		#如果存在且仅存在一个 safe_to_bootstrap为1的节点 则引导节点为此节点
        if [ $(awk '/ 1/{print $1}' ${bootstrap_tmpfile}|wc -l) = 1 ]; #注意awk关键词里面有个空格
        then
                host1=$(awk '/ 1/{print $1}' ${bootstrap_tmpfile});
		#如果不存在safe_to_bootstrap为1的节点或者存在多个 则需要指定一个节点为引导节点 这里指定的是node1节点
        else
                host1=galera-node1
                ssh $host1 "sed -i '/bootstrap/c\safe_to_bootstrap: 1' /var/lib/mysql/grastate.dat"
				#修改其它节点的safe_to_bootstrap为0
                for host in $(awk '{print $1}' ${bootstrap_tmpfile}|grep -v ${host1});do
                        ssh $host "sed -i '/bootstrap/c\safe_to_bootstrap: 0' /var/lib/mysql/grastate.dat"
                done
        fi
		#在引导节点上执行引导集群命令
        ssh $host1  'galera_new_cluster'
        sleep 3
		#在其它节点上正常启动数据库服务
        for host in $(awk '{print $1}' ${bootstrap_tmpfile}|grep -v ${host1});do
                ssh $host 'systemctl start mariadb'
        done
}
main(){
        stop_db
        check_status
        start_db
        rm -f ${bootstrap_tmpfile}
        echo "task over  at: $(date +%Y-%m-%d_%H:%M:%S)"
}
main

各节点上systemctl disable mariadb

在node1节点上配置开机自启

防火墙配置

MariaDB/MySQL Galera服务器的防火墙规则https://blog.csdn.net/echolivia/article/details/51555051

3306，这个是MariaDB/MySQL的服务端口，这个都不开那就不用跑MariaDB/MySQL服务了。

4567，Galera做数据复制的通讯和数据传输端口，需要在防火墙放开TCP和UDP

4568，Galera做增量数据传输使用的端口（Incremental State Transfer, IST），需要防火墙放开TCP

4444，Galera做快照状态传输使用的端口（State Snapshot Transfer, SST），需要防火墙放开TCP

二、Nginx

Nginx Stream(TCP/UDP)负载均衡 (w3ccoo.com)
两种hash算法的区别
 Nginx实现四层/七层负载均衡 - 彼扬 - 博客园 (cnblogs.com)

安装软件包

bash 复制代码

apt install nginx -y

安装stream支持

bash 复制代码

apt install libnginx-mod-stream -y

复制代码

查看是否启用stream模块
ls /etc/nginx/modules-enabled/
50-mod-stream.conf

apt list --installed | grep stream
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
libnginx-mod-stream/stable,now 1.22.1-9 amd64 [installed,automatic]

也可以直接apt install nginx-full 会默认安装模块

修改配置文件

bash 复制代码

cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.bak
cat /etc/nginx/nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
error_log /var/log/nginx/error.log;
include /etc/nginx/modules-enabled/*.conf;

events {
        worker_connections 100000;#连接改大
        # multi_accept on;
}
#加入stream
stream {
        include /etc/nginx/conf.d/stream*.conf;
}
#注释掉或者删除掉http{。。。}

bash 复制代码

cat > /etc/nginx/conf.d/stream_mysql_cluster.conf <<EOF
upstream mysql_cluster {
    #least_conn;#把请求转发给连接数较少的后端
    #ip_hash;#通过ip每次把请求尽量发往同一个后端 这个是http里的写法 
    
    #参考
    #https://blog.csdn.net/kaige8312/article/details/145731636
    #https://blog.csdn.net/Tyro_java/article/details/141635599
    #https://zhuanlan.zhihu.com/p/565477967
    #https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/
    #https://docs.nginx.com/nginx/admin-guide/load-balancer/tcp-udp-load-balancer/
    #hash $remote_addr consistent;
    #虚拟节点数设置为32甚至更大，因此即使很少的服务节点也能做到相对均匀的数据分布
    hash $remote_addr consistent=1000;
    server 192.168.100.111:3306;
    server 192.168.100.112:3306;
    server 192.168.100.113:3306;
}

server {
    listen 3306;
    proxy_pass mysql_cluster;
}
EOF

重启Nginx

bash 复制代码

nginx -t #检查配置
systemctl restart nginx

测试

先在数据库里创建一个可以远程登录的用户testuser1

mysql 复制代码

grant all privileges on *.* to 'testuser1'@'%' identified by '123456' with grant option;

安装客户端通过nginx代理访问数据库

bash 复制代码

apt install mariadb-client  -qy
mysql -uroot -p -h192.168.100.114

三、Keepalived

安装软件包

bash 复制代码

apt install  keepalived
mv /etc/keepalived/keepalived.conf.sample  /etc/keepalived/keepalived.conf

如果同一网段中有其他Keepalived要先查看该网络中所存在的vrid：

复制代码

tcpdump -nn -i any net 192.168.100.0/8

注意：同一网段中virtual_router_id的值不能重复，否则会出错，相关错误信息如下。

Keepalived_vrrp $27120$ : ip address associated with VRID not present in received packet :

one or more VIP associated with VRID mismatch actual MASTER advert

bogus VRRP packet received on eth1 !!!

receive an invalid ip number count associated with VRID!

VRRP_Instance(xxx) ignoring received advertisment...

可以用这条命令来查看该网络中所存在的vrid：

tcpdump -nn -i any net 192.168.100.0/8

原文链接：https://blog.csdn.net/Kim_Weir/article/details/78991854

修改配置文件

查看网卡名ip a

Nginx1(Master节点)

bash 复制代码

vi /etc/keepalived/keepalived.conf
global_defs {
   notification_email {
        test@xxx.com
   }
   notification_email_from info@xxx.com
   smtp_server 192.168.x.x
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
#   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state MASTER
    interface enp0s3#网卡改
    virtual_router_id 3
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass dbS3n2 #密码改
    }
    virtual_ipaddress {
        192.168.100.110/24
    }
}

Nginx2(Backup节点)

bash 复制代码

vi /etc/keepalived/keepalived.conf
global_defs {
   notification_email {
        test@xxx.com
   }
   notification_email_from info@xxx.com
   smtp_server 192.168.x.x
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
#   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state BACKUP
    interface enp0s3#网卡改
    virtual_router_id 3 #vrid要相同
    priority 99 #优先级小
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass dbS3n2 #密码改
    }
    virtual_ipaddress {
        192.168.100.110/24
    }
}

重启Keepalived

复制代码

systemctl start keepalived

防火墙

两台nginx要互通vrrp协议不然收不到广播。

否则master重启后，当前的master节点Nginx2不会变成backup状态，出现两台都是master、都有vip的情况，即发生了脑裂现象。

VRRP协议在四层，不是tcp，故也没有端口。

笔者的虚拟机中禁用了防火墙，PVE平台上直接配置放通vrrp协议即可。

测试

数据库连接测试：

mysql 复制代码

#通过vip连接数据库
mysql -uroot -p -h192.168.100.110

VIP转移测试：

关闭Nginx1或者关闭Keepalived服务，此时业务不断,可用tcping -t 192.168.100.110 3306测试

在Nginx2上查看ip，vip浮动到了Nginx2的网卡上，nginx2成为了master

启动nginx1时，会切换回master，nginx2会切换回backup。

bash 复制代码

systemctl status keepalived -l #查看详细的日志