PostgreSQL14 +patroni+etcd+haproxy+keepalived 集群部署指南

使用postgresql + etcd + patroni + haproxy + keepalived可以实现PG的高可用集群,其中,以postgresql做数据库,Patroni监控本地的PostgreSQL状态,并将本地PostgreSQL信息/状态写入etcd来存储集群状态,所以,patroni与etcd结合可以实现数据库集群故障切换(自动或手动切换),而haproxy可以实现数据库读写分离+读负载均衡(通过不同端口实现),keepalived实现VIP跳转,对haproxy提供了高可用,防止haproxy宕机。

Etcd用于Patroni节点之间共享信息。Patroni监控本地的PostgreSQL状态。如果主库(Primary)故障,Patroni把一个从库(Standby)拉起来,作为新的主(Primary)数据库, 如果一个故障PostgreSQL被抢救过来了,能够重新自动或手动加入集群。

Patroni基于Python开发的模板,结合DCS(Distributed Configuration Store,例如 ZooKeeper, etcd, Consul )可以定制PostgreSQL高可用方案。Patroni接管PostgreSQL数据库的启停,同时监控本地的PostgreSQL数据库,并将本地的PostgreSQL数据库信息写入DCS。Patroni的主备端是通过是否能获得 leader key 来控制的,获取到了leader key的Patroni为主节点,其它的为备节点。

其中Patroni不仅简单易用而且功能非常强大。
支持自动failover和按需switchover
支持一个和多个备节点
支持级联复制
支持同步复制,异步复制
支持同步复制下备库故障时自动降级为异步复制(功效类似于MySQL的半同步,但是更加智能)
支持控制指定节点是否参与选主,是否参与负载均衡以及是否可以成为同步备机
支持通过pg_rewind自动修复旧主
支持多种方式初始化集群和重建备机,包括pg_basebackup和支持wal_e,pgBackRest,barman等备份工具的自定义脚本
支持自定义外部callback脚本
支持REST API
支持通过watchdog防止脑裂
支持k8s,docker等容器化环境部署
支持多种常见DCS(Distributed Configuration Store)存储元数据,包括etcd,ZooKeeper,Consul,Kubernetes

这个架构中,PostgreSQL 提供数据服务,Patroni 负责主从切换,etcd 提供一致性存储,HAProxy 提供访问路由,Keepalived 提供网络VIP高可用,Watchdog 提供节点存活及脑裂防护机制。 六者协同组成一个企业级高可用数据库集群

一、环境准备

软件版本:

Postgresql: 14.6

patroni: 3.1.1

etcd : 3.3.11

HAProxy :1.5.18

Keepalived :1.3.5

系统规划:

|---------|---------------|-------|-------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-------------|
| 主机 | IP | 接口 | 组件 | 系统版本 | 备注 |
| pgtest1 | 192.168.24.11 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 主节点Master |
| pgtest2 | 192.168.24.12 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 备节点1 BACKUP |
| pgtest3 | 192.168.24.13 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 备节点2 BACKUP |
| VIP | 192.168.24.15 | 绑定接口 || ens33 ||

关闭防火墙(比较彻底,也可以放行相应端口):

systemctl stop firewalld

systemctl disable firewalld

关闭selinux

vi /etc/selinux/config,设置selinux=disabled

配置sudo(免密)

cat >>/etc/sudoers <<EOF

postgres ALL=(ALL) NOPASSWD: ALL

EOF

修改/etc/hosts

cat >>/etc/hosts <<EOF

192.168.24.11 pgtest1

192.168.24.12 pgtest2

192.168.24.13 pgtest3

EOF

所有节点修改主机时间,确保节点间时间和时区同步,有条件的同步时间服务器

systemctl start chronyd.service

安装依赖包

yum install -y perl-ExtUtils-Embed readline zlib-devel pam-devel libxml2-devel libxslt-devel openldap-devel python-devel gcc-c++ openssl-devel cmake gcc* readline-devel zlib bison flex bison-devel flex-devel openssl openssl-devel

二、部署postgresql集群

2.1 数据库安装

-- 数据库软件安装(三个节点安装)

--创建用户、目录(root)

useradd postgres

echo "postgres" | passwd --stdin postgres

mkdir -p /postgresql/{pg14,pgdata,arch,soft}

chown -R postgres. /postgresql/

chmod -R 700 /postgresql/

--安装(postgres)

tar zxvf postgresql-14.6.tar.gz

cd postgresql-14.6/

./configure --prefix=/postgresql/pg14

make world &&make install-world

--配置环境变量(postgres)

vi /home/postgres/.bash_profile文件添加以下内容

export LANG=en_US.UTF-8

export PGHOME=/postgresql/pg14

export PGDATA=/postgresql/pgdata

export LD_LIBRARY_PATH=PGHOME/lib:LD_LIBRARY_PATH

export PATH=PGHOME/bin:PATH

source ~/.bash_profile

2.2 数据库配置

主库

--初始化数据库

postgres@pgtest1 \~ initdb -D PGDATA

--postgresql.conf文件末尾插入如下内容

cat >> /postgresql/pgdata/postgresql.conf << "EOF"

listen_addresses = '*'

archive_mode = on

archive_command = 'cp %p /postgresql/arch/%f'

log_destination = 'csvlog'

logging_collector = on

EOF

--pg_hba.conf文件插入host all all 0.0.0.0/0 scram-sha-256

sed -i '/^host\[:space:]\+all\[:space:]\+all\[:space:]\+127.0.0.1\/32\[:space:]\+trust/i\

host all all 0.0.0.0/0 scram-sha-256' /postgresql/pgdata/pg_hba.conf

--pg_hba.conf文件插入host replication all 0.0.0.0/0 scram-sha-256

sed -i '/^host\[:space:]\+replication\[:space:]\+all\[:space:]\+127.0.0.1\/32\[:space:]\+trust/i\

host replication all 0.0.0.0/0 scram-sha-256' /postgresql/pgdata/pg_hba.conf

--启动数据库

postgres@pgtest1 \~$ pg_ctl start

--修改postgres用户默认密码

postgres@pgtest1 postgresql-14.6$ psql

psql (14.6)

Type "help" for help.

postgres=# alter user postgres password 'postgres';

ALTER ROLE

备库 1

-- 复制备库 1

postgres@pgtest2 \~$ pg_basebackup -Fp -Pv -Xs -R -D /postgresql/pgdata -h 192.168.24.11 -p 5432 -Upostgres

-- 启动备库 1

postgres@pgtest2 \~$ pg_ctl start

备库 2

-- 复制备库 2

postgres@pgtest3 \~$ pg_basebackup -Fp -Pv -Xs -R -D /postgresql/pgdata -h 192.168.24.11 -p 5432 -Upostgres

-- 启动备库 2

postgres@pgtest3 \~$ pg_ctl start

2.3 主备库状态

主库状态:

postgres@pgtest1 \~$ pg_controldata |grep cluster

Database cluster state: in production

备库1状态:

postgres@pgtest2 \~$ pg_controldata |grep cluster

Database cluster state: in archive recovery

备库2状态:

postgres@pgtest3 \~$ pg_controldata |grep cluster

Database cluster state: in archive recovery

2.4 集群状态

1.通过pg_controldata输出:

postgres@pgtest1 postgresql-14.6$pg_controldata |grep state

Database cluster state: in production

Database cluster state: in archive recovery

2.通过数据字典表pg_stat_replication,主机表中能查到记录,备机表中无记录

postgres=#select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication;

pid | state | client_addr | sync_priority | sync_state

-------+-----------+---------------+---------------+------------

84644 | streaming | 192.168.24.13 | 0 | async

84638 | streaming | 192.168.24.12 | 0 | async

(2 rows)

3.通过wal进程查看,显示 walsender 的是主机,显示 walreceiver 的是备机

postgres@pgtest1 \~$ ps -ef|grep wal

postgres 84435 84430 0 20:05 ? 00:00:00 postgres: walwriter

postgres 84638 84430 0 20:08 ? 00:00:00 postgres: walsender postgres 192.168.24.12(50458) streaming 0/5000060

postgres 84644 84430 0 20:08 ? 00:00:00 postgres: walsender postgres 192.168.24.13(35594) streaming 0/5000060

4. 通过自带函数判断,select pg_is_in_recovery();

#主库

postgres@pgtest1 \~$ psql -c "select pg_is_in_recovery();"

pg_is_in_recovery


f

(1 row)

#备库1

postgres@pgtest2 \~$ psql -c "select pg_is_in_recovery();"

pg_is_in_recovery


t

(1 row)

#备库2

postgres@pgtest3 \~$ psql -c "select pg_is_in_recovery();"

pg_is_in_recovery


t

(1 row)

三、部署watchdog

3 个节点安装配置

root@pgtest1 etc# modprobe softdog

root@pgtest1 etc# chown postgres:postgres /dev/watchdog

四、部署ETCD

3 个节点安装配置

4.1 添加环境变量

编辑root用户下的.bash_profile文件添加如下变量:

cat >> ~/.bash_profile <<EOF

export PATH=\$PATH:/postgresql/pg14/bin

export LD_LIBRARY_PATH=/postgresql/pg14/lib:\$LD_LIBRARY_PATH

EOF

source ~/.bash_profile,使环境变量立即生效。

4.2 安装etcd

-- 安装

yum -y install etcd

-- 查看已安装版本

root@pgtest1 \~# etcd --version

etcd Version: 3.3.11

Git SHA: 2cf9e51

Go Version: go1.10.3

Go OS/Arch: linux/amd64

-- 编辑etcd配置文件

  • pgtest1节点配置:

root@pgtest1 etcd# cat > /etc/etcd/etcd.conf <<EOF

ETCD_LISTEN_PEER_URLS="http://192.168.24.11:2380"

ETCD_LISTEN_CLIENT_URLS="http://192.168.24.11:2379,http://127.0.0.1:2379"

ETCD_NAME="pgtest1"

ETCD_DATA_DIR="/var/lib/etcd/etcd1"

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.24.11:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://192.168.24.11:2379"

ETCD_INITIAL_CLUSTER="pgtest1=http://192.168.24.11:2380,pgtest2=http://192.168.24.12:2380,pgtest3=http://192.168.24.13:2380"

ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new"

EOF

  • pgtest2节点配置

root@pgtest2 etcd# cat > /etc/etcd/etcd.conf <<EOF

ETCD_LISTEN_PEER_URLS="http://192.168.24.12:2380"

ETCD_LISTEN_CLIENT_URLS="http://192.168.24.12:2379,http://127.0.0.1:2379"

ETCD_NAME="pgtest2"

ETCD_DATA_DIR="/var/lib/etcd/etcd2"

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.24.12:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://192.168.24.12:2379"

ETCD_INITIAL_CLUSTER="pgtest1=http://192.168.24.11:2380,pgtest2=http://192.168.24.12:2380,pgtest3=http://192.168.24.13:2380"

ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new"

EOF

  • pgtest3节点配置

root@pgtest3 \~# cat > /etc/etcd/etcd.conf <<EOF

ETCD_LISTEN_PEER_URLS="http://192.168.24.13:2380"

ETCD_LISTEN_CLIENT_URLS="http://192.168.24.13:2379,http://127.0.0.1:2379"

ETCD_NAME="pgtest3"

ETCD_DATA_DIR="/var/lib/etcd/etcd3"

ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.24.13:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://192.168.24.13:2379"

ETCD_INITIAL_CLUSTER="pgtest1=http://192.168.24.11:2380,pgtest2=http://192.168.24.12:2380,pgtest3=http://192.168.24.13:2380"

ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new"

EOF

4.3 启动etcd

systemctl start etcd(依次同时启动 pgtest1、pgtest2、pgtest3)

systemctl enable etcd

4.4 查看etcd状态

4.1 检查etcd服务状态

#systemctl status etcd.service

  • 主节点:

root@pgtest1 \~# systemctl status etcd

  • 备节点1:

root@pgtest2 \~# systemctl status etcd

  • 备节点2:

root@pgtest3 \~# systemctl status etcd

4.2 检查etcd集群状态
  • 检查集群成员

root@pgtest1 keepalived# etcdctl member list

547cd75869727c68: name=pgtest1 peerURLs=http://192.168.24.11:2380 clientURLs=http://192.168.24.11:2379 isLeader=true

649a8a0e4650f77f: name=pgtest2 peerURLs=http://192.168.24.12:2380 clientURLs=http://192.168.24.12:2379 isLeader=false

e57d7e3537897022: name=pgtest3 peerURLs=http://192.168.24.13:2380 clientURLs=http://192.168.24.13:2379 isLeader=false

  • 检查集群状态

root@pgtest1 keepalived# etcdctl cluster-health

member 547cd75869727c68 is healthy: got healthy result from http://192.168.24.11:2379

member 649a8a0e4650f77f is healthy: got healthy result from http://192.168.24.12:2379

member e57d7e3537897022 is healthy: got healthy result from http://192.168.24.13:2379

cluster is healthy

五、部署patroni

3 个节点安装配置

5.1 安装python

  • python安装

yum install -y python3 python3-devel

查看安装情况

root@pgtest1 bin# whereis python3

python3: /usr/bin/python3.6-config /usr/bin/python3.6m-config /usr/bin/python3 /usr/bin/python3.6m /usr/bin/python3.6 /usr/bin/python3.6m-x86_64-config /usr/lib/python3.6 /usr/lib64/python3.6 /usr/include/python3.6m /usr/share/man/man1/python3.1.gz

  • python系统默认装版本查看

root@pgtest1 keepalived# python --version

Python 2.7.5

  • 替换默认python

rm -f /usr/bin/python

ln -s /usr/bin/python3 /usr/bin/python

root@pgtest1 \~# python -V

Python 3.6.8

全词匹配替换,python3安装后'yum'命令执行会报错,需要修改以下配置

root@pgtest1 \~# sed -i "s:\<python\>:python2:g" /usr/bin/yum

root@pgtest1 \~# sed -i "s:\<python\>:python2:g" /usr/libexec/urlgrabber-ext-down

5.2 安装PIP

  • 下载pip.py文件(每个节点顺序下载,不可同时下载)

#curl https://bootstrap.pypa.io/pip/3.6/get-pip.py -o get-pip.py

++如果下载很慢,则访问++ https://bootstrap.pypa.io/get-pip.py++,将这些代码复制并粘贴到文本编辑器中,再将文件保存为++++get-pip.py++++。++

++或者++

++scp++++传输++++get-pip.py++++到其他节点++

  • 安装.py文件

--使用清华园

#python3 get-pip.py -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host mirrors.aliyun.com

5.3 安装patroni

pip3 install psycopg2-binary -i https://pypi.tuna.tsinghua.edu.cn/simple

pip3 install psycopg2==2.7.5 -i Simple Index

pip3 install cdiff -i https://pypi.tuna.tsinghua.edu.cn/simple

pip3 install "patronietcd,consul==3.1.1" -i Simple Index

5.4 查看已安装patroni版本

root@pgtest1 etcd# patroni --version

patroni 3.1.1

5.5 编辑patroni配置文件

root@pgtest1 # mkdir -p /app/patroni

创建/app/patroni/patroni_config.yml文件

  • pgtest1 节点配置

cat > /app/patroni/patroni_config.yml << EOF

scope: postgres_cluster

namespace: /service/

name: pgtest1

restapi:

listen: 192.168.24.11:8008

connect_address: 192.168.24.11:8008

etcd:

host: 192.168.24.11:2379

bootstrap:

dcs:

ttl: 30

loop_wait: 10

retry_timeout: 10

maximum_lag_on_failover: 1048576

master_start_timeout: 300

synchronous_mode: on

postgresql:

use_pg_rewind: true

use_slots: true

parameters:

listen_addresses: "0.0.0.0"

port: 5432

wal_level: "replica"

hot_standby: "on"

wal_keep_segments: 1000

max_wal_senders: 10

max_replication_slots: 10

wal_log_hints: "on"

initdb:

  • encoding: UTF8

  • data-checksums

postgresql:

listen: 0.0.0.0:5432

connect_address: 192.168.24.11:5432

data_dir: /postgresql/pgdata

bin_dir: /postgresql/pg14/bin

authentication:

replication:

username: postgres

password: postgres

superuser:

username: postgres

password: postgres

rewind:

username: postgres

password: postgres

tags:

nofailover: false

noloadbalance: false

clonefrom: false

nosync: false

EOF

  • pgtest2 节点配置

cat > /app/patroni/patroni_config.yml << EOF

scope: postgres_cluster

namespace: /service/

name: pgtest2

restapi:

listen: 192.168.24.12:8008

connect_address: 192.168.24.12:8008

etcd:

host: 192.168.24.12:2379

bootstrap:

dcs:

ttl: 30

loop_wait: 10

retry_timeout: 10

maximum_lag_on_failover: 1048576

master_start_timeout: 300

synchronous_mode: on

postgresql:

use_pg_rewind: true

use_slots: true

parameters:

listen_addresses: "0.0.0.0"

port: 5432

wal_level: "replica"

hot_standby: "on"

wal_keep_segments: 1000

max_wal_senders: 10

max_replication_slots: 10

wal_log_hints: "on"

initdb:

  • encoding: UTF8

  • data-checksums

postgresql:

listen: 0.0.0.0:5432

connect_address: 192.168.24.12:5432

data_dir: /postgresql/pgdata

bin_dir: /postgresql/pg14/bin

authentication:

replication:

username: postgres

password: postgres

superuser:

username: postgres

password: postgres

rewind:

username: postgres

password: postgres

tags:

nofailover: false

noloadbalance: false

clonefrom: false

nosync: false

EOF

  • pgtest3 节点配置

cat > /app/patroni/patroni_config.yml << EOF

scope: postgres_cluster

namespace: /service/

name: pgtest3

restapi:

listen: 192.168.24.13:8008

connect_address: 192.168.24.13:8008

etcd:

host: 192.168.24.13:2379

bootstrap:

dcs:

ttl: 30

loop_wait: 10

retry_timeout: 10

maximum_lag_on_failover: 1048576

master_start_timeout: 300

synchronous_mode: on

postgresql:

use_pg_rewind: true

use_slots: true

parameters:

listen_addresses: "0.0.0.0"

port: 5432

wal_level: "replica"

hot_standby: "on"

wal_keep_segments: 1000

max_wal_senders: 10

max_replication_slots: 10

wal_log_hints: "on"

initdb:

  • encoding: UTF8

  • data-checksums

postgresql:

listen: 0.0.0.0:5432

connect_address: 192.168.24.13:5432

data_dir: /postgresql/pgdata

bin_dir: /postgresql/pg14/bin

authentication:

replication:

username: postgres

password: postgres

superuser:

username: postgres

password: postgres

rewind:

username: postgres

password: postgres

tags:

nofailover: false

noloadbalance: false

clonefrom: false

nosync: false

EOF

5.6 配置systemd管理Patroni service

所有节点执行相同操作:

cat >>/usr/lib/systemd/system/patroni.service<<EOF

Unit

Description=patroni - a high-availability PostgreSQL

Documentation=https://patroni.readthedocs.io/en/latest/index.html

After=syslog.target network.target etcd.target

Wants=network-online.target

Service

Type=simple

User=postgres

Group=postgres

PermissionsStartOnly=true

ExecStart=/usr/local/bin/patroni /app/patroni/patroni_config.yml

ExecReload=/bin/kill -HUP $MAINPID

LimitNOFILE=65536

KillMode=process

KillSignal=SIGINT

Restart=on-abnormal

RestartSec=30s

TimeoutSec=0

Install

WantedBy=multi-user.target

EOF

5.7 启动patroni

  • 在所有节点依次启动

#chown -R postgres. /app/patroni

#systemctl start patroni

5.8 查看patroni状态

5.8.1 查看patroni服务状态
  • 主节点

root@pgtest1 \~# systemctl status patroni

  • 备节点1:

root@pgtest2 \~# systemctl status patroni

  • 备节点2:

root@pgtest3 \~# systemctl status patroni

注意:patroni集群启动时,数据库集群会自动重启。重启之后状态恢复正常,如下:

5.8.2 查看patroni集群状态
  • 在任意节点上查看Patroni集群状态

root@pgtest1 \~# patronictl -c /app/patroni/patroni_config.yml list

  • 在任意节点上查看ETCD信息:

root@pgtest1 \~# etcdctl ls /service/postgres_cluster

root@pgtest1 \~# etcdctl get /service/postgres_cluster/members/pgtest1

5.9 设置patroni自启动

systemctl enable patroni

六、部署keepalived

3个节点安装配置

6.1 安装keepalived

yum -y install keepalived

6.2 编辑Keepalived配置文件

  • 主服务器的keepalived.conf添加以下内容:

cat > /etc/keepalived/keepalived.conf << EOF

global_defs {

router_id LVS_DEVEL00

script_user root

enable_script_security

}

vrrp_script check_haproxy {

script "/etc/keepalived/check_haproxy.sh"

interval 2

weight 5

fall 3

rise 5

timeout 2

}

vrrp_instance VI_1 {

state Master

interface ens33

virtual_router_id 80

priority 100

advert_int 1

authentication {

auth_type PASS

auth_pass 12345

}

virtual_ipaddress {

192.168.24.15/24

}

track_script {

check_haproxy

}

}

EOF

  • 备服务器1的keepalived.conf添加以下内容:

cat > /etc/keepalived/keepalived.conf << EOF

global_defs {

router_id LVS_DEVEL01

script_user root

enable_script_security

}

vrrp_script check_haproxy {

script "/etc/keepalived/check_haproxy.sh"

interval 2

weight 5

fall 3

rise 5

timeout 2

}

vrrp_instance VI_1 {

state BACKUP

interface ens33

virtual_router_id 80

priority 90

advert_int 1

authentication {

auth_type PASS

auth_pass 12345

}

virtual_ipaddress {

192.168.24.15/24

}

track_script {

check_haproxy

}

}

EOF

  • 备服务器2的keepalived.conf添加以下内容:

cat > /etc/keepalived/keepalived.conf << EOF

global_defs {

router_id LVS_DEVEL02

script_user root

enable_script_security

}

vrrp_script check_haproxy {

script "/etc/keepalived/check_haproxy.sh"

interval 2

weight 5

fall 3

rise 5

timeout 2

}

vrrp_instance VI_1 {

state BACKUP

interface ens33

virtual_router_id 80

priority 80

advert_int 1

authentication {

auth_type PASS

auth_pass 12345

}

virtual_ipaddress {

192.168.24.15/24

}

track_script {

check_haproxy

}

}

EOF

6.3 创建check_haproxy检查脚本

所有节点执行相同操作

编辑脚本

root@pgtest1 keepalived# vi /etc/keepalived/check_haproxy.sh

#!/bin/bash

count=`ps aux | grep -v grep | grep haproxy | wc -l`

if $count -eq 0 ; then

exit 1

else

exit 0

fi

更改脚本权限

root@pgtest1 \~# chmod a+x /etc/keepalived/check_haproxy.sh

6.4 启动keepalived

systemctl start keepalived

6.5 查看keepalived状态

6.5.1 查看keepalived服务状态
  • 主节点:

root@pgtest1 keepalived# systemctl status keepalived.service

  • 备节点 1

root@pgtest2 \~# systemctl status keepalived.service

  • 备节点 2

root@pgtest3 \~# systemctl status keepalived.service

6.5.2 查看VIP绑定状态

root@pgtest1 keepalived# ip addr

6.5.3 设置patroni服务自启动

systemctl enable keepalived

七、部署haproxy

3 个节点安装配置

7.1安装haproxy

yum -y install haproxy

7.2 编辑HAProxy配置文件

所有节点执行相同操作

cat > /etc/haproxy/haproxy.cfg << EOF

global

log 127.0.0.1 local0 info

log 127.0.0.1 local1 warning

chroot /var/lib/haproxy

pidfile /var/run/haproxy.pid

maxconn 4000

user haproxy

group haproxy

daemon

stats socket /var/lib/haproxy/stats

defaults

mode tcp

log global

option tcplog

option dontlognull

option redispatch

retries 3

timeout queue 5s

timeout connect 10s

timeout client 60m

timeout server 60m

timeout check 15s

maxconn 3000

listen status

bind *:1080

mode http

log global

stats enable

stats refresh 30s

stats uri /

stats realm Private lands

stats auth admin:admin

listen master

bind *:5000

mode tcp

option tcplog

balance roundrobin

option httpchk OPTIONS /master

http-check expect status 200

default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions

server pgtest1 192.168.24.11:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

server pgtest2 192.168.24.12:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

server pgtest3 192.168.24.13:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

listen replicas

bind *:5001

mode tcp

option tcplog

balance roundrobin

option httpchk OPTIONS /replica

http-check expect status 200

default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions

server pgtest1 192.168.24.11:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

server pgtest2 192.168.24.12:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

server pgtest3 192.168.24.13:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2

EOF

7.3 启动haproxy

systemctl start haproxy

7.4 查看haproxy服务

systemctl status haproxy

7.5 设置haproxy服务自启动

systemctl enable haproxy

7.8 HAProxy监控页面

登录地址:http://192.168.24.15:1080/ (也可以通过各个节点IP+端口登录)

默认用户密码:admin/admin

8、故障演示

8.1 在任意节点上查看Patroni集群状态

#patronictl -c /app/patroni/patroni_config.yml list

pgtest1节点为Leader节点

任意节点通过虚IP:192.168.24.15端口5000连接数据库:

psql -h 192.168.24.15 -p 5000 -Upostgres

postgres@pgtest3 pgdata$ psql -h 192.168.24.15 -p5432 -Upostgres

Password for user postgres:

psql (14.6)

Type "help" for help.

postgres=# \l

List of databases

Name | Owner | Encoding | Collate | Ctype | Access privileges

-----------+----------+----------+-------------+-------------+-----------------------

postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +

| | | | | postgres=CTc/postgres

template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +

| | | | | postgres=CTc/postgres

(3 rows)

8.2 pgtest1节点切换

--当前集群主节点是 pgtest1,将主角色切换到 pgtest2:

patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest1 --candidate pgtest2

8.4 pgtest2节点切换

--当前集群主节点是 pgtest2,将主角色切换到 pgtest3:

patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest2 --candidate pgtest3

8.6 pgtest3节点切换

--当前集群主节点是 pgtest3,将主角色切换到 pgtest1:

patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest3 --candidate pgtest1

8.7 failover的切换

停止pgtest1节点的partroni、keepalive、haproxy服务或者关闭节点

root@pgtest1 \~# systemctl stop haproxy

root@pgtest1 \~# systemctl stop keepalived

root@pgtest1 \~# systemctl stop patroni

查看集群状态,集群leader已漂移到2节点

恢复pgtest1节点的集群服务

root@pgtest1 \~# systemctl start patroni

root@pgtest1 \~# systemctl start keepalived

root@pgtest1 \~# systemctl start haproxy

root@pgtest1 \~# patronictl -c /app/patroni/patroni_config.yml list

Pgtest1节点恢复,由leader节点变为备节点。

8.8 数据丢失恢复

--pgtest3备节点删除数据目录模拟数据丢失

删除前集群状态查看

root@pgtest1 \~# patronictl -c /app/patroni/patroni_config.yml list

  • Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+

| Member | Host | Role | State | TL | Lag in MB |

+---------+---------------+--------------+-----------+----+-----------+

| pgtest1 | 192.168.24.11 | Leader | running | 7 | |

| pgtest2 | 192.168.24.12 | Sync Standby | streaming | 7 | 0 |

| pgtest3 | 192.168.24.13 | Replica | streaming | 7 | 0 |

+---------+---------------+--------------+-----------+----+-----------+

删除data目录

postgres@pgtest3 postgresql$ rm -rf pgdata

postgres@pgtest3 postgresql$ ls

arch pg14 pgdatabak scripts soft

postgres@pgtest3 postgresql$ ls -ltr

total 8

drwx------ 2 postgres postgres 6 Oct 29 12:47 scripts

drwx------ 3 postgres postgres 59 Oct 29 12:50 soft

drwx------ 6 postgres postgres 56 Oct 29 12:56 pg14

drwx------ 2 postgres postgres 4096 Oct 29 14:53 arch

drwx------ 20 postgres postgres 4096 Oct 30 09:31 pgdatabak

删除后集群状态查看

root@pgtest1 \~# patronictl -c /app/patroni/patroni_config.yml list

  • Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+

| Member | Host | Role | State | TL | Lag in MB |

+---------+---------------+--------------+-----------+----+-----------+

| pgtest1 | 192.168.24.11 | Leader | running | 7 | |

| pgtest2 | 192.168.24.12 | Sync Standby | streaming | 7 | 0 |

| pgtest3 | 192.168.24.13 | Replica | streaming | 7 | 0 |

+---------+---------------+--------------+-----------+----+-----------+

集群自动恢复开始

root@pgtest1 \~# patronictl -c /app/patroni/patroni_config.yml list

  • Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+

| Member | Host | Role | State | TL | Lag in MB |

+---------+---------------+--------------+-----------+----+-----------+

| pgtest1 | 192.168.24.11 | Leader | running | 7 | |

| pgtest2 | 192.168.24.12 | Sync Standby | streaming | 7 | 0 |

| pgtest3 | 192.168.24.13 | Replica | streaming | 7 | 0 |

+---------+---------------+--------------+-----------+----+-----------+

pgtest3数据目录自动恢复

流复制同步状态:

相关推荐
程序媛_6 小时前
【Python】连接PostgreSQL获取手机验证码
开发语言·python·postgresql
IvorySQL8 小时前
PostgreSQL 技术日报 (6月7日)|峰会线上通道开放
数据库·postgresql
李燚1 天前
erlang_migrate 架构拆解:behaviour 驱动的多数据库迁移引擎
数据库·postgresql·架构·erlang·migrate·behaviour·erlang_migrate
Jinkxs1 天前
PostgreSQL - 全文检索的开启与基础使用
数据库·postgresql·全文检索
l1t1 天前
DeepSeek总结的使用 Docker 对 PostgreSQL 进行 Beta 测试
docker·postgresql·容器
IvorySQL1 天前
PGv19预发布对现有生产系统的隐患思考,MySQL别看!
数据库·postgresql·开源
倒流时光三十年1 天前
PostgreSQL JOIN 大白话指南
postgresql·join
花酒锄作田2 天前
DeepAgents - 使用Postgres作为Checkpoint
postgresql·deepagents
瀚高PG实验室3 天前
pgsql-ogr-fdw
数据库·postgresql·瀚高数据库·highgo
IvorySQL3 天前
PostgreSQL 技术日报 (6月5日)|PG19 Beta1 上线,PGConf.PL 2026开启征稿
数据库·postgresql·区块链