PostgreSQL14 +patroni+etcd+haproxy+keepalived 集群部署指南

使用postgresql + etcd + patroni + haproxy + keepalived可以实现PG的高可用集群,其中,以postgresql做数据库,Patroni监控本地的PostgreSQL状态,并将本地PostgreSQL信息/状态写入etcd来存储集群状态,所以,patroni与etcd结合可以实现数据库集群故障切换(自动或手动切换),而haproxy可以实现数据库读写分离+读负载均衡(通过不同端口实现),keepalived实现VIP跳转,对haproxy提供了高可用,防止haproxy宕机。

Etcd用于Patroni节点之间共享信息。Patroni监控本地的PostgreSQL状态。如果主库(Primary)故障,Patroni把一个从库(Standby)拉起来,作为新的主(Primary)数据库, 如果一个故障PostgreSQL被抢救过来了,能够重新自动或手动加入集群。

Patroni基于Python开发的模板,结合DCS(Distributed Configuration Store,例如 ZooKeeper, etcd, Consul )可以定制PostgreSQL高可用方案。Patroni接管PostgreSQL数据库的启停,同时监控本地的PostgreSQL数据库,并将本地的PostgreSQL数据库信息写入DCS。Patroni的主备端是通过是否能获得 leader key 来控制的,获取到了leader key的Patroni为主节点,其它的为备节点。

其中Patroni不仅简单易用而且功能非常强大。
支持自动failover和按需switchover
支持一个和多个备节点
支持级联复制
支持同步复制,异步复制
支持同步复制下备库故障时自动降级为异步复制(功效类似于MySQL的半同步,但是更加智能)
支持控制指定节点是否参与选主,是否参与负载均衡以及是否可以成为同步备机
支持通过pg_rewind自动修复旧主
支持多种方式初始化集群和重建备机,包括pg_basebackup和支持wal_e,pgBackRest,barman等备份工具的自定义脚本
支持自定义外部callback脚本
支持REST API
支持通过watchdog防止脑裂
支持k8s,docker等容器化环境部署
支持多种常见DCS(Distributed Configuration Store)存储元数据,包括etcd,ZooKeeper,Consul,Kubernetes

这个架构中,PostgreSQL 提供数据服务,Patroni 负责主从切换,etcd 提供一致性存储,HAProxy 提供访问路由,Keepalived 提供网络VIP高可用,Watchdog 提供节点存活及脑裂防护机制。 六者协同组成一个企业级高可用数据库集群

一、环境准备

软件版本:

Postgresql: 14.6

patroni: 3.1.1

etcd : 3.3.11

HAProxy :1.5.18

Keepalived :1.3.5

系统规划:

|---------|---------------|-------|-------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-------------|
| 主机 | IP | 接口 | 组件 | 系统版本 | 备注 |
| pgtest1 | 192.168.24.11 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 主节点Master |
| pgtest2 | 192.168.24.12 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 备节点1 BACKUP |
| pgtest3 | 192.168.24.13 | ens33 | PostgreSQL、Patroni、Etcd,haproxy、keepalived | Centos7.9(3.10.0-1160.88.1.el7.x86_64) | 备节点2 BACKUP |
| VIP | 192.168.24.15 | 绑定接口 || ens33 ||

关闭防火墙(比较彻底,也可以放行相应端口):

systemctl stop firewalld

systemctl disable firewalld

关闭selinux

vi /etc/selinux/config,设置selinux=disabled

配置sudo(免密)

cat >>/etc/sudoers <<EOF

postgres ALL=(ALL) NOPASSWD: ALL

EOF

修改/etc/hosts

cat >>/etc/hosts <<EOF

192.168.24.11 pgtest1

192.168.24.12 pgtest2

192.168.24.13 pgtest3

EOF

所有节点修改主机时间,确保节点间时间和时区同步,有条件的同步时间服务器

systemctl start chronyd.service

安装依赖包

yum install -y perl-ExtUtils-Embed readline zlib-devel pam-devel libxml2-devel libxslt-devel openldap-devel python-devel gcc-c++ openssl-devel cmake gcc* readline-devel zlib bison flex bison-devel flex-devel openssl openssl-devel

二、部署postgresql集群

2.1 数据库安装

-- 数据库软件安装(三个节点安装)

--创建用户、目录(root)

useradd postgres

echo "postgres" | passwd --stdin postgres

mkdir -p /postgresql/{pg14,pgdata,arch,soft}

chown -R postgres. /postgresql/

chmod -R 700 /postgresql/

--安装(postgres)

tar zxvf postgresql-14.6.tar.gz

cd postgresql-14.6/

./configure --prefix=/postgresql/pg14

make world &&make install-world

--配置环境变量(postgres)

vi /home/postgres/.bash_profile文件添加以下内容

export LANG=en_US.UTF-8

export PGHOME=/postgresql/pg14

export PGDATA=/postgresql/pgdata

export LD_LIBRARY_PATH=PGHOME/lib:LD_LIBRARY_PATH

export PATH=PGHOME/bin:PATH

source ~/.bash_profile

2.2 数据库配置

主库

--初始化数据库

postgres@pgtest1 \~\]$ initdb -D $PGDATA --postgresql.conf文件末尾插入如下内容 cat \>\> /postgresql/pgdata/postgresql.conf \<\< "EOF" listen_addresses = '\*' archive_mode = on archive_command = 'cp %p /postgresql/arch/%f' log_destination = 'csvlog' logging_collector = on EOF --pg_hba.conf文件插入host all all 0.0.0.0/0 scram-sha-256 sed -i '/\^host\[\[:space:\]\]\\+all\[\[:space:\]\]\\+all\[\[:space:\]\]\\+127.0.0.1\\/32\[\[:space:\]\]\\+trust/i\\ host all all 0.0.0.0/0 scram-sha-256' /postgresql/pgdata/pg_hba.conf --pg_hba.conf文件插入host replication all 0.0.0.0/0 scram-sha-256 sed -i '/\^host\[\[:space:\]\]\\+replication\[\[:space:\]\]\\+all\[\[:space:\]\]\\+127.0.0.1\\/32\[\[:space:\]\]\\+trust/i\\ host replication all 0.0.0.0/0 scram-sha-256' /postgresql/pgdata/pg_hba.conf --启动数据库 \[postgres@pgtest1 \~\]$ pg_ctl start --修改postgres用户默认密码 \[postgres@pgtest1 postgresql-14.6\]$ psql psql (14.6) Type "help" for help. postgres=# alter user postgres password 'postgres'; ALTER ROLE **备库** **1**: **--** **复制备库** **1** \[postgres@pgtest2 \~\]$ pg_basebackup -Fp -Pv -Xs -R -D /postgresql/pgdata -h 192.168.24.11 -p 5432 -Upostgres **--** **启动备库** **1** \[postgres@pgtest2 \~\]$ pg_ctl start **备库** **2** **:** **--** **复制备库** **2** \[postgres@pgtest3 \~\]$ pg_basebackup -Fp -Pv -Xs -R -D /postgresql/pgdata -h 192.168.24.11 -p 5432 -Upostgres **--** **启动备库** **2** \[postgres@pgtest3 \~\]$ pg_ctl start #### **2.3 主备库状态** **主库状态:** \[postgres@pgtest1 \~\]$ pg_controldata \|grep cluster Database cluster state: in production **备库1状态:** \[postgres@pgtest2 \~\]$ pg_controldata \|grep cluster Database cluster state: in archive recovery **备库2状态:** \[postgres@pgtest3 \~\]$ pg_controldata \|grep cluster Database cluster state: in archive recovery #### **2.4 集群状态** # 1.通过pg_controldata输出: \[postgres@pgtest1 postgresql-14.6\]$pg_controldata \|grep state Database cluster state: in production Database cluster state: in archive recovery # 2.通过数据字典表pg_stat_replication,主机表中能查到记录,备机表中无记录 postgres=#select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication; pid \| state \| client_addr \| sync_priority \| sync_state -------+-----------+---------------+---------------+------------ 84644 \| streaming \| 192.168.24.13 \| 0 \| async 84638 \| streaming \| 192.168.24.12 \| 0 \| async (2 rows) # 3.通过wal进程查看,显示 walsender 的是主机,显示 walreceiver 的是备机 \[postgres@pgtest1 \~\]$ ps -ef\|grep wal postgres 84435 84430 0 20:05 ? 00:00:00 postgres: walwriter postgres 84638 84430 0 20:08 ? 00:00:00 postgres: walsender postgres 192.168.24.12(50458) streaming 0/5000060 postgres 84644 84430 0 20:08 ? 00:00:00 postgres: walsender postgres 192.168.24.13(35594) streaming 0/5000060 # 4. 通过自带函数判断,select pg_is_in_recovery(); #主库 \[postgres@pgtest1 \~\]$ psql -c "select pg_is_in_recovery();" pg_is_in_recovery ------------------- f (1 row) #备库1 \[postgres@pgtest2 \~\]$ psql -c "select pg_is_in_recovery();" pg_is_in_recovery ------------------- t (1 row) #备库2 \[postgres@pgtest3 \~\]$ psql -c "select pg_is_in_recovery();" pg_is_in_recovery ------------------- t (1 row) ### **三、部署watchdog** **3** **个节点安装配置** \[root@pgtest1 etc\]# modprobe softdog \[root@pgtest1 etc\]# chown postgres:postgres /dev/watchdog ### **四、部署ETCD** **3** **个节点安装配置** #### **4.1 添加环境变量** 编辑root用户下的.bash_profile文件添加如下变量: cat \>\> \~/.bash_profile \<\ /etc/etcd/etcd.conf \<\ /etc/etcd/etcd.conf \<\ /etc/etcd/etcd.conf \<\:python2:g" /usr/bin/yum \[root@pgtest1 \~\]# sed -i "s:\\\:python2:g" /usr/libexec/urlgrabber-ext-down #### **5.2 安装PIP** * 下载pip.py文件(每个节点顺序下载,不可同时下载) #curl [https://bootstrap.pypa.io/pip/3.6/get-pip.py -o get-pip.py](https://bootstrap.pypa.io/pip/3.6/get-pip.py%20-o%20get-pip.py "https://bootstrap.pypa.io/pip/3.6/get-pip.py -o get-pip.py") ++如果下载很慢,则访问++ [https://bootstrap.pypa.io/get-pip.py](https://bootstrap.pypa.io/get-pip.py "https://bootstrap.pypa.io/get-pip.py")++,将这些代码复制并粘贴到文本编辑器中,再将文件保存为++++get-pip.py++++。++ ++或者++ ++scp++++传输++++get-pip.py++++到其他节点++ * 安装.py文件 --使用清华园 #python3 get-pip.py -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host mirrors.aliyun.com #### **5.3 安装patroni** pip3 install psycopg2-binary -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install psycopg2==2.7.5 -i [Simple Index](https://pypi.tuna.tsinghua.edu.cn/simple "Simple Index") pip3 install cdiff -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install "patroni\[etcd,consul\]==3.1.1" -i [Simple Index](https://pypi.tuna.tsinghua.edu.cn/simple "Simple Index") #### **5.4 查看已安装patroni版本** \[root@pgtest1 etcd\]# patroni --version patroni 3.1.1 #### **5.5 编辑patroni配置文件** \[root@pgtest1 \]# mkdir -p /app/patroni 创建/app/patroni/patroni_config.yml文件 * **pgtest1** **节点配置** cat \> /app/patroni/patroni_config.yml \<\< EOF scope: postgres_cluster namespace: /service/ name: pgtest1 restapi: listen: 192.168.24.11:8008 connect_address: 192.168.24.11:8008 etcd: host: 192.168.24.11:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: on postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: "replica" hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" initdb: - encoding: UTF8 - data-checksums postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.24.11:5432 data_dir: /postgresql/pgdata bin_dir: /postgresql/pg14/bin authentication: replication: username: postgres password: postgres superuser: username: postgres password: postgres rewind: username: postgres password: postgres tags: nofailover: false noloadbalance: false clonefrom: false nosync: false EOF * **pgtest2** **节点配置** cat \> /app/patroni/patroni_config.yml \<\< EOF scope: postgres_cluster namespace: /service/ name: pgtest2 restapi: listen: 192.168.24.12:8008 connect_address: 192.168.24.12:8008 etcd: host: 192.168.24.12:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: on postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: "replica" hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" initdb: - encoding: UTF8 - data-checksums postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.24.12:5432 data_dir: /postgresql/pgdata bin_dir: /postgresql/pg14/bin authentication: replication: username: postgres password: postgres superuser: username: postgres password: postgres rewind: username: postgres password: postgres tags: nofailover: false noloadbalance: false clonefrom: false nosync: false EOF * **pgtest3** **节点配置** cat \> /app/patroni/patroni_config.yml \<\< EOF scope: postgres_cluster namespace: /service/ name: pgtest3 restapi: listen: 192.168.24.13:8008 connect_address: 192.168.24.13:8008 etcd: host: 192.168.24.13:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: on postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: "replica" hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" initdb: - encoding: UTF8 - data-checksums postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.24.13:5432 data_dir: /postgresql/pgdata bin_dir: /postgresql/pg14/bin authentication: replication: username: postgres password: postgres superuser: username: postgres password: postgres rewind: username: postgres password: postgres tags: nofailover: false noloadbalance: false clonefrom: false nosync: false EOF #### **5.6 配置systemd管理Patroni service** 所有节点执行相同操作: cat \>\>/usr/lib/systemd/system/patroni.service\<\

Description=patroni - a high-availability PostgreSQL

Documentation=https://patroni.readthedocs.io/en/latest/index.html

After=syslog.target network.target etcd.target

Wants=network-online.target

Service

Type=simple

User=postgres

Group=postgres

PermissionsStartOnly=true

ExecStart=/usr/local/bin/patroni /app/patroni/patroni_config.yml

ExecReload=/bin/kill -HUP $MAINPID

LimitNOFILE=65536

KillMode=process

KillSignal=SIGINT

Restart=on-abnormal

RestartSec=30s

TimeoutSec=0

Install

WantedBy=multi-user.target

EOF

5.7 启动patroni

  • 在所有节点依次启动

#chown -R postgres. /app/patroni

#systemctl start patroni

5.8 查看patroni状态

5.8.1 查看patroni服务状态
  • 主节点

root@pgtest1 \~\]# systemctl status patroni ![](https://i-blog.csdnimg.cn/direct/f88787410eb44d428f08589ef3e8d1be.png) * **备节点1:** \[root@pgtest2 \~\]# systemctl status patroni ![](https://i-blog.csdnimg.cn/direct/51781c713ad442b69a6f7a572793404e.png) * **备节点2:** \[root@pgtest3 \~\]# systemctl status patroni ![](https://i-blog.csdnimg.cn/direct/6cf634ea65af4713a127468e81ce64a4.png) 注意:patroni集群启动时,数据库集群会自动重启。重启之后状态恢复正常,如下: ![](https://i-blog.csdnimg.cn/direct/727994a87805451f852773b1248c1ab0.png) ![](https://i-blog.csdnimg.cn/direct/8f265d0ef462492d8d778c254622ae12.png) ##### **5.8.2 查看patroni集群状态** * 在任意节点上查看Patroni集群状态 \[root@pgtest1 \~\]# patronictl -c /app/patroni/patroni_config.yml list ![](https://i-blog.csdnimg.cn/direct/b1b6a189e8414d169a2f0e25be430fa6.png) * 在任意节点上查看ETCD信息: \[root@pgtest1 \~\]# etcdctl ls /service/postgres_cluster \[root@pgtest1 \~\]# etcdctl get /service/postgres_cluster/members/pgtest1 ![](https://i-blog.csdnimg.cn/direct/035ef387cdfb4deab14c132230b0a2c3.png) #### **5.9 设置patroni自启动** systemctl enable patroni ### **六、部署keepalived** 3个节点安装配置 #### **6.1 安装keepalived** yum -y install keepalived #### **6.2 编辑Keepalived配置文件** * 主服务器的keepalived.conf添加以下内容: cat \> /etc/keepalived/keepalived.conf \<\< EOF global_defs { router_id LVS_DEVEL00 script_user root enable_script_security } vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 2 weight 5 fall 3 rise 5 timeout 2 } vrrp_instance VI_1 { state Master interface ens33 virtual_router_id 80 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 12345 } virtual_ipaddress { 192.168.24.15/24 } track_script { check_haproxy } } EOF * 备服务器1的keepalived.conf添加以下内容: cat \> /etc/keepalived/keepalived.conf \<\< EOF global_defs { router_id LVS_DEVEL01 script_user root enable_script_security } vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 2 weight 5 fall 3 rise 5 timeout 2 } vrrp_instance VI_1 { state BACKUP interface ens33 virtual_router_id 80 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 12345 } virtual_ipaddress { 192.168.24.15/24 } track_script { check_haproxy } } EOF * 备服务器2的keepalived.conf添加以下内容: cat \> /etc/keepalived/keepalived.conf \<\< EOF global_defs { router_id LVS_DEVEL02 script_user root enable_script_security } vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 2 weight 5 fall 3 rise 5 timeout 2 } vrrp_instance VI_1 { state BACKUP interface ens33 virtual_router_id 80 priority 80 advert_int 1 authentication { auth_type PASS auth_pass 12345 } virtual_ipaddress { 192.168.24.15/24 } track_script { check_haproxy } } EOF #### **6.3 创建check_haproxy检查脚本** **所有节点执行相同操作** 编辑脚本 \[root@pgtest1 keepalived\]# vi /etc/keepalived/check_haproxy.sh #!/bin/bash count=\`ps aux \| grep -v grep \| grep haproxy \| wc -l\` if \[ $count -eq 0 \]; then exit 1 else exit 0 fi 更改脚本权限 \[root@pgtest1 \~\]# chmod a+x /etc/keepalived/check_haproxy.sh #### **6.4 启动keepalived** systemctl start keepalived #### **6.5 查看keepalived状态** ##### **6.5.1 查看keepalived服务状态** * **主节点:** \[root@pgtest1 keepalived\]# systemctl status keepalived.service ![](https://i-blog.csdnimg.cn/direct/867ab4c56a68491b9fc9364b6f63116a.png) * **备节点** **1** **:** \[root@pgtest2 \~\]# systemctl status keepalived.service ![](https://i-blog.csdnimg.cn/direct/69d772b8c2c2450e80a7b5c267405c7f.png) * **备节点** **2** **:** \[root@pgtest3 \~\]# systemctl status keepalived.service ![](https://i-blog.csdnimg.cn/direct/87608e06b4eb43739c55bc24e7465f11.png) ##### **6.5.2 查看VIP绑定状态** \[root@pgtest1 keepalived\]# ip addr ![](https://i-blog.csdnimg.cn/direct/6469335bb7c745528d9a753191c7e9ab.png) ##### **6.5.3 设置patroni服务自启动** systemctl enable keepalived ### **七、部署haproxy** **3** **个节点安装配置** #### **7.1安装haproxy** yum -y install haproxy #### **7.2 编辑HAProxy配置文件** 所有节点执行相同操作 cat \> /etc/haproxy/haproxy.cfg \<\< EOF global log 127.0.0.1 local0 info log 127.0.0.1 local1 warning chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon stats socket /var/lib/haproxy/stats defaults mode tcp log global option tcplog option dontlognull option redispatch retries 3 timeout queue 5s timeout connect 10s timeout client 60m timeout server 60m timeout check 15s maxconn 3000 listen status bind \*:1080 mode http log global stats enable stats refresh 30s stats uri / stats realm Private lands stats auth admin:admin listen master bind \*:5000 mode tcp option tcplog balance roundrobin option httpchk OPTIONS /master http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server pgtest1 192.168.24.11:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 server pgtest2 192.168.24.12:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 server pgtest3 192.168.24.13:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 listen replicas bind \*:5001 mode tcp option tcplog balance roundrobin option httpchk OPTIONS /replica http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server pgtest1 192.168.24.11:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 server pgtest2 192.168.24.12:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 server pgtest3 192.168.24.13:5432 maxconn 1000 check port 8008 inter 5000 rise 2 fall 2 EOF #### **7.3 启动haproxy** systemctl start haproxy #### **7.4 查看haproxy服务** systemctl status haproxy #### **7.5 设置haproxy服务自启动** systemctl enable haproxy #### **7.8 HAProxy监控页面** ![](https://i-blog.csdnimg.cn/direct/e60c7ffbd5e7422dabc68f60c6a989e9.png) 登录地址:[http://192.168.24.15:1080/](http://192.168.24.15:1080/ "http://192.168.24.15:1080/") (也可以通过各个节点IP+端口登录) 默认用户密码:admin/admin ### **8、故障演示** #### **8.1 在任意节点上查看Patroni集群状态** #patronictl -c /app/patroni/patroni_config.yml list ![](https://i-blog.csdnimg.cn/direct/70cf3fa9b50840b5a6209f939bd84533.png)pgtest1节点为Leader节点 任意节点通过虚IP:192.168.24.15端口5000连接数据库: psql -h 192.168.24.15 -p 5000 -Upostgres \[postgres@pgtest3 pgdata\]$ psql -h 192.168.24.15 -p5432 -Upostgres Password for user postgres: psql (14.6) Type "help" for help. postgres=# \\l List of databases Name \| Owner \| Encoding \| Collate \| Ctype \| Access privileges -----------+----------+----------+-------------+-------------+----------------------- postgres \| postgres \| UTF8 \| en_US.UTF-8 \| en_US.UTF-8 \| template0 \| postgres \| UTF8 \| en_US.UTF-8 \| en_US.UTF-8 \| =c/postgres + \| \| \| \| \| postgres=CTc/postgres template1 \| postgres \| UTF8 \| en_US.UTF-8 \| en_US.UTF-8 \| =c/postgres + \| \| \| \| \| postgres=CTc/postgres (3 rows) #### **8.2 pgtest1节点切换** --当前集群主节点是 pgtest1,将主角色切换到 pgtest2: # patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest1 --candidate pgtest2 ![](https://i-blog.csdnimg.cn/direct/56abf419069a4948a462ed97d4390a1e.png) #### **8.4 pgtest2节点切换** --当前集群主节点是 pgtest2,将主角色切换到 pgtest3: # patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest2 --candidate pgtest3 ![](https://i-blog.csdnimg.cn/direct/d71babe8677440d6bd2ea276cac602f5.png) #### **8.6 pgtest3节点切换** --当前集群主节点是 pgtest3,将主角色切换到 pgtest1: # patronictl -c /app/patroni/patroni_config.yml switchover --master pgtest3 --candidate pgtest1 ![](https://i-blog.csdnimg.cn/direct/3f3c6dd69f904378a40c050459f2811d.png) #### **8.7 failover的切换** 停止pgtest1节点的partroni、keepalive、haproxy服务或者关闭节点 \[root@pgtest1 \~\]# systemctl stop haproxy \[root@pgtest1 \~\]# systemctl stop keepalived \[root@pgtest1 \~\]# systemctl stop patroni 查看集群状态,集群leader已漂移到2节点 ![](https://i-blog.csdnimg.cn/direct/37bcc70237bb497298979d67c5102ffa.png) 恢复pgtest1节点的集群服务 \[root@pgtest1 \~\]# systemctl start patroni \[root@pgtest1 \~\]# systemctl start keepalived \[root@pgtest1 \~\]# systemctl start haproxy \[root@pgtest1 \~\]# patronictl -c /app/patroni/patroni_config.yml list ![](https://i-blog.csdnimg.cn/direct/17c7837c2b0e4eeeaf5d71648a275189.png) Pgtest1节点恢复,由leader节点变为备节点。 #### **8.8 数据丢失恢复** --pgtest3备节点删除数据目录模拟数据丢失 删除前集群状态查看 \[root@pgtest1 \~\]# patronictl -c /app/patroni/patroni_config.yml list + Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +---------+---------------+--------------+-----------+----+-----------+ \| pgtest1 \| 192.168.24.11 \| Leader \| running \| 7 \| \| \| pgtest2 \| 192.168.24.12 \| Sync Standby \| streaming \| 7 \| 0 \| \| pgtest3 \| 192.168.24.13 \| Replica \| streaming \| 7 \| 0 \| +---------+---------------+--------------+-----------+----+-----------+ 删除data目录 \[postgres@pgtest3 postgresql\]$ rm -rf pgdata \[postgres@pgtest3 postgresql\]$ ls arch pg14 pgdatabak scripts soft \[postgres@pgtest3 postgresql\]$ ls -ltr total 8 drwx------ 2 postgres postgres 6 Oct 29 12:47 scripts drwx------ 3 postgres postgres 59 Oct 29 12:50 soft drwx------ 6 postgres postgres 56 Oct 29 12:56 pg14 drwx------ 2 postgres postgres 4096 Oct 29 14:53 arch drwx------ 20 postgres postgres 4096 Oct 30 09:31 pgdatabak 删除后集群状态查看 \[root@pgtest1 \~\]# patronictl -c /app/patroni/patroni_config.yml list + Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +---------+---------------+--------------+-----------+----+-----------+ \| pgtest1 \| 192.168.24.11 \| Leader \| running \| 7 \| \| \| pgtest2 \| 192.168.24.12 \| Sync Standby \| streaming \| 7 \| 0 \| \| pgtest3 \| 192.168.24.13 \| Replica \| streaming \| 7 \| 0 \| +---------+---------------+--------------+-----------+----+-----------+ 集群自动恢复开始 \[root@pgtest1 \~\]# patronictl -c /app/patroni/patroni_config.yml list + Cluster: postgres_cluster (7501697848219523437) ---+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +---------+---------------+--------------+-----------+----+-----------+ \| pgtest1 \| 192.168.24.11 \| Leader \| running \| 7 \| \| \| pgtest2 \| 192.168.24.12 \| Sync Standby \| streaming \| 7 \| 0 \| \| pgtest3 \| 192.168.24.13 \| Replica \| streaming \| 7 \| 0 \| +---------+---------------+--------------+-----------+----+-----------+ pgtest3数据目录自动恢复 ![](https://i-blog.csdnimg.cn/direct/3870e7a3b0e04938b858f8d7c2b9135d.png) 流复制同步状态: ![](https://i-blog.csdnimg.cn/direct/5b6ab820123347d7882c8340676da0d5.png)

相关推荐
king_harry13 小时前
postgresql-14 高可用(pgpool)部署
postgresql·pg高可用
weixin_4342556113 小时前
PostgreSQL跨数据库表字段值复制实战经验分
数据库·postgresql
潘yi.1 天前
PostgreSQL日常维护
数据库·postgresql
霖檬ing1 天前
PostgreSQL
数据库·postgresql
2401_836836591 天前
PostgreSQL使用
数据库·postgresql
大鹅同志1 天前
Ubuntu 20.04卸载并重装 PostgreSQL
linux·ubuntu·postgresql
Vic101011 天前
创建索引:GaussDB(PostgreSQL)开发笔记
笔记·postgresql·gaussdb
昭阳~1 天前
PostgreSQL日常维护
数据库·postgresql
YUNYINGXIA2 天前
PostgreSQL初体验
数据库·postgresql