前言
安装部署postgresql-14 高可用(pgpool)一主二从
All servers表示所有节点执行,master表示主节点执行。
核心组件说明
组件 |
作用 |
PostgreSQL 14 |
实际存储数据的数据库节点(主备结构) |
Pgpool-II |
数据库代理中间件,提供连接池、读写分离、负载均衡、主备切换、故障转移等功能 |
VIP (虚拟 IP ) |
客户端统一连接入口,始终指向 Pgpool 主节点 |
一、部署环境
|----|--------|---------------|---------------|-----------|------------------|----------------|
| 序号 | 主机名 | IP地址 | VIP | 系统版本 | 数据库版本 | 软件版本 |
| 1 | ptest1 | 192.168.24.11 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |
| 2 | pest2 | 192.168.24.12 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |
| 3 | pest3 | 192.168.24.13 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |
软件下载地址:
Postgresql14: PostgreSQL: Linux downloads (Red Hat family)
Pgpool-II: Index of /yum/rpms/4.3/redhat/rhel-7-x86_64
二、Postgresql安装
2.1 配置hosts文件及sudo权限[all servers]
--配置hosts
root@pgtest1 \~\]# cat \>\> /etc/hosts \<\\> /etc/sudoers \<\\> /pg14/pgdata/postgresql.conf \<\< "EOF"
listen_addresses = '\*'
port=5432
logging_collector = on
log_truncate_on_rotation = on
archive_mode = on
archive_command = 'cp %p /pg14/arch/%f'
max_wal_senders = 10
max_replication_slots = 10
wal_level = replica
hot_standby = on
wal_log_hints = on
password_encryption = 'md5'
EOF
#### **2.9 修改 pg_hba.conf文件\[master\]**
# vi /pg14/pgdata/pg_hba.conf
host all all 0.0.0.0/0 md5
host replication all 0.0.0.0/0 md5
#### **2.10 启动数据库\[master\]**
# systemctl start postgresql-14
#### **2.11 创建数据库账户\[master\]**
-bash-4.2$ psql -c "alter user postgres password 'postgres'"
-bash-4.2$ psql -c "create user repl with replication password 'repl'"
-bash-4.2$ psql -c "create user pgpool password 'pgpool'"
-bash-4.2$ psql -c "grant pg_monitor to pgpool"
### **三、安装pgpool**
#### **3.1上传pgpool安装文件\[all servers\]**
\[root@pgtest1 \~\]# ls -l pgpool\*.rpm
-rw-r--r-- 1 root root 1789236 Oct 20 09:47 pgpool-II-pg15-4.3.6-1pgdg.rhel7.x86_64.rpm
-rw-r--r-- 1 root root 2711052 Oct 20 09:47 pgpool-II-pg15-debuginfo-4.3.6-1pgdg.rhel7.x86_64.rpm
-rw-r--r-- 1 root root 18980 Oct 20 09:47 pgpool-II-pg15-devel-4.3.6-1pgdg.rhel7.x86_64.rpm
-rw-r--r-- 1 root root 49936 Oct 20 09:47 pgpool-II-pg15-extensions-4.3.6-1pgdg.rhel7.x86_64.rpm
#### **3.2 yum安装pgpool\[all servers\]**
\[root@pgtest1 \~\]# yum localinstall -y pgpool\*.rpm
#### **3.3 关闭pgpool服务自启动\[all servers\]**
\[root@pgtest1 \~\]# systemctl disable pgpool.service
### **四、配置互信\[all servers\]**
##root用户
mkdir -p \~/.ssh;
cd \~/.ssh;
ssh-keygen -t rsa -f id_rsa_pgpool
ssh-copy-id -i id_rsa_pgpool.pub [email protected]
ssh-copy-id -i id_rsa_pgpool.pub [email protected]
ssh-copy-id -i id_rsa_pgpool.pub [[email protected]](mailto:[email protected])
##postgres用户
cd \~/.ssh;
ssh-keygen -t rsa -f id_rsa_pgpool
ssh-copy-id -i id_rsa_pgpool.pub [email protected]
ssh-copy-id -i id_rsa_pgpool.pub [email protected]
ssh-copy-id -i id_rsa_pgpool.pub [[email protected]](mailto:[email protected])
##使用ssh [email protected] -i \~/.ssh/id_rsa_pgpool指令检验是否需要密码
\[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool
Last login: Fri Oct 20 10:17:05 2023 from 192.168.24.11
-bash-4.2$
\[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool
Last login: Fri Oct 20 10:16:59 2023 from 192.168.24.11
-bash-4.2$ exit
\[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool
Last login: Fri Oct 20 10:16:50 2023 from pgtest1
-bash-4.2$
**若需要密码,上述目录均执行restorecon -R -v \~/.ssh**
### **五、配置密码文件**
#### **5.1 pgpass文件\[all servers\]**
#cat \> /var/lib/pgsql/.pgpass \<\<"EOF"
192.168.24.11:5432:replication:repl:repl
192.168.24.12:5432:replication:repl:repl
192.168.24.13:5432:replication:repl:repl
192.168.24.15:9999:postgres:pgpool:pgpool
192.168.24.15:9999:postgres:postgres:postgres
192.168.24.11:5432:postgres:postgres:postgres
192.168.24.12:5432:postgres:postgres:postgres
192.168.24.13:5432:postgres:postgres:postgres
EOF
#chown postgres:postgres /var/lib/pgsql/.pgpass
#chmod 600 /var/lib/pgsql/.pgpass
#### **5.2 配置pcppass文件\[all servers\]**
#echo 'localhost:9898:pgpool:pgpool' \> /pg14/pgdata/.pcppass
#chmod 600 /pg14/pgdata/.pcppass
#chown postgres.postgres /pg14/pgdata/.pcppass
#echo 'pgpool:'\`pg_md5 -u=pgpool pgpool\` \>\> /etc/pgpool-II/pcp.conf
### **六、创建pgpool_node_id**
--节点192.168.24.11执行:
# echo "0" \> /etc/pgpool-II/pgpool_node_id
--节点192.168.24.12执行:
# echo "1" \> /etc/pgpool-II/pgpool_node_id
--节点192.168.24.13执行:
# echo "2" \> /etc/pgpool-II/pgpool_node_id
### **七、Pgpool-II参数配置**
#### **7.1 pgpool.conf文件配置\[all servers\]**
#cat \> /etc/pgpool-II/pgpool.conf \<\<"EOF"
backend_clustering_mode = 'streaming_replication'
# - pgpool Connection Settings -
listen_addresses = '\*'
port=9999
socket_dir = '/pg14/pgdata/'
pcp_listen_addresses = '\*'
pcp_port = 9898
pcp_socket_dir = '/pg14/pgdata/'
# - Streaming Replication Check
sr_check_user = 'pgpool'
sr_check_password = 'pgpool'
# - Health Check
health_check_period = 5
health_check_timeout = 30
health_check_user = 'pgpool'
health_check_password = 'pgpool'
health_check_max_retries = 3
search_primary_node_timeout = 10min
# - Backend Connection Settings -
backend_hostname0 = '192.168.24.11'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/pg14/pgdata'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_application_name0 = 'pgtest1'
backend_hostname1 = '192.168.24.12'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/pg14/pgdata'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'pgtest2'
backend_hostname2 = '192.168.24.13'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/pg14/pgdata'
backend_flag2 = 'ALLOW_TO_FAILOVER'
backend_application_name2 = 'pgtest3'
# - Failover configuration
failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
follow_primary_command = '/etc/pgpool-II/follow_primary.sh %d %h %p %D %m %H %M %P %r %R'
# - Online Recovery Configurations
recovery_user = 'postgres'
recovery_password = 'postgres'
recovery_1st_stage_command = 'recovery_1st_stage'
# - Client Authentication Configuration
enable_pool_hba = on
pool_passwd = '/etc/pgpool-II/pool_passwd'
# - Watchdog Configuration
use_watchdog = on
delegate_IP = '192.168.24.15'
if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev ens33 label ens33:1'
if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev ens33'
arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I ens33'
hostname0 = '192.168.24.11'
wd_port0 = 9000
pgpool_port0 = 9999
hostname1 = '192.168.24.12'
wd_port1 = 9000
pgpool_port1 = 9999
hostname2 = '192.168.24.13'
wd_port2 = 9000
pgpool_port2 = 9999
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
heartbeat_hostname0 = '192.168.24.11'
heartbeat_port0 = 9694
heartbeat_device0 = ''
heartbeat_hostname1 = '192.168.24.12'
heartbeat_port1 = 9694
heartbeat_device1 = ''
heartbeat_hostname2 = '192.168.24.13'
heartbeat_port2 = 9694
heartbeat_device2 = ''
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
wd_escalation_command = '/etc/pgpool-II/escalation.sh'
# - Where to log -
log_destination = 'stderr'
logging_collector = on
log_directory = '/var/log/pgpool-II'
log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 10MB
pid_file_name = '/var/run/postgresql/pgpool.pid'
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
load_balance_mode = on
EOF
#### **7.2 failover.sh、follow_primary.sh文件配置\[all servers\]**
--从模板创建文件
#cp -p /etc/pgpool-II/failover.sh{.sample,}
#cp -p /etc/pgpool-II/follow_primary.sh{.sample,}
#chown postgres:postgres /etc/pgpool-II/{failover.sh,follow_primary.sh}
#chmod +x /etc/pgpool-II/{failover.sh,follow_primary.sh}
--修改follow_primary.sh
#根据实际情况修改配置参数,本测试环境只需修改ARCHIVEDIR
ARCHIVEDIR=/pg14/arch
#### **7.3 Pgpool-II在线恢复配置\[master\]**
根据模板创建recovery_First_stage和pgpool_remote_start脚本并放在pg集簇目录中
#cp -p /etc/pgpool-II/recovery_1st_stage.sample /pg14/pgdata/recovery_1st_stage
#cp -p /etc/pgpool-II/pgpool_remote_start.sample /pg14/pgdata/pgpool_remote_start
#chown postgres:postgres /pg14/pgdata/{recovery_1st_stage,pgpool_remote_start}
#chmod +x /pg14/pgdata/{recovery_1st_stage,pgpool_remote_start}
--修改recovery_1st_stage脚本,根据实际情况更改。
ARCHIVEDIR=/pg14/arch
#注意脚本recovery_1st_stage和pgpool_remote_start中的PGHOME是否配置正确。脚本follow_primary.sh中的PCP_USER为pgpool。
--安装pgpool_recovery插件
#为了使用在线恢复功能,需要pgpool_recovery、pgpool_remote_start、pgpool_switch_xlog等功能,因此我们需要在PostgreSQL server server1的template1上安装pgpool_recovery,以后新建数据库将自动安装pgpool_recovery插件。
#su - postgres
$psql template1 -c "CREATE EXTENSION pgpool_recovery"
$psql postgres -c "CREATE EXTENSION pgpool_recovery"
#### **7.4修改 pool_hba.conf文件\[all servers\]**
#echo "host all all 0.0.0.0/0 md5" \>\> /etc/pgpool-II/pool_hba.conf
#### **7.5 配置pool_passwd\[all servers\]**
--检查pool_passwd文件
#cat /etc/pgpool-II/pool_passwd
--使用pg_md5加密用户口令
#pg_md5 --md5auth --username=pgpool "pgpool"
#pg_md5 --md5auth --username=postgres "postgres"
#### **7.6 watchdog配置\[all servers\]**
--创建配置文件
#cat \> /etc/pgpool-II/escalation.sh \<\<"EOF"
#!/bin/bash
# This script is run by wd_escalation_command to bring down the virtual IP on other pgpool nodes
# before bringing up the virtual IP on the new active pgpool node.
set -o xtrace
PGPOOLS=(192.168.24.11 192.168.24.12 192.168.24.13)
VIP=192.168.24.15
DEVICE=ens33
for pgpool in "${PGPOOLS\[@\]}"; do
\[ "$HOSTNAME" = "$pgpool" \] \&\& continue
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres@$pgpool -i \~/.ssh/id_rsa_pgpool "
/usr/bin/sudo /sbin/ip addr del $VIP/24 dev $DEVICE
"
done
exit 0
EOF
--更改文件权限
#chown postgres:postgres /etc/pgpool-II/escalation.sh
--所有脚本文件添加执行权限
#chmod +x /etc/pgpool-II/\*.sh

pgpool在线恢复功能:
为了使用Pgpool II执行在线恢复,在pgpool.conf文件中我们指定PostgreSQL用户名和在线恢复命令recovery_1st_stage。因为执行在线恢复需要PostgreSQL中的超级用户权限,所以我们在recovery_user中指定postgres_user。然后,我们在PostgreSQL主服务器(server1)的数据库集群目录中创建recovery_1st_stage和pgpool_remote_start,并添加执行权限。
### **八、配置PostgreSQL主备库集群**
#### **8.1. 启动pgpool-II\[all servers\]**
--创建日志目录
#touch /tmp/pgpool_status
#chown -R postgres. /tmp/pgpool_status
#mkdir -p /var/log/pgpool-II
#chown -R postgres. /var/log/pgpool-II
#systemctl start pgpool.service \[各节点依次启动服务
--查看服务状态
##节点1状态
root@pgtest1 pgpool-II\]# systemctl status pgpool.service
● pgpool.service - Pgpool-II
Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2025-05-12 13:04:54 CST; 1min 1s ago
Main PID: 17916 (pgpool)
Tasks: 45
CGroup: /system.slice/pgpool.service
├─17916 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n
├─17919 pgpool: PgpoolLogger
├─17923 pgpool: watchdog
├─17933 pgpool: lifecheck
├─17934 pgpool: heartbeat receiver
├─17935 pgpool: heartbeat sender
├─17936 pgpool: heartbeat receiver
├─17937 pgpool: wait for connection request
├─17938 pgpool: wait for connection request
├─17939 pgpool: heartbeat sender
├─17940 pgpool: wait for connection request
├─17941 pgpool: wait for connection request
├─17942 pgpool: wait for connection request
├─17943 pgpool: wait for connection request
├─17944 pgpool: wait for connection request
├─17945 pgpool: wait for connection request
├─17946 pgpool: wait for connection request
├─17947 pgpool: wait for connection request
├─17948 pgpool: wait for connection request
├─17949 pgpool: wait for connection request
├─17950 pgpool: wait for connection request
├─17951 pgpool: wait for connection request
├─17952 pgpool: wait for connection request
├─17953 pgpool: wait for connection request
├─17954 pgpool: wait for connection request
├─17955 pgpool: wait for connection request
├─17956 pgpool: wait for connection request
├─17957 pgpool: wait for connection request
├─17958 pgpool: wait for connection request
├─17959 pgpool: wait for connection request
├─17960 pgpool: wait for connection request
├─17961 pgpool: wait for connection request
├─17962 pgpool: wait for connection request
├─17963 pgpool: wait for connection request
├─17964 pgpool: wait for connection request
├─17965 pgpool: wait for connection request
├─17966 pgpool: wait for connection request
├─17967 pgpool: wait for connection request
├─17968 pgpool: wait for connection request
├─17969 pgpool: wait for connection request
├─17973 pgpool: health check process(0)
├─17975 pgpool: health check process(1)
├─17976 pgpool: health check process(2)
├─18096 pgpool: PCP: wait for connection request
└─18097 pgpool: worker process
May 12 13:04:54 pgtest1 systemd\[1\]: Started Pgpool-II.
May 12 13:04:54 pgtest1 pgpool\[17916\]: 2025-05-12 13:04:54.188: main pid 17916: LOG: redirecting log output to logging collector process
May 12 13:04:54 pgtest1 pgpool\[17916\]: 2025-05-12 13:04:54.188: main pid 17916: HINT: Future log output will appear in directory "/var/log/pgpool-II".
May 12 13:05:06 pgtest1 sudo\[18010\]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip addr add 192.168.24.15/24 dev ens33 label ens33:1
May 12 13:05:07 pgtest1 sudo\[18013\]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/sbin/arping -U 192.168.24.15 -w 1 -I ens33
\[root@pgtest1 pgpool-II\]# ip a
1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b0:62:cd brd ff:ff:ff:ff:ff:ff
inet 192.168.24.11/24 brd 192.168.24.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.24.15/24 scope global secondary ens33:1
valid_lft forever preferred_lft forever
inet6 fe80::a2cc:510c:af40:af87/64 scope link noprefixroute
valid_lft forever preferred_lft forever
##节点2状态
\[root@pgtest2 pgpool-II\]# systemctl status pgpool.service
● pgpool.service - Pgpool-II
Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2025-05-12 13:04:59 CST; 1min 48s ago
Main PID: 17535 (pgpool)
Tasks: 45
CGroup: /system.slice/pgpool.service
├─17535 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n
├─17538 pgpool: PgpoolLogger
├─17542 pgpool: watchdog
├─17548 pgpool: lifecheck
├─17549 pgpool: heartbeat receiver
├─17550 pgpool: heartbeat sender
├─17551 pgpool: heartbeat receiver
├─17552 pgpool: wait for connection request
├─17553 pgpool: wait for connection request
├─17554 pgpool: heartbeat sender
├─17555 pgpool: wait for connection request
├─17556 pgpool: wait for connection request
├─17558 pgpool: wait for connection request
├─17559 pgpool: wait for connection request
├─17560 pgpool: wait for connection request
├─17561 pgpool: wait for connection request
├─17562 pgpool: wait for connection request
├─17563 pgpool: wait for connection request
├─17564 pgpool: wait for connection request
├─17565 pgpool: wait for connection request
├─17566 pgpool: wait for connection request
├─17567 pgpool: wait for connection request
├─17568 pgpool: wait for connection request
├─17569 pgpool: wait for connection request
├─17570 pgpool: wait for connection request
├─17571 pgpool: wait for connection request
├─17572 pgpool: wait for connection request
├─17573 pgpool: wait for connection request
├─17574 pgpool: wait for connection request
├─17575 pgpool: wait for connection request
├─17576 pgpool: wait for connection request
├─17577 pgpool: wait for connection request
├─17578 pgpool: wait for connection request
├─17579 pgpool: wait for connection request
├─17580 pgpool: wait for connection request
├─17581 pgpool: wait for connection request
├─17582 pgpool: wait for connection request
├─17583 pgpool: wait for connection request
├─17584 pgpool: wait for connection request
├─17585 pgpool: wait for connection request
├─17586 pgpool: PCP: wait for connection request
├─17588 pgpool: health check process(0)
├─17589 pgpool: health check process(1)
├─17590 pgpool: health check process(2)
└─17610 pgpool: worker process
May 12 13:04:59 pgtest2 systemd\[1\]: Started Pgpool-II.
May 12 13:04:59 pgtest2 pgpool\[17535\]: 2025-05-12 13:04:59.645: main pid 17535: LOG: redirecting log output to logging collector process
May 12 13:04:59 pgtest2 pgpool\[17535\]: 2025-05-12 13:04:59.645: main pid 17535: HINT: Future log output will appear in directory "/var/log/pgpool-II"..
\[root@pgtest2 pgpool-II\]# ip a
1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:c8:ee:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.24.12/24 brd 192.168.24.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::15bb:4008:354c:4f0f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
##节点3状态
\[root@pgtest3 pgpool-II\]# systemctl status pgpool.service
● pgpool.service - Pgpool-II
Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2025-05-12 13:05:02 CST; 2min 28s ago
Main PID: 12784 (pgpool)
Tasks: 45
CGroup: /system.slice/pgpool.service
├─12784 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n
├─12786 pgpool: PgpoolLogger
├─12791 pgpool: watchdog
├─12795 pgpool: lifecheck
├─12796 pgpool: heartbeat receiver
├─12797 pgpool: heartbeat sender
├─12798 pgpool: heartbeat receiver
├─12799 pgpool: heartbeat sender
├─12800 pgpool: wait for connection request
├─12801 pgpool: wait for connection request
├─12802 pgpool: wait for connection request
├─12803 pgpool: wait for connection request
├─12804 pgpool: wait for connection request
├─12805 pgpool: wait for connection request
├─12806 pgpool: wait for connection request
├─12807 pgpool: wait for connection request
├─12808 pgpool: wait for connection request
├─12809 pgpool: wait for connection request
├─12810 pgpool: wait for connection request
├─12811 pgpool: wait for connection request
├─12812 pgpool: wait for connection request
├─12813 pgpool: wait for connection request
├─12814 pgpool: wait for connection request
├─12815 pgpool: wait for connection request
├─12816 pgpool: wait for connection request
├─12817 pgpool: wait for connection request
├─12818 pgpool: wait for connection request
├─12819 pgpool: wait for connection request
├─12820 pgpool: wait for connection request
├─12821 pgpool: wait for connection request
├─12822 pgpool: wait for connection request
├─12823 pgpool: wait for connection request
├─12824 pgpool: wait for connection request
├─12825 pgpool: wait for connection request
├─12826 pgpool: wait for connection request
├─12827 pgpool: wait for connection request
├─12828 pgpool: wait for connection request
├─12829 pgpool: wait for connection request
├─12830 pgpool: wait for connection request
├─12831 pgpool: wait for connection request
├─12832 pgpool: PCP: wait for connection request
├─12834 pgpool: health check process(0)
├─12835 pgpool: health check process(1)
├─12836 pgpool: health check process(2)
└─12863 pgpool: worker process
May 12 13:05:02 pgtest3 systemd\[1\]: Started Pgpool-II.
May 12 13:05:02 pgtest3 pgpool\[12784\]: 2025-05-12 13:05:02.949: main pid 12784: LOG: redirecting log output to logging collector process
May 12 13:05:02 pgtest3 pgpool\[12784\]: 2025-05-12 13:05:02.949: main pid 12784: HINT: Future log output will appear in directory "/var/log/pgpool-II".
\[root@pgtest3 pgpool-II\]# ip a
1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:50:60:1a brd ff:ff:ff:ff:ff:ff
inet 192.168.24.13/24 brd 192.168.24.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::6d33:e581:1ad:cc2e/64 scope link noprefixroute
valid_lft forever preferred_lft forever
#### **8.2. 利用pgpool的在线恢复功能配置数据库主从**
确保pcp_recovery_node命令使用的recovery_1st_stage和pgpool_remote_start脚本位于PostgreSQL主服务器(pgtest1)的数据库群集目录中。
主节点操作:
\[root@pgtest1 pgdata\]# su - postgres
Last login: Mon May 12 12:59:26 CST 2025 on pts/0
-bash-4.2$ pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 1
Password:
pcp_recovery_node -- Command Successful
-bash-4.2$ pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 2
Password:
pcp_recovery_node -- Command Successful
验证节点1(pgtest2)和节点2(pgtest3)是否作为PostgreSQL备用服务器启动
**方法1:**
-bash-4.2$ ps -ajxf\|grep postgres

**方法2:**
-bash-4.2$ psql -h 192.168.24.15 -p 9999 -U pgpool postgres -c "show pool_nodes"

--查看postgresql服务状态
\[root@pgtest2 pgpool-II\]# systemctl status postgresql-14.service
● postgresql-14.service - PostgreSQL 14 database server
Loaded: loaded (/usr/lib/systemd/system/postgresql-14.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: https://www.postgresql.org/docs/14/static/
\[root@pgtest3 pgpool-II\]# systemctl status postgresql-14.service
● postgresql-14.service - PostgreSQL 14 database server
Loaded: loaded (/usr/lib/systemd/system/postgresql-14.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: https://www.postgresql.org/docs/14/static/
#systemctl status 只有主节点显示正常,其他节点不显示,但是查看ps -ajxf\|grep postgres可以看到所有节点数据库都正常启动。
### **九、集群管理**
#### **9.1 PGPool 集群\[all servers\]**
##### **9.1.1 查看集群配置信息**
\[root@pgtest1 pgpool-II\]# pcp_pool_status -h 192.168.24.15 -p 9898 -U pgpool -v
##### **9.1.2 查看集群节点详情**
# -h 表示集群IP,-p 表示PCP管道端口(默认是9898),-U 表示 PCP管道用户,-v表示查看详细内容
\[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:

##### **9.1.3 查看节点数量**
\[root@pgtest1 pgpool-II\]# pcp_node_count -h 192.168.24.15 -p 9898 -U pgpool
Password:
3
##### **9.1.4 查看指定节点信息**
\[root@pgtest1 pgpool-II\]# pcp_node_info -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v
Password:

##### **9.1.5增加一个集群节点**
# -n 表示节点序号(从0开始)
#pcp_attach_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v
##### **9.1.6 脱离一个集群节点**
#pcp_detach_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v
##### **9.1.7 提升一个备节点为活动节点**
#pcp_promote_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v
##### **9.1.8 恢复一个离线节点为集群节点**
#pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v
#### **9.2 PostgresSQL集群\[all servers\]**
##### **9.2.1 连接集群**
# psql -h 192.168.24.15 -p 9999 -Upostgres postgres
##### **9.2.2 查看集群状态**
# 可加参数 -h 192.168.24.15 通过 VIP
# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes"

### **十、测试切换**
#### **10.1 pgpool主备切换**
| **工具或命令** | **显示的角色** | **说明** |
|---------------------------|----------------------------------|-------------------------------------------------------------------|
| pcp_watchdog_info | **Pgpool-II** **节点的** **Leader** | 指 Pgpool 的 Watchdog 机制中哪个 Pgpool 节点是 Leader,控制虚拟 IP(VIP),与数据库角色无关 |
| psql -c "show pool_nodes" | **PostgreSQL** **节点的角色** | 指当前 Pgpool 节点感知到的数据库角色,如 primary、standby,用于负载均衡与 failover |
查看当前集群主备状态
\[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER
当前pgpool集群leader是pgtest1。
--pgtest1 启停pgpool服务
1)关闭pgtest1的pgpool服务
\[root@pgtest1 pgpool-II\]# systemctl stop pgpool.service
\[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.12:9999 Linux pgtest2 192.168.24.12
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 4 LEADER 0 MEMBER
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 10 SHUTDOWN 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER
可以看到此时Watchdog 的 Leader 节点为pgtest2(负责协调虚拟IP)。
查看pgtest2 ip地址
\[root@pgtest2 pgpool-II\]# ip a
1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:c8:ee:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.24.12/24 brd 192.168.24.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.24.15/24 scope global secondary ens33:1
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fec8:ee12/64 scope link
valid_lft forever preferred_lft forever
此时VIP已经正常漂移到pgtest2上。
2)启动pgtest1的pgpool服务
\[root@pgtest1 pgpool-II\]# systemctl start pgpool.service
\[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.12:9999 Linux pgtest2 192.168.24.12
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 4 LEADER 0 MEMBER
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER
#pgtest1重新启动pgpool服务后,重新加入了集群,并且变为Watchdog的standby节点。
\[root@pgtest1 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U pgpool postgres -c "show pool_nodes"

#查看数据库集群状态,pgtest1节点仍为primary主节点。
--pgtest2 启停pgpool服务
\[root@pgtest2 pgpool-II\]# systemctl stop pgpool.service
\[root@pgtest2 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.13:9999 Linux pgtest3 192.168.24.13
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 4 LEADER 0 MEMBER
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 10 SHUTDOWN 0 MEMBER
\[root@pgtest2 pgpool-II\]# systemctl start pgpool.service
\[root@pgtest2 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.13:9999 Linux pgtest3 192.168.24.13
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 4 LEADER 0 MEMBER
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER
#此时pgpool集群的watchdog leader切换为pgtest3
--pgtest3 启停pgpool服务
\[root@pgtest3 pgpool-II\]# systemctl stop pgpool.service
\[root@pgtest3 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 10 SHUTDOWN 0 MEMBER
\[root@pgtest3 pgpool-II\]# systemctl start pgpool.service
\[root@pgtest3 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER
#此时pgpool集群的watchdog leader已经切换为pgtest1
#### **10.2 postgresql数据库主备切换**
##### **10.2.1关闭主节点数据库**
\[root@pgtest1 pgdata\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes"

\[root@pgtest1 pgpool-II\]# systemctl stop postgresql-14
\[root@pgtest1 pgpool-II\]# ps -ajxf\|grep postgres
1680 25931 25930 1680 pts/0 25930 S+ 0 0:00 \| \\_ grep --color=auto postgres
##### **10.2.2查看数据库集群状态**
\[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes"

主节点数据库已关闭,节点2接管成为新的数据库集群primary主节点。
##### **10.2.3连接数据库集群**
\[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -Upostgres postgres
Password for user postgres:
psql (14.18)
Type "help" for help.
postgres=# create table t1 (id int);
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# select \* from t1;
id
----
1
(1 row)
连接集群正常,查询写入正常。
##### **10.2.4查看pgpool集群状态**
\[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool
Password:
3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11
192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER
192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER
192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER
Pgpool集群主节点依然在pgtest1上。
##### **10.2.5恢复数据**
从新的primary节点pgtest2上执行恢复数据到pgtest1:
\[root@pgtest2 pgpool-II\]# pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 0
Password:
pcp_recovery_node -- Command Successful
从新的数据库集群primary节点pgtest2上同步数据到pgtest1上数据库。
##### **10.2.6查看数据库集群状态**
\[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes"

Pgtest1数据库恢复。
##### **10.2.7 通过ps进程查看数据库状态**
\[root@pgtest2 pgpool-II\]# ps -ajxf\|grep postgres

当前pgtest2成为primary db,pgtest1、pgtest3为standby节点。
### **十一、问题**
#### **11.1关于pgpool_status状态文件**
在部署pgpool过程中,启动pgpool时报错,提示/tmp/pgpool_status file not exists,解决办法:
1. 创建状态文件
#touch /tmp/pgpool_status
#chown -R postgres. /tmp/pgpool_status
2、重启pgpool服务
#systemctl restart pgpool.service
# systemctl status pgpool.service
#### **11.2 关于pgpool日志目录**
在部署pgpool过程中,启动pgpool时报错提示could not open log file "/var/log/pgpool-II/pgpool-2023-10-20_112258.log",默认使用rpm包安装pgpool后配置文件默认pgpoll日志目录为/var/log/pgpool-II/,系统内未创建此目录。
解决办法:
1、创建目录
#mkdir -p /var/log/pgpool-II
#chown -R postgres. /var/log/pgpool-II
2、重启pgpool服务
#systemctl restart pgpool.service
# systemctl status pgpool.service
#### **11.3 集群状态总错误显示节点status为down**
Pgpool启停测试过程中,会出现点个别节点status显示down状态,实际是up状态的情况,如下:

查看各节点,发现状态文件pgpool_status其中一个显示down

解决办法:
停掉每个节点的pgpool_II,删除每个节点的状态文件,然后重启动pgpool_II
1、停掉pgpool服务
\[root@pgtest1 \~\]# systemctl stop pgpool.service
\[root@pgtest2 \~\]# systemctl stop pgpool.service
\[root@pgtest3 \~\]# systemctl stop pgpool.service
2、删除pgpool_status文件
\[root@pgtest1 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak
\[root@pgtest2 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak
\[root@pgtest3 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak
3、启动pgpool服务
\[root@pgtest1 \~\]# systemctl start pgpool.service
\[root@pgtest2 \~\]# systemctl start pgpool.service
\[root@pgtest3 \~\]# systemctl start pgpool.service
4、查看数据库集群状态
# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes"

参考:
[安装部署postgresql-15 高可用(pgpool)_postgresql15 + pgpool 构建容灾高可用集群-CSDN博客](https://blog.csdn.net/yang_z_1/article/details/129178092 "安装部署postgresql-15 高可用(pgpool)_postgresql15 + pgpool 构建容灾高可用集群-CSDN博客")
[PGPool-II 4.2.1+PostgreSQL 13高可用_pgpool2无法创建vip-CSDN博客](https://blog.csdn.net/along1976/article/details/113719487 "PGPool-II 4.2.1+PostgreSQL 13高可用_pgpool2无法创建vip-CSDN博客")
[pgpool_II节点状态问题(pgpool_status) - slnngk - 博客园](https://www.cnblogs.com/hxlasky/p/16994059.html "pgpool_II节点状态问题(pgpool_status) - slnngk - 博客园")