postgresql-14 高可用(pgpool)部署

前言

安装部署postgresql-14 高可用(pgpool)一主二从

All servers表示所有节点执行,master表示主节点执行。

核心组件说明

组件 作用
PostgreSQL 14 实际存储数据的数据库节点(主备结构)
Pgpool-II 数据库代理中间件,提供连接池、读写分离、负载均衡、主备切换、故障转移等功能
VIP (虚拟 IP 客户端统一连接入口,始终指向 Pgpool 主节点

一、部署环境

|----|--------|---------------|---------------|-----------|------------------|----------------|
| 序号 | 主机名 | IP地址 | VIP | 系统版本 | 数据库版本 | 软件版本 |
| 1 | ptest1 | 192.168.24.11 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |
| 2 | pest2 | 192.168.24.12 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |
| 3 | pest3 | 192.168.24.13 | 192.168.24.15 | centos7.9 | postgresql 14.18 | Pgpool-II4.3.6 |

软件下载地址:

Postgresql14: PostgreSQL: Linux downloads (Red Hat family)

Pgpool-II: Index of /yum/rpms/4.3/redhat/rhel-7-x86_64

二、Postgresql安装

2.1 配置hosts文件及sudo权限[all servers]

--配置hosts

root@pgtest1 \~\]# cat \>\> /etc/hosts \<\\> /etc/sudoers \<\\> /pg14/pgdata/postgresql.conf \<\< "EOF" listen_addresses = '\*' port=5432 logging_collector = on log_truncate_on_rotation = on archive_mode = on archive_command = 'cp %p /pg14/arch/%f' max_wal_senders = 10 max_replication_slots = 10 wal_level = replica hot_standby = on wal_log_hints = on password_encryption = 'md5' EOF #### **2.9 修改 pg_hba.conf文件\[master\]** # vi /pg14/pgdata/pg_hba.conf host all all 0.0.0.0/0 md5 host replication all 0.0.0.0/0 md5 #### **2.10 启动数据库\[master\]** # systemctl start postgresql-14 #### **2.11 创建数据库账户\[master\]** -bash-4.2$ psql -c "alter user postgres password 'postgres'" -bash-4.2$ psql -c "create user repl with replication password 'repl'" -bash-4.2$ psql -c "create user pgpool password 'pgpool'" -bash-4.2$ psql -c "grant pg_monitor to pgpool" ### **三、安装pgpool** #### **3.1上传pgpool安装文件\[all servers\]** \[root@pgtest1 \~\]# ls -l pgpool\*.rpm -rw-r--r-- 1 root root 1789236 Oct 20 09:47 pgpool-II-pg15-4.3.6-1pgdg.rhel7.x86_64.rpm -rw-r--r-- 1 root root 2711052 Oct 20 09:47 pgpool-II-pg15-debuginfo-4.3.6-1pgdg.rhel7.x86_64.rpm -rw-r--r-- 1 root root 18980 Oct 20 09:47 pgpool-II-pg15-devel-4.3.6-1pgdg.rhel7.x86_64.rpm -rw-r--r-- 1 root root 49936 Oct 20 09:47 pgpool-II-pg15-extensions-4.3.6-1pgdg.rhel7.x86_64.rpm #### **3.2 yum安装pgpool\[all servers\]** \[root@pgtest1 \~\]# yum localinstall -y pgpool\*.rpm #### **3.3 关闭pgpool服务自启动\[all servers\]** \[root@pgtest1 \~\]# systemctl disable pgpool.service ### **四、配置互信\[all servers\]** ##root用户 mkdir -p \~/.ssh; cd \~/.ssh; ssh-keygen -t rsa -f id_rsa_pgpool ssh-copy-id -i id_rsa_pgpool.pub [email protected] ssh-copy-id -i id_rsa_pgpool.pub [email protected] ssh-copy-id -i id_rsa_pgpool.pub [[email protected]](mailto:[email protected]) ##postgres用户 cd \~/.ssh; ssh-keygen -t rsa -f id_rsa_pgpool ssh-copy-id -i id_rsa_pgpool.pub [email protected] ssh-copy-id -i id_rsa_pgpool.pub [email protected] ssh-copy-id -i id_rsa_pgpool.pub [[email protected]](mailto:[email protected]) ##使用ssh [email protected] -i \~/.ssh/id_rsa_pgpool指令检验是否需要密码 \[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool Last login: Fri Oct 20 10:17:05 2023 from 192.168.24.11 -bash-4.2$ \[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool Last login: Fri Oct 20 10:16:59 2023 from 192.168.24.11 -bash-4.2$ exit \[root@pgtest1 .ssh\]# ssh [email protected] -i \~/.ssh/id_rsa_pgpool Last login: Fri Oct 20 10:16:50 2023 from pgtest1 -bash-4.2$ **若需要密码,上述目录均执行restorecon -R -v \~/.ssh** ### **五、配置密码文件** #### **5.1 pgpass文件\[all servers\]** #cat \> /var/lib/pgsql/.pgpass \<\<"EOF" 192.168.24.11:5432:replication:repl:repl 192.168.24.12:5432:replication:repl:repl 192.168.24.13:5432:replication:repl:repl 192.168.24.15:9999:postgres:pgpool:pgpool 192.168.24.15:9999:postgres:postgres:postgres 192.168.24.11:5432:postgres:postgres:postgres 192.168.24.12:5432:postgres:postgres:postgres 192.168.24.13:5432:postgres:postgres:postgres EOF #chown postgres:postgres /var/lib/pgsql/.pgpass #chmod 600 /var/lib/pgsql/.pgpass #### **5.2 配置pcppass文件\[all servers\]** #echo 'localhost:9898:pgpool:pgpool' \> /pg14/pgdata/.pcppass #chmod 600 /pg14/pgdata/.pcppass #chown postgres.postgres /pg14/pgdata/.pcppass #echo 'pgpool:'\`pg_md5 -u=pgpool pgpool\` \>\> /etc/pgpool-II/pcp.conf ### **六、创建pgpool_node_id** --节点192.168.24.11执行: # echo "0" \> /etc/pgpool-II/pgpool_node_id --节点192.168.24.12执行: # echo "1" \> /etc/pgpool-II/pgpool_node_id --节点192.168.24.13执行: # echo "2" \> /etc/pgpool-II/pgpool_node_id ### **七、Pgpool-II参数配置** #### **7.1 pgpool.conf文件配置\[all servers\]** #cat \> /etc/pgpool-II/pgpool.conf \<\<"EOF" backend_clustering_mode = 'streaming_replication' # - pgpool Connection Settings - listen_addresses = '\*' port=9999 socket_dir = '/pg14/pgdata/' pcp_listen_addresses = '\*' pcp_port = 9898 pcp_socket_dir = '/pg14/pgdata/' # - Streaming Replication Check sr_check_user = 'pgpool' sr_check_password = 'pgpool' # - Health Check health_check_period = 5 health_check_timeout = 30 health_check_user = 'pgpool' health_check_password = 'pgpool' health_check_max_retries = 3 search_primary_node_timeout = 10min # - Backend Connection Settings - backend_hostname0 = '192.168.24.11' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/pg14/pgdata' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_application_name0 = 'pgtest1' backend_hostname1 = '192.168.24.12' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/pg14/pgdata' backend_flag1 = 'ALLOW_TO_FAILOVER' backend_application_name1 = 'pgtest2' backend_hostname2 = '192.168.24.13' backend_port2 = 5432 backend_weight2 = 1 backend_data_directory2 = '/pg14/pgdata' backend_flag2 = 'ALLOW_TO_FAILOVER' backend_application_name2 = 'pgtest3' # - Failover configuration failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S' follow_primary_command = '/etc/pgpool-II/follow_primary.sh %d %h %p %D %m %H %M %P %r %R' # - Online Recovery Configurations recovery_user = 'postgres' recovery_password = 'postgres' recovery_1st_stage_command = 'recovery_1st_stage' # - Client Authentication Configuration enable_pool_hba = on pool_passwd = '/etc/pgpool-II/pool_passwd' # - Watchdog Configuration use_watchdog = on delegate_IP = '192.168.24.15' if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev ens33 label ens33:1' if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev ens33' arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I ens33' hostname0 = '192.168.24.11' wd_port0 = 9000 pgpool_port0 = 9999 hostname1 = '192.168.24.12' wd_port1 = 9000 pgpool_port1 = 9999 hostname2 = '192.168.24.13' wd_port2 = 9000 pgpool_port2 = 9999 wd_lifecheck_method = 'heartbeat' wd_interval = 10 heartbeat_hostname0 = '192.168.24.11' heartbeat_port0 = 9694 heartbeat_device0 = '' heartbeat_hostname1 = '192.168.24.12' heartbeat_port1 = 9694 heartbeat_device1 = '' heartbeat_hostname2 = '192.168.24.13' heartbeat_port2 = 9694 heartbeat_device2 = '' wd_heartbeat_keepalive = 2 wd_heartbeat_deadtime = 30 wd_escalation_command = '/etc/pgpool-II/escalation.sh' # - Where to log - log_destination = 'stderr' logging_collector = on log_directory = '/var/log/pgpool-II' log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log' log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 10MB pid_file_name = '/var/run/postgresql/pgpool.pid' #------------------------------------------------------------------------------ # LOAD BALANCING MODE #------------------------------------------------------------------------------ load_balance_mode = on EOF #### **7.2 failover.sh、follow_primary.sh文件配置\[all servers\]** --从模板创建文件 #cp -p /etc/pgpool-II/failover.sh{.sample,} #cp -p /etc/pgpool-II/follow_primary.sh{.sample,} #chown postgres:postgres /etc/pgpool-II/{failover.sh,follow_primary.sh} #chmod +x /etc/pgpool-II/{failover.sh,follow_primary.sh} --修改follow_primary.sh #根据实际情况修改配置参数,本测试环境只需修改ARCHIVEDIR ARCHIVEDIR=/pg14/arch #### **7.3 Pgpool-II在线恢复配置\[master\]** 根据模板创建recovery_First_stage和pgpool_remote_start脚本并放在pg集簇目录中 #cp -p /etc/pgpool-II/recovery_1st_stage.sample /pg14/pgdata/recovery_1st_stage #cp -p /etc/pgpool-II/pgpool_remote_start.sample /pg14/pgdata/pgpool_remote_start #chown postgres:postgres /pg14/pgdata/{recovery_1st_stage,pgpool_remote_start} #chmod +x /pg14/pgdata/{recovery_1st_stage,pgpool_remote_start} --修改recovery_1st_stage脚本,根据实际情况更改。 ARCHIVEDIR=/pg14/arch #注意脚本recovery_1st_stage和pgpool_remote_start中的PGHOME是否配置正确。脚本follow_primary.sh中的PCP_USER为pgpool。 --安装pgpool_recovery插件 #为了使用在线恢复功能,需要pgpool_recovery、pgpool_remote_start、pgpool_switch_xlog等功能,因此我们需要在PostgreSQL server server1的template1上安装pgpool_recovery,以后新建数据库将自动安装pgpool_recovery插件。 #su - postgres $psql template1 -c "CREATE EXTENSION pgpool_recovery" $psql postgres -c "CREATE EXTENSION pgpool_recovery" #### **7.4修改 pool_hba.conf文件\[all servers\]** #echo "host all all 0.0.0.0/0 md5" \>\> /etc/pgpool-II/pool_hba.conf #### **7.5 配置pool_passwd\[all servers\]** --检查pool_passwd文件 #cat /etc/pgpool-II/pool_passwd --使用pg_md5加密用户口令 #pg_md5 --md5auth --username=pgpool "pgpool" #pg_md5 --md5auth --username=postgres "postgres" #### **7.6 watchdog配置\[all servers\]** --创建配置文件 #cat \> /etc/pgpool-II/escalation.sh \<\<"EOF" #!/bin/bash # This script is run by wd_escalation_command to bring down the virtual IP on other pgpool nodes # before bringing up the virtual IP on the new active pgpool node. set -o xtrace PGPOOLS=(192.168.24.11 192.168.24.12 192.168.24.13) VIP=192.168.24.15 DEVICE=ens33 for pgpool in "${PGPOOLS\[@\]}"; do \[ "$HOSTNAME" = "$pgpool" \] \&\& continue ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres@$pgpool -i \~/.ssh/id_rsa_pgpool " /usr/bin/sudo /sbin/ip addr del $VIP/24 dev $DEVICE " done exit 0 EOF --更改文件权限 #chown postgres:postgres /etc/pgpool-II/escalation.sh --所有脚本文件添加执行权限 #chmod +x /etc/pgpool-II/\*.sh ![](https://i-blog.csdnimg.cn/direct/6becc20a9dca48dfbf2a4cb3eebd1289.png) pgpool在线恢复功能: 为了使用Pgpool II执行在线恢复,在pgpool.conf文件中我们指定PostgreSQL用户名和在线恢复命令recovery_1st_stage。因为执行在线恢复需要PostgreSQL中的超级用户权限,所以我们在recovery_user中指定postgres_user。然后,我们在PostgreSQL主服务器(server1)的数据库集群目录中创建recovery_1st_stage和pgpool_remote_start,并添加执行权限。 ### **八、配置PostgreSQL主备库集群** #### **8.1. 启动pgpool-II\[all servers\]** --创建日志目录 #touch /tmp/pgpool_status #chown -R postgres. /tmp/pgpool_status #mkdir -p /var/log/pgpool-II #chown -R postgres. /var/log/pgpool-II #systemctl start pgpool.service \[各节点依次启动服务

--查看服务状态

##节点1状态

root@pgtest1 pgpool-II\]# systemctl status pgpool.service ● pgpool.service - Pgpool-II Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2025-05-12 13:04:54 CST; 1min 1s ago Main PID: 17916 (pgpool) Tasks: 45 CGroup: /system.slice/pgpool.service ├─17916 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n ├─17919 pgpool: PgpoolLogger ├─17923 pgpool: watchdog ├─17933 pgpool: lifecheck ├─17934 pgpool: heartbeat receiver ├─17935 pgpool: heartbeat sender ├─17936 pgpool: heartbeat receiver ├─17937 pgpool: wait for connection request ├─17938 pgpool: wait for connection request ├─17939 pgpool: heartbeat sender ├─17940 pgpool: wait for connection request ├─17941 pgpool: wait for connection request ├─17942 pgpool: wait for connection request ├─17943 pgpool: wait for connection request ├─17944 pgpool: wait for connection request ├─17945 pgpool: wait for connection request ├─17946 pgpool: wait for connection request ├─17947 pgpool: wait for connection request ├─17948 pgpool: wait for connection request ├─17949 pgpool: wait for connection request ├─17950 pgpool: wait for connection request ├─17951 pgpool: wait for connection request ├─17952 pgpool: wait for connection request ├─17953 pgpool: wait for connection request ├─17954 pgpool: wait for connection request ├─17955 pgpool: wait for connection request ├─17956 pgpool: wait for connection request ├─17957 pgpool: wait for connection request ├─17958 pgpool: wait for connection request ├─17959 pgpool: wait for connection request ├─17960 pgpool: wait for connection request ├─17961 pgpool: wait for connection request ├─17962 pgpool: wait for connection request ├─17963 pgpool: wait for connection request ├─17964 pgpool: wait for connection request ├─17965 pgpool: wait for connection request ├─17966 pgpool: wait for connection request ├─17967 pgpool: wait for connection request ├─17968 pgpool: wait for connection request ├─17969 pgpool: wait for connection request ├─17973 pgpool: health check process(0) ├─17975 pgpool: health check process(1) ├─17976 pgpool: health check process(2) ├─18096 pgpool: PCP: wait for connection request └─18097 pgpool: worker process May 12 13:04:54 pgtest1 systemd\[1\]: Started Pgpool-II. May 12 13:04:54 pgtest1 pgpool\[17916\]: 2025-05-12 13:04:54.188: main pid 17916: LOG: redirecting log output to logging collector process May 12 13:04:54 pgtest1 pgpool\[17916\]: 2025-05-12 13:04:54.188: main pid 17916: HINT: Future log output will appear in directory "/var/log/pgpool-II". May 12 13:05:06 pgtest1 sudo\[18010\]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip addr add 192.168.24.15/24 dev ens33 label ens33:1 May 12 13:05:07 pgtest1 sudo\[18013\]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/sbin/arping -U 192.168.24.15 -w 1 -I ens33 \[root@pgtest1 pgpool-II\]# ip a 1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:b0:62:cd brd ff:ff:ff:ff:ff:ff inet 192.168.24.11/24 brd 192.168.24.255 scope global noprefixroute ens33 valid_lft forever preferred_lft forever inet 192.168.24.15/24 scope global secondary ens33:1 valid_lft forever preferred_lft forever inet6 fe80::a2cc:510c:af40:af87/64 scope link noprefixroute valid_lft forever preferred_lft forever ##节点2状态 \[root@pgtest2 pgpool-II\]# systemctl status pgpool.service ● pgpool.service - Pgpool-II Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2025-05-12 13:04:59 CST; 1min 48s ago Main PID: 17535 (pgpool) Tasks: 45 CGroup: /system.slice/pgpool.service ├─17535 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n ├─17538 pgpool: PgpoolLogger ├─17542 pgpool: watchdog ├─17548 pgpool: lifecheck ├─17549 pgpool: heartbeat receiver ├─17550 pgpool: heartbeat sender ├─17551 pgpool: heartbeat receiver ├─17552 pgpool: wait for connection request ├─17553 pgpool: wait for connection request ├─17554 pgpool: heartbeat sender ├─17555 pgpool: wait for connection request ├─17556 pgpool: wait for connection request ├─17558 pgpool: wait for connection request ├─17559 pgpool: wait for connection request ├─17560 pgpool: wait for connection request ├─17561 pgpool: wait for connection request ├─17562 pgpool: wait for connection request ├─17563 pgpool: wait for connection request ├─17564 pgpool: wait for connection request ├─17565 pgpool: wait for connection request ├─17566 pgpool: wait for connection request ├─17567 pgpool: wait for connection request ├─17568 pgpool: wait for connection request ├─17569 pgpool: wait for connection request ├─17570 pgpool: wait for connection request ├─17571 pgpool: wait for connection request ├─17572 pgpool: wait for connection request ├─17573 pgpool: wait for connection request ├─17574 pgpool: wait for connection request ├─17575 pgpool: wait for connection request ├─17576 pgpool: wait for connection request ├─17577 pgpool: wait for connection request ├─17578 pgpool: wait for connection request ├─17579 pgpool: wait for connection request ├─17580 pgpool: wait for connection request ├─17581 pgpool: wait for connection request ├─17582 pgpool: wait for connection request ├─17583 pgpool: wait for connection request ├─17584 pgpool: wait for connection request ├─17585 pgpool: wait for connection request ├─17586 pgpool: PCP: wait for connection request ├─17588 pgpool: health check process(0) ├─17589 pgpool: health check process(1) ├─17590 pgpool: health check process(2) └─17610 pgpool: worker process May 12 13:04:59 pgtest2 systemd\[1\]: Started Pgpool-II. May 12 13:04:59 pgtest2 pgpool\[17535\]: 2025-05-12 13:04:59.645: main pid 17535: LOG: redirecting log output to logging collector process May 12 13:04:59 pgtest2 pgpool\[17535\]: 2025-05-12 13:04:59.645: main pid 17535: HINT: Future log output will appear in directory "/var/log/pgpool-II".. \[root@pgtest2 pgpool-II\]# ip a 1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:c8:ee:12 brd ff:ff:ff:ff:ff:ff inet 192.168.24.12/24 brd 192.168.24.255 scope global noprefixroute ens33 valid_lft forever preferred_lft forever inet6 fe80::15bb:4008:354c:4f0f/64 scope link noprefixroute valid_lft forever preferred_lft forever ##节点3状态 \[root@pgtest3 pgpool-II\]# systemctl status pgpool.service ● pgpool.service - Pgpool-II Loaded: loaded (/usr/lib/systemd/system/pgpool.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2025-05-12 13:05:02 CST; 2min 28s ago Main PID: 12784 (pgpool) Tasks: 45 CGroup: /system.slice/pgpool.service ├─12784 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n ├─12786 pgpool: PgpoolLogger ├─12791 pgpool: watchdog ├─12795 pgpool: lifecheck ├─12796 pgpool: heartbeat receiver ├─12797 pgpool: heartbeat sender ├─12798 pgpool: heartbeat receiver ├─12799 pgpool: heartbeat sender ├─12800 pgpool: wait for connection request ├─12801 pgpool: wait for connection request ├─12802 pgpool: wait for connection request ├─12803 pgpool: wait for connection request ├─12804 pgpool: wait for connection request ├─12805 pgpool: wait for connection request ├─12806 pgpool: wait for connection request ├─12807 pgpool: wait for connection request ├─12808 pgpool: wait for connection request ├─12809 pgpool: wait for connection request ├─12810 pgpool: wait for connection request ├─12811 pgpool: wait for connection request ├─12812 pgpool: wait for connection request ├─12813 pgpool: wait for connection request ├─12814 pgpool: wait for connection request ├─12815 pgpool: wait for connection request ├─12816 pgpool: wait for connection request ├─12817 pgpool: wait for connection request ├─12818 pgpool: wait for connection request ├─12819 pgpool: wait for connection request ├─12820 pgpool: wait for connection request ├─12821 pgpool: wait for connection request ├─12822 pgpool: wait for connection request ├─12823 pgpool: wait for connection request ├─12824 pgpool: wait for connection request ├─12825 pgpool: wait for connection request ├─12826 pgpool: wait for connection request ├─12827 pgpool: wait for connection request ├─12828 pgpool: wait for connection request ├─12829 pgpool: wait for connection request ├─12830 pgpool: wait for connection request ├─12831 pgpool: wait for connection request ├─12832 pgpool: PCP: wait for connection request ├─12834 pgpool: health check process(0) ├─12835 pgpool: health check process(1) ├─12836 pgpool: health check process(2) └─12863 pgpool: worker process May 12 13:05:02 pgtest3 systemd\[1\]: Started Pgpool-II. May 12 13:05:02 pgtest3 pgpool\[12784\]: 2025-05-12 13:05:02.949: main pid 12784: LOG: redirecting log output to logging collector process May 12 13:05:02 pgtest3 pgpool\[12784\]: 2025-05-12 13:05:02.949: main pid 12784: HINT: Future log output will appear in directory "/var/log/pgpool-II". \[root@pgtest3 pgpool-II\]# ip a 1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:50:60:1a brd ff:ff:ff:ff:ff:ff inet 192.168.24.13/24 brd 192.168.24.255 scope global noprefixroute ens33 valid_lft forever preferred_lft forever inet6 fe80::6d33:e581:1ad:cc2e/64 scope link noprefixroute valid_lft forever preferred_lft forever #### **8.2. 利用pgpool的在线恢复功能配置数据库主从** 确保pcp_recovery_node命令使用的recovery_1st_stage和pgpool_remote_start脚本位于PostgreSQL主服务器(pgtest1)的数据库群集目录中。 主节点操作: \[root@pgtest1 pgdata\]# su - postgres Last login: Mon May 12 12:59:26 CST 2025 on pts/0 -bash-4.2$ pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 1 Password: pcp_recovery_node -- Command Successful -bash-4.2$ pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 2 Password: pcp_recovery_node -- Command Successful 验证节点1(pgtest2)和节点2(pgtest3)是否作为PostgreSQL备用服务器启动 **方法1:** -bash-4.2$ ps -ajxf\|grep postgres ![](https://i-blog.csdnimg.cn/direct/d8d346e93ae743479c007eda2909d056.png) **方法2:** -bash-4.2$ psql -h 192.168.24.15 -p 9999 -U pgpool postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/e3096b1f9f68407cad3038067a5edf03.png) --查看postgresql服务状态 \[root@pgtest2 pgpool-II\]# systemctl status postgresql-14.service ● postgresql-14.service - PostgreSQL 14 database server Loaded: loaded (/usr/lib/systemd/system/postgresql-14.service; enabled; vendor preset: disabled) Active: inactive (dead) Docs: https://www.postgresql.org/docs/14/static/ \[root@pgtest3 pgpool-II\]# systemctl status postgresql-14.service ● postgresql-14.service - PostgreSQL 14 database server Loaded: loaded (/usr/lib/systemd/system/postgresql-14.service; enabled; vendor preset: disabled) Active: inactive (dead) Docs: https://www.postgresql.org/docs/14/static/ #systemctl status 只有主节点显示正常,其他节点不显示,但是查看ps -ajxf\|grep postgres可以看到所有节点数据库都正常启动。 ### **九、集群管理** #### **9.1 PGPool 集群\[all servers\]** ##### **9.1.1 查看集群配置信息** \[root@pgtest1 pgpool-II\]# pcp_pool_status -h 192.168.24.15 -p 9898 -U pgpool -v ##### **9.1.2 查看集群节点详情** # -h 表示集群IP,-p 表示PCP管道端口(默认是9898),-U 表示 PCP管道用户,-v表示查看详细内容 \[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: ![](https://i-blog.csdnimg.cn/direct/87f1fbd3f8a14664be2c35233286ed03.png) ##### **9.1.3 查看节点数量** \[root@pgtest1 pgpool-II\]# pcp_node_count -h 192.168.24.15 -p 9898 -U pgpool Password: 3 ##### **9.1.4 查看指定节点信息** \[root@pgtest1 pgpool-II\]# pcp_node_info -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v Password: ![](https://i-blog.csdnimg.cn/direct/507b8000b91045a1845f1afaec721572.png) ##### **9.1.5增加一个集群节点** # -n 表示节点序号(从0开始) #pcp_attach_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v ##### **9.1.6 脱离一个集群节点** #pcp_detach_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v ##### **9.1.7 提升一个备节点为活动节点** #pcp_promote_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v ##### **9.1.8 恢复一个离线节点为集群节点** #pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 -v #### **9.2 PostgresSQL集群\[all servers\]** ##### **9.2.1 连接集群** # psql -h 192.168.24.15 -p 9999 -Upostgres postgres ##### **9.2.2 查看集群状态** # 可加参数 -h 192.168.24.15 通过 VIP # psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/ab81f13481f9453995af58baa2418d7f.png) ### **十、测试切换** #### **10.1 pgpool主备切换** | **工具或命令** | **显示的角色** | **说明** | |---------------------------|----------------------------------|-------------------------------------------------------------------| | pcp_watchdog_info | **Pgpool-II** **节点的** **Leader** | 指 Pgpool 的 Watchdog 机制中哪个 Pgpool 节点是 Leader,控制虚拟 IP(VIP),与数据库角色无关 | | psql -c "show pool_nodes" | **PostgreSQL** **节点的角色** | 指当前 Pgpool 节点感知到的数据库角色,如 primary、standby,用于负载均衡与 failover | 查看当前集群主备状态 \[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER 当前pgpool集群leader是pgtest1。 --pgtest1 启停pgpool服务 1)关闭pgtest1的pgpool服务 \[root@pgtest1 pgpool-II\]# systemctl stop pgpool.service \[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.12:9999 Linux pgtest2 192.168.24.12 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 4 LEADER 0 MEMBER 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 10 SHUTDOWN 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER 可以看到此时Watchdog 的 Leader 节点为pgtest2(负责协调虚拟IP)。 查看pgtest2 ip地址 \[root@pgtest2 pgpool-II\]# ip a 1: lo: \ mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: \ mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:c8:ee:12 brd ff:ff:ff:ff:ff:ff inet 192.168.24.12/24 brd 192.168.24.255 scope global noprefixroute ens33 valid_lft forever preferred_lft forever inet 192.168.24.15/24 scope global secondary ens33:1 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fec8:ee12/64 scope link valid_lft forever preferred_lft forever 此时VIP已经正常漂移到pgtest2上。 2)启动pgtest1的pgpool服务 \[root@pgtest1 pgpool-II\]# systemctl start pgpool.service \[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.12:9999 Linux pgtest2 192.168.24.12 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 4 LEADER 0 MEMBER 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER #pgtest1重新启动pgpool服务后,重新加入了集群,并且变为Watchdog的standby节点。 \[root@pgtest1 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U pgpool postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/8999d1eaee62443487a630cf616b3d97.png) #查看数据库集群状态,pgtest1节点仍为primary主节点。 --pgtest2 启停pgpool服务 \[root@pgtest2 pgpool-II\]# systemctl stop pgpool.service \[root@pgtest2 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.13:9999 Linux pgtest3 192.168.24.13 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 4 LEADER 0 MEMBER 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 10 SHUTDOWN 0 MEMBER \[root@pgtest2 pgpool-II\]# systemctl start pgpool.service \[root@pgtest2 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.13:9999 Linux pgtest3 192.168.24.13 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 4 LEADER 0 MEMBER 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 7 STANDBY 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER #此时pgpool集群的watchdog leader切换为pgtest3 --pgtest3 启停pgpool服务 \[root@pgtest3 pgpool-II\]# systemctl stop pgpool.service \[root@pgtest3 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 10 SHUTDOWN 0 MEMBER \[root@pgtest3 pgpool-II\]# systemctl start pgpool.service \[root@pgtest3 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER #此时pgpool集群的watchdog leader已经切换为pgtest1 #### **10.2 postgresql数据库主备切换** ##### **10.2.1关闭主节点数据库** \[root@pgtest1 pgdata\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/f61a4d9ee64145b5a5989be8a0db3d49.png) \[root@pgtest1 pgpool-II\]# systemctl stop postgresql-14 \[root@pgtest1 pgpool-II\]# ps -ajxf\|grep postgres 1680 25931 25930 1680 pts/0 25930 S+ 0 0:00 \| \\_ grep --color=auto postgres ##### **10.2.2查看数据库集群状态** \[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/5a09baa9ac83448d9781de290f20a9b2.png) 主节点数据库已关闭,节点2接管成为新的数据库集群primary主节点。 ##### **10.2.3连接数据库集群** \[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -Upostgres postgres Password for user postgres: psql (14.18) Type "help" for help. postgres=# create table t1 (id int); CREATE TABLE postgres=# insert into t1 values(1); INSERT 0 1 postgres=# select \* from t1; id ---- 1 (1 row) 连接集群正常,查询写入正常。 ##### **10.2.4查看pgpool集群状态** \[root@pgtest1 pgpool-II\]# pcp_watchdog_info -h 192.168.24.15 -p 9898 -U pgpool Password: 3 3 YES 192.168.24.11:9999 Linux pgtest1 192.168.24.11 192.168.24.11:9999 Linux pgtest1 192.168.24.11 9999 9000 4 LEADER 0 MEMBER 192.168.24.12:9999 Linux pgtest2 192.168.24.12 9999 9000 7 STANDBY 0 MEMBER 192.168.24.13:9999 Linux pgtest3 192.168.24.13 9999 9000 7 STANDBY 0 MEMBER Pgpool集群主节点依然在pgtest1上。 ##### **10.2.5恢复数据** 从新的primary节点pgtest2上执行恢复数据到pgtest1: \[root@pgtest2 pgpool-II\]# pcp_recovery_node -h 192.168.24.15 -p 9898 -U pgpool -n 0 Password: pcp_recovery_node -- Command Successful 从新的数据库集群primary节点pgtest2上同步数据到pgtest1上数据库。 ##### **10.2.6查看数据库集群状态** \[root@pgtest2 pgpool-II\]# psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/08842f2b72a94253a27e60ea8842479a.png) Pgtest1数据库恢复。 ##### **10.2.7 通过ps进程查看数据库状态** \[root@pgtest2 pgpool-II\]# ps -ajxf\|grep postgres ![](https://i-blog.csdnimg.cn/direct/64f5bbdff90a425fac4b92bf0a7f2c37.png) 当前pgtest2成为primary db,pgtest1、pgtest3为standby节点。 ### **十一、问题** #### **11.1关于pgpool_status状态文件** 在部署pgpool过程中,启动pgpool时报错,提示/tmp/pgpool_status file not exists,解决办法: 1. 创建状态文件 #touch /tmp/pgpool_status #chown -R postgres. /tmp/pgpool_status 2、重启pgpool服务 #systemctl restart pgpool.service # systemctl status pgpool.service #### **11.2 关于pgpool日志目录** 在部署pgpool过程中,启动pgpool时报错提示could not open log file "/var/log/pgpool-II/pgpool-2023-10-20_112258.log",默认使用rpm包安装pgpool后配置文件默认pgpoll日志目录为/var/log/pgpool-II/,系统内未创建此目录。 解决办法: 1、创建目录 #mkdir -p /var/log/pgpool-II #chown -R postgres. /var/log/pgpool-II 2、重启pgpool服务 #systemctl restart pgpool.service # systemctl status pgpool.service #### **11.3 集群状态总错误显示节点status为down** Pgpool启停测试过程中,会出现点个别节点status显示down状态,实际是up状态的情况,如下: ![](https://i-blog.csdnimg.cn/direct/3f4c55f8fc984a8d801200bf2ded1e19.png) 查看各节点,发现状态文件pgpool_status其中一个显示down ![](https://i-blog.csdnimg.cn/direct/fff793e1a6ad45fc99baa79e7a192159.png) 解决办法: 停掉每个节点的pgpool_II,删除每个节点的状态文件,然后重启动pgpool_II 1、停掉pgpool服务 \[root@pgtest1 \~\]# systemctl stop pgpool.service \[root@pgtest2 \~\]# systemctl stop pgpool.service \[root@pgtest3 \~\]# systemctl stop pgpool.service 2、删除pgpool_status文件 \[root@pgtest1 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak \[root@pgtest2 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak \[root@pgtest3 \~\]# mv /tmp/pgpool_status /tmp/pgpool_status.bak 3、启动pgpool服务 \[root@pgtest1 \~\]# systemctl start pgpool.service \[root@pgtest2 \~\]# systemctl start pgpool.service \[root@pgtest3 \~\]# systemctl start pgpool.service 4、查看数据库集群状态 # psql -h 192.168.24.15 -p 9999 -U postgres postgres -c "show pool_nodes" ![](https://i-blog.csdnimg.cn/direct/85ebdab0fb504ec48602b798f352705d.png) 参考: [安装部署postgresql-15 高可用(pgpool)_postgresql15 + pgpool 构建容灾高可用集群-CSDN博客](https://blog.csdn.net/yang_z_1/article/details/129178092 "安装部署postgresql-15 高可用(pgpool)_postgresql15 + pgpool 构建容灾高可用集群-CSDN博客") [PGPool-II 4.2.1+PostgreSQL 13高可用_pgpool2无法创建vip-CSDN博客](https://blog.csdn.net/along1976/article/details/113719487 "PGPool-II 4.2.1+PostgreSQL 13高可用_pgpool2无法创建vip-CSDN博客") [pgpool_II节点状态问题(pgpool_status) - slnngk - 博客园](https://www.cnblogs.com/hxlasky/p/16994059.html "pgpool_II节点状态问题(pgpool_status) - slnngk - 博客园")

相关推荐
king_harry13 小时前
PostgreSQL14 +patroni+etcd+haproxy+keepalived 集群部署指南
postgresql·pg高可用
weixin_4342556115 小时前
PostgreSQL跨数据库表字段值复制实战经验分
数据库·postgresql
潘yi.1 天前
PostgreSQL日常维护
数据库·postgresql
霖檬ing1 天前
PostgreSQL
数据库·postgresql
2401_836836591 天前
PostgreSQL使用
数据库·postgresql
大鹅同志1 天前
Ubuntu 20.04卸载并重装 PostgreSQL
linux·ubuntu·postgresql
Vic101011 天前
创建索引:GaussDB(PostgreSQL)开发笔记
笔记·postgresql·gaussdb
昭阳~2 天前
PostgreSQL日常维护
数据库·postgresql
YUNYINGXIA2 天前
PostgreSQL初体验
数据库·postgresql