【PostgreSQL-patroni维护命令】

手册
https://patroni.readthedocs.io/en/latest/index.html

patroni运维

1.列出节点信息

patronictl -c /etc/patroni.yml list

2.重做备库

reinit先是移除了整个data目录。然后选择正确的节点进行备份恢复。

patronictl -c /etc/patroni.yml reinit patnori-test

language 复制代码
# patronictl -c /etc/patroni.yml reinit pgsql
+ Cluster: pgsql (6972099274779350082)+------+---------+----+-----------+
|   Member    |        Host         |  Role  |  State  | TL | Lag in MB |
+-------------+---------------------+--------+---------+----+-----------+
| pgsql_node1 | 192.168.22.128:5432 |        | running | 3  |      0    |
| pgsql_node2 | 192.168.22.129:5432 | Leader | running | 3  |           |
| pgsql_node3 | 192.168.22.130:5432 |        | running | 3  |      0    |
+-------------+---------------------+--------+---------+----+-----------+
Select the following node names to add: pgsql_node3
 Are you sure you want to reinitialize members pgsql_node3?[y/N]: y
 Success: for member pgsql_node3 Perform initialization

参数查看参数

patronictl -c /etc/patroni.yaml show-config

更改参数

patronictl -c /etc/patroni.yaml edit-config

reload加载参数(同时在三个节点生效)

patronictl -c /etc/patroni.yaml reload patnori-test

重启节点/关闭节点

加- -force 强制

加- -scheduled 2023-09-13T18:00-03:00 定时

1.仅重启当前节点(node1节点)

[root@pgtest1 ~]# patronictl restart pg_cluster node1

2.如果节点是 pending 状态的,才会执行重启操作

[root@pgtest1 ~]# patronictl restart pg_cluster --pending

3.重启所有成员

[root@pgtest1 ~]# patronictl restart pg_cluster

维护模式,脱离patroni的集群管理

patronictl pause

patronictl pause暂时将Patroni集群置于维护模式并禁用自动故障转移。

在某些情况下,Patroni需要暂时退出集群管理,同时仍然在DCS中保留集群状态。可能的用例是集群上不常见的活动,例如主要版本升级或损坏恢复。在这些活动期间,节点经常因为Patroni不知道的原因而启动和停止,有些节点甚至可以暂时提升,这违反了只运行一个主节点的假设。因此,Patroni需要能够与正在运行的集群"分离",在Pacemaker中实现与维护模式相当的功能。

patronictl resume

patronictl resume将使Patroni集群退出维护模式,并重新启用自动故障转移。

自动拉起所有数据库

switchover切换

patronictl switchover

language 复制代码
# Switchover
[postgres@pgtest1 ~]$ patronictl switchover
Master [pgtest1]: 
Candidate ['pgtest2', 'pgtest3'] []: pgtest2
When should the switchover take place (e.g. 2021-10-28T04:45 )  [now]: 
Current cluster topology
... ...
Are you sure you want to switchover cluster pg_cluster, demoting current master pgtest1? [y/N]: y
2021-10-28 03:45:35.91763 Successfully switched over to "pgtest2"
... ...
language 复制代码
 数据库从 pgtest1 switchover 到 pgtest2
[root@pgtest1 ~]# curl -s http://192.168.58.10:8008/switchover -XPOST -d '{"leader":"pgtest1","candidate":"pgtest2"}'
Successfully switched over to "pgtest2"

failover切换

patronictl failover

# Failover
[postgres@pgtest1 ~]$ patronictl failover
Candidate ['pgtest1', 'pgtest3'] []: pgtest1
Current cluster topology
... ...
Are you sure you want to failover cluster pg_cluster, demoting current master pgtest2? [y/N]: y
2021-10-28 03:47:56.13486 Successfully failed over to "pgtest1"
... ...

执行到特定节点的 failover,在节点都正常的情况下,执行 failover 实际上和执行 Switchover 一样

curl -s http://192.168.58.10:8008/failover -XPOST -d '{"candidate":"pgtest2"}'

language 复制代码
[root@pgtest1 ~]# patronictl list
+---------+---------------+---------+---------+----+-----------+
| Member  | Host          | Role    | State   | TL | Lag in MB |
+ Cluster: pg_cluster (7025023477017500881) --+----+-----------+
| pgtest1 | 192.168.58.10 | Leader  | running |  5 |           |
| pgtest2 | 192.168.58.11 | Replica | running |  5 |         0 |
| pgtest3 | 192.168.58.12 | Replica | running |  5 |         0 |
+---------+---------------+---------+---------+----+-----------+
[root@pgtest1 ~]# curl -s http://192.168.58.10:8008/failover -XPOST -d '{"candidate":"pgtest2"}'

使用 patronictl 执行数据库查询操作

language 复制代码
[postgres@pgtest1 ~]$ cat aa.sql
select * from test_1;
[postgres@pgtest1 ~]$ patronictl -c /enmo/app/patroni/pg_test01.yml query -f aa.sql --password
Password: 
id      create_time
1       2021-10-16 17:47:34
2       2021-10-16 17:55:06
[postgres@pgtest1 ~]$ patronictl -c /enmo/app/patroni/pg_test01.yml query -c "select * from test_1;" --password
Password: 
id      create_time
1       2021-10-16 17:47:34
2       2021-10-16 17:55:06

获取主节点dsn信息

$ ./patronictl -c postgres.yml dsn pgsql

host=192.168.1.143 port=5432

ETCD

1.查看etcd节点

etcdctl endpoint status --cluster -w table

language 复制代码
 +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://192.168.58.11:2379 | 3f414532c235ce16 |   3.5.1 |   20 kB |     false |      false |         4 |         16 |                 16 |        |
| http://192.168.58.12:2379 | 41cf8a739c2e9b50 |   3.5.1 |   20 kB |     false |      false |         4 |         16 |                 16 |        |
| http://192.168.58.10:2379 | caef4208a95efee8 |   3.5.1 |   25 kB |      true |      false |         4 |         16 |                 16 |        |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

2.转移leader节点

etcdctl move-leader 3f414532c235ce16

3.保存数据快照(一个节点执行)

etcdctl snapshot save etcd_bak.db

4.查看快照信息

etcdctl snapshot status etcd_bak.db -w table

[root@pgtest1 ~]# etcdctl snapshot status etcd_bak.db -w table 
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| 230fea56 |        0 |          8 |      25 kB |
+----------+----------+------------+------------+

Zookeeper

1.查看zookeeper状态

./zkServer.sh status

2.启动zookeeper

./zkServer.sh start

3.停止zookeeper

./zkServer.sh stop

4.连接到zookeeper服务

./zkCli.sh -server localhost:2181

ls(查看当前节点数据),

ls2(查看当前节点数据并能看到更新次数等数据)

create(创建一个节点)

get(得到一个节点,包含数据和更新次数等数据)

set(修改节点)

delete(删除一个节点)

5.查看patroni key值

ls

ls /service/batman

查看相关key值

[leader, optime, failover, members, initialize, history, config, sync]
 batman是patroni配置文件中scope的名字。

leader(主节点的名字)

get /service/batman/leader

leader记录主节点的名字,是临时节点,当session时间超过ttl后未响应,zookeeper就会删除该节点。

language 复制代码
zk: localhost:2181(CONNECTED) 33] get /service/batman/leader
 postgresql1

sync(同步复制的状态)

get /service/batman/sync

sync记录同步复制的状态,是持久节点不会因为session到期,删除该key值

language 复制代码
[zk: localhost:2181(CONNECTED) 30] get /service/batman/sync  
{"leader":"postgresql0","sync_standby":"postgresql1"}

optime(主库最后一次lsn位置)

get /service/batman/optime/leader

optime/leader是主库最后一次操作后的lsn位置,是持久节点不会因为session到期,删除该key值。

language 复制代码
[zk: localhost:2181(CONNECTED) 71] get /service/batman/optime/leader      
1342177280

failover(记录计划的切换任务)

get /service/batman/failover

failover是记录计划的切换任务,是持久节点不会因为session到期,删除该key值。

language 复制代码
 [zk: localhost:2181(CONNECTED) 117] get /service/batman/failover
 {"leader":"postgresql1","scheduled_at":"2018-11-09T14:30:00+08:00"}

members(记录所有数据节点的连接信息和重要的状态信息)

get /service/batman/members/postgresql0

get /service/batman/members/postgresql

members是分别记录了所有数据节点的连接信息和重要的状态信息,是临时节点,当session时间超过ttl后未响应,zookeeper就会删除该节点。

language 复制代码
[zk: localhost:2181(CONNECTED) 120] get /service/batman/members/postgresql0 
{"conn_url":"postgres://192.168.56.5:5432/postgres","api_url":"http://192.168.56.5:8008
 /patroni","timeline":58,"state":"running","role":"replica","xlog_location":1426063952}

initialize(记录数据库集群初始化的信息)

get /service/batman/initialize

initialize记录了数据库集群初始化的信息,是持久节点不会因为session到期,删除该key值。

language 复制代码
[zk: localhost:2181(CONNECTED) 123] get /service/batman/initialize         
6618183861621602635

//这个id是控制文件信息里的 Database system identifier
[postgres@node2 ~]$ pg_controldata
pg_control version number:     1100     
Catalog version number:        201809051        
Database system identifier:    6618183861621602635   

history(记录的是集群中时间线变化的过程)

get /service/batman/history

history记录的是集群中时间线变化的过程,是持久节点不会因为session到期,删除该key值

language 复制代码
[zk: localhost:2181(CONNECTED) 26] get /service/batman/history
 [[1,67109464,"no recovery target specified"],[2,83886232,"no recovery target 
specified"],[3,100663448,"no recovery target specified"],[4,218103960,"no recovery 
target specified"],[5,251658392,"no recovery target specified"],[6,268435608,"no 
recovery target specified"],[7,318767256,"no recovery target specified"],
 [8,335544472,"no recovery target specified"],[9,352321688,"no recovery target 
specified"],[10,369098904,"no recovery target specified"],[11,402653336,"no recovery 
target specified"],[12,419430552,"no recovery target specified"],[13,436207768,"no 
recovery target specified"],[14,452984984,"no recovery target specified"],
 [15,469762200,"no recovery target specified"],[16,486539416,"no recovery target 
specified"],[17,503316632,"no recovery target specified"],[18,536871064,"no recovery 
target specified"],[19,553648280,"no recovery target specified"],[20,570425496,"no 
recovery target specified"],[21,603979928,"no recovery target specified"],
 [22,620757144,"no recovery target specified"],[23,637534360,"no recovery target 
specified"],[24,654311576,"no recovery target specified"],[25,671088792,"no recovery 
target specified","2018-11-05T16:44:52+08:00"],[26,704643224,"no recovery target 
specified","2018-11-05T16:49:12+08:00"],[27,721420440,"no recovery target 
specified","2018-11-05T23:21:12+08:00"],[28,771752088,"no recovery target 
specified","2018-11-06T04:32:40+08:00"],[29,805306520,"no recovery target 
specified","2018-11-06T05:22:45+08:00"],[30,822083736,"no recovery target 

config(记录的是patroni配置文件的配置信息)

get /service/batman/config

config记录patroni配置文件的配置信息,是持久节点不会因为session到期,删除该key值

language 复制代码
[zk: localhost:2181(CONNECTED) 28] get /service/batman/config 
{"retry_timeout": 10, "postgresql": {"use_slots": true, "use_pg_rewind": true, 
"parameters": {"hot_standby": "on", "wal_keep_segments": 8, "wal_level": "hot_standby", 
"archive_command": "mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p 
../wal_archive/%f", "wal_log_hints": "on", "max_wal_senders": 10, "archive_timeout": 
2000, "archive_mode": "on", "max_replication_slots": 10, "max_connections": 300}, 
"recovery_conf": {"restore_command": "cp ../wal_archive/%f %p"}}, "synchronous_mode": 
true, "maximum_lag_on_failover": 1048576, "loop_wait": 10, "max_connection": 300, 
"archive_timeout": "2000s", "ttl": 30, "max_connections": 300}

6.zk清理快照

保留最近20个snap文件

./zkCleanup.sh -n 20

使用一些linux命令删除n天前的数据或日志

find /zookeeperData/version-2/ -name "snap*" -mtime +10 | xargs rm -f
find /zookeeperDataLog/version-2/ -name "log*" -mtime +10 | xargs rm -f
find /opt/apache-zookeeper-3.7.1-bin/logs -name "zookeeper.log.*" -mtime +10 | xargs rm --f

配置自动清理日志

从3.4.0开始,会自动清理日志了,所以这个通常不用配置。

配置autopurge.snapRetainCount和autopurge.purgeInterval参数。

保留的snapshop的数量,默认是3个,最小也是3。

language 复制代码
autopurge.snapRetainCount=3
autopurge.purgeInterval=1

3.4.0之前的版本可以通过zookeeper的配置自行对snap进行管理。如下这三个参数分别表示一个小时清理一次,log的大小(单位是kb)和快照的数量。

autopurge.purgeInterval=1
preAllocSize=131072
snapCount=300000

7.通过下面命令查看zookeeper启动的各个参数,包括java路径等

./bin/zkServer.sh print-cmd

参考链接

1.https://www.modb.pro/db/73762

2.https://www.modb.pro/db/75268

3.https://www.modb.pro/topic/152353

升级

1.https://www.modb.pro/db/500077

相关推荐
qq_529835354 分钟前
对计算机中缓存的理解和使用Redis作为缓存
数据库·redis·缓存
月光水岸New2 小时前
Ubuntu 中建的mysql数据库使用Navicat for MySQL连接不上
数据库·mysql·ubuntu
狄加山6752 小时前
数据库基础1
数据库
我爱松子鱼3 小时前
mysql之规则优化器RBO
数据库·mysql
chengooooooo3 小时前
苍穹外卖day8 地址上传 用户下单 订单支付
java·服务器·数据库
Rverdoser4 小时前
【SQL】多表查询案例
数据库·sql
Galeoto4 小时前
how to export a table in sqlite, and import into another
数据库·sqlite
人间打气筒(Ada)4 小时前
MySQL主从架构
服务器·数据库·mysql
leegong231114 小时前
学习PostgreSQL专家认证
数据库·学习·postgresql
喝醉酒的小白4 小时前
PostgreSQL:更新字段慢
数据库·postgresql