进行基础备份有2中方式,可使用pg_basebackup工具或其他备份工具进行备份,另一种是使用底层命令进行基础备份。pg_basebackup等工具其实是封装了底层命令,所以,为了更好的理解基础备份的过程,这里我们使用底层命令进行备份。并分析其中的源码实现。
基础备份过程
备份的方式有多种,可以进行SQL Dump,也可以停止数据库实例,对实例物理文件进行复制拷贝,有其各自的优缺点与适用场景。这里的基础备份,其中一个最大的优势就是可以不停机,不停业务进行物理备份,在备份过程中,不需要获取表上的锁,正常业务受备份的影响较小。另外非常强大的一个功能就是PIRT,后面再去分析,这里我们分析一下基础备份的全过程。
基础备份的过程如下:
- 连接到数据库
- 执行
select pg_start_backup('lable')
命令。(会强制发生一次checkpoint,并将检查点记录到backup_label文件中) - 执行备份,把数据目录进行复制(包含backup_label)
- 执行
select pg_stop_backup
命令,(删除backup_label文件,并在WAL日志中写入一条XLOG_BACKUP_END
的记录,当备节点回放到该记录时,就知道备份结束了,数据达到了一致点,可以对外提供服务了) - 备份过程中产生的WAL日志进行复制
操作执行过程分析
在分析源码之前,我们先执行基础备份操作过程,进行基础备份,帮助我们理解其中的备份过程。
- initdb,创建数据库
查看pg_control
sql
postgres@slpc:~/pgsql$ pg_controldata -D pgdata/
pg_control version number: 1300
Catalog version number: 202107181
Database system identifier: 7279971345653503170
Database cluster state: shut down
pg_control last modified: 2023年09月18日 星期一 09时26分56秒
Latest checkpoint location: 0/167E598
Latest checkpoint's REDO location: 0/167E598
Latest checkpoint's REDO WAL file: 000000010000000000000001
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
查看WAL日志:
postgres@slpc:~/pgsql$ pg_waldump -p pgdata/pg_wal/ 000000010000000000000001
// 省略...
rmgr: Transaction len (rec/tot): 66/ 66, tx: 732, lsn: 0/0167E550, prev 0/0167E4B0, desc: COMMIT 2023-09-18 09:26:56.640405 CST; inval msgs: snapshot 2396
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/0167E598, prev 0/0167E550, desc: CHECKPOINT_SHUTDOWN redo 0/167E598; tli 1; prev tli 1; fpw true; xid 0:733; oid 13011; multi 1; offset 0; oldest xid 726 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown
- 启动数据库
- 连接数据库,建表,插入数据
- 执行
pg_start_backup('bak1')
函数
sql
postgres@slpc:~/pgsql/pgdata/pg_wal$ psql -p 7432
psql (14.8)
Type "help" for help.
postgres=# create table t1(a int);
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# select pg_start_backup('bak1');
pg_start_backup
-----------------
0/2000028
(1 row)
首先是日志文件发生切换,切换后再执行checkpoint操作
sql
postgres@slpc:~/pgsql/pgdata/pg_wal$ ls
000000010000000000000001 archive_status
postgres@slpc:~/pgsql/pgdata/pg_wal$ ls -- 强制切换WAL段,回收WAL文件, 从000000010000000000000002开始,后的WAL文件都要拷贝到备份文件中,回收的WAL文件则不需要
000000010000000000000002 000000010000000000000003 archive_status
查看日志,观察运行过程, 执行过程中,会进行checkpoint操作:
```sql
2023-09-18 10:12:21.139 CST [417435] DEBUG: 00000: attempting to remove WAL segments older than log file 000000000000000000000001
2023-09-18 10:12:21.139 CST [417435] LOCATION: RemoveOldXlogFiles, xlog.c:4114
2023-09-18 10:12:21.141 CST [417435] DEBUG: 00000: recycled write-ahead log file "000000010000000000000001"
2023-09-18 10:12:21.141 CST [417435] LOCATION: RemoveXlogFile, xlog.c:4256
2023-09-18 10:12:21.141 CST [417435] DEBUG: 00000: SlruScanDirectory invoking callback on pg_subtrans/0000
2023-09-18 10:12:21.141 CST [417435] LOCATION: SlruScanDirectory, slru.c:1574
2023-09-18 10:12:21.141 CST [417435] LOG: 00000: checkpoint complete: wrote 31 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled; write=2.846 s, sync=0.005 s, total=2.860 s; sync files=22, longest=0.004 s, average=0.001 s; distance=9734 kB, estimate=9734 kB
2023-09-18 10:12:21.141 CST [417435] LOCATION: LogCheckpointEnd, xlog.c:8925
2023-09-18 10:12:39.283 CST [417436] DEBUG: 00000: snapshot of 0+0 running transaction ids (lsn 0/2000148 oldest xid 735 latest complete 734 next xid 735)
观察wal日志:
sql
postgres@slpc:~/pgsql$ pg_waldump -p pgdata/pg_wal/ 000000010000000000000002
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000028, prev 0/01696D18, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000060, prev 0/02000028, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/02000098, prev 0/02000060, desc: CHECKPOINT_ONLINE redo 0/2000028; tli 1; prev tli 1; fpw true; xid 0:735; oid 24576; multi 1; offset 0; oldest xid 726 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 735; online
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000110, prev 0/02000098, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: Heap len (rec/tot): 54/ 150, tx: 735, lsn: 0/02000148, prev 0/02000110, desc: INSERT off 2 flags 0x00, blkref #0: rel 1663/13010/16384 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 735, lsn: 0/020001E0, prev 0/02000148, desc: COMMIT 2023-09-18 10:23:57.688476 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000208, prev 0/020001E0, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
观察pg_control
sql
postgres@slpc:~/pgsql$ pg_controldata -D pgdata/
pg_control version number: 1300
Catalog version number: 202107181
Database system identifier: 7279971345653503170
Database cluster state: in production
pg_control last modified: 2023年09月18日 星期一 10时12分21秒
Latest checkpoint location: 0/2000098 -- 最新检测点位置
Latest checkpoint's REDO location: 0/2000028
Latest checkpoint's REDO WAL file: 000000010000000000000002 -- REDO WAL文件,即checkpoint REDO location开始的文件
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:735
Latest checkpoint's NextOID: 24576
生成backup_label文件(非常重要,后续从备份文件中进行恢复时,从这里记录的位置开始,而不是读取pg_control文件中的位置):
sql
postgres@slpc:~/pgsql/pgdata$ cat backup_label
START WAL LOCATION: 0/2000028 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/2000098
BACKUP METHOD: pg_start_backup
BACKUP FROM: primary
START TIME: 2023-09-18 10:12:21 CST
LABEL: bak1
START TIMELINE: 1
-
拷贝数据库实例到备份文件
-
执行
pg_stop_backup()
,结束基础备份
sql
postgres=# select pg_stop_backup();
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_stop_backup
----------------
0/2000268
(1 row)
观察日志
2023-09-18 10:47:41.095 CST [447083] DEBUG: 00000: removing WAL backup history file "000000010000000000000002.00000028.backup"
2023-09-18 10:47:41.095 CST [447083] LOCATION: CleanupBackupHistory, xlog.c:4375
2023-09-18 10:47:41.095 CST [447083] NOTICE: 00000: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
2023-09-18 10:47:41.095 CST [447083] LOCATION: do_pg_stop_backup, xlog.c:11912
2023-09-18 10:47:41.263 CST [417436] DEBUG: 00000: snapshot of 0+0 running transaction ids (lsn 0/3000060 oldest xid 736 latest complete 735 next xid 736)
查看wal日志:
sql
postgres@slpc:~/pgsql$ pg_waldump -p pgdata/pg_wal/ 000000010000000000000002
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000028, prev 0/01696D18, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000060, prev 0/02000028, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/02000098, prev 0/02000060, desc: CHECKPOINT_ONLINE redo 0/2000028; tli 1; prev tli 1; fpw true; xid 0:735; oid 24576; multi 1; offset 0; oldest xid 726 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 735; online
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000110, prev 0/02000098, desc: RUNNING_XACTS nextXid 735 latestCompletedXid 734 oldestRunningXid 735
rmgr: Heap len (rec/tot): 54/ 150, tx: 735, lsn: 0/02000148, prev 0/02000110, desc: INSERT off 2 flags 0x00, blkref #0: rel 1663/13010/16384 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 735, lsn: 0/020001E0, prev 0/02000148, desc: COMMIT 2023-09-18 10:23:57.688476 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/02000208, prev 0/020001E0, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
rmgr: XLOG len (rec/tot): 34/ 34, tx: 0, lsn: 0/02000240, prev 0/02000208, desc: BACKUP_END 0/2000028
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/02000268, prev 0/02000240, desc: SWITCH
postgres@slpc:~/pgsql$ pg_waldump -p pgdata/pg_wal/ 000000010000000000000003
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/03000028, prev 0/02000268, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
删除了源数据库实例中的backup_label
文件,因为这个是给备库用的,已经被拷贝到了备份文件中,等待恢复使用。
-
备份文件进行恢复
启动备份的数据库实例,读backup_label文件,
观察日志:2023-09-18 11:09:59.964 CST [1237713] LOG: 00000: starting PostgreSQL 14.8 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, 64-bit
2023-09-18 11:09:59.965 CST [1237713] LOG: 00000: listening on IPv4 address "0.0.0.0", port 7431
2023-09-18 11:09:59.965 CST [1237713] LOG: 00000: listening on IPv6 address "::", port 7431
2023-09-18 11:09:59.970 CST [1237713] LOG: 00000: listening on Unix socket "/tmp/.s.PGSQL.7431"
2023-09-18 11:09:59.976 CST [1237717] LOG: 00000: database system was interrupted; last known up at 2023-09-18 10:12:21 CST
2023-09-18 11:09:59.976 CST [1237717] LOCATION: StartupXLOG, xlog.c:6585
2023-09-18 11:09:59.976 CST [1237717] DEBUG: 00000: removing all temporary WAL segments
2023-09-18 11:09:59.976 CST [1237717] LOCATION: RemoveTempXlogFiles, xlog.c:4070
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: backup time 2023-09-18 10:12:21 CST in file "backup_label"
2023-09-18 11:09:59.993 CST [1237717] LOCATION: read_backup_label, xlog.c:12143
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: backup label bak1 in file "backup_label"
2023-09-18 11:09:59.993 CST [1237717] LOCATION: read_backup_label, xlog.c:12148
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: backup timeline 1 in file "backup_label"
2023-09-18 11:09:59.993 CST [1237717] LOCATION: read_backup_label, xlog.c:12165
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: checkpoint record is at 0/2000098
2023-09-18 11:09:59.993 CST [1237717] LOCATION: StartupXLOG, xlog.c:6729
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: redo record is at 0/2000028; shutdown false
2023-09-18 11:09:59.993 CST [1237717] LOCATION: StartupXLOG, xlog.c:6936
2023-09-18 11:09:59.993 CST [1237717] DEBUG: 00000: next transaction ID: 735; next OID: 24576
2023-09-18 11:09:59.993 CST [1237717] LOCATION: StartupXLOG, xlog.c:6940
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: next MultiXactId: 1; next MultiXactOffset: 0
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupXLOG, xlog.c:6944
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: oldest unfrozen transaction ID: 726, in database 1
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupXLOG, xlog.c:6947
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: oldest MultiXactId: 1, in database 1
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupXLOG, xlog.c:6950
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: commit timestamp Xid oldest/newest: 0/0
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupXLOG, xlog.c:6953
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: transaction ID wrap limit is 2147484373, limited by database with OID 1
2023-09-18 11:09:59.994 CST [1237717] LOCATION: SetTransactionIdLimit, varsup.c:427
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2023-09-18 11:09:59.994 CST [1237717] LOCATION: SetMultiXactIdLimit, multixact.c:2283
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: starting up replication slots
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupReplicationSlots, slot.c:1394
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: xmin required by slots: data 0, catalog 0
2023-09-18 11:09:59.994 CST [1237717] LOCATION: ProcArraySetReplicationSlotXmin, procarray.c:3984
2023-09-18 11:09:59.994 CST [1237717] DEBUG: 00000: starting up replication origin progress state
2023-09-18 11:09:59.994 CST [1237717] LOCATION: StartupReplicationOrigin, origin.c:706
2023-09-18 11:09:59.996 CST [1237717] DEBUG: 00000: resetting unlogged relations: cleanup 1 init 0
2023-09-18 11:09:59.996 CST [1237717] LOCATION: ResetUnloggedRelations, reinit.c:55
2023-09-18 11:10:00.008 CST [1237717] LOG: 00000: redo starts at 0/2000028
2023-09-18 11:10:00.008 CST [1237717] LOCATION: StartupXLOG, xlog.c:7387
2023-09-18 11:10:00.008 CST [1237717] DEBUG: 00000: end of backup reached
2023-09-18 11:10:00.008 CST [1237717] CONTEXT: WAL redo at 0/2000240 for XLOG/BACKUP_END: 0/2000028
2023-09-18 11:10:00.008 CST [1237717] LOCATION: xlog_redo, xlog.c:10595
2023-09-18 11:10:00.010 CST [1237717] LOG: 00000: consistent recovery state reached at 0/2000268 到达一致性点,也就是pg_stop_backup的位置
2023-09-18 11:10:00.010 CST [1237717] LOCATION: CheckRecoveryConsistency, xlog.c:8331
观察WAL文件
sql
postgres@slpc:~/pgsql$ pg_waldump -p pgbak/pg_wal/ 000000010000000000000003
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/03000028, prev 0/02000268, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/03000060, prev 0/03000028, desc: CHECKPOINT_SHUTDOWN redo 0/3000060; tli 1; prev tli 1; fpw true; xid 0:736; oid 24576; multi 1; offset 0; oldest xid 726 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/030000D8, prev 0/03000060, desc: RUNNING_XACTS nextXid 736 latestCompletedXid 735 oldestRunningXid 736
- 检测备份的数据库实例是否启动成功
sql
postgres@slpc:~/pgsql/pgdata$ psql -p 7431
psql (14.8)
Type "help" for help.
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t1 | table | postgres
(1 row)
postgres=# select * from t1;
a
---
1
2
(2 rows)
下面我们进行源码分析
pg_start_backup
pg_start_backup
开始为制作基础备份进行准备工作,恢复过程从重做点开始,因此pg_start_backup
必须执行检查点,以便在制作基础备份开始的时刻显式创建一个重做点。这个检查点的位置需要保存在非pg_control
文件中,因为备份过程中,业务并没有停,期间可能会执行多次常规检查点。
c++
pg_start_backup ( label text [, fast boolean [, exclusive boolean ]] ) → pg_lsn
准备开始在线备份。唯一需要的参数是用于备份的任意用户定义的标签。(通常,备份转储文件将存储在这个名称下。) 如果可选的第二个参数被指定为true,它将指定尽可能快地执行pg_start_backup
。这将强制产生一个即时检查点,这将导致I/O操作突增,从而降低并发执行的查询的速度。第三个可选参数指定是执行排他或非排他备份(默认为排他备份)。在排他模式下使用时,该函数将写一个备份标签文件(backup_label),如果pg_tblspc/目录中有任何链接, 则将一个表空间映射文件(tablespace_map)写入数据库集群的数据目录,然后执行检查点,然后返回备份的开始写-提前日志位置。 (用户可以忽略这个结果值,但在有用的情况下会提供它。) 在非排他模式下使用时,这些文件的内容将由pg_stop_backup
函数返回,并且应该由用户复制到备份区域。
源码分析,调用pg_start_backup
,调用的中间过程略,直接看函数实现。
c
pg_start_backup
--> do_pg_start_backup
pg_start_backup
函数实现如下:
c
/*
* pg_start_backup: set up for taking an on-line backup dump
*
* Essentially what this does is to create a backup label file in $PGDATA,
* where it will be archived as part of the backup dump. The label file
* contains the user-supplied label string (typically this would be used
* to tell where the backup dump will be stored) and the starting time and
* starting WAL location for the dump.
*/
Datum pg_start_backup(PG_FUNCTION_ARGS)
{
text *backupid = PG_GETARG_TEXT_PP(0); // 参数1:用来唯一标识这次备份操作的任意字符串
// 默认情况下,pg_start_backup可能需要较长的时间完成。 这是因为它会执行一个检查点,并且该检查点所需要的 I/O 将会分散到一段 显著的时间上,默认情况下是你的检查点间隔(见配置参数 checkpoint_completion_target)的一半。这通常 是你所想要的,因为它可以最小化对查询处理的影响。如果你想要尽可能快地 开始备份,请把第二个参数改成true,这将会发出一个立即的检查点并且使用尽可能多的I/O。
bool fast = PG_GETARG_BOOL(1);
bool exclusive = PG_GETARG_BOOL(2); // 开始一次非排他基础备份
char *backupidstr;
XLogRecPtr startpoint;
SessionBackupState status = get_backup_status();
backupidstr = text_to_cstring(backupid);
if (status == SESSION_BACKUP_NON_EXCLUSIVE)
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("a backup is already in progress in this session")));
if (exclusive) // 是否排他备份
{
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL, NULL, NULL);
}
else
{
MemoryContext oldcontext;
/* Label file and tablespace map file need to be long-lived, since they are read in pg_stop_backup. */
oldcontext = MemoryContextSwitchTo(TopMemoryContext);
label_file = makeStringInfo();
tblspc_map_file = makeStringInfo();
MemoryContextSwitchTo(oldcontext);
register_persistent_abort_backup_handler();
startpoint = do_pg_start_backup(backupidstr, fast, NULL, label_file, NULL, tblspc_map_file);
}
PG_RETURN_LSN(startpoint); // 返回LSN
}
实际实现在do_pg_start_backup
中,主要工作:
- 强制开启full_page_writes = on, 备份结束再还原
- 切换到一个新的WAL日志文件,命名规则如下: (方便进行日志归档,拷贝等操作)
c
/* Generate a WAL segment file name.*/
#define XLogFileName(fname, tli, logSegNo, wal_segsz_bytes) \
snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, \
(uint32) ((logSegNo) / XLogSegmentsPerXLogId(wal_segsz_bytes)), \
(uint32) ((logSegNo) % XLogSegmentsPerXLogId(wal_segsz_bytes)))
- 进行checkpoint
- 构造backup_lable文件,存储检查点位置等信息
返回最小的WAL LSN,以及timeline。这个LSN表示备份恢复需要的起始WAL日志的位置。
c
/*
* do_pg_start_backup
*
* Utility function called at the start of an online backup. It creates the
* necessary starting checkpoint and constructs the backup label file.
* Returns the minimum WAL location that must be present to restore from this
* backup, and the corresponding timeline ID in *starttli_p.
*/
XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
StringInfo labelfile, List **tablespaces, StringInfo tblspcmapfile)
{
bool exclusive = (labelfile == NULL);
bool backup_started_in_recovery = false;
XLogRecPtr checkpointloc;
XLogRecPtr startpoint;
TimeLineID starttli;
pg_time_t stamp_time;
char strfbuf[128];
char xlogfilename[MAXFNAMELEN];
XLogSegNo _logSegNo;
struct stat stat_buf;
FILE *fp;
backup_started_in_recovery = RecoveryInProgress();
// 在恢复阶段,不能进行排他备份
/* Currently only non-exclusive backup can be taken during recovery.*/
if (backup_started_in_recovery && exclusive)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("recovery is in progress"),
errhint("WAL control functions cannot be executed during recovery.")));
/* During recovery, we don't need to check WAL level. Because, if WAL
* level is not sufficient, it's impossible to get here during recovery. */
if (!backup_started_in_recovery && !XLogIsNeeded())
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("WAL level not sufficient for making an online backup"),
errhint("wal_level must be set to \"replica\" or \"logical\" at server start.")));
// ...
/*
* Mark backup active in shared memory. We must do full-page WAL writes
* during an on-line backup even if not doing so at other times, because
* it's quite possible for the backup dump to obtain a "torn" (partially
* written) copy of a database page if it reads the page concurrently with
* our write to the same page. This can be fixed as long as the first
* write to the page in the WAL sequence is a full-page write. Hence, we
* turn on forcePageWrites and then force a CHECKPOINT, to ensure there
* are no dirty pages in shared memory that might get dumped while the
* backup is in progress without having a corresponding WAL record. (Once
* the backup is complete, we need not force full-page writes anymore,
* since we expect that any pages not modified during the backup interval
* must have been correctly captured by the backup.)
*
* Note that forcePageWrites has no effect during an online backup from
* the standby.
*
* We must hold all the insertion locks to change the value of
* forcePageWrites, to ensure adequate interlocking against
* XLogInsertRecord().
*/
WALInsertLockAcquireExclusive();
if (exclusive)
{
/* At first, mark that we're now starting an exclusive backup, to
* ensure that there are no other sessions currently running pg_start_backup() or pg_stop_backup(). */
if (XLogCtl->Insert.exclusiveBackupState != EXCLUSIVE_BACKUP_NONE)
{
WALInsertLockRelease();
ereport(ERROR,(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("a backup is already in progress"), errhint("Run pg_stop_backup() and try again.")));
}
XLogCtl->Insert.exclusiveBackupState = EXCLUSIVE_BACKUP_STARTING;
}
else
XLogCtl->Insert.nonExclusiveBackups++;
XLogCtl->Insert.forcePageWrites = true; /* 强制开启full_page_writes */
WALInsertLockRelease();
/* Ensure we release forcePageWrites if fail below */
PG_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) BoolGetDatum(exclusive));
{
bool gotUniqueStartpoint = false;
DIR *tblspcdir;
struct dirent *de;
tablespaceinfo *ti;
int datadirpathlen;
/*
* Force an XLOG file switch before the checkpoint, to ensure that the
* WAL segment the checkpoint is written to doesn't contain pages with
* old timeline IDs. That would otherwise happen if you called
* pg_start_backup() right after restoring from a PITR archive: the
* first WAL segment containing the startup checkpoint has pages in
* the beginning with the old timeline ID. That can cause trouble at
* recovery: we won't have a history file covering the old timeline if
* pg_wal directory was not included in the base backup and the WAL
* archive was cleared too before starting the backup.
*
* This also ensures that we have emitted a WAL page header that has
* XLP_BKP_REMOVABLE off before we emit the checkpoint record.
* Therefore, if a WAL archiver (such as pglesslog) is trying to
* compress out removable backup blocks, it won't remove any that
* occur after this point.
*
* During recovery, we skip forcing XLOG file switch, which means that
* the backup taken during recovery is not available for the special
* recovery case described above.
*/
if (!backup_started_in_recovery)
RequestXLogSwitch(false); // 切换到一个新的WAL日志文件,默认是16MB后才切换
do
{
bool checkpointfpw;
// 进行强制checkpoint
/*
* Force a CHECKPOINT. Aside from being necessary to prevent torn
* page problems, this guarantees that two successive backup runs
* will have different checkpoint positions and hence different
* history file names, even if nothing happened in between.
*
* During recovery, establish a restartpoint if possible. We use
* the last restartpoint as the backup starting checkpoint. This
* means that two successive backup runs can have same checkpoint
* positions.
*
* Since the fact that we are executing do_pg_start_backup()
* during recovery means that checkpointer is running, we can use
* RequestCheckpoint() to establish a restartpoint.
*
* We use CHECKPOINT_IMMEDIATE only if requested by user (via
* passing fast = true). Otherwise this can take awhile.
*/
RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | (fast ? CHECKPOINT_IMMEDIATE : 0));
/*
* Now we need to fetch the checkpoint record location, and also
* its REDO pointer. The oldest point in WAL that would be needed
* to restore starting from the checkpoint is precisely the REDO pointer. */
LWLockAcquire(ControlFileLock, LW_SHARED);
checkpointloc = ControlFile->checkPoint; // 获取最新的检查点信息
startpoint = ControlFile->checkPointCopy.redo;
starttli = ControlFile->checkPointCopy.ThisTimeLineID;
checkpointfpw = ControlFile->checkPointCopy.fullPageWrites;
LWLockRelease(ControlFileLock);
// ...
/*
* If two base backups are started at the same time (in WAL sender
* processes), we need to make sure that they use different
* checkpoints as starting locations, because we use the starting
* WAL location as a unique identifier for the base backup in the
* end-of-backup WAL record and when we write the backup history
* file. Perhaps it would be better generate a separate unique ID
* for each backup instead of forcing another checkpoint, but
* taking a checkpoint right after another is not that expensive
* either because only few buffers have been dirtied yet.
*/
WALInsertLockAcquireExclusive();
if (XLogCtl->Insert.lastBackupStart < startpoint)
{
XLogCtl->Insert.lastBackupStart = startpoint;
gotUniqueStartpoint = true;
}
WALInsertLockRelease();
} while (!gotUniqueStartpoint);
XLByteToSeg(startpoint, _logSegNo, wal_segment_size); //Compute a segment number from an XLogRecPtr.
XLogFileName(xlogfilename, starttli, _logSegNo, wal_segment_size); // 生成WAL日志文件名
/* Construct tablespace_map file. */
if (tblspcmapfile == NULL)
tblspcmapfile = makeStringInfo();
datadirpathlen = strlen(DataDir);
/* Collect information about all tablespaces */
tblspcdir = AllocateDir("pg_tblspc");
while ((de = ReadDir(tblspcdir, "pg_tblspc")) != NULL)
{
// ...
}
FreeDir(tblspcdir);
//创建backup_label文件,构造信息
/* Construct backup label file. */
if (labelfile == NULL)
labelfile = makeStringInfo();
/* Use the log timezone here, not the session timezone */
stamp_time = (pg_time_t) time(NULL);
pg_strftime(strfbuf, sizeof(strfbuf),
"%Y-%m-%d %H:%M:%S %Z",
pg_localtime(&stamp_time, log_timezone));
appendStringInfo(labelfile, "START WAL LOCATION: %X/%X (file %s)\n",
LSN_FORMAT_ARGS(startpoint), xlogfilename);
appendStringInfo(labelfile, "CHECKPOINT LOCATION: %X/%X\n",
LSN_FORMAT_ARGS(checkpointloc));
appendStringInfo(labelfile, "BACKUP METHOD: %s\n",
exclusive ? "pg_start_backup" : "streamed");
appendStringInfo(labelfile, "BACKUP FROM: %s\n",
backup_started_in_recovery ? "standby" : "primary");
appendStringInfo(labelfile, "START TIME: %s\n", strfbuf);
appendStringInfo(labelfile, "LABEL: %s\n", backupidstr);
appendStringInfo(labelfile, "START TIMELINE: %u\n", starttli);
// 写backup_lable文件到磁盘
/* Okay, write the file, or return its contents to caller. */
if (exclusive)
{
/* Check for existing backup label --- implies a backup is already
* running. (XXX given that we checked exclusiveBackupState
* above, maybe it would be OK to just unlink any such label file?) */
if (stat(BACKUP_LABEL_FILE, &stat_buf) != 0)
{
if (errno != ENOENT)
ereport(ERROR, (errcode_for_file_access(), errmsg("could not stat file \"%s\": %m", BACKUP_LABEL_FILE)));
}
else
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("a backup is already in progress"),
errhint("If you're sure there is no backup in progress, remove file \"%s\" and try again.",
BACKUP_LABEL_FILE)));
fp = AllocateFile(BACKUP_LABEL_FILE, "w");
if (!fp)
ereport(ERROR,(errcode_for_file_access(), errmsg("could not create file \"%s\": %m",BACKUP_LABEL_FILE)));
if (fwrite(labelfile->data, labelfile->len, 1, fp) != 1 ||fflush(fp) != 0 ||pg_fsync(fileno(fp)) != 0 ||ferror(fp) ||FreeFile(fp))
ereport(ERROR,(errcode_for_file_access(), errmsg("could not write file \"%s\": %m",BACKUP_LABEL_FILE)));
/* Allocated locally for exclusive backups, so free separately */
pfree(labelfile->data);
pfree(labelfile);
/* Write backup tablespace_map file. */
if (tblspcmapfile->len > 0)
{
if (stat(TABLESPACE_MAP, &stat_buf) != 0)
{
if (errno != ENOENT)
ereport(ERROR,(errcode_for_file_access(), errmsg("could not stat file \"%s\": %m",TABLESPACE_MAP)));
}
else
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("a backup is already in progress"),
errhint("If you're sure there is no backup in progress, remove file \"%s\" and try again.",
TABLESPACE_MAP)));
fp = AllocateFile(TABLESPACE_MAP, "w");
if (!fp)
ereport(ERROR,(errcode_for_file_access(), errmsg("could not create file \"%s\": %m",TABLESPACE_MAP)));
if (fwrite(tblspcmapfile->data, tblspcmapfile->len, 1, fp) != 1 ||
fflush(fp) != 0 ||pg_fsync(fileno(fp)) != 0 ||ferror(fp) ||FreeFile(fp))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write file \"%s\": %m",
TABLESPACE_MAP)));
}
/* Allocated locally for exclusive backups, so free separately */
pfree(tblspcmapfile->data);
pfree(tblspcmapfile);
}
}
PG_END_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) BoolGetDatum(exclusive));
/*
* Mark that start phase has correctly finished for an exclusive backup.
* Session-level locks are updated as well to reflect that state.
*
* Note that CHECK_FOR_INTERRUPTS() must not occur while updating backup
* counters and session-level lock. Otherwise they can be updated
* inconsistently, and which might cause do_pg_abort_backup() to fail.
*/
if (exclusive)
{
WALInsertLockAcquireExclusive();
XLogCtl->Insert.exclusiveBackupState = EXCLUSIVE_BACKUP_IN_PROGRESS;
/* Set session-level lock */
sessionBackupState = SESSION_BACKUP_EXCLUSIVE;
WALInsertLockRelease();
}
else
sessionBackupState = SESSION_BACKUP_NON_EXCLUSIVE;
/* We're done. As a convenience, return the starting WAL location.*/
if (starttli_p)
*starttli_p = starttli;
return startpoint;
}
执行如下命令:
sql
postgres=# select pg_start_backup('bak1');
pg_start_backup
-----------------
7/F7000148
(1 row)
-- 生成的backup_label文件内容
postgres@slpc:~/pgsql/pgdata$ cat backup_label
START WAL LOCATION: 7/F7000148 (file 0000000100000007000000F7)
CHECKPOINT LOCATION: 7/F7000180
BACKUP METHOD: pg_start_backup
BACKUP FROM: primary
START TIME: 2023-09-15 15:05:13 CST
LABEL: bak1
START TIMELINE: 1
pg_stop_backup
结束备份操作,主要内容如下:
- 如果强制开启了full_page_writes,则关闭
- 写入一条备份结束的XLOG记录
- 切换WAL段文件
- 创建一个备份历史记录文件
- 删除backup_label文件, 这个文件最开始是放在源数据库实例目录下,必须删除,不然源数据库重启时,会读该文件从而影响正常的恢复过程。
c
pg_stop_backup ( exclusive boolean [, wait_for_archive boolean ] ) → setof record ( lsn pg_lsn, labelfile text, spcmapfile text )
完成排他或非排他联机备份。exclusive参数必须与前面的pg_start_backup
调用相匹配。 在排他备份中, pg_stop_backup
删除备份标签文件,如果存在,则删除pg_start_backup
创建的表空间映射文件。 在非排他备份中,这些文件的所需内容将作为函数结果的一部分返回,并且应该写入备份区域(不在数据目录)中的文件。
还有一个可选的boolean类型的第二个参数。如果为假,则该函数将在备份完成后立即返回,而无需等待WAL被归档。 这种行为只有在独立监控WAL归档的备份软件中才有用。否则,使备份一致所需的WAL可能会丢失,从而使备份无效。 默认情况下或当该参数为真时,pg_stop_backup将在启用归档时等待WAL被归档。 (在备用状态下,这意味着只有当archive_mode = always时,它才会等待。 如果主节点上的写活动很少,那么可以在主节点上运行pg_switch_wal来触发立即段切换。)
当在主节点上执行时,这个函数还会在预写式日志归档区域中创建一个备份历史文件。 历史文件包括给予pg_start_backup的标签,备份的开始和结束写前预写式日志的位置,以及备份的开始和结束时间。 记录完结束位置后,当前的预写式日志插入点自动移到下一个预写式日志文件,以便结束的预写式日志文件可以立即归档,从而完成备份。
该函数的结果是一条记录。lsn列保持备份的结束预写式日志位置(可以再忽略)。 当结束排他备份时,第二和第三列为NULL;在非排他备份之后,它们保持标签和表空间映射文件所需的内容。
还有另外一个函数,无参数。
c
pg_stop_backup () → pg_lsn
结束执行排他在线备份。这个简化版本等同于pg_stop_backup(true, true),只是它只返回pg_lsn结果。
源码如下:
c
/*
* pg_stop_backup: finish taking an on-line backup dump
*
* We write an end-of-backup WAL record, and remove the backup label file
* created by pg_start_backup, creating a backup history file in pg_wal
* instead (whence it will immediately be archived). The backup history file
* contains the same info found in the label file, plus the backup-end time
* and WAL location.
*
* Note: this version is only called to stop an exclusive backup. The function
* pg_stop_backup_v2 (overloaded as pg_stop_backup in SQL) is called to stop non-exclusive backups.
*/
Datum pg_stop_backup(PG_FUNCTION_ARGS)
{
XLogRecPtr stoppoint;
SessionBackupState status = get_backup_status();
if (status == SESSION_BACKUP_NON_EXCLUSIVE)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("non-exclusive backup in progress"),
errhint("Did you mean to use pg_stop_backup('f')?")));
/*
* Exclusive backups were typically started in a different connection, so
* don't try to verify that status of backup is set to
* SESSION_BACKUP_EXCLUSIVE in this function. Actual verification that an
* exclusive backup is in fact running is handled inside
* do_pg_stop_backup.
*/
stoppoint = do_pg_stop_backup(NULL, true, NULL);
PG_RETURN_LSN(stoppoint);
}
/*
* do_pg_stop_backup
*
* Utility function called at the end of an online backup. It cleans up the
* backup state and can optionally wait for WAL segments to be archived.
*
* If labelfile is NULL, this stops an exclusive backup. Otherwise this stops
* the non-exclusive backup specified by 'labelfile'.
*
* Returns the last WAL location that must be present to restore from this
* backup, and the corresponding timeline ID in *stoptli_p.
*/
XLogRecPtr
do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p)
{
bool exclusive = (labelfile == NULL);
bool backup_started_in_recovery = false;
XLogRecPtr startpoint;
XLogRecPtr stoppoint;
TimeLineID stoptli;
pg_time_t stamp_time;
char strfbuf[128];
char histfilepath[MAXPGPATH];
char startxlogfilename[MAXFNAMELEN];
char stopxlogfilename[MAXFNAMELEN];
char lastxlogfilename[MAXFNAMELEN];
char histfilename[MAXFNAMELEN];
char backupfrom[20];
XLogSegNo _logSegNo;
FILE *lfp;
FILE *fp;
char ch;
int seconds_before_warning;
int waits = 0;
bool reported_waiting = false;
char *remaining;
char *ptr;
uint32 hi,lo;
// ...
if (exclusive)
{
/*
* At first, mark that we're now stopping an exclusive backup, to
* ensure that there are no other sessions currently running
* pg_start_backup() or pg_stop_backup().
*/
WALInsertLockAcquireExclusive();
if (XLogCtl->Insert.exclusiveBackupState != EXCLUSIVE_BACKUP_IN_PROGRESS)
{
WALInsertLockRelease();
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exclusive backup not in progress")));
}
XLogCtl->Insert.exclusiveBackupState = EXCLUSIVE_BACKUP_STOPPING;
WALInsertLockRelease();
/*
* Remove backup_label. In case of failure, the state for an exclusive
* backup is switched back to in-progress.
*/
PG_ENSURE_ERROR_CLEANUP(pg_stop_backup_callback, (Datum) BoolGetDatum(exclusive));
{
// ...
// 删除backup_label文件
/*
* Close and remove the backup label file
*/
if (r != 1 || ferror(lfp) || FreeFile(lfp))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not read file \"%s\": %m",
BACKUP_LABEL_FILE)));
durable_unlink(BACKUP_LABEL_FILE, ERROR);
/*
* Remove tablespace_map file if present, it is created only if
* there are tablespaces.
*/
durable_unlink(TABLESPACE_MAP, DEBUG1);
}
PG_END_ENSURE_ERROR_CLEANUP(pg_stop_backup_callback, (Datum) BoolGetDatum(exclusive));
}
/*
* OK to update backup counters, forcePageWrites and session-level lock.
*
* Note that CHECK_FOR_INTERRUPTS() must not occur while updating them.
* Otherwise they can be updated inconsistently, and which might cause
* do_pg_abort_backup() to fail.
*/
WALInsertLockAcquireExclusive();
if (exclusive)
{
XLogCtl->Insert.exclusiveBackupState = EXCLUSIVE_BACKUP_NONE;
}
else
{
/*
* The user-visible pg_start/stop_backup() functions that operate on
* exclusive backups can be called at any time, but for non-exclusive
* backups, it is expected that each do_pg_start_backup() call is
* matched by exactly one do_pg_stop_backup() call.
*/
Assert(XLogCtl->Insert.nonExclusiveBackups > 0);
XLogCtl->Insert.nonExclusiveBackups--;
}
if (XLogCtl->Insert.exclusiveBackupState == EXCLUSIVE_BACKUP_NONE &&
XLogCtl->Insert.nonExclusiveBackups == 0)
{
XLogCtl->Insert.forcePageWrites = false; // 关闭强制full_page_writes
}
/*
* Clean up session-level lock.
*
* You might think that WALInsertLockRelease() can be called before
* cleaning up session-level lock because session-level lock doesn't need
* to be protected with WAL insertion lock. But since
* CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be
* cleaned up before it.
*/
sessionBackupState = SESSION_BACKUP_NONE;
WALInsertLockRelease();
/*
* Read and parse the START WAL LOCATION line (this code is pretty crude,
* but we are not expecting any variability in the file format).
*/
if (sscanf(labelfile, "START WAL LOCATION: %X/%X (file %24s)%c",
&hi, &lo, startxlogfilename,
&ch) != 4 || ch != '\n')
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));
startpoint = ((uint64) hi) << 32 | lo;
remaining = strchr(labelfile, '\n') + 1; /* %n is not portable enough */
/*
* Parse the BACKUP FROM line. If we are taking an online backup from the
* standby, we confirm that the standby has not been promoted during the
* backup.
*/
ptr = strstr(remaining, "BACKUP FROM:");
if (!ptr || sscanf(ptr, "BACKUP FROM: %19s\n", backupfrom) != 1)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));
if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("the standby was promoted during online backup"),
errhint("This means that the backup being taken is corrupt "
"and should not be used. "
"Try taking another online backup.")));
/*
* During recovery, we don't write an end-of-backup record. We assume that
* pg_control was backed up last and its minimum recovery point can be
* available as the backup end location. Since we don't have an
* end-of-backup record, we use the pg_control value to check whether
* we've reached the end of backup when starting recovery from this
* backup. We have no way of checking if pg_control wasn't backed up last
* however.
*
* We don't force a switch to new WAL file but it is still possible to
* wait for all the required files to be archived if waitforarchive is
* true. This is okay if we use the backup to start a standby and fetch
* the missing WAL using streaming replication. But in the case of an
* archive recovery, a user should set waitforarchive to true and wait for
* them to be archived to ensure that all the required files are
* available.
*
* We return the current minimum recovery point as the backup end
* location. Note that it can be greater than the exact backup end
* location if the minimum recovery point is updated after the backup of
* pg_control. This is harmless for current uses.
*
* XXX currently a backup history file is for informational and debug
* purposes only. It's not essential for an online backup. Furthermore,
* even if it's created, it will not be archived during recovery because
* an archiver is not invoked. So it doesn't seem worthwhile to write a
* backup history file during recovery.
*/
if (backup_started_in_recovery)
{
// ...
}
else
{
// 写入一条备份结束XLOG记录
/* Write the backup-end xlog record */
XLogBeginInsert();
XLogRegisterData((char *) (&startpoint), sizeof(startpoint));
stoppoint = XLogInsert(RM_XLOG_ID, XLOG_BACKUP_END);
stoptli = ThisTimeLineID;
/*
* Force a switch to a new xlog segment file, so that the backup is
* valid as soon as archiver moves out the current segment file. */
RequestXLogSwitch(false); // 切换日志段文件,以便尽快归档,减少等待归档结束的时间
XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size);
/* Use the log timezone here, not the session timezone */
stamp_time = (pg_time_t) time(NULL);
pg_strftime(strfbuf, sizeof(strfbuf),"%Y-%m-%d %H:%M:%S %Z", pg_localtime(&stamp_time, log_timezone));
/* Write the backup history file */
XLByteToSeg(startpoint, _logSegNo, wal_segment_size);
BackupHistoryFilePath(histfilepath, stoptli, _logSegNo, startpoint, wal_segment_size);
fp = AllocateFile(histfilepath, "w");
if (!fp)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not create file \"%s\": %m",
histfilepath)));
fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n",
LSN_FORMAT_ARGS(startpoint), startxlogfilename);
fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n",
LSN_FORMAT_ARGS(stoppoint), stopxlogfilename);
/* Transfer remaining lines including label and start timeline to history file.*/
fprintf(fp, "%s", remaining);
fprintf(fp, "STOP TIME: %s\n", strfbuf);
fprintf(fp, "STOP TIMELINE: %u\n", stoptli);
if (fflush(fp) || ferror(fp) || FreeFile(fp))
ereport(ERROR, (errcode_for_file_access(), errmsg("could not write file \"%s\": %m", histfilepath)));
/* Clean out any no-longer-needed history files. As a side effect,
* this will post a .ready file for the newly created history file,
* notifying the archiver that history file may be archived immediately. */
CleanupBackupHistory();
}
// 等待归档结束
/*
* If archiving is enabled, wait for all the required WAL files to be
* archived before returning. If archiving isn't enabled, the required WAL
* needs to be transported via streaming replication (hopefully with
* wal_keep_size set high enough), or some more exotic mechanism like
* polling and copying files from pg_wal with script. We have no knowledge
* of those mechanisms, so it's up to the user to ensure that he gets all
* the required WAL.
*
* We wait until both the last WAL file filled during backup and the
* history file have been archived, and assume that the alphabetic sorting
* property of the WAL files ensures any earlier WAL files are safely
* archived as well.
*
* We wait forever, since archive_command is supposed to work and we
* assume the admin wanted his backup to work completely. If you don't
* wish to wait, then either waitforarchive should be passed in as false,
* or you can set statement_timeout. Also, some notices are issued to
* clue in anyone who might be doing this interactively. */
if (waitforarchive && ((!backup_started_in_recovery && XLogArchivingActive()) || (backup_started_in_recovery && XLogArchivingAlways())))
{
XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size);
XLogFileName(lastxlogfilename, stoptli, _logSegNo, wal_segment_size);
XLByteToSeg(startpoint, _logSegNo, wal_segment_size);
BackupHistoryFileName(histfilename, stoptli, _logSegNo, startpoint, wal_segment_size);
seconds_before_warning = 60;
waits = 0;
while (XLogArchiveIsBusy(lastxlogfilename) || XLogArchiveIsBusy(histfilename))
{
CHECK_FOR_INTERRUPTS();
if (!reported_waiting && waits > 5)
{
ereport(NOTICE, (errmsg("base backup done, waiting for required WAL segments to be archived")));
reported_waiting = true;
}
pgstat_report_wait_start(WAIT_EVENT_BACKUP_WAIT_WAL_ARCHIVE);
pg_usleep(1000000L);
pgstat_report_wait_end();
if (++waits >= seconds_before_warning)
{
seconds_before_warning *= 2; /* This wraps in >10 years... */
ereport(WARNING,
(errmsg("still waiting for all required WAL segments to be archived (%d seconds elapsed)",
waits),
errhint("Check that your archive_command is executing properly. "
"You can safely cancel this backup, "
"but the database backup will not be usable without all the WAL segments.")));
}
}
ereport(NOTICE,
(errmsg("all required WAL segments have been archived")));
}
else if (waitforarchive)
ereport(NOTICE,
(errmsg("WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup")));
/* We're done. As a convenience, return the ending WAL location.*/
if (stoptli_p)
*stoptli_p = stoptli;
return stoppoint;
}
恢复过程
参考文档:
9.27. 系统管理函数
He3DB恢复过程源码分析系列