Ubuntu开启自启动PostgreSQL读取HDD失败处理思路

前置文章:

背景:

启动实体Ubuntu机器后后很大的概率PostgreSQL不会成功启动,查看日志:
Ubuntu启动时间:

bash 复制代码
root@Pine-Tree:~# uptime -s
2025-04-19 09:52:24

查看PostgreSQL运行状态

bash 复制代码
root@Pine-Tree:~# sudo systemctl status postgresql@15-main
× postgresql@15-main.service - PostgreSQL Cluster 15-main
     Loaded: loaded (/lib/systemd/system/postgresql@.service; enabled; vendor preset: enabled)
     Active: failed (Result: protocol) since Sat 2025-04-19 09:52:26 CST; 15min ago
    Process: 700 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 15-main start (code=exited, status=1/FAILURE)
        CPU: 41ms

4月 19 09:52:26 Pine-Tree systemd[1]: Starting PostgreSQL Cluster 15-main...
4月 19 09:52:26 Pine-Tree postgresql@15-main[700]: Error: /mnt/pgdata/main is not accessible or does not exist
4月 19 09:52:26 Pine-Tree systemd[1]: postgresql@15-main.service: Can't open PID file /run/postgresql/15-main.pid (yet?) after start: Operation not permitted
4月 19 09:52:26 Pine-Tree systemd[1]: postgresql@15-main.service: Failed with result 'protocol'.
4月 19 09:52:26 Pine-Tree systemd[1]: Failed to start PostgreSQL Cluster 15-main.

可知在系统启动2秒后就开始尝试启动PostgreSQL了,但是挂载目录/mnt/pgdata/main还无法访问,导致PostgreSQL启动失败。

查询相关资料发现冷启动HDD通过USB3.0连接从开机到系统检测完毕大概需要3-20秒 。

解决思路:

使用systemctl edit调整启动策略

方案一、设置PostgreSQL延迟5秒启动

创建文件夹用于systemctl edit配置
bash 复制代码
sudo mkdir -p /etc/systemd/system/postgresql@15-main.service.d
新增片段覆盖文件
bash 复制代码
sudo nano /etc/systemd/system/postgresql@15-main.service.d/override.conf
在打开的编辑器中添加以下内容
bash 复制代码
[Service]
ExecStartPre=/bin/sleep 5
保存并退出,然后重新加载systemd配置
bash 复制代码
sudo systemctl daemon-reload
重新启动验证
bash 复制代码
reboot
确认PostgreSQL运行状况

启动成功:

bash 复制代码
root@Pine-Tree:~# sudo systemctl status postgresql@15-main
● postgresql@15-main.service - PostgreSQL Cluster 15-main
     Loaded: loaded (/lib/systemd/system/postgresql@.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/postgresql@15-main.service.d
             └─override.conf
     Active: active (running) since Sat 2025-04-19 11:20:15 CST; 2min 14s ago
    Process: 814 ExecStartPre=/bin/sleep 5 (code=exited, status=0/SUCCESS)
    Process: 1440 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 15-main start (code=exited, status=0/SUCCESS)
   Main PID: 1446 (postgres)

Ubuntu启动时间:

bash 复制代码
root@Pine-Tree:~# uptime -s
2025-04-19 11:19:56

确认PostgreSQL启动时间,可知延迟启动生效

bash 复制代码
root@Pine-Tree:~# ps -eo pid,lstart,cmd | grep postgres | grep -v grep
   1446 Sat Apr 19 11:20:05 2025 /usr/lib/postgresql/15/bin/postgres -D /mnt/pgdata/main -c config_file=/etc/postgresql/15/main/postgresql.conf
   1483 Sat Apr 19 11:20:08 2025 postgres: 15/main: checkpointer 
   1484 Sat Apr 19 11:20:08 2025 postgres: 15/main: background writer 
   1486 Sat Apr 19 11:20:10 2025 postgres: 15/main: walwriter 
   1487 Sat Apr 19 11:20:10 2025 postgres: 15/main: autovacuum launcher 
   1488 Sat Apr 19 11:20:10 2025 postgres: 15/main: logical replication launcher 
   1838 Sat Apr 19 11:22:32 2025 postgres: 15/main: postgres dbname 192.168.125.2(6139) idle

方案二、PostgreSQL开机自启动失败后重试2次(间隔10秒)

修改override.conf
bash 复制代码
sudo nano /etc/systemd/system/postgresql@15-main.service.d/override.conf

配置调整为:

bash 复制代码
[Service]
Restart=on-failure
RestartSec=10s
StartLimitBurst=2
保存并退出,然后重新加载systemd配置
bash 复制代码
sudo systemctl daemon-reload
重新启动验证
bash 复制代码
reboot
确认PostgreSQL运行状况

启动成功:

bash 复制代码
root@Pine-Tree:~# sudo systemctl status postgresql@15-main
● postgresql@15-main.service - PostgreSQL Cluster 15-main
     Loaded: loaded (/lib/systemd/system/postgresql@.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/postgresql@15-main.service.d
             └─override.conf
     Active: active (running) since Sat 2025-04-19 12:30:06 CST; 7min ago
    Process: 1479 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 15-main start (code=exited, status=0/SUCCESS)
   Main PID: 1487 (postgres)

Ubuntu启动时间:

bash 复制代码
root@Pine-Tree:~# uptime -s
2025-04-19 12:29:47

查看PostgreSQL历史启动记录,可知12:29:50s首次启动PostgreSQL失败,10秒过后启动成功:

bash 复制代码
 root@Pine-Tree:~# sudo journalctl -u postgresql@15-main --no-pager -n 50
 -- Boot 0ba0937613c14ba8b47c6bb17de28bcd --
4月 19 12:29:50 Pine-Tree systemd[1]: Starting PostgreSQL Cluster 15-main...
4月 19 12:29:50 Pine-Tree postgresql@15-main[794]: Error: /mnt/pgdata/main is not accessible or does not exist
4月 19 12:29:50 Pine-Tree systemd[1]: postgresql@15-main.service: Can't open PID file /run/postgresql/15-main.pid (yet?) after start: Operation not permitted
4月 19 12:29:50 Pine-Tree systemd[1]: postgresql@15-main.service: Failed with result 'protocol'.
4月 19 12:29:50 Pine-Tree systemd[1]: Failed to start PostgreSQL Cluster 15-main.
4月 19 12:30:00 Pine-Tree systemd[1]: postgresql@15-main.service: Scheduled restart job, restart counter is at 1.
4月 19 12:30:00 Pine-Tree systemd[1]: Stopped PostgreSQL Cluster 15-main.
4月 19 12:30:00 Pine-Tree systemd[1]: Starting PostgreSQL Cluster 15-main...
4月 19 12:30:06 Pine-Tree systemd[1]: Started PostgreSQL Cluster 15-main.

方案三、设置PostgreSQL延迟5秒启动同时设置启动失败后重试2次(间隔10秒 )

修改override.conf后重新验证

bash 复制代码
sudo nano /etc/systemd/system/postgresql@15-main.service.d/override.conf

配置调整为:

bash 复制代码
[Service]
ExecStartPre=/bin/sleep 5
Restart=on-failure
RestartSec=10s
StartLimitBurst=2
保存并退出,然后重新加载systemd配置

大部分情况下,延迟5秒即可保证启动成功,不会走到重试逻辑

bash 复制代码
sudo systemctl daemon-reload

问题汇总

sudo systemctl edit postgresql@15-main编辑后保存失败,提示文件不存在

bash 复制代码
root@Pine-Tree:~# sudo systemctl edit postgresql@15-main
Editing "/etc/systemd/system/postgresql@15-main.service.d/override.conf" canceled: temporary file is empty.

解决措施:

创建文件夹用于systemctl edit配置

bash 复制代码
sudo mkdir -p /etc/systemd/system/postgresql@15-main.service.d

新增片段覆盖文件,然后编辑

bash 复制代码
sudo nano /etc/systemd/system/postgresql@15-main.service.d/override.conf
相关推荐
Lovyk18 分钟前
Linux 正则表达式
linux·运维
Fireworkitte1 小时前
Ubuntu、CentOS、AlmaLinux 9.5的 rc.local实现 开机启动
linux·ubuntu·centos
sword devil9002 小时前
ubuntu常见问题汇总
linux·ubuntu
ac.char2 小时前
在CentOS系统中查询已删除但仍占用磁盘空间的文件
linux·运维·centos
淮北也生橘123 小时前
Linux的ALSA音频框架学习笔记
linux·笔记·学习
华强笔记7 小时前
Linux内存管理系统性总结
linux·运维·网络
十五年专注C++开发7 小时前
CMake进阶: CMake Modules---简化CMake配置的利器
linux·c++·windows·cmake·自动化构建
phoenix09818 小时前
ansible部署lnmp-allinone
linux·运维·ansible
winds~8 小时前
【git】 撤销revert一次commit中的某几个文件
linux·c++
iY_n9 小时前
Linux网络基础
linux·网络·arm开发