PostgreSQL 故障排查 -- pgbackrest
/etc/pgbackrest.conf
[global]
repo1-path=/var/lib/pgbackrest
[main]
pg1-path=/var/lib/postgresql
postgresql.conf:
archive_command = 'pgbackrest --stanza=main archive-push %p'
报错信息:
Dec 27 17:15:37 x podman[1707]: 2025-12-27 09:15:37.386 P00 ERROR: [103]: unable to find a valid repository:
Dec 27 17:15:37 x podman[1707]: repo1: [FileMissingError] unable to load info file '/var/lib/pgbackrest/archive/main/archive.info' or '/var/lib/pgbackrest/archive/main/archive.info.copy':
Dec 27 17:15:37 x podman[1707]: FileMissingError: unable to open missing file '/var/lib/pgbackrest/archive/main/archive.info' for read
Dec 27 17:15:37 x podman[1707]: FileMissingError: unable to open missing file '/var/lib/pgbackrest/archive/main/archive.info.copy' for read
Dec 27 17:15:37 x podman[1707]: HINT: archive.info cannot be opened but is required to push/get WAL segments.
Dec 27 17:15:37 x podman[1707]: HINT: is archive_command configured correctly in postgresql.conf?
Dec 27 17:15:37 x podman[1707]: HINT: has a stanza-create been performed?
Dec 27 17:15:37 x podman[1707]: HINT: use --no-archive-check to disable archive checks during backup if you have an alternate archiving scheme.
Dec 27 17:15:37 x podman[1707]: 2025-12-27 09:15:37.387 PRC [29] LOG: archive command failed with exit code 103
Dec 27 17:15:37 x podman[1707]: 2025-12-27 09:15:37.387 PRC [29] DETAIL: The failed archive command was: pgbackrest --stanza=main archive-push pg_wal/000000010000000000000001
Dec 27 17:15:37 x podman[1707]: 2025-12-27 09:15:37.387 PRC [29] WARNING: archiving write-ahead log file "000000010000000000000001" failed too many times, will try again later
Dec 27 17:15:37 x postgres[1787]: 2025-12-27 09:15:37.387 PRC [29] DETAIL: The failed archive command was: pgbackrest --stanza=main archive-push pg_wal/000000010000000000000001
Dec 27 17:15:37 x postgres[1787]: 2025-12-27 09:15:37.387 PRC [29] WARNING: archiving write-ahead log file "000000010000000000000001" failed too many times, will try again later
快速解决方法:
podman exec -it postgres bash
chown -R postgres:root /var/lib/pgbackrest /var/lib/postgresql
su - postgres -c 'pgbackrest --stanza=main stanza-create'
su - postgres -c 'pgbackrest --stanza=main check'
这个报错信息其实非常典型,而且 pgBackRest 的提示已经把“正确答案”写出来了。按生产级排错流程来一步步解释和修复。
你配置了
archive_command,但从未对 stanzamain执行过stanza-create, 导致archive.info不存在,WAL 无法归档。
所以 pgBackRest 没有一个“合法的仓库(repository)”可以接收 WAL。
ERROR: [103]: unable to find a valid repository
👉 pgBackRest 没有找到可用 repo
unable to load info file
'/var/lib/pgbackrest/archive/main/archive.info'
pgBackRest 必须存在以下文件之一 才能归档 WAL:
repo1-path/archive/<stanza>/archive.info
repo1-path/archive/<stanza>/archive.info.copy
而你的路径是:
/var/lib/pgbackrest/archive/main/
👉 目录 / 文件根本还没创建
archive command failed with exit code 103
archiving write-ahead log file failed too many times
PostgreSQL 只是如实汇报:
我执行 archive_command 了,但命令失败了
1️⃣ 配置 pgbackrest.conf
2️⃣ stanza-create ← 你漏了这一步
3️⃣ 配置 archive_command
4️⃣ 启动 PostgreSQL
而你现在是:
1️⃣ 配置 pgbackrest.conf
2️⃣ 直接启用 archive_command ❌
⚠️ 下面命令必须用 PostgreSQL 运行用户(通常是 postgres)执行
ls -ld /var/lib/pgbackrest
ls -ld /var/lib/postgresql
确保:
- pgBackRest 和 PostgreSQL 看到的是同一个数据目录
- 用户一致(容器里通常是 postgres)
pgbackrest --stanza=main stanza-create
成功后你应该看到:
INFO: stanza-create for stanza 'main' completed successfully
ls /var/lib/pgbackrest/archive/main/
至少应有:
archive.info
archive.info.copy
pgbackrest --stanza=main check
正常输出类似:
INFO: check command end: completed successfully
SELECT pg_switch_wal();
然后查看日志是否还有 archive 错误。
[global]
repo1-path=/var/lib/pgbackrest
[main]
pg1-path=/var/lib/postgresql
✔ 配置是正确的
archive_command = 'pgbackrest --stanza=main archive-push %p'
✔ 写法正确
📌 建议补充(生产必备):
archive_mode = on
wal_level = replica
你是在 podman 容器中运行 PostgreSQL,请重点确认:
podman exec -it <pg_container> ls /var/lib/pgbackrest
podman exec -it <pg_container> ls /var/lib/postgresql
👉 stanza-create 必须在 PostgreSQL 所在容器执行
推荐结构(示例):
-v /data/pgdata:/var/lib/postgresql
-v /data/pgbackrest:/var/lib/pgbackrest
否则:
- 容器重启 = repo 消失
- WAL 无法恢复
| 误区 | 后果 |
|---|---|
| 未执行 stanza-create | archive.info 不存在 |
| archive_command 先启用 | PostgreSQL 启动即报错 |
| repo 目录未持久化 | 归档看似成功,实际丢失 |
| stanza 名拼错 | repo 找不到 |
pgBackRest 的 archive 不是“即插即用”,stanza-create 是 WAL 归档的“生命起点”。