
Mercredi 12 février 2025 à 23:05
Suite de ma note 2025-02-12_1511.
J'ai passé 5h40 sur un POC de barman, mais je n'ai pas eu beaucoup plus de succès qu'avec pgBackRest. Décidément, ces outils ne m'aiment pas 😔.
Repository du POC : https://github.com/stephane-klein/poc-barman
La commande barman check streaming-server
retourne le message WAL archive: FAILED (please make sure WAL shipping is setup)
. Pour fixer cette erreur, je dois faire les manipulations suivantes que je trouve bizarre :
$ ./scripts/enter-in-pg1.sh
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/206A330
(1 row)
et ensuite :
$ docker compose exec barman bash
root@5482aa5f8420:/# su barman
barman@5482aa5f8420:/$ barman cron
Starting WAL archiving for server streaming-server
barman@5482aa5f8420:/$ barman check streaming-server
Server streaming-server:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
backup minimum size: OK (0 B)
wal maximum age: OK (no last_wal_maximum_age provided)
wal size: OK (0 B)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
systemid coherence: OK (no system Id stored on disk)
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK
Je ne sais pas pourquoi je dois lancer select pg_switch_wal();
.
J'ai pourtant configuré checkpoint_timeout='60s'
:
command: >
postgres
-c wal_level=replica
-c summarize_wal=on
-c checkpoint_timeout='60s'
-c max_wal_size='100MB'
Je pensais que ce paramètre effectuait la même action que pg_switch_wal();
mais je constate que non.
Aussi, je constate que je dois aussi lancer pg_switch_wal();
pour que la commande suivante se termine :
barman@5482aa5f8420:/$ barman backup streaming-server --wait
Starting backup using postgres method for server streaming-server in /var/lib/barman/streaming-server/base/20250212T221703
Backup start at LSN: 0/5000B40 (000000010000000000000005, 00000B40)
Starting backup copy via pg_basebackup for 20250212T221703
Copy done (time: 1 second)
Finalising the backup.
Backup size: 22.3 MiB
Backup end at LSN: 0/7000000 (000000010000000000000007, 00000000)
Backup completed (start time: 2025-02-12 22:17:03.190492, elapsed time: 1 second)
Waiting for the WAL file 000000010000000000000007 from server 'streaming-server'
Processing xlog segments from streaming for streaming-server
000000010000000000000005
Processing xlog segments from streaming for streaming-server
000000010000000000000006
Je ne comprends pas non plus pourquoi.
Journaux liées à cette note :
Journal du jeudi 13 février 2025 à 14:09
Suite de mes notes 2025-02-09_1705, 2025-02-12_1044, 2025-02-12_1511, 2025-02-12_1534 et 2025-02-12_2305 au sujet de barman pour sauvegarder des bases de données PostgreSQL
Je ne sais pas pourquoi je dois lancer
select pg_switch_wal();
.
J'ai découvert dans ce commentaire qu'il existe une commande nommée : barman switch-wal
.
Je pense avoir compris qu'avant d'exécuter barman backup…
il est nécessaire d'exécuter :
$ barman switch-wal
$ barman cron
$ barman check postgres1
Server postgres1:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
backup minimum size: OK (0 B)
wal maximum age: OK (no last_wal_maximum_age provided)
wal size: OK (0 B)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
systemid coherence: OK (no system Id stored on disk)
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK
$ barman backup postgres1 --immediate-checkpoint
Starting backup using postgres method for server postgres1 in /var/lib/barman/postgres1/base/20250213T100353
Backup start at LSN: 0/4000000 (000000010000000000000004, 00000000)
Starting backup copy via pg_basebackup for 20250213T100353
Copy done (time: 1 second)
Finalising the backup.
This is the first backup for server postgres1
WAL segments preceding the current backup have been found:
000000010000000000000002 from server postgres1 has been removed
Backup size: 22.3 MiB
Backup end at LSN: 0/6000000 (000000010000000000000006, 00000000)
Backup completed (start time: 2025-02-13 10:03:53.072228, elapsed time: 1 second)
Processing xlog segments from streaming for postgres1
000000010000000000000003
000000010000000000000004
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
total 4.0K
$ ls /var/lib/barman/postgres1/base/ -lha
total 8.0K
drwxr-xr-x 1 barman barman 60 Feb 13 10:00 .
drwxr-xr-x 1 barman barman 88 Feb 13 09:59 ..
drwxr-xr-x 1 barman barman 30 Feb 13 09:59 20250213T095917
$ barman list-backups postgres1
postgres1 20250213T103723 - F - Thu Feb 13 10:37:24 2025 - Size: 22.3 MiB - WAL Size: 0 B - WAITING_FOR_WALS
J'ai réussi dans le POC https://github.com/stephane-klein/poc-barman à dérouler toutes les étapes du backup complet jusqu'à la restauration d'une base de données.
Toutefois, pour le moment, je n'ai toujours pas réussi à restaurer un backup incrémental 🙁.
À cet endroit, j'ai l'erreur suivante :
$ docker compose up postgres2
postgres2-1 | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres2-1 |
postgres2-1 | 2025-02-13 13:20:07.594 UTC [1] LOG: starting PostgreSQL 17.2 (Debian 17.2-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
postgres2-1 | 2025-02-13 13:20:07.594 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
postgres2-1 | 2025-02-13 13:20:07.594 UTC [1] LOG: listening on IPv6 address "::", port 5432
postgres2-1 | 2025-02-13 13:20:07.596 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres2-1 | 2025-02-13 13:20:07.598 UTC [1] LOG: could not open directory "pg_tblspc": No such file or directory
postgres2-1 | 2025-02-13 13:20:07.600 UTC [29] LOG: database system was interrupted; last known up at 2025-02-13 13:20:03 UTC
postgres2-1 | 2025-02-13 13:20:07.643 UTC [29] LOG: could not open directory "pg_tblspc": No such file or directory
postgres2-1 | 2025-02-13 13:20:07.643 UTC [29] LOG: starting backup recovery with redo LSN 0/8000028, checkpoint LSN 0/8000080, on timeline ID 1
postgres2-1 | 2025-02-13 13:20:07.643 UTC [29] LOG: could not open directory "pg_tblspc": No such file or directory
postgres2-1 | 2025-02-13 13:20:07.649 UTC [29] FATAL: could not open directory "pg_tblspc": No such file or directory
postgres2-1 | 2025-02-13 13:20:07.651 UTC [1] LOG: startup process (PID 29) exited with exit code 1
postgres2-1 | 2025-02-13 13:20:07.651 UTC [1] LOG: aborting startup due to startup process failure
postgres2-1 | 2025-02-13 13:20:07.652 UTC [1] LOG: database system is shut down