Mercredi 12 février 2025 à 23:05

Suite de ma note 2025-02-12_1511.

J'ai passé 5h40 sur un POC de barman, mais je n'ai pas eu beaucoup plus de succès qu'avec pgBackRest. Décidément, ces outils ne m'aiment pas 😔.

Repository du POC : https://github.com/stephane-klein/poc-barman

La commande barman check streaming-server retourne le message WAL archive: FAILED (please make sure WAL shipping is setup). Pour fixer cette erreur, je dois faire les manipulations suivantes que je trouve bizarre :

$ ./scripts/enter-in-pg1.sh
postgres=# select pg_switch_wal();
 pg_switch_wal
---------------
 0/206A330
(1 row)

et ensuite :

$ docker compose exec barman bash
root@5482aa5f8420:/# su barman
barman@5482aa5f8420:/$ barman cron
Starting WAL archiving for server streaming-server
barman@5482aa5f8420:/$ barman check streaming-server
Server streaming-server:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: OK

Je ne sais pas pourquoi je dois lancer select pg_switch_wal();.

J'ai pourtant configuré checkpoint_timeout='60s' :

    command: >
      postgres
      -c wal_level=replica
      -c summarize_wal=on
      -c checkpoint_timeout='60s'
      -c max_wal_size='100MB'

Je pensais que ce paramètre effectuait la même action que pg_switch_wal(); mais je constate que non.

Aussi, je constate que je dois aussi lancer pg_switch_wal(); pour que la commande suivante se termine :

barman@5482aa5f8420:/$ barman backup streaming-server --wait
Starting backup using postgres method for server streaming-server in /var/lib/barman/streaming-server/base/20250212T221703
Backup start at LSN: 0/5000B40 (000000010000000000000005, 00000B40)
Starting backup copy via pg_basebackup for 20250212T221703
Copy done (time: 1 second)
Finalising the backup.
Backup size: 22.3 MiB
Backup end at LSN: 0/7000000 (000000010000000000000007, 00000000)
Backup completed (start time: 2025-02-12 22:17:03.190492, elapsed time: 1 second)
Waiting for the WAL file 000000010000000000000007 from server 'streaming-server'
Processing xlog segments from streaming for streaming-server
        000000010000000000000005
Processing xlog segments from streaming for streaming-server
        000000010000000000000006

Je ne comprends pas non plus pourquoi.


Journaux liées à cette note :

Journal du jeudi 13 février 2025 à 14:09 #postgresql, #backup

Suite de mes notes 2025-02-09_1705, 2025-02-12_1044, 2025-02-12_1511, 2025-02-12_1534 et 2025-02-12_2305 au sujet de barman pour sauvegarder des bases de données PostgreSQL


Je ne sais pas pourquoi je dois lancer select pg_switch_wal();.

source

J'ai découvert dans ce commentaire qu'il existe une commande nommée : barman switch-wal.

Je pense avoir compris qu'avant d'exécuter barman backup… il est nécessaire d'exécuter :

$ barman switch-wal
$ barman cron
$ barman check postgres1
Server postgres1:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: OK

$ barman backup postgres1 --immediate-checkpoint
Starting backup using postgres method for server postgres1 in /var/lib/barman/postgres1/base/20250213T100353
Backup start at LSN: 0/4000000 (000000010000000000000004, 00000000)
Starting backup copy via pg_basebackup for 20250213T100353
Copy done (time: 1 second)
Finalising the backup.
This is the first backup for server postgres1
WAL segments preceding the current backup have been found:
        000000010000000000000002 from server postgres1 has been removed
Backup size: 22.3 MiB
Backup end at LSN: 0/6000000 (000000010000000000000006, 00000000)
Backup completed (start time: 2025-02-13 10:03:53.072228, elapsed time: 1 second)
Processing xlog segments from streaming for postgres1
        000000010000000000000003
        000000010000000000000004
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
total 4.0K
$ ls /var/lib/barman/postgres1/base/ -lha
total 8.0K
drwxr-xr-x 1 barman barman 60 Feb 13 10:00 .
drwxr-xr-x 1 barman barman 88 Feb 13 09:59 ..
drwxr-xr-x 1 barman barman 30 Feb 13 09:59 20250213T095917

$ barman list-backups postgres1
postgres1 20250213T103723 - F - Thu Feb 13 10:37:24 2025 - Size: 22.3 MiB - WAL Size: 0 B - WAITING_FOR_WALS

J'ai réussi dans le POC https://github.com/stephane-klein/poc-barman à dérouler toutes les étapes du backup complet jusqu'à la restauration d'une base de données.

Toutefois, pour le moment, je n'ai toujours pas réussi à restaurer un backup incrémental 🙁.

À cet endroit, j'ai l'erreur suivante :

$ docker compose up postgres2
postgres2-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres2-1  |
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  starting PostgreSQL 17.2 (Debian 17.2-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv6 address "::", port 5432
postgres2-1  | 2025-02-13 13:20:07.596 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres2-1  | 2025-02-13 13:20:07.598 UTC [1] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.600 UTC [29] LOG:  database system was interrupted; last known up at 2025-02-13 13:20:03 UTC
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  starting backup recovery with redo LSN 0/8000028, checkpoint LSN 0/8000080, on timeline ID 1
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.649 UTC [29] FATAL:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  startup process (PID 29) exited with exit code 1
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  aborting startup due to startup process failure
postgres2-1  | 2025-02-13 13:20:07.652 UTC [1] LOG:  database system is shut down