Mercredi 12 février 2025 à 15:11

Je pense comprendre que pgBackRest ne permet pas d'utiliser des INET sockets pour communiquer avec PostgreSQL.

Toutefois, je me dis que je pourrais partager le volume PGDATA avec le sidecar pgBackRest pour lui donner accès à l'Unix Socket du Streaming Replication Protocol 🤔.

source

Je viens de me rappeler que pgBackRest a une seconde contrainte qui semble l'empêcher de fonctionner en Docker sidecar :

Backing up a running PostgreSQL cluster requires WAL archiving to be enabled. Note that at least one WAL segment will be created during the backup process even if no explicit writes are made to the cluster.

pg-primary:/etc/postgresql/15/demo/postgresql.conf ⇒ Configure archive settings :

archive_command = 'pgbackrest --stanza=demo archive-push %p'
archive_mode = on
max_wal_senders = 3
wal_level = replica 

source

Cela signifie que l'exécutable pgbackrest doit être installé dans l'image Docker PostgreSQL.

Cela me pose un problème parce que mon objectif est de pouvoir utiliser un système de sauvegarde en Docker sidecar sans avoir à utiliser une image Docker PostgreSQL modifiée.

Cette contrainte ne semble pas présente avec barman qui propose 3 méthodes de backup :

La méthode postgres utilise pg_basebackup et je pense qu'elle peut fonctionner en Docker sidecar.

#JaiDécidé d'explorer cette piste.


Journaux liées à cette note :

Journal du jeudi 13 février 2025 à 14:09 #postgresql, #backup

Suite de mes notes 2025-02-09_1705, 2025-02-12_1044, 2025-02-12_1511, 2025-02-12_1534 et 2025-02-12_2305 au sujet de barman pour sauvegarder des bases de données PostgreSQL


Je ne sais pas pourquoi je dois lancer select pg_switch_wal();.

source

J'ai découvert dans ce commentaire qu'il existe une commande nommée : barman switch-wal.

Je pense avoir compris qu'avant d'exécuter barman backup… il est nécessaire d'exécuter :

$ barman switch-wal
$ barman cron
$ barman check postgres1
Server postgres1:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: OK

$ barman backup postgres1 --immediate-checkpoint
Starting backup using postgres method for server postgres1 in /var/lib/barman/postgres1/base/20250213T100353
Backup start at LSN: 0/4000000 (000000010000000000000004, 00000000)
Starting backup copy via pg_basebackup for 20250213T100353
Copy done (time: 1 second)
Finalising the backup.
This is the first backup for server postgres1
WAL segments preceding the current backup have been found:
        000000010000000000000002 from server postgres1 has been removed
Backup size: 22.3 MiB
Backup end at LSN: 0/6000000 (000000010000000000000006, 00000000)
Backup completed (start time: 2025-02-13 10:03:53.072228, elapsed time: 1 second)
Processing xlog segments from streaming for postgres1
        000000010000000000000003
        000000010000000000000004
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
total 4.0K
$ ls /var/lib/barman/postgres1/base/ -lha
total 8.0K
drwxr-xr-x 1 barman barman 60 Feb 13 10:00 .
drwxr-xr-x 1 barman barman 88 Feb 13 09:59 ..
drwxr-xr-x 1 barman barman 30 Feb 13 09:59 20250213T095917

$ barman list-backups postgres1
postgres1 20250213T103723 - F - Thu Feb 13 10:37:24 2025 - Size: 22.3 MiB - WAL Size: 0 B - WAITING_FOR_WALS

J'ai réussi dans le POC https://github.com/stephane-klein/poc-barman à dérouler toutes les étapes du backup complet jusqu'à la restauration d'une base de données.

Toutefois, pour le moment, je n'ai toujours pas réussi à restaurer un backup incrémental 🙁.

À cet endroit, j'ai l'erreur suivante :

$ docker compose up postgres2
postgres2-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres2-1  |
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  starting PostgreSQL 17.2 (Debian 17.2-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv6 address "::", port 5432
postgres2-1  | 2025-02-13 13:20:07.596 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres2-1  | 2025-02-13 13:20:07.598 UTC [1] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.600 UTC [29] LOG:  database system was interrupted; last known up at 2025-02-13 13:20:03 UTC
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  starting backup recovery with redo LSN 0/8000028, checkpoint LSN 0/8000080, on timeline ID 1
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.649 UTC [29] FATAL:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  startup process (PID 29) exited with exit code 1
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  aborting startup due to startup process failure
postgres2-1  | 2025-02-13 13:20:07.652 UTC [1] LOG:  database system is shut down

Journal du mercredi 12 février 2025 à 23:05 #backup, #postgresql

Suite de ma note 2025-02-12_1511.

J'ai passé 5h40 sur un POC de barman, mais je n'ai pas eu beaucoup plus de succès qu'avec pgBackRest. Décidément, ces outils ne m'aiment pas 😔.

Repository du POC : https://github.com/stephane-klein/poc-barman

La commande barman check streaming-server retourne le message WAL archive: FAILED (please make sure WAL shipping is setup). Pour fixer cette erreur, je dois faire les manipulations suivantes que je trouve bizarre :

$ ./scripts/enter-in-pg1.sh
postgres=# select pg_switch_wal();
 pg_switch_wal
---------------
 0/206A330
(1 row)

et ensuite :

$ docker compose exec barman bash
root@5482aa5f8420:/# su barman
barman@5482aa5f8420:/$ barman cron
Starting WAL archiving for server streaming-server
barman@5482aa5f8420:/$ barman check streaming-server
Server streaming-server:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: OK

Je ne sais pas pourquoi je dois lancer select pg_switch_wal();.

J'ai pourtant configuré checkpoint_timeout='60s' :

    command: >
      postgres
      -c wal_level=replica
      -c summarize_wal=on
      -c checkpoint_timeout='60s'
      -c max_wal_size='100MB'

Je pensais que ce paramètre effectuait la même action que pg_switch_wal(); mais je constate que non.

Aussi, je constate que je dois aussi lancer pg_switch_wal(); pour que la commande suivante se termine :

barman@5482aa5f8420:/$ barman backup streaming-server --wait
Starting backup using postgres method for server streaming-server in /var/lib/barman/streaming-server/base/20250212T221703
Backup start at LSN: 0/5000B40 (000000010000000000000005, 00000B40)
Starting backup copy via pg_basebackup for 20250212T221703
Copy done (time: 1 second)
Finalising the backup.
Backup size: 22.3 MiB
Backup end at LSN: 0/7000000 (000000010000000000000007, 00000000)
Backup completed (start time: 2025-02-12 22:17:03.190492, elapsed time: 1 second)
Waiting for the WAL file 000000010000000000000007 from server 'streaming-server'
Processing xlog segments from streaming for streaming-server
        000000010000000000000005
Processing xlog segments from streaming for streaming-server
        000000010000000000000006

Je ne comprends pas non plus pourquoi.