Jeudi 13 février 2025 à 14:09

Suite de mes notes 2025-02-09_1705, 2025-02-12_1044, 2025-02-12_1511, 2025-02-12_1534 et 2025-02-12_2305 au sujet de barman pour sauvegarder des bases de données PostgreSQL

Je ne sais pas pourquoi je dois lancer select pg_switch_wal();.

source

J'ai découvert dans ce commentaire qu'il existe une commande nommée : barman switch-wal.

Je pense avoir compris qu'avant d'exécuter barman backup… il est nécessaire d'exécuter :

$ barman switch-wal
$ barman cron
$ barman check postgres1
Server postgres1:
        PostgreSQL: OK
        superuser or standard user with backup privileges: OK
        PostgreSQL streaming: OK
        wal_level: OK
        replication slot: OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archiver errors: OK

$ barman backup postgres1 --immediate-checkpoint
Starting backup using postgres method for server postgres1 in /var/lib/barman/postgres1/base/20250213T100353
Backup start at LSN: 0/4000000 (000000010000000000000004, 00000000)
Starting backup copy via pg_basebackup for 20250213T100353
Copy done (time: 1 second)
Finalising the backup.
This is the first backup for server postgres1
WAL segments preceding the current backup have been found:
        000000010000000000000002 from server postgres1 has been removed
Backup size: 22.3 MiB
Backup end at LSN: 0/6000000 (000000010000000000000006, 00000000)
Backup completed (start time: 2025-02-13 10:03:53.072228, elapsed time: 1 second)
Processing xlog segments from streaming for postgres1
        000000010000000000000003
        000000010000000000000004
WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
total 4.0K
$ ls /var/lib/barman/postgres1/base/ -lha
total 8.0K
drwxr-xr-x 1 barman barman 60 Feb 13 10:00 .
drwxr-xr-x 1 barman barman 88 Feb 13 09:59 ..
drwxr-xr-x 1 barman barman 30 Feb 13 09:59 20250213T095917

$ barman list-backups postgres1
postgres1 20250213T103723 - F - Thu Feb 13 10:37:24 2025 - Size: 22.3 MiB - WAL Size: 0 B - WAITING_FOR_WALS

J'ai réussi dans le POC https://github.com/stephane-klein/poc-barman à dérouler toutes les étapes du backup complet jusqu'à la restauration d'une base de données.

Toutefois, pour le moment, je n'ai toujours pas réussi à restaurer un backup incrémental 🙁.

À cet endroit, j'ai l'erreur suivante :

$ docker compose up postgres2
postgres2-1  | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres2-1  |
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  starting PostgreSQL 17.2 (Debian 17.2-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres2-1  | 2025-02-13 13:20:07.594 UTC [1] LOG:  listening on IPv6 address "::", port 5432
postgres2-1  | 2025-02-13 13:20:07.596 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres2-1  | 2025-02-13 13:20:07.598 UTC [1] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.600 UTC [29] LOG:  database system was interrupted; last known up at 2025-02-13 13:20:03 UTC
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  starting backup recovery with redo LSN 0/8000028, checkpoint LSN 0/8000080, on timeline ID 1
postgres2-1  | 2025-02-13 13:20:07.643 UTC [29] LOG:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.649 UTC [29] FATAL:  could not open directory "pg_tblspc": No such file or directory
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  startup process (PID 29) exited with exit code 1
postgres2-1  | 2025-02-13 13:20:07.651 UTC [1] LOG:  aborting startup due to startup process failure
postgres2-1  | 2025-02-13 13:20:07.652 UTC [1] LOG:  database system is shut down

Journaux liées à cette note :

Je me relance sur mes sujets de backup de PostgreSQL.

Au mois de février dernier, j'ai initié le « Projet 23 - "Ajouter le support pg_basebackup incremental à restic-pg_dump-docker" ».

J'ai ensuite publié les notes suivantes à ce sujet :

À ce jour, je n'ai pas fini mes POC suivants :

poc-pg_basebackup_incremental est la seule méthode que j'ai réussi à faire fonctionner totalement.

#JaimeraisUnJour terminer ces POC.

Aujourd'hui, je m'interroge sur les motivations qui m'ont conduit en 2020 à intégrer restic dans mon projet restic-pg_dump-docker. Avec le recul, l'utilisation de cet outil pour la simple sauvegarde d'archives pg_dump me semble désormais moins évidente qu'à l'époque.

J'ai fait ce choix peut-être pour bénéficier directement du support des fonctionnalités suivantes :

Uploader vers différents Object Storage : S3-compatible Storage
Le système de rétention : Removing snapshots according to a policy
Le chiffrement : Encryption
Et naïvement, je pensais peut-être pouvoir utiliser le système de déduplication des données : Backups and Deduplication

Après réflexion, je pense que pour la sauvegarde d'archives pg_dump, les fonctionnalités de déduplication et de sauvegarde incrémentale offertes par restic génèrent en réalité une surconsommation d'espace disque et de ressources CPU sans apporter aucun bénéfice.

J'ai ensuite effectué quelques recherches pour savoir s'il existait un système de sauvegarde PostgreSQL basé sur pg_dump et un système d'upload vers Object Storage et #JaiDécouvert pg_back (https://github.com/orgrim/pg_back/).

En 2020, quand j'ai créé restic-pg_dump-docker, je pense que je n'avais pas retenu pg_back car celui-ci était minimaliste et ne supportait pas encore l'upload vers de l'Object Storage.

En 2025, pg_back supporte toutes les fonctionnalités dont j'ai besoin :

pg_back is a dump tool for PostgreSQL. The goal is to dump all or some databases with globals at once in the format you want, because a simple call to pg_dumpall only dumps databases in the plain SQL format.

Behind the scene, pg_back uses pg_dumpall to dump roles and tablespaces definitions, pg_dump to dump all or each selected database to a separate file in the custom format. ...

Features

...

Choose the format of the dump for each database

...

Dump databases concurrently

...

Purge based on age and number of dumps to keep

Dump from a hot standby by pausing replication replay

Encrypt and decrypt dumps and other files

Upload and download dumps to S3, GCS, Azure, B2 or a remote host with SFTP

source

Je souhaite :

Créer et publier un playground pour tester pg_back
Si le résultat est positif, alors je souhaite ajouter une note en introduction de restic-pg_dump-docker pour inviter à ne pas utiliser ce projet et renvoyer les lecteurs vers le projet pg_back.