Talk:Backup-Script for Pubman

From MPDLMediaWiki
Jump to navigation Jump to search

Using the logical volume manager snapshots for consistent backups seems like a good idea. But if you want to use the lv snapshot to secure consistency, then why shut everything down first? It would seem that if you shut down all the services involved, and in particular the Postgres server, then you might as well copy the files directly off the disk, without the overhead of the snapshot.

Generally though, shutting down services for backup or maintenance doesn't fulfill 21. century expectations of 24/7 service and ought not be necessary. If I read the documentation of lvcreate rightly, the snapshot method should be safe, as long as buffers or transactions are continuously being written to disk, and not just held in memory. The Postgres manual says that

An alternative file-system backup approach is to make a "consistent snapshot" of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly). The typical procedure is to make a "frozen snapshot" of the volume containing the database, then copy the whole data directory (not just parts, see above) from the snapshot to a backup device, then release the frozen snapshot. This will work even while the database server is running. However, a backup created in this way saves the database files in a state as if the database server was not properly shut down; therefore, when you start the database server on the backed-up data, it will think the previous server instance crashed and will replay the WAL log. This is not a problem; just be aware of it (and be sure to include the WAL files in your backup). You can perform a CHECKPOINT before taking the snapshot to reduce recovery time. (from http://developer.postgresql.org/pgdocs/postgres/backup-file.html).

No support of any particular snapshot technology is mentioned - and take a note of the disclaimer "and you are willing to trust that it is implemented correctly"! It would seem worth testing if lvcreate works securely enough with Postgres and build a script around that, with no service shutdown. Even better if Postgres people could directly express support of the method with lvcreate...

The Postgres documentation mentions two alternatives: The pg_dump which is a logical backup that could possibly be combined with a file system snapshot. I would not recommend this as a backup method, due to the difficult and time-consuming recovery procedure. And finally, the Postgres archiving method which seems to be the standard Postgres method for 24/7 production. I have not tried it, but I assume it would work, combined with some sort of file system synchronisation, possibly snapshots. To me, the larger issue remains how to synchronize the entire range of services involved in Pubman, without having to do the total service shutdown.

Does virtual server snapshots maybe provide some different possibilities for the backup/recovery scenarios?.<