July 22, 2016

Backing up tablespaces and streaming WAL with PGHoard

We’ve just released a new version of PGHoard, the PostgreSQL cloud backup tool we initially developed for Aiven and later open sourced.

Version 1.4.0 comes with the following new features:

  • Support for PostgreSQL 9.6 beta3
  • Support for backing up multiple tablespaces
  • Support for StatsD and Datadog metrics collection
  • Basebackup restoration now shows download progress
  • Experimental new WAL streaming mode walreceiver, which reads the write-ahead log data directly from the PostgreSQL server using the streaming replication protocol
  • New status API in the internal REST HTTP server

Please see our previous blog post about PGHoard for more information about the tool and a guide for deploying it.

Backing up multiple tablespaces

This is the first version of PGHoard capable of backing up multiple tablespaces. Multiple tablespaces require using the new local-tar backup option for reading files directly from the disk instead of streaming them using pg_basebackup as pg_basebackup doesn’t currently allow streaming multiple tablespaces without writing them to the local filesystem.

The current version of PGHoard can utilize the local-tar backup mode only on a PG master server, PostgreSQL versions prior to 9.6 don’t allow users to run the necessary control commands on a standby server without using the pgespresso extension. pgespresso also required fixes which we contributed to support multiple tablespaces - once a fixed version has been released we’ll add support for it to PGHoard.

The next version of PGHoard, due out by the time of PostgreSQL 9.6 final release, will support local-tar backups from standby servers, natively when running 9.6 and using the pgespresso extension when running older versions with the latest version of the extension.

A future version of PGHoard will support backing up and restoring PostgreSQL basebackups in parallel mode when using the local-tar mode.  This will greatly reduce the time required for setting up a new standby server or restoring a system from backups.

Streaming replication support

This version adds experimental support for reading PostgreSQL’s write-ahead log directly from the server using the streaming replication protocol which is also used by PostgreSQL’s native replication and related tools such as pg_basebackup and pg_receivexlog. The functionality currently depends on an unmerged psycopg2 pull request which we hope to see land in a psycopg2 release soon.

While the walreceiver mode is still experimental it has a number of benefits over other methods of backing up the WAL and allows implementing new features in the future: temporary, uncompressed, files as written by pg_receivexlog are no longer needed saving disk space and I/O and incomplete WAL segments can be archived at specified intervals or, for example, whenever a new COMMIT appears in the WAL stream.

New contributors

The following people contributed their first patches to PGHoard in this release:

  • Brad Durrow
  • Tarvi Pillessaar

PGHoard in Aiven.io

We’re happy to talk more about PGHoard and help you set up your backups with it.  You can also sign up for a free trial of our aiven.io PostgreSQL service where PGHoard will take care of your backups.

Start your free 30 day trial today

Test the whole platform for 30 days with no ifs, ands, or buts.