Aiven Blog

Jul 22, 2016

Backing up tablespaces and streaming WAL with PGHoard

Our open-source PGHoard’s latest version is now available. Find out what new features were added and how to get the most out of it.

oskari-saarenmaa

Oskari Saarenmaa

|RSS Feed

Chief Executive Officer at Aiven

We've just released a new version of PGHoard, the PostgreSQL cloud backup tool we initially developed for Aiven and later open sourced.

Version 1.4.0 comes with the following new features:

  • Support for PostgreSQL 9.6 beta3
  • Support for backing up multiple tablespaces
  • Support for StatsD and Datadog metrics collection
  • Basebackup restoration now shows download progress
  • Experimental new WAL streaming mode walreceiver, which reads the write-ahead log data directly from the PostgreSQL server using the streaming replication protocol
  • New status API in the internal REST HTTP server

Please see our previous blog post about PGHoard for more information about the tool and a guide for deploying it.

Backing up multiple tablespaces

This is the first version of PGHoard capable of backing up multiple tablespaces. Multiple tablespaces require using the new local-tar backup option for reading files directly from the disk instead of streaming them using pg_basebackup as pg_basebackup doesn't currently allow streaming multiple tablespaces without writing them to the local filesystem.

The current version of PGHoard can utilize the local-tar backup mode only on a PG master server, PostgreSQL versions prior to 9.6 don't allow users to run the necessary control commands on a standby server without using the pgespresso extension. pgespresso also required fixes which we contributed to support multiple tablespaces - once a fixed version has been released we'll add support for it to PGHoard.

The next version of PGHoard, due out by the time of PostgreSQL 9.6 final release, will support local-tar backups from standby servers, natively when running 9.6 and using the pgespresso extension when running older versions with the latest version of the extension.

A future version of PGHoard will support backing up and restoring PostgreSQL basebackups in parallel mode when using the local-tar mode.  This will greatly reduce the time required for setting up a new standby server or restoring a system from backups.

Streaming replication support

This version adds experimental support for reading PostgreSQL's write-ahead log directly from the server using the streaming replication protocol which is also used by PostgreSQL's native replication and related tools such as pg_basebackup and pg_receivexlog. The functionality currently depends on an unmerged psycopg2 pull request which we hope to see land in a psycopg2 release soon.

While the walreceiver mode is still experimental it has a number of benefits over other methods of backing up the WAL and allows implementing new features in the future: temporary, uncompressed, files as written by pg_receivexlog are no longer needed saving disk space and I/O and incomplete WAL segments can be archived at specified intervals or, for example, whenever a new COMMIT appears in the WAL stream.

New contributors

The following people contributed their first patches to PGHoard in this release:

  • Brad Durrow
  • Tarvi Pillessaar

PGHoard in Aiven.io

We're happy to talk more about PGHoard and help you set up your backups with it.  You can also sign up for a free trial of our aiven.io PostgreSQL service where PGHoard will take care of your backups.


Related resources