Aiven Blog

Jan 13, 2022

Future PostgreSQL: Improvement to the Replication Protocol

Check out the latest improvement to PostgreSQL replication to which our OSPO team contributed to help solve reliability problems.

ronan-dunklau

Ronan Dunklau

|RSS Feed

Senior Software Engineer

Today's post is about a new PostgreSQL feature (coming in PostgreSQL 15)
developed by Aiven to solve a reliability problem we sometimes run into with
replication. The READ_REPLICATION_SLOT command makes a physical replication
client, particularly pg_receivewal, more reliable.

At Aiven's OSPO office, we dedicate our time to the open source products Aiven
offers or develops, and we were very happy to collaborate on this feature.

PostgreSQL replication surprises

This story starts with our own PgHoard, a
PITR
backup tool for PostgreSQL. PgHoard offers several methods to archive the WAL
(Write Ahead Log), including
pg_receivewal,
a small application shipping with PostgreSQL which connects to a PostgreSQL
cluster using the physical replication protocol to stream WAL as they are
produced, optionally keeping track of the position on the server using a
replication slot.

We noticed that we could occasionally lose some WAL when PgHoard is restarted
on another machine. This happens because pg_receivewal determines its start
position like this:

  • Look at what segments are present in the archive folder, and resume from the
    latest archived one.
  • If the archive folder is empty, start from the current server LSN position.

The problem is, in our case, since pg_receivewal was started on another
server, it didn't have access to the archive folder directly so it resumed from
the current LSN (Log Sequence Number) position.

You can observe this in action by launching pg_receivewal, stopping it after
some WAL have been archived, then relaunching it with its directory empty:

❯ sudo -u postgres pg_receivewal -D /tmp/wals -v -s pg_receivewal_slot pg_receivewal: starting log streaming at 47/E5000000 (timeline 1) pg_receivewal: finished segment at 47/E6000000 (timeline 1) pg_receivewal: received interrupt signal, exiting pg_receivewal: not renaming "0000000100000047000000E6.partial", segment is not complete pg_receivewal: received interrupt signal, exiting ❯ sudo rm /tmp/wals/* -rf # Wait for some WAL to be generated on the server, and relaunch pg_receivewal ❯ sudo -u postgres pg_receivewal -D /tmp/wals -v -s pg_receivewal_slot pg_receivewal: starting log streaming at 47/E9000000 (timeline 1)

We notice here that the segments 0000000100000047000000E7 and
0000000100000047000000E8 have not been archived at all, jumping from
47/E6000000 to 47/E9000000

What should happen here is resuming from the replication slot's restart_lsn, which
is kept on the server.

New feature: READ_REPLICATION_SLOT command

Prior to version 15 of PostgreSQL, the only way to know a replication slot's position
was to query the pg_replication_slots view, which means regular SQL queries need
to be issued. This is fine for logical replication, as logical replication
connection are bound to a database and are allowed to perform SQL queries. But in
the case of physical replication connections, we do not have this possibility

That meant that a physical replication client had no way to know the state of
it's replication slot, except by opening a separate, non-replication connection,
which is not something we can do with the pg_receivewal application.

The READ_REPLICATION_SLOT command aims to fulfill this gap, by allowing a
replication connection to read the current restart_lsn and timeline associated
to a replication slot:

❯ psql postgres -d "replication=1" psql (15devel) Type "help" for help. postgres=# READ_REPLICATION_SLOT slot1; slot_type | restart_lsn | restart_tli -----------+-------------+------------- physical | 0/1501000 | 1 (1 row)

With that new command, pg_receivewal can now determine its starting LSN
slightly differently:

  • Look at what segments are present in the archive folder, and resume from the
    latest archived one.
  • If the archive folder is empty and a replication slot is used,
    issue a READ_REPLICATION_SLOT command to read the state from the server.
  • Otherwise, start streaming from the server's current LSN.

This new command is currently only supported for physical replication slots,
and returns limited information but that could evolve in future if needed.
Already the feature will help with the "surprises" we encountered when using
pg_receivewal, and will also benefit other replication clients by making them
more reliable. It has been accepted into the project and will be available in
the version 15 release.

Next steps


Related resources