How to stand up multiregion PostgreSQL with Pulumi

To improve the performance of your PostgreSQL database, consider setting up a multiregion service that will allow you to isolate read-intensive workloads to replicas while keeping write workloads on the primary. To learn how, read this tutorial where Trevor Kennedy explains the process using Aiven and Pulumi.

17 February 2021
Trevor KennedySenior Solution Architect at Aiven

UPDATE: Be sure to catch this webinar on 21 March 2021!

This tutorial uses Python and Pulumi to stand up a multiregion managed PostgreSQL service on Aiven. I default to using GCP as the underlying cloud provider, but this can be easily changed.

You could even use multiple public clouds with N number read replicas by changing a couple of variables in Pulumi. Let’s begin!

Requirements

The following must be installed on your local computer before you can follow along with this tutorial:

  • Python 3.7+
  • Pulumi 2.16+
  • Register for accounts with Pulumi & Aiven

Time and cost estimate

You'll need no more than 15 minutes for this.

The costs to temporarily run this tutorial should be negligible or zero.

You need:

  • Free Pulumi trial account
  • Free credits for new Aiven users (less than $1/hr for existing customers)

Why multiregion PostgreSQL?

Using Aiven, you can accelerate application development on open source data infrastructure and quickly deploy to public clouds (including AWS, GCP, and Azure).

Database read replicas help improve the performance and scale of read-intensive workloads. These workloads can be isolated to the replicas, while write workloads can be directed to the primary. This is good for analytical workloads, such as interactive dashboards that can use a read replica as the data source for reporting. Please note that data is replicated asynchronously, so there could be a delay when copying it to the replica. You can also create a read replica in a different region from your primary server: this is called cross-region replication. It can be helpful for scenarios like disaster recovery. For example, in the event that the primary server fails (or becomes unreachable), one of the read replicas can be promoted to replace it.

Aiven makes it trivial to use multiple public cloud providers to mitigate the risk of downtime even further. On our platform, you will not pay any network charges for synchronizing data between clouds, which is an important concern for read replica scenarios that leverage multiple public cloud providers. This matters more than you might think, as these transit costs can add up quickly!

Why Pulumi?

Pulumi enables you to create, deploy, and manage infrastructure on any cloud using your favorite programming languages such as TypeScript, Go, C# or, in this case, Python. Simply, import the appropriate Pulumi libraries for your language and cloud to begin defining infrastructure as code.

Set up the environment

Your first step is to install Python. Follow the directions on this page to install Python on your local system.

Next, get the Aiven examples by cloning this repo to your local machine.

Create the Python virtual environment and install the required dependencies:

$ cd aiven-examples/pulumi/postgres_read_replicas
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Set your Aiven API token using the Pulumi CLI:

$ pulumi config set aiven:apiToken XXXXXXXXXXXXXXXXXXXX --secret

Note that you can generate an API Token on your profile page.

Configure the project settings for Pulumi:

$ pulumi config set aiven_project sa-demo
$ pulumi config set plan_size startup-4
$ pulumi config set pg_version 12

Note that I am using a project named ‘sa-demo’ and I want PostgreSQL v12.x instances.

pulumi1

Set your preferred cloud regions and number of replicas, as well as instance names in the __main__.py file:

# Edit instance names, regions and # of read replicas as needed
primary = {'name': "pulumi-pg-master", 'region': "google-us-east1"}
replicas = [
    {'name': "pulumi-pg-replica1", 'region': "google-us-west1"},
    {'name': "pulumi-pg-replica2", 'region': "google-europe-west1"}
]

This will create a primary PG instance in the GCP us-east1 region, with a total of two read replicas (us-west1 and europe-west1).

*Supported regions/clouds for each service are listed on the pricing page. To create a 3rd read replica, add it to the replicas list. *

Use Pulumi to deploy the service with Aiven

Now we are ready to spin up our source on Aiven. Run the following in your terminal:

$ pulumi up

Select Yes to confirm the deployment. After about 5 minutes or so, you should see that all of the resources have been created. The URLs and username/password combinations listed are needed to access the services. The influxdb_uri shown here is also added as an environment variable in the Lambda function, so that the Python client can connect to InfluxDB.

Aiven console

To see the new services, log into the Aiven [console]=(https://console.aiven.io/) and verify that our three newly created resources are displayed:

pulumi2

Test

Now we can load some data. Issue the following commands to run and test:

export postgres_master_uri=$(pulumi stack output pulumi-pg-master_uri --show-secrets)
export postgres_replica_uri=$(pulumi stack output pulumi-pg-replica1_uri --show-secrets)
python3 run.py

The commands above do two things: First we export the connection strings for our primary database and the first read replica (which are automatically assigned at deployment time) to local environmental variables using the pulumi stack output command. Then we can run a Python script which will load 1000 fake ecommerce orders into the primary instance. Then it will read the row counts from both the primary and read replica, to demonstrate that both databases contain the same number of records.

Clean up

Let’s tidy up and tear down our resources to save on costs (we can always deploy this again later).

$ deactivate
$ pulumi destroy
$ pulumi stack rm <my-stack-name-here>

Python code details

Create Aiven infrastructure using Pulumi

All of the Pulumi specific code resides in main.py.The key to provisioning infrastructure on Aiven using Pulumi is to include the two Python libraries below:

import pulumi
import pulumi_aiven as aiven

Once you have done that, you can create a Postgres service by utilizing the Pg method with the associated parameters like this:

avn = aiven.Pg(instance_info['name'],
                project=conf.require('aiven_project'),
                cloud_name=instance_info['region'],
                service_name=instance_info['name'],
                plan=conf.require('plan_size'),
                pg_user_config=user_config_args,
                service_integrations=integrations
                )

I have chosen to leverage a Pulumi Config to preclude having to hardcode settings.

To create the read replicas, we have to create a service integration by specifying the read_replica integration type and providing the name of the primary database instance.

# Specify the read replica service integration
aiven.ServiceServiceIntegrationArgs(
        integration_type="read_replica",
        source_service_name=instance.service_name
    )

Generating test data using Python

Mock data is generated using the Faker package. The main.py program contains just two functions: one to insert some fake data into the master instance and the other to read the row count (on both the primary and a read replica). The generate_data routine also creates a table for us in the event that it does not exist. Check the file called schema.sql for details.

if __name__ == "__main__":
    generate_data("postgres_master_uri")
    get_row_count("postgres_master_uri")
get_row_count("postgres_replica_uri")

I connect to our instance using the popular Psycopg database adapter. This is a standard pattern for making database connections in Python. The only thing that I have to do differently is read the connection string from the environment variables that we previously set.

conn_string = os.getenv(service_uri)
conn = pg_connect(conn_string)
cursor = conn.cursor()
cursor.execute('select count(*) from fake_orders;')
count = cursor.fetchone()[0]
cursor.close()
print(f'Current row count in {service_uri}: {count}')

Here is the output displaying the row counts in both the primary and the first read replica:

pulumi3

Wrapping up

Pulumi is a great option to provision cloud resources using the language of your choice. And Aiven makes it trivial to deploy Postgres instances in your preferred cloud and region. When you put these together, you have a powerful combination.

This same approach can be utilized for any Aiven’s services, including managed Kafka and MySQL.

You are also invited to the webinar Multicloud OSS database deployments with Zero downtime - Pulumi and Aiven on 21 March 2021.

Not using Aiven services yet? Sign up now for your free trial at https://console.aiven.io/signup!

In the meantime, make sure you follow our changelog and blog RSS feeds or our LinkedIn and Twitter accounts to stay up-to-date with product and feature-related news.

tipsdatapostgresqltutorials

Start your free 30 day trial

Test the whole platform for 30 days with no ifs, ands, or buts.

Aiven logo

Let‘s connect

Apache Kafka, Apache Kafka Connect, Apache Kafka MirrorMaker 2, M3, M3 Aggregator, Apache Cassandra, Elasticsearch, PostgreSQL, MySQL, Redis, InfluxDB, Grafana are trademarks and property of their respective owners. All product and service names used in this website are for identification purposes only and do not imply endorsement.