Skip to main content

Sample dataset for PostgreSQL®: Pagila

Use an Aiven-provided sample database that you can import in your Aiven for PostgreSQL® service.

Pagila is a PostgreSQL® port of the Sakila Sample Database. The examples use one from devrimgunduz, version 2.1.0.

Sakila/Pagila is a database representing a DVD rental store, containing information about films, rental stores and rentals, where a customer rents a film from a store through its staff.

With all this relational information, Pagila is a perfect fit to play around with PostgreSQL and the SQL language.

Load Pagila to your Aiven for PostgreSQL service

Before exploring the Pagila database, follow the create a service to start a PostgreSQL instance.

  1. Download the pagila-data.sql from our GitHub repository.

    tip

    You may use the following command on your terminal:

    wget https://raw.githubusercontent.com/aiven/devportal/main/code/products/postgresql/pagila/pagila-data.sql
  2. Connect to the PostgreSQL instance using the following command. The SERVICE_URI value can be found in the Aiven Console dashboard.

    psql 'SERVICE_URI'
  3. Within the psql shell, create a database named pagila and connect to it with the command below:

    CREATE DATABASE pagila;
    \c pagila;
  4. Populate the database with the command below. This might take some time.

    \i pagila-data.sql;
  5. Once the command finishes, make sure to reconnect to the database to access the imported data:

    \c pagila;

Entity-relationship model diagram

The image below shows an overview of the Pagila database tables and views, generated by DBeaver.

For example, the film table has string columns like title and description. It also relates to the table language with the columns language_id and original_language_id. With that information, you know that you can join both tables to get the language of each film, or to list all films for a specific language.

A entity-relation model diagram for the Pagila databases, containing all the tables, fields and views.

Sample queries

Let's explore the dataset with a few queries. All the queries results were limited by the first 10 items.

List all the films by ordered by their length

select
film_id,
title,
length
from
film
order by
length desc;
| film_id | title              | length |
| ------- | ------------------ | ------ |
| 426 | HOME PITY | 185 |
| 690 | POND SEATTLE | 185 |
| 609 | MUSCLE BRIGHT | 185 |
| 991 | WORST BANGER | 185 |
| 182 | CONTROL ANTHEM | 185 |
| 141 | CHICAGO NORTH | 185 |
| 349 | GANGS PRIDE | 185 |
| 212 | DARN FORRESTER | 185 |
| 817 | SOLDIERS EVOLUTION | 185 |
| 872 | SWEET BROTHERHOOD | 185 |

List how many films there are in each film category

select
category.name,
count(category.name) category_count
from
category
left join film_category on
category.category_id = film_category.category_id
left join film on
film_category.film_id = film.film_id
group by
category.name
order by
category_count desc;
| name        | category_count |
| ----------- | -------------- |
| Sports | 74 |
| Foreign | 73 |
| Family | 69 |
| Documentary | 68 |
| Animation | 66 |
| Action | 64 |
| New | 63 |
| Drama | 62 |
| Sci-Fi | 61 |
| Games | 61 |

Show the actors and actresses ordered by how many movies they are featured in

select
actor.first_name,
actor.last_name,
count(actor.first_name) featured_count
from
actor
left join film_actor on
actor.actor_id = film_actor.actor_id
group by
actor.first_name,
actor.last_name
order by
featured_count desc;
| first_name | last_name | featured_count |
| ---------- | --------- | -------------- |
| SUSAN | DAVIS | 54 |
| GINA | DEGENERES | 42 |
| WALTER | TORN | 41 |
| MARY | KEITEL | 40 |
| MATTHEW | CARREY | 39 |
| SANDRA | KILMER | 37 |
| SCARLETT | DAMON | 36 |
| VIVIEN | BASINGER | 35 |
| VAL | BOLGER | 35 |
| GROUCHO | DUNST | 35 |

Get a list of all active customers, ordered by their first name

select
first_name,
last_name
from
customer
where
active = 1
order by first_name asc;
| first_name | last_name |
| ---------- | --------- |
| MARY | SMITH |
| PATRICIA | JOHNSON |
| LINDA | WILLIAMS |
| BARBARA | JONES |
| ELIZABETH | BROWN |
| JENNIFER | DAVIS |
| MARIA | MILLER |
| SUSAN | WILSON |
| MARGARET | MOORE |
| DOROTHY | TAYLOR |

See who rented most DVDs - and how many times

select
customer.first_name,
customer.last_name,
count(customer.first_name) rentals_count
from
customer
left join rental on
customer.customer_id = rental.customer_id
group by
customer.first_name,
customer.last_name
order by rentals_count desc;
| first_name | last_name | rentals_count |
| ---------- | --------- | ------------- |
| ELEANOR | HUNT | 46 |
| KARL | SEAL | 45 |
| CLARA | SHAW | 42 |
| MARCIA | DEAN | 42 |
| TAMMY | SANDERS | 41 |
| WESLEY | BULL | 40 |
| SUE | PETERS | 40 |
| MARION | SNYDER | 39 |
| RHONDA | KENNEDY | 39 |
| TIM | CARY | 39 |

Challenge

After playing around with the sample queries, can you use SQL statements to answer some of these questions?

  1. What is the total revenue of each rental store?

    See answer

    select
    store.store_id,
    sum(payment.amount) as "total revenue"
    from
    store
    left join inventory on
    inventory.store_id = store.store_id
    left join rental on
    rental.inventory_id = inventory.inventory_id
    left join payment on
    payment.rental_id = rental.rental_id
    where
    payment.amount is not null
    group by
    store.store_id
    order by
    sum(payment.amount) desc;
    | store_id | total revenue |
    | -------- | ------------- |
    | 2 | 33726.77 |
    | 1 | 33689.74 |
  2. Can you list the top 5 film genres by their gross revenue?

    See answer

    select
    category.name,
    film.title,
    sum(payment.amount) as "gross revenue"
    from
    film
    left join film_category on
    film_category.film_id = film.film_id
    left join category on
    film_category.category_id = category.category_id
    left join inventory on
    inventory.film_id = film.film_id
    left join rental on
    rental.inventory_id = inventory.inventory_id
    left join payment
    on payment.rental_id = rental.rental_id
    where
    payment.amount is not null
    group by
    category.name,
    film.title
    order by
    sum(payment.amount) desc
    limit 5;
    | name        | title             | gross revenue |
    | ----------- | ----------------- | ------------- |
    | Music | TELEGRAPH VOYAGE | 231.73 |
    | Documentary | WIFE TURN | 223.69 |
    | Comedy | ZORRO ARK | 214.69 |
    | Sci-Fi | GOODFELLAS SALUTE | 209.69 |
    | Sports | SATURDAY LAMBS | 204.72 |
  3. The film.description has the text type, allowing for full text search queries, what will you search for?

    See answer

    -- Select all descriptions with the words "documentary" and "robot"

    select
    film.title,
    film.description
    from
    film
    where
    to_tsvector(film.description) @@ to_tsquery('documentary & robot');
    | title            | description                                                                                                        |
    | ---------------- | ------------------------------------------------------------------------------------------------------------------ |
    | CASPER DRAGONFLY | A Intrepid Documentary of a Boat And a Crocodile who must Chase a Robot in The Sahara Desert |
    | CHAINSAW UPTOWN | A Beautiful Documentary of a Boy And a Robot who must Discover a Squirrel in Australia |
    | CONTROL ANTHEM | A Fateful Documentary of a Robot And a Student who must Battle a Cat in A Monastery |
    | CROSSING DIVORCE | A Beautiful Documentary of a Dog And a Robot who must Redeem a Womanizer in Berlin |
    | KANE EXORCIST | A Epic Documentary of a Composer And a Robot who must Overcome a Car in Berlin |
    | RUNNER MADIGAN | A Thoughtful Documentary of a Crocodile And a Robot who must Outrace a Womanizer in The Outback |
    | SOUTH WAIT | A Amazing Documentary of a Car And a Robot who must Escape a Lumberjack in An Abandoned Amusement Park |
    | SWEDEN SHINING | A Taut Documentary of a Car And a Robot who must Conquer a Boy in The Canadian Rockies |
    | VIRGIN DAISY | A Awe-Inspiring Documentary of a Robot And a Mad Scientist who must Reach a Database Administrator in A Shark Tank |

Clean up

To clean up the environment and destroy the database, run the following commands:

\c defaultdb;
DROP DATABASE pagila;

Source

The source code for the Pagila database is available from our repository.