DevOps

Restoring a Postgres database to AWS RDS using Docker

By 23 May 2017No Comments

In this post I look at using Docker to restore a Postgres dump file to a Postgres database running in the cloud on AWS RDS.

Keep it clean

One of the big selling points of docker, for me, is that I can have lots of apps and utils running in nice containers on my dev laptop, without having to install them locally.  This ensures my laptop stays nice and responsive and I don’t clutter/break my laptop with lots of weird dependencies and running processes that I’m then too scared to delete.

Postgres is a good example – I don’t want to install it locally, but I do need access to the command line tools like psql and pg_restore, to be able to work with my databases effectively.

One way of accessing these tools would be to ssh onto the AWS cloud instances, but there’s a bunch of reasons most pertinently security (not to mention the faff) why you’d want to avoid that every time you want to run some sql.  So let’s look at how we use Docker to ease the pain instead.

Start Me Up

With Docker installed you can build this simple Dockerfile to create a local Postgres container.  The User and Password env vars aren’t strictly required, however, if you want to actually connect to the containerised DB, it’s pretty handy

FROM postgres
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD password
view rawDockerfile hosted with ❤ by GitHub

You can build, run and connect to the container as follows (assumes you are on Mac)

mkdir data-load
vi Dockerfile # Copy Docker commands listed above into your local Dockerfile
docker build -t postgres-db .
docker run -d -v /Users/iainhunter/dev/docker/postgres/data-load:/data-loader -p 5432:5432 postgres-db
docker ps
docker exec -it <imageId> bash
view rawdocker-commands.sh hosted with ❤ by GitHub

Note line 4 where I map the data-load dir I created at line 1 to a new directory called data-loader inside my container.  This means that when I copy the Postgres dump file into my local data-load directory, it will be available to the postgres tools available in the container.

Line 6  allows me to connect to the container, swap the imageId  for your locally running containerID.

Restoring your database with pg_restore

I’ll assume you already have a Postgres database set up within the AWS cloud.  So now we have connected to our container, we can use pg_restore to use restore our dumpfile into AWS (note this command will prompt you for the admin password)

pg_restore –host=<youHost>.eu-west-1.rds.amazonaws.com –port=5432 –username=<yourAdminUser> –password –dbname=<yourDB> /data-loader/dumpfile.dmp
view rawpg_restore_to_aws.sql hosted with ❤ by GitHub

A note on schemas

If you’re doing a partial restore, you may want to restore your dumpfile to a separate schema.  Unfortunately there appears to be no way to do this from the command line.  What you have to do is to rename the public schema, create a new public schema and restore into that, then reverse the process.

This StackOverflow answer outlines the process.

Restore Complete

You should now have a complete restore of your dumpfile in the cloud.  Please add comments if anything is unclear.