Package a Clojure web application using Docker

This is the second blog post in a three-part series about building, testing, and deploying a Clojure web application. You can find the first post here and the third post here.

In this post, we will be focusing on how to add a production database (PostgreSQL, in this instance) to an application, how to package the application as a Docker instance, and how to run the application and the database inside Docker. To follow along, I would recommend going through the first post and following the steps to create the app. Otherwise, you can get the source by forking this repository and checking out the master branch. If you choose this method, you will also need to set up your CircleCI account as described in the first post.

Although we are building a Clojure application, there is not much Clojure knowledge that is required to follow along with this part of the series.

Prerequisites

In order to build this web application you need to install the following:

Java JDK 8 or greater - Clojure runs on the Java Virtual Machine and is in fact just a Java library (JAR). I built this using version 8, but a greater version should work fine, too.
Leiningen - Leiningen, usually referred to as lein (pronounced ‘line’) is the most commonly used Clojure build tool.
Git - The ubiquitous distributed version control tool.
Docker - A tool designed to make it easier to create, deploy, and run applications by using containers.
Docker Compose - A tool for defining and running multi-container Docker applications.

You will also need to sign up for:

CircleCI account - CircleCI is a continuous integration and delivery platform.
GitHub account - GitHub is a web-based hosting service for version control using Git.
Docker Hub account - Docker Hub is a cloud-based repository in which Docker users and partners create, test, store and distribute container images.

Running a PostgreSQL database

In this section, we will walk through how to run a PostgreSQL database that we will connect to from the web application built in part one of this blog series.

We are going to be using Docker to ‘package’ our application for deployment. There have been many articles written on why and how to use Docker in more detail than I plan to discuss here. The reason I have decided to use it is to provide a level of isolation from the physical machine we are deploying to and, I think, more importantly, to ensure some consistency in the run time behavior when the application is running between local or remote environments.

For this reason, the end game will be to run the web application we built in the last blog and the PostgreSQL database in Docker and get the two to communicate.

The application currently uses SQLite when it runs in development mode. In part one of the blog series, we were only running in development mode, either by running the server from a REPL using lein repl or by running unit tests using lein test. If we try and run the application in production mode by issuing lein run from our project directory, we will get an error as the production database connection is not specified.

$ lein run
Exception in thread "main" clojure.lang.ExceptionInfo: Error on key :duct.database.sql/hikaricp when building system {:reason :integrant.core/build-threw-exception ...

We are going to run the database inside a Docker container using the official postgres Docker image (alpine version). To do this we can issue the following command:

$ docker run -p 5432:5432 -e POSTGRES_USER=filmuser -e POSTGRES_DB=filmdb -e POSTGRES_PASSWORD=password postgres:alpine
...
2019-01-20 10:08:32.064 UTC [1] LOG:  database system is ready to accept connections

This command runs the postgres Docker image (pulling it down from Docker Hub, if required) with the database listening to TCP port 5432, sets up a default user called filmuser, sets the password for that user to password, and creates an empty database called filmdb. If you already have PostgreSQL installed as a service on your machine, you may get a message about port 5432 being in use. If this happens, either stop the local PostgreSQL service or change the -p 5432:5432 entry to expose a different port, e.g., port 5500 -p 5500:5432.

In order to check that you can connect to the database, issue the following command in a different terminal window:

psql -h localhost -p 5432 -U filmuser filmdb
Password for user filmuser:
psql (11.1 (Ubuntu 11.1-1.pgdg16.04+1))
Type "help" for help.

filmdb=#

Although you have now connected to the database, there’s not a lot you can do with it at this point as we have not created any tables, views, etc (relations).

filmdb=# \d
Did not find any relations.

So let’s close the psql utility.

filmdb=# exit

Next, let’s leave the Docker container for postgres running and change our application so that it has a production configuration that can connect to the database.

Open up the resources/film_ratings/config.edn file in the film-ratings project directory. Then find the :duct.module/sql entry and add the following below it:

:duct.database.sql/hikaricp {:adapter "postgresql"
                              :port-number #duct/env [ "DB_PORT" :or "5432" ]
                              :server-name #duct/env [ "DB_HOST" ]
                              :database-name "filmdb"
                              :username "filmuser"
                              :password #duct/env [ "DB_PASSWORD" ]}

This entry defines the config for a Hikari connection pool using PostgreSQL. Note that we are picking up the server-name and the password from the environment variables DB_HOST and DB_PASSWORD. We have also allowed for an environment variable DB_PORT that is optional, but can be used to set the application to connect on a different port than 5432, if needed.

You also need to add in a dependency for the PostgreSQL database driver and the hikaricp library in the project.clj file, so the dependencies section looks like this:

:dependencies [[org.clojure/clojure "1.9.0"]
                 [duct/core "0.6.2"]
                 [duct/module.logging "0.3.1"]
                 [duct/module.web "0.6.4"]
                 [duct/module.ataraxy "0.2.0"]
                 [duct/module.sql "0.4.2"]
                 [org.xerial/sqlite-jdbc "3.21.0.1"]
                 [org.postgresql/postgresql "42.1.4"]
                 [duct/database.sql.hikaricp "0.3.3"]
                 [hiccup "1.0.5"]]

Also, we want the id column to be automatically assigned a unique number when we insert a new film so we will need to change the migration slightly so that the id column type is no longer an integer (which worked for SQLite), but is of type serial in PostgreSQL. This means you need to change the migrator ragtime entry in resources/film_ratings/config.edn to:

[:duct.migrator.ragtime/sql :film-ratings.migrations/create-film]
 {:up ["CREATE TABLE film (id SERIAL PRIMARY KEY, name TEXT UNIQUE, description TEXT, rating INTEGER)"]
  :down ["DROP TABLE film"]}

In order to test this config, you first need to set the environment variables. So open a separate terminal window from the one running the Docker postgres instance and set the two environment variables like so:

$ export DB_HOST=localhost
$ export DB_PASSWORD=password

Note: If you had to change the port number that the postgres Docker instance is using, you will also need to set the DB_PORT environment variable to the same port number.

Once you have set these environment variables you can run the application in the production profile like so (change directory to your project root directory first):

$ lein run
lein run
19-01-21 07:19:51 chris-XPS-13-9370 REPORT [duct.server.http.jetty:13] - :duct.server.http.jetty/starting-server {:port 3000}

As you can see from the output, our migration, defined in the first part of the blog, is not being run to insert the film table. By default, migrations are not run by Duct in the Production profile, but we will fix that later. In order to create the film table we can run our migration manually by opening another terminal session and executing the following command (after setting the environment variables and changing directory to the project root):

$ lein run :duct/migrator
19-01-21 07:48:59 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE ragtime_migrations (id varchar(255), created_at varchar(32))"], :elapsed 4}
19-01-21 07:48:59 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["SELECT id FROM ragtime_migrations ORDER BY created_at"], :elapsed 6}
19-01-21 07:48:59 chris-XPS-13-9370 REPORT [duct.migrator.ragtime:14] - :duct.migrator.ragtime/applying :film-ratings.migrations/create-film#11693a5d
19-01-21 07:48:59 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE film (id SERIAL PRIMARY KEY, name TEXT UNIQUE, description TEXT, rating INTEGER)"], :elapsed 10}
19-01-21 07:48:59 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["INSERT INTO ragtime_migrations ( id, created_at ) VALUES ( ?, ? )" ":film-ratings.migrations/create-film#11693a5d" "2019-01-21T07:48:59.960"], :elapsed 4}
19-01-21 07:48:59 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:31] - :duct.database.sql/batch-query {:queries [], :elapsed 0}

You can now try out the running application by opening a browser and pointing at http://localhost:3000.

If you try adding a film you will see an error message indicating that the column rating is of type integer, but the supplied value is of type character.

This is because all the values in the add film form are strings, but the ratings column on the film table expects an INTEGER. This didn’t happen with our development mode because the SQLite database driver coerced the string value for the rating to an integer for us, but the PostgreSQL driver doesn’t. This illustrates that using a different database for production and development can be problematic!

For now, let’s just fix this bug in the handler. Edit the src/film_ratings/handler/film.clj file to refactor the code that deals with the film form to a separate function that also deals with coercing the ratings to an integer:

(defn- film-form->film
  [film-form]
  (as-> film-form film
    (dissoc film "__anti-forgery-token")
    (reduce-kv (fn [m k v] (assoc m (keyword k) v))
               {}
               film)
    (update film :rating #(Integer/parseInt %))))

(defmethod ig/init-key :film-ratings.handler.film/create [_ {:keys [db]}]
  (fn [{[_ film-form] :ataraxy/result :as request}]
    (let [film (film-form->film film-form)
          result (boundary.film/create-film db film)
          alerts (if (:id result)
                   {:messages ["Film added"]}
                   result)]
      [::response/ok (views.film/film-view film alerts)])))

To test this, stop the server running the application and restart it (CTRL-C to stop, lein run to restart). Then point your browser to http://localhost:3000 and try out the application.

Once you are happy that everything works, you can now close the server by using CTRL-C and close the Docker postgres instance by issuing CTRL-C in the terminal session running the Docker instance.

Just to ensure we haven’t broken anything, we can try and run the server in development mode which should use the SQLite database instance.

$ lein repl
 lein repl
nREPL server started on port 38553 on host 127.0.0.1 - nrepl://127.0.0.1:38553
REPL-y 0.3.7, nREPL 0.2.12
Clojure 1.9.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
    Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

user=> (dev)
:loaded
dev=> (go)
Jan 20, 2019 12:16:20 PM org.postgresql.Driver connect
SEVERE: Connection error:
org.postgresql.util.PSQLException: Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.

Something is wrong with the way our config is applied. Instead of using the config in dev/resources/dev.edn here:

:duct.module/sql
{:database-url "jdbc:sqlite:db/dev.sqlite"}

We seem to be picking up at least some of the postgres config in resources/config.edn. What is actually happening is that the dev config and the production config maps are merged. Therefore, the postgres adapter config and the rest of the key-value pairs are added to the :database-url. We need to fix this, so let’s add some code to the dev/src/dev.clj file to remove these production database attributes. Let’s add a function to do this and invoke this new function from the call to set-prep!:

(defn remove-prod-database-attributes
  "The prepared config is a merge of dev and prod config and the prod attributes for 
  everything except :jdbc-url need to be dropped or the sqlite db is 
  configured with postgres attributes"
  [config]
  (update config :duct.database.sql/hikaricp 
    (fn [db-config] (->> (find db-config :jdbc-url) (apply hash-map)))))

(integrant.repl/set-prep! 
  (comp remove-prod-database-attributes duct/prep read-config))

Now if we run the server in development mode like so (use quit to exit the previously running repl):

$ lein repl
nREPL server started on port 44721 on host 127.0.0.1 - nrepl://127.0.0.1:44721
REPL-y 0.3.7, nREPL 0.2.12
Clojure 1.9.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
    Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

user=> (dev)
:loaded
dev=> (go)
:duct.server.http.jetty/starting-server {:port 3000}
:initiated
dev=>

Note: if you’ve cloned my repo rather than following the previous blog, you may see an error like this: SQLException path to 'db/dev.sqlite': '.../blog-film-ratings/db' does not exist org.sqlite.core.CoreConnection.open (CoreConnection.java:192). In this case, manually create an empty ‘db’ directory below the project root directory using mkdir db and retry the repl commands.

We can see that the server now starts successfully and is connecting to the SQLite database. At this point let’s commit our changes.

$ git add --all .
$ git commit -m "Add production database config, dependencies & fix dev db config"
$ git push

Running our application in Docker

Having learned how to run a PostgreSQL database in Docker and how to connect to it from our application, the next stage is to run our application in Docker, too.

We are going to add a Dockerfile to build our application in Docker and we are going to use Docker Compose to run both our application Docker instance and the postgres Docker instance together in their own network.

Before we create the Dockerfile, let’s examine how our web application will actually run inside the Dockerfile. Up to this point, we have been running our application either in development mode using the repl or in production mode, but using lein.

Usually, a Clojure application is run in production in the same way a Java application would be. It’s most common to compile the application into .class files and then package these in an Uberjar, which is an archive file that contains all the .class files and all the dependent libraries (jar files) required by our application along with any resource files (e.g. our config files). This Uberjar would then be run using the JVM (Java Virtual Machine).

Let’s try out running our application in this way locally, before we have to do it inside Docker. Firstly, we can compile and package our application in an Uberjar by using the command:

$ lein uberjar
Compiling film-ratings.main
Compiling film-ratings.views.template
Compiling film-ratings.views.film
Compiling film-ratings.views.index
Compiling film-ratings.boundary.film
Compiling film-ratings.handler.film
Compiling film-ratings.handler.index
Created /home/chris/circleciblogs/repos/film-ratings/target/film-ratings-0.1.0-SNAPSHOT.jar
Created /home/chris/circleciblogs/repos/film-ratings/target/film-ratings.jar

This creates two JAR files. The one labelled as a snapshot contains just the application without the library dependencies, and the one called film-ratings.jar is our Uberjar with all the dependencies. We can now run our application from this Uberjar, but first make sure that your Docker instance for postgres is running and that the DB_HOST and DB_PASSWORD environment variables are set in your terminal session before issuing this command:

$ java -jar target/film-ratings.jar
19-01-22 07:26:20 chris-XPS-13-9370 REPORT [duct.server.http.jetty:13] - :duct.server.http.jetty/starting-server {:port 3000}

What we need to do now is write a Dockerfile that will execute this same command. This means that we need the Docker instance to have a Java Runtime Environment (in this case we are actually using a Java Development Kit environment). Create the Dockerfile in the project base directory and add the following:

FROM openjdk:8u181-alpine3.8

WORKDIR /

COPY target/film-ratings.jar film-ratings.jar
EXPOSE 3000

CMD java -jar film-ratings.jar

You can now build the Docker instance like so:

$ docker build . -t film-ratings-app
Sending build context to Docker daemon  23.71MB
Step 1/5 : FROM openjdk:8u181-alpine3.8
 ---> 04060a9dfc39
Step 2/5 : WORKDIR /
 ---> Using cache
 ---> 2752489e606e
Step 3/5 : COPY target/film-ratings.jar film-ratings.jar
 ---> Using cache
 ---> b282e93eff39
Step 4/5 : EXPOSE 3000
 ---> Using cache
 ---> 15d2e1b9197e
Step 5/5 : CMD java -jar film-ratings.jar
 ---> Using cache
 ---> 2fe0b1e058e5
Successfully built 2fe0b1e058e5
Successfully tagged film-ratings-app:latest

This creates a new Docker image and tags it as film-ratings-app:latest. We can now run our dockerized application like so:

$ docker run --network host -e DB_HOST=localhost -e DB_PASSWORD=password film-ratings-app
19-01-22 09:12:20 chris-XPS-13-9370 REPORT [duct.server.http.jetty:13] - :duct.server.http.jetty/starting-server {:port 3000}

However, we still have the problem we experienced earlier in that the migrations have not run. You can demonstrate this if you open a browser to http://localhost:3000/list-films. You will see the internal server error and you will see a huge stack trace in the terminal session running the app in Docker that ends with:

serverErrorMessage: #object[org.postgresql.util.ServerErrorMessage 0x5661cc86 "ERROR: relation \"film\" does not exist\n  Position: 15"]

To get around this, we are going to do something that’s not really recommended for a scalable production server and make the migrations run on the start of the application. The better way to do this would be to have a separate Docker instance in the production environment that was spun up on demand, probably by the CI pipeline, to run the migrations when something changes. Let’s take the simpler approach for the purposes of this blog and change our main function to invoke the migrator. Open and edit the src/film_ratings/main.clj file like so:

(defn -main [& args]
  (let [keys (or (duct/parse-keys args) [:duct/migrator :duct/daemon])]
    (-> (duct/read-config (io/resource "film_ratings/config.edn"))
        (duct/prep keys)
        (duct/exec keys))))

This ensures the :duct\migrator Integrant key is invoked to run the migrations before the daemon starts the server. In order to get this change into our Docker image we need to rerun lein uberjar and docker build . -t film-ratings-app. Then we can spin up our application in Docker:

$ docker run --network=host -e DB_HOST=localhost -e DB_PASSWORD=password film-ratings-app
19-01-22 18:08:23 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE ragtime_migrations (id varchar(255), created_at varchar(32))"], :elapsed 2}
19-01-22 18:08:23 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["SELECT id FROM ragtime_migrations ORDER BY created_at"], :elapsed 11}
19-01-22 18:08:23 chris-XPS-13-9370 REPORT [duct.migrator.ragtime:14] - :duct.migrator.ragtime/applying :film-ratings.migrations/create-film#11693a5d
19-01-22 18:08:23 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE film (id SERIAL PRIMARY KEY, name TEXT UNIQUE, description TEXT, rating INTEGER)"], :elapsed 13}
19-01-22 18:08:23 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["INSERT INTO ragtime_migrations ( id, created_at ) VALUES ( ?, ? )" ":film-ratings.migrations/create-film#11693a5d" "2019-01-22T18:08:23.146"], :elapsed 3}
19-01-22 18:08:23 chris-XPS-13-9370 INFO [duct.database.sql.hikaricp:31] - :duct.database.sql/batch-query {:queries [], :elapsed 0}
19-01-22 18:08:23 chris-XPS-13-9370 REPORT [duct.server.http.jetty:13] - :duct.server.http.jetty/starting-server {:port 3000}

This time we can see the migrations running before the server starts to add the film table. Now if we open a browser pointing at http://localhost:3000/list-films, we see the “No films found.” message. You can try out the application by adding some films.

Let’s commit these changes before moving on.

$ git add --all .
$ git commit -m "Add Dockerfile & call migrations on startup."
$ git push

Persisting data

We still have some issues. Currently, if we stop the postgres Docker container process and restart it, we will lose all of the films. Also, we are communicating between the two Docker containers via our host network. This means that if you run a PostgreSQL server locally, you would get port clashes.

Let’s fix both of those issues by creating a Docker Compose file. Our Docker Compose file will build our application dockerfile, set the appropriate environment variables and spin up the postgres instance. Docker Compose will facilitate communication between the application and database via a bridge network so that we don’t get port clashes on localhost except for the exposed port 3000 for the application.

Create a docker-compose.yml file in the root project directory and add the following:

version: '3.0'
services:
  postgres:
    restart: 'always'
    environment:
      - "POSTGRES_USER=filmuser"
      - "POSTGRES_DB=filmdb"
      - "POSTGRES_PASSWORD=${DB_PASSWORD}"
    volumes:
      - /tmp/postgresdata:/var/lib/postgresql/data
    image: 'postgres:alpine'
  filmapp:
    restart: 'always'
    ports:
      - '3000:3000'
    environment:
      - "DB_PASSWORD=${DB_PASSWORD}"
      - "DB_HOST=postgres"
    build:
      context: .
      dockerfile: Dockerfile

This file registers two services in docker-compose: one called postgres that uses the postgres:alpine image and sets the appropriate postgres environment variables, and one called filmapp that builds our Dockerfile, runs it exposing port 3000 and sets it’s environment variables.

You can also see that we have defined a volume that maps a directory /tmp/postgresdata on your local machine to /var/lib/postgresql/data on the container which is the data directory for postgres.

This means that when we run our docker-compose process, any data stored in the database will be written to /tmp/postgresdata locally and will persist even after we restart the docker-compose process.

Let’s try this out. First, we build the docker-compose image (make sure you have set the DB_PASSWORD environment variable first).

$ docker-compose build
postgres uses an image, skipping
Building filmapp
Step 1/5 : FROM openjdk:8u181-alpine3.8
 ---> 04060a9dfc39
Step 2/5 : WORKDIR /
 ---> Using cache
 ---> 2752489e606e
Step 3/5 : COPY target/film-ratings.jar film-ratings.jar
 ---> b855626e4a45
Step 4/5 : EXPOSE 3000
 ---> Running in 7721d74eee62
Removing intermediate container 7721d74eee62
 ---> f7caccf63c3b
Step 5/5 : CMD java -jar film-ratings.jar
 ---> Running in 89b75d045897
Removing intermediate container 89b75d045897
 ---> 48303637af01
Successfully built 48303637af01
Successfully tagged film-ratings_filmapp:latest

Now, let’s start docker-compose (quit any running Docker postgres or filmapp instances first).

$ docker-compose up
Starting film-ratings_filmapp_1  ... done
Starting film-ratings_postgres_1 ... done
Attaching to film-ratings_filmapp_1, film-ratings_postgres_1
...
postgres_1  | 2019-01-22 18:35:58.117 UTC [1] LOG:  database system is ready to accept connections
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE ragtime_migrations (id varchar(255), created_at varchar(32))"], :elapsed 4}
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["SELECT id FROM ragtime_migrations ORDER BY created_at"], :elapsed 11}
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 REPORT [duct.migrator.ragtime:14] - :duct.migrator.ragtime/applying :film-ratings.migrations/create-film#11693a5d
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["CREATE TABLE film (id SERIAL PRIMARY KEY, name TEXT UNIQUE, description TEXT, rating INTEGER)"], :elapsed 12}
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 INFO [duct.database.sql.hikaricp:30] - :duct.database.sql/query {:query ["INSERT INTO ragtime_migrations ( id, created_at ) VALUES ( ?, ? )" ":film-ratings.migrations/create-film#11693a5d" "2019-01-22T18:36:06.120"], :elapsed 3}
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 INFO [duct.database.sql.hikaricp:31] - :duct.database.sql/batch-query {:queries [], :elapsed 0}
filmapp_1   | 19-01-22 18:36:06 5d9729ccfdd0 REPORT [duct.server.http.jetty:13] - :duct.server.http.jetty/starting-server {:port 3000}

As you can see from the messages, we have started the postgres service followed by the filmapp service. The filmapp service has connected to the postgres service (via the DB_HOST=postgres environment variable which is picked up in config.edn).

The filmapp service has run the migration to add the film table and is now listening on port 3000.

You can now try adding some films. Then, stop the docker-compose process by either running docker-compose down in the project root directory in another terminal session or CTRL-C the running session.

If you then bring the docker-compose services back up using docker-compose up -d, you should find that any film data you’ve added has persisted.

Note: The -d flag runs docker-compose detached so you will have to run docker-compose logs to see the log output and docker-compose down to bring down the service.

If you ever want to start with an empty database again, simply delete the /tmp/postgresdata directory and bring up the docker-compose services again.

Again, let’s commit our docker-compose file before carrying on.

$ git add --all .
$ git commit -m "Added docker-compose"
$ git push

Publishing the Docker image to Docker Hub

We have almost finished what we set out to accomplish in this blog. One last thing we’d like to do is push our Docker image to Docker Hub, preferably as a part of our continuous integration system.

First, let’s do this manually.

Manual publish to Docker Hub

If you haven’t already done so, create an account on Docker Hub. Then, create a new repository called film-ratings-app in your account.

We want to publish just the Docker image for the application, as we won’t be using the docker-compose file for production. First, let’s rebuild the Docker image and tag it with our Docker Hub repository id (chrishowejones/film-ratings-app in my case):

$ docker build . -t chrishowejones/film-ratings-app
Sending build context to Docker daemon  23.72MB
Step 1/5 : FROM openjdk:8u181-alpine3.8
 ---> 04060a9dfc39
Step 2/5 : WORKDIR /
 ---> Using cache
 ---> 2752489e606e
Step 3/5 : COPY target/film-ratings.jar film-ratings.jar
 ---> Using cache
 ---> 60fe31dc32e4
Step 4/5 : EXPOSE 3000
 ---> Using cache
 ---> 672aa852b89a
Step 5/5 : CMD java -jar film-ratings.jar
 ---> Using cache
 ---> 1fdfcd0dc843
Successfully built 1fdfcd0dc843

Then, we need to push that image to Docker Hub like so (remember to use your Docker Hub repository and not chrishowejones!):

$ docker push chrishowejones/film-ratings-app:latest
The push refers to repository [docker.io/chrishowejones/film-ratings-app]
25a31ca3ed23: Pushed
...
latest: digest: sha256:bcd2d24f7cdb927b4f1bc79c403a33beb43ab5b2395cbb389fb04ea4fa701db2 size: 1159

Add a CircleCI job to build Docker images

OK, we’ve proved that we can push manually. Now, let’s look at setting up CircleCI to do it for us whenever we tag a version of our repository.

First, we want to use a feature of CircleCI called executors to reduce duplication. As this feature was only introduced in version 2.1, we need to open our .circleci/config.yml file and change the version reference from 2 to 2.1.

version: 2.1
jobs:
...

We need to add a job to build the Docker image for the application and another job to publish the Docker image. We will add two workflows to control when our various build steps are run.

Let’s start by adding an executor at the top of the file which will declare some useful stuff we want to reuse in our two new jobs.

version: 2.1
executors:
    Docker-publisher:
      working_directory: ~/cci-film-ratings # directory where steps will run
      environment:
        IMAGE_NAME: chrishowejones/film-ratings-app
      Docker:
        - image: circleci/buildpack-deps:stretch
jobs:
...

This executor declares the working directory, the local environment variable IMAGE_NAME, and a Docker image that has the buildpack dependencies we need to support executing Docker commands. Just as before, you will need to change the image name value to be prefixed with your Docker Hub user and not mine.

Before we start adding the job to build our Docker image we need to ensure that the Uberjar of the application, that was build in the build job, persists into the new job we are about to write.

So at the bottom of the build job, we need to add the CircleCI command to persist to the workspace.

      - run: lein do test, uberjar
      - persist_to_workspace:
          root: ~/cci-film-ratings
          paths:
            - target

This persists the target directory so that we can re-attach that directory in the new job. So at this point, let’s add that job after our build job:

   build-docker:
    executor: docker-publisher
    steps:
      - checkout
      - attach_workspace:
          at: .
      - setup_remote_docker
      - run:
          name: Build latest Docker image
          command: docker build . -t $IMAGE_NAME:latest
      - run:
          name: Build tagged Docker image
          command: docker build . -t $IMAGE_NAME:${CIRCLE_TAG}
      - run:
          name: Archive Docker image
          command: docker save -o image.tar $IMAGE_NAME
      - persist_to_workspace:
          root: ~/cci-film-ratings
          paths:
            - ./image.tar

OK, let’s walk through what this build-docker job is doing. First, we reference our docker-publisher executor so we have the working directory, IMAGE_NAME variable, and correct Docker image available.

Next, we checkout our project so that we have the Dockerfile for the application available. After that, we attach the saved workspace at our current working directory so that we have the target directory with its jars available on the right path for the Dockerfile.

The setup_remote_docker command creates an environment, remote from our primary container, that is a Docker engine and we can use this to execute Docker build/publish events (although other Docker events occur in our primary container as well).

The next two run commands execute Docker builds on the Dockerfile and tag the resulting images as chrishowejones/film-ratings-app:latest and chrishowejones/film-ratings-app:${CIRCLE_TAG}, respectively. In your case, you should have changed the IMAGE_NAME to have the appropriate prefix for you Docker Hub account and not chrishowejones.

The ${CIRCLE_TAG} variable will be interpolated into a GitHub tag associated with this build. The idea here is to trigger this job when we tag a commit and push it to GitHub. For argument sake, if we tag a commit as 0.1.0 and push that to GitHub, when our build-docker job runs, it will build a Docker image tagged latest, but will also build the same image and tag it as 0.1.0.

The docker save command is saving all of the Docker images for IMAGE_NAME to a tar archive file which we then persist in the persist-to-workspace command so we can use it in the next job.

Add a CircleCI job to publish to Docker Hub

We now have a job that builds two Docker images and persists them so that we can use them in the next job which will push these two images to Docker Hub for us.

Let’s add that job after the build-docker one in the config.yml:

    publish-docker:
        executor: docker-publisher
        steps:
          - attach_workspace:
              at: .
          - setup_remote_docker
          - run:
              name: Load archived Docker image
              command: docker load -i image.tar
          - run:
              name: Publish Docker Image to Docker Hub
              command: |
                echo "${DOCKERHUB_PASS}" | docker login -u "${DOCKERHUB_USERNAME}" --password-stdin
                docker push $IMAGE_NAME:latest
                docker push $IMAGE_NAME:${CIRCLE_TAG}

Let’s examine what this job does. First, it reuses our executor with the workspace, environment variable, and image. Next, it attaches the persisted workspace to the working directory. Then, we use the setup_remote_docker command to get the remote Docker engine so we can push images.

After that, we run the Docker command to load the previously stored Docker images from the persisted tar file. The next run command logs in to Docker Hub using two, as yet unset, environment variables, DOCKERHUB_USER and DOCKERHUB_PASS, and then pushes the two Docker images built in the previous job to Docker Hub.

Before we go any further lets set those two environment variables in CircleCI so that we don’t forget. Log on to https://circleci.com/ and go to your dashboard https://circleci.com/dashboard. Select the project settings for your project (click the cog icon against the project listed in the sidebar in jobs.)

Next, select the Environment Variables from the BUILD SETTINGS section and add two new variables, DOCKERHUB_USER and DOCKERHUB_PASS, setting their values appropriately to your Docker Hub username and password, respectively.

Add a workflow to run the jobs when a new version is published

We now have the jobs that we need to build and publish the Docker images but we have not specified how to execute these jobs. By default, CircleCI will run a job called build on every push to GitHub, but additional jobs will be ignored unless we set up workflows to run them. Let’s add workflows to the bottom of the .circleci/config.yml file.

workflows:
  version: 2.1
  main:
    jobs:
      - build
  build_and_deploy:
    jobs:
      - build:
          filters:
            branches:
              ignore: /.*/
            tags:
              only: /^\d+\.\d+\.\d+$/
      - build-docker:
          requires:
            - build
          filters:
            branches:
              ignore: /.*/
            tags:
              only: /^\d+\.\d+\.\d+$/
      - publish-docker:
          requires:
            - build-docker
          filters:
            branches:
              ignore: /.*/
            tags:
              only: /^\d+\.\d+\.\d+$/

The main workflow specifies that the build job should execute and doesn’t give any special filters when this will happen, so the build job executes on every push to GitHub.

The build_and_deploy workflow is a bit more complex. In this workflow, we specify that the build job will run when we push to GitHub any tags that conform to the specified regular expression /^\d+\.\d+\.\d+$/. This regular expression is a simplistic match to tags that conform to semantic versioning, e.g., 1.0.1. Therefore, if we push a tag like 1.0.1 to GitHub the build job will run.

This workflow next specifies that the build-docker job will run under the same conditions, but it requires the build job to have completed successfully first (this is what the requires command does).

The last step in the workflow is to run the publish-docker job under the same conditions of pushing a tag to GitHub, but only if the previous build-docker job completes successfully.

Our completed .circleci/config.yml file should look like this (remember to change the references to chrishowejones to your Docker Hub account instead).

    version: 2.1
    executors:
        docker-publisher:
          working_directory: ~/cci-film-ratings # directory where steps will run
          environment:
            IMAGE_NAME: chrishowejones/film-ratings-app
          docker:
            - image: circleci/buildpack-deps:stretch
    jobs:
      build:
        working_directory: ~/cci-film-ratings # directory where steps will run
        docker:
          - image: circleci/clojure:lein-2.8.1
        environment:
          LEIN_ROOT: nbd
          JVM_OPTS: -Xmx3200m # limit the maximum heap size to prevent out of memory errors
        steps:
          - checkout
          - restore_cache:
              key: film-ratings-{{ checksum "project.clj" }}
          - run: lein deps
          - save_cache:
              paths:
                - ~/.m2
              key: film-ratings-{{ checksum "project.clj" }}
          - run: lein do test, uberjar
          - persist_to_workspace:
              root: ~/cci-film-ratings
              paths:
                - target
      build-docker:
        executor: docker-publisher
        steps:
          - checkout
          - attach_workspace:
              at: .
          - setup_remote_docker
          - run:
              name: Build latest Docker image
              command: docker build . -t $IMAGE_NAME:latest
          - run:
              name: Build tagged Docker image
              command: docker build . -t $IMAGE_NAME:${CIRCLE_TAG}
          - run:
              name: Archive Docker images
              command: docker save -o image.tar $IMAGE_NAME
          - persist_to_workspace:
              root: ~/cci-film-ratings
              paths:
                - ./image.tar
      publish-docker:
        executor: docker-publisher
        steps:
          - attach_workspace:
              at: .
          - setup_remote_docker
          - run:
              name: Load archived Docker image
              command: docker load -i image.tar
          - run:
              name: Publish Docker Image to Docker Hub
              command: |
                echo "${DOCKERHUB_PASS}" | docker login -u "${DOCKERHUB_USERNAME}" --password-stdin
                docker push $IMAGE_NAME:latest
                docker push $IMAGE_NAME:${CIRCLE_TAG}
    workflows:
      version: 2.1
      main:
        jobs:
          - build
      build_and_deploy:
        jobs:
          - build:
              filters:
                branches:
                  ignore: /.*/
                tags:
                  only: /^\d+\.\d+\.\d+$/
          - build-docker:
              requires:
                - build
              filters:
                branches:
                  ignore: /.*/
                tags:
                  only: /^\d+\.\d+\.\d+$/
          - publish-docker:
              requires:
                - build-docker
              filters:
                branches:
                  ignore: /.*/
                tags:
                  only: /^\d+\.\d+\.\d+$/

You can now commit this config to GitHub.

$ git add --all .
$ git commit -m "Added workflow & jobs to publish to Docker Hub."
$ git push

You can now tag the current build and check that the new workflow runs to build and push the Docker images to Docker Hub:

$ git tag -a 0.1.0 -m "v0.1.0"
$ git push --tags

This will create an annotated tag and push it to your GitHub repository. That will trigger the build_and_deploy workflow to build the application Uberjar, build the Docker images (latest and 0.1.0), and push both these images to your Docker Hub repository for film-ratings-app. You can check this in the CircleCI dashboard and by browsing to your film-ratings-app repository in Docker Hub.

Summary

Congratulations! If you’ve gotten this far in the series, you have created a simple Clojure web application and packaged it using Docker and Docker Compose so that you can run it in a production-like environment locally. You have also learned how to get CircleCI to build, test, package, and publish your application as a Docker image.

In the next blog post in this series, I will step through how to set up a fairly complex AWS environment to run our application in the cloud using Terraform.