Build Rails images efficiently with Docker and GitLab CI: Complete Tutorial in 2019 (Part 1)

29 Jun, 2019

Background

Docker is an amazing tool for DevOps. It can save considerable amount of time even for a personal project. However, it is not very easy to adapt Docker for a Rails project effectively. Usually we have several problems while dockerizing a Rails project:

The final Docker image size is too large.
The build process takes a very long time. Installation of the project dependencies will require many build tools and libraries installed first, which will consume long periods of time.
It is not easy to reuse the Docker image for testing.

In this blog post, you can see a relatively completely tutorial for dockerizing a Rails project with the help of GitLab CI step by step. It cannot be 100% perfect for Rails DevOps but I believe it should be a very good start for you 😊.

tl;dr

If you are very familiar with Rails and Docker, or you have even tried a number of times with this stack, please skip Section 1 and 2, or jump to GitHub respositoy (https://github.com/imWildCat/rails-docker-demo) or the GitLab repository (https://gitlab.com/imWildCat/docker-rails-demo). It is also a good chioce to read the Summary directly if you have enough exprience with Docker and GitLab CI.

Section 1: Get started with Rails and Docker

Please note that Rails 6.0.0 stable is not released at the present time (2019-06-29). Please install Rails 6.0.0.rc1 using gem install --pre rails for this tutorial.

It is always good to start with a simple and clean project. We can use the following command to create a Rails project:

rails new -d postgresql --webpack react rails_docker_demo

In this stage, we do not need to write some rails code and let’s jump to the Docker part first.

So what should the Dockerfile look like?

FROM ruby:2.6.3-alpine3.8

# Install alpine packages
RUN apk add --no-cache \
  build-base \
  busybox \
  ca-certificates \
  cmake \
  curl \
  git \
  tzdata \
  gnupg1 \
  graphicsmagick \
  libffi-dev \
  libsodium-dev \
  nodejs \
  yarn \
  openssh-client \
  postgresql-dev \
  tzdata

# Define WORKDIR
WORKDIR /app

# Use bunlder to avoid exit with code 1 bugs while doing integration test
RUN gem install bundler -v 2 --no-doc

# Copy dependency manifest
COPY Gemfile Gemfile.lock /app/

# Install Ruby dependencies
RUN bundle update --bundler
RUN bundle install --jobs $(nproc) --retry 3 --without development test \
      && rm -rf /usr/local/bundle/bundler/gems/*/.git /usr/local/bundle/cache/

# Copy JavaScript dependencies
COPY package.json yarn.lock /app/

# Install JavaScript dependencies
RUN yarn install

# Define basic environment variables
ENV NODE_ENV production
ENV RAILS_ENV production
ENV RAILS_LOG_TO_STDOUT true

# Copy source code
COPY . /app/

# Build front-end assets
RUN bundle exec rails webpacker:verify_install
RUN SECRET_KEY_BASE=nein bundle exec rails assets:precompile

RUN chmod +x ./bin/entrypoint.sh

# Define entrypoint
ENTRYPOINT ["./bin/entrypoint.sh"]

You can also find the file content of bin/entrypoint.sh in the project respository on the v1-base-case branch (GitHub or GitLab).

Now it comes to the part for .gitlab-ci.yml:

variables:
  # Prevent any locale errors
  LC_ALL: C.UTF-8
  LANG: en_US.UTF-8
  LANGUAGE: en_US.UTF-8

build_image:
  stage: build
  image: docker:stable
  services:
    - docker:dind
  variables:
    IMAGE_TAG: $CI_REGISTRY_IMAGE:latest
  before_script:
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
  script:
    - docker pull $IMAGE_TAG || echo "No pre-built image found."
    - docker build --cache-from $IMAGE_TAG -t $IMAGE_TAG . || docker build -t $IMAGE_TAG . # Use cache for building if possible
    - docker push $IMAGE_TAG

This CI config file does some work (using --cahce-from) for cache but it does not help so much. Once we change the dependencies for alpine packages, it will be built from scratch. Before the improvements for cache, we’d like to deploy this Rails image first. The Docker image built in this stage cannot be deployed directly.

Time for the build in this section:

Rails app: 5:51 from scratch (approx.)

For details, please read the Merge Request: https://gitlab.com/imWildCat/docker-rails-demo/merge_requests/1.

Section 1.1: Improvements for deployment

What do we need for deployment?

A reverse proxy, such as nginx
Database configuration
Secrets or master.key
(Optional) Task queue, such as sidekiq

Since this topic is not very close to this article, I prefer to show you a PR for the changes instead:

https://gitlab.com/imWildCat/docker-rails-demo/merge_requests/3

Here are the steps for deployment:

Download and refine the docker-compose.yml file into your host machine.
(Optional) Transfer the master.key to the host machine to the designated path defined in docker-compose.yml.
Pull the images: sudo docker-compose pull.
Start the services sudo docker-compose up.
(Optional) After stopping these services, you can run sudo docker-compose down to reove all the resources allocated before.

There are still several things which we might know:

We may need to set the SECRET_KEY_BASE environment variable to the Rails server.
The port for the nginx server can be exported or a host-level reverse proxy such as Traefik can be used.

Time for builds in this section:

Rails app: 5:09 from scratch (approx.), 1:47 with cache (approx.)
Reverse proxy: 0:43 (approx.)

After the deployment, you may see 404 in the home page. It is fine because we have not added any business logic.

Section 2: Multi-stage building

The most obvious issue of the Docker build in the previous section is that the image size is too large (about 617MB). So what can we do to reduce the Docker image size?

Firstly, let’s have look at the Dockerfile. There are about 3 parts of the Dockerfile:

Installation of the system dependencies
Installation of the project dependencies specified in Gemfile and package.json
Front-end resource generating and environment setup

Luckily, Docker has multi-stage builds which we can leverage. In this tutorial, I won’t go too deeply about this feature but there are great potentials which we can explore.

In this section, we should seperate the original Dockerfile to 2 files:

Dockerfile.builder: For installation of dependencies
Dockerfile: For building the final Docker image

We should move all lines above the final Rails app setup from Dockerfile to Dockerfile.builder, creating a builder:

# Stage for dependencies installation
FROM ruby:2.6.3-alpine3.8 as builder

# Install alpine packages
RUN apk add --no-cache \
  build-base \
  busybox \
  ca-certificates \
  cmake \
  curl \
  git \
  tzdata \
  gnupg1 \
  graphicsmagick \
  libffi-dev \
  libsodium-dev \
  nodejs \
  yarn \
  openssh-client \
  postgresql-dev \
  tzdata

# Define WORKDIR
WORKDIR /app

# Use bunlder to avoid exit with code 1 bugs while doing integration test
RUN gem install bundler -v 2 --no-doc

# Copy dependency manifest
COPY Gemfile Gemfile.lock /app/

# Install Ruby dependencies
RUN bundle update --bundler
RUN bundle install --jobs $(nproc) --retry 3 --without development test

# Copy JavaScript dependencies
COPY package.json yarn.lock /app/

# Install JavaScript dependencies
RUN yarn install

We also need to update the Dockerfile, using 2 stages so that we can minimize the size of the final Docker image.

# ARG: https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
ARG BUILDER_IMAGE_TAG
FROM $BUILDER_IMAGE_TAG as builder

# Define basic environment variables
ENV NODE_ENV production
ENV RAILS_ENV production
ENV RAILS_LOG_TO_STDOUT true

# Copy source code
COPY . /app/

# Build front-end assets
RUN bundle exec rails webpacker:verify_install
RUN SECRET_KEY_BASE=nein bundle exec rails assets:precompile

RUN rm -rf node_modules

FROM ruby:2.6.3-alpine3.8 as deploy

RUN apk add --no-cache \
  ca-certificates \
  curl \
  tzdata \
  gnupg1 \
  graphicsmagick \
  libsodium-dev \
  nodejs \
  postgresql-dev \
  bash

# Define basic environment variables
ENV NODE_ENV production
ENV RAILS_ENV production
ENV RAILS_LOG_TO_STDOUT true
# Defined for future testing
ENV RAILS_SERVE_STATIC_FILES true

WORKDIR /var/www/app

COPY --from=builder /usr/local/bundle/ /usr/local/bundle/
COPY --from=builder /app/ /var/www/app/
# We will copy the files in to /app/public while app is starting.
# Otherwise, the asset files may not be updated if we use named volume.
COPY --from=builder /app/public /var/www/app/public_temp

RUN chmod +x ./bin/entrypoint.sh

# Define entrypoint
ENTRYPOINT ["./bin/entrypoint.sh"]

After that, we should also update the CI config .gitlab-ci.yml:

# ...
build_image:
  stage: build
  image: docker:stable
  services:
    - docker:dind
  variables:
    IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
    BUILDER_IMAGE_TAG: $CI_REGISTRY_IMAGE/builder:latest
  before_script:
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
  script:
    - docker pull $BUILDER_IMAGE_TAG || echo "No pre-built image found."
    - docker build --cache-from $BUILDER_IMAGE_TAG -t $BUILDER_IMAGE_TAG -f Dockerfile.builder . || docker build -t $BUILDER_IMAGE_TAG -f Dockerfile.builder . 
    - docker push $BUILDER_IMAGE_TAG
    - docker pull $IMAGE_TAG || echo "No pre-built image found."
    - docker build --cache-from $IMAGE_TAG --build-arg BUILDER_IMAGE_TAG=${BUILDER_IMAGE_TAG} -t $IMAGE_TAG . || docker build --build-arg BUILDER_IMAGE_TAG=${BUILDER_IMAGE_TAG} -t $IMAGE_TAG . # Use cache for building if possible
    - docker push $IMAGE_TAG
# ...

Please note we use build-time variables (–build-arg) here.

In addition, we also applied a small tricks in bin/entrypoint.sh to handle the assets:

rm -rf public/* # Remove assets in named volume
cp -r public_temp/* public/ # Copy new files from new image

With these improvements, we can reduce the final image size from 617MB to 221MB (64.1%) approximately but the time consumption should be similar.

For details, please read the Merge Request: https://gitlab.com/imWildCat/docker-rails-demo/merge_requests/1.

Section 3: On-demand multi-stage building

After introducing multi-stage build, you may wonder whether we can only build builders when necessary. Because for most commits or PRs, we do not change the dependencies. Only business logic are frequently updated. We can leverage the GitLab CI configuration only (https://docs.gitlab.com/ee/ci/yaml/#onlyexcept-basic) and the ‘stage’ feature. They are really nice.

Firstly, we should add stages defination to the top of the .gitlab-ci.yml file:

stages:
  - prebuild
  # - test # For future work
  - build
  # - deploy # For future work

Secondly, constructing builder should be move to the new stage prebuild:

construct_builder:
  stage: prebuild
  image: docker:stable
  services:
    - docker:dind
  only:
    changes:
      - Dockerfile
      - Dockerfile.builder
      - Gemfile
      - Gemfile.lock
      - package.json
      - yarn.lock
  variables:
    BUILDER_IMAGE_TAG: $CI_REGISTRY_IMAGE/builder:latest
  before_script:
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
  script:
    - docker pull $BUILDER_IMAGE_TAG || echo "No pre-built image found."
    - docker build --cache-from $BUILDER_IMAGE_TAG -t $BUILDER_IMAGE_TAG -f Dockerfile.builder . || docker build -t $BUILDER_IMAGE_TAG -f Dockerfile.builder . 
    - docker push $BUILDER_IMAGE_TAG

Please note that we keep using the latest tag for the latest image for simplicity. You might change it to suit your needs.

As a result, we can simplify the job build_image:

build_image:
  # ...
  script:
    - docker pull $BUILDER_IMAGE_TAG
    - docker build --build-arg BUILDER_IMAGE_TAG=${BUILDER_IMAGE_TAG} -t $IMAGE_TAG .
    - docker push $IMAGE_TAG

Similarly, we can also apply this to the build_reverse_proxy job. The stage should be changed to prebuild in case the job for construct_builder fails. In this edge case, if there is any change of the reverse proxy code, the build_reverse_proxy won’t be triggered.

build_reverse_proxy:
  stage: prebuild
  # ...
  only:
    changes:
      - misc/reverse_proxy/**/*
      - misc/reverse_proxy/Dockerfile
  # ...

Finally, we can build the base images only when necessary. The build process for the builder image takes a lot of time. We can save the GitLab CI monthly quota by dothing this and actually our own time is also saved.

Time for builds in this section:

Builder (only runs while dependencies change): 4:38 (approx.)
Rails app: 02:14 (approx.), saved 50% compared to the time in previous sections.
Reverse proxy (only runs while dependencies change): No change.

For details, please read the Merge Request: https://gitlab.com/imWildCat/docker-rails-demo/merge_requests/1.

Summary

Basically, this tutorial uses the following technologies accelerate the build process and minimize the final image size:

The --cahce-from option: https://docs.docker.com/engine/reference/commandline/build/
Multi-stage build: https://docs.docker.com/develop/develop-images/multistage-build/
Build-time variables (--build-arg): https://docs.docker.com/engine/reference/commandline/build/#set-build-time-variables—build-arg
GitLab CI: only: https://docs.gitlab.com/ee/ci/yaml/#onlyexcept-basic

There are also some tricks (mostly mentioned above):

We may need to manually remove and add front-end assets if we use named volume.
If we do not want to set secret using master.key provided by Encrypted Credentials, Docker environment variables or [secrets][https://docs.docker.com/engine/swarm/secrets/) can be used.

By using these techniques, we can not only save time, but also reduce the final image size so that we will be happy to develop and deploy the Rails app.

Future topics

The should be two more sections about testing and linting but I decided to defer these parts because the first 3 sections took me too much time. Actually, we can fully automate the DevOps process by leveraging Traefik and Portainer webhooks on Docker Swarm. I’d like to write another blog post about that in my spare time.

Tips

If you don’t want to use Docker on your development machine, VS Code Remote Development is an awesome way to do this on a remote machine.