Deploying a Hugo Static Site Using GitLab, CI/CD, & SSH

Recently (April 2018), I redeployed grh.am using Hugo after running the site with Jekyll for a number of years (which I never really blogged about, the last blog post talks about Pelican!

Previously, deployment felt a little clunky, requiring a git push to a remote repository set up on a server, where a git hook picks up on the new commit and runs Jekyll (as per the Jekyll documentation).

This time round, I fancied doing it a little differently & automating all of the deployment whilst also removing a lot of the moving parts to limit areas for things to go wrong. I’ve been using GitLab for a while now thanks to their free private repositories & various other wonderful offerings, one of which is CI/CD which we’ll be using a little later in this post.

Workflow

The process for publishing a new post or making any changes is as follows:

  1. Write and/or draft a blog post, using hugo serve -D locally to ensure formatting etc. is correct.
  2. git commit & git push to a private GitLab repository
  3. GitLab CI/CD picks up this new commit, and runs two sequential jobs.
  4. Job 1 - Essentially run hugo -d public_html to deploy the site into a folder called public_html which is then cached by GitLab & the output stored (as ‘artifacts’) to be used by the next job.
  5. Job 2 - Picks up the artifacts (i.e. the deploy output from Hugo), and runs rsync over SSH into the grh.am server, where it overwrites the public_html folder where all the HTML & static files are kept to then be served by nginx.

And that’s it. There are a few extra details, such as drafts being ignored and the CI/CD pipeline within GitLab only running on the master branch of the repository, but that is essentially it.

Lets take a closer look at those two jobs, and see how they work…

Job 1 - Hugo

To go from commit and push to seeing the changes live on grh.am, we first need to make hugo build the static site, and this is what we get from Job 1.

GitLab CI can be configured from a .gitlab-ci.yml file within your repository. Below is the config for the first job, formatted in YAML:

stages:
  - build
  - deploy
build:
  stage: build
  image: jojomi/hugo
  script:
  - hugo version
  - git submodule update --init --recursive
  - hugo -d public_html
  artifacts:
    paths:
    - public_html
  only:
  - master

At the top of the file, we have outlined all our jobs. In this instance, we only have two - build (job 1) & deploy (job 2). We will look at the config for deploy shortly.

Next, we tell GitLab what stage we are configuring, and set a Docker image to use. I found the best Docker image for the job to be jojomi/hugo, which is regularly updated & well maintained. We can then tell GitLab CI what commands to run within this Docker container, in this instance I have made it print out the version of hugo that is running (which we can see in the CI logs), update any submodules which in my instance is a theme, and then run hugo -d public_html.

We then tell GitLab CI to treat this new folder public_html as an artifact which allows us to reuse this output in other jobs, as well as simply making it available for download from the CI web front-end.

GitLab CI artifacts

Lastly, we tell GitLab CI to only run on the master branch.

Job 2 - rsync & SSH

So we have our static source files as artifacts, but how do we now get these onto our server for nginx to then serve? The answer is of course another Docker image. Here is the config for the deploy stage of our CI pipeline:

deploy:
  stage: deploy
  image: alpine:latest
  tags:
  - private
  before_script:
    - apk update && apk add openssh-client bash rsync
  script:
  - eval $(ssh-agent -s)
  - bash -c 'ssh-add <(echo "${SSH_PRIVATE_KEY}")'
  - mkdir -p ~/.ssh
  - echo "${SSH_HOST_KEY}" > ~/.ssh/known_hosts
  - rsync -hrvz --delete --exclude=_ -e 'ssh -p 2468' public_html/ "${SSH_USER_HOST_LOCATION}" 
  only:
  - master

Here we are using a bog-standard Alpine Linux Docker image, where we then load our SSH private key, add our server key to known_hosts, and fire up rsync! GitLab very kindly outline most of this in their documentation: Using SSH keys with GitLab CI/CD, plus a very useful example repository with a .gitlab-ci.yml ready to go.

From the top, we outline which stage we are configuring, and set the Docker image we which to use. I chose Alpine as it is tiny to reduce build times, as well as being well regarded in the Docker community for a good base OS/image.

We then come across tags, which tells GitLab CI to only run this job on machines that are associated with that tag. Because we are handling SSH private keys here, I have setup a Docker instance of gitlab-runner on my home lab, limiting the possibility of that key going walkabout or being gleamed somehow from other GitLab CI jobs.

We run a few commands prior to actually executing some of our commands, this is just to ensure we have the right packages installed & up to date. In this instance, we install or update openssh-client (for SSH), bash, and rsync (for.. well, rsync).

And then finally we run a number of commands on our Docker instance:

  • eval $(ssh-agent -s) to ensure SSH-Agent is running
  • bash -c 'ssh-add <(echo "${SSH_PRIVATE_KEY}")' to add our private key into the ssh-agent key store
  • mkdir -p ~/.ssh & echo "${SSH_HOST_KEY}" > ~/.ssh/known_hosts to add our remote server as a known host when using SSH, meaning the deploy will fail if we get man-in-the-middled and to allow the job to run without interaction (no need to approve the remote host key for rsync)
  • rsync -hrvz --delete --exclude=_ -e 'ssh -p 2222' public_html/ "${SSH_USER_HOST_LOCATION}" which runs rsync with a number of options… A full overview can be see on explainshell.com but essentially removes any files on the remote server that aren’t part of public_html, doesn’t transfer anything beginning with _, and sets some SSH options in the form of a port.

Because public_html is an artifact from our previous job, GitLab CI shares that with our deploy job seamlessly.

Clearly there are some variable being used here, hence the ${SSH_PRIVATE_KEY} etc. These are set within GitLab CI/CD settings, so that I don’t have to hardcode any credentials into our git repository.

GitLab CI variables

And once again, this will only run on the master branch of our repository.

And all together now…

Finally, we end up with the following .gitlab-ci.yml file (plus some variables set on the GitLab CI/CD settings page):

stages:
  - build
  - deploy
build:
  stage: build
  image: jojomi/hugo
  script:
  - hugo version
  - git submodule update --init --recursive
  - hugo -d public_html
  artifacts:
    paths:
    - public_html
  only:
  - master
deploy:
  stage: deploy
  image: alpine:latest
  tags:
  - private
  before_script:
    - apk update && apk add openssh-client bash rsync
  script:
  - eval $(ssh-agent -s)
  - bash -c 'ssh-add <(echo "${SSH_PRIVATE_KEY}")'
  - mkdir -p ~/.ssh
  - echo "${SSH_HOST_KEY}" > ~/.ssh/known_hosts
  - rsync -hrvz --delete --exclude=_ -e 'ssh -p 2222' public_html/ "${SSH_USER_HOST_LOCATION}" 
  only:
  - master

As soon as that config file is pushed to our repository, GitLab picks it up, and runs it. Assuming everything else is setup correctly (SSH connectivity works, public key authentication works etc.), you’ll see the contents of public_html deployed to the location specific in SSH_USER_HOST_LOCATION.

Things to consider

  • Please, please, please, don’t use your standard server account for these deployments – set up a specific gitlabci account with locked down permissions and access.
  • Everything here is setup to be relatively opinionated, such as port numbers & job tags – please consider if you even need these to be set, for your deployments to work.
  • If you need to troubleshoot GitLab CI jobs, they provide the logs from the docker instances so you can follow them through and figure out where stuff is going pear-shaped. You can see an example of this logging below, whereby any commands are prefixed with $ - below you can see the output of hugo version etc.
Fetching changes...
Removing public_html/
HEAD is now at 4a467a7 Remove reliance on an unknown docker image to handle SSH and rsync
From https://gitlab.com/graystevens/grh.am-hugo
   4a467a7..a33f4dc  master     -> origin/master
Checking out a33f4dc1 as master...
Skipping Git submodules setup
$ hugo version
Hugo Static Site Generator v0.40.2 linux/amd64 BuildDate: 2018-04-30T06:47:44Z
$ git submodule update --init --recursive
$ hugo -d public_html
Building sites …