TutorialsLast Updated Dec 20, 202413 min read

Deploying documentation to GitHub Pages with continuous integration

Developer C sits at a desk working on an intermediate-level project.

Continuous integration (CI) tools have evolved from running tests and reporting results, becoming flexible, general-purpose computing environments. You can use CI to run full builds and send artifacts to external systems. If you’re already using a CI system, you might want to consider building and deploying your documentation using the same platform. It can be much more convenient than implementing another tool or service.

This tutorial provides an overview of some options available for building and deploying documentation. You will then explore the details of using CircleCI to deploy documentation to GitHub Pages. This workflow has proven to be convenient for teams already using those tools for hosting code and running automated tests.

Options for deploying documentation

API documentation is usually rendered from a codebase using a language-specific documentation tool (sphinx for Python or javadoc for Java). The building of the documentation can be done on the developer’s local machine, in a CI environment, or in a documentation-specific hosting service.

Services for hosting documentation are usually language-specific and can be a great low-friction option for a team that writes projects in a single language. For example, Read the Docs has been the standard for the Python community. Read the Docs uses webhooks to watch commits to a hosted repository and automatically builds and renders documentation for each code update. It offers some benefits that could be difficult to replicate in your own pipeline, such as deploying multiple versions of documentation and maintaining links from rendered docs to source code.

However, Read the Docs’ limitations can pop up if your team needs to deploy docs for additional languages or if builds require uncommon system dependencies that can’t be installed via the pip or conda package managers. Using a documentation-specific service also means maintaining another set of user accounts and permissions.

The least infrastructure-dependent workflow for building documentation is for developers to build docs locally and check the results into the project repository. Most teams prefer to keep generated content out of source control to keep code reviews simpler and to lessen developer responsibility for building and committing the content, but some may enjoy seeing the revision history of documentation alongside the code. GitHub has developed support for this workflow by offering the option to render contents of a docs directory to GitHub Pages. Other setups may still need a separate deploy step for documentation in a CI system.

Instead, if a team decides to build documentation as part of a CI flow, content could be deployed to a variety of destinations; a locally maintained server, an object store like Amazon S3, GitHub Pages, or some other external hosting service. In most cases, the CI job will need some form of credentials to authenticate with the destination, which can be the most complex part of the flow. One of the advantages of GitHub Pages as a documentation host is the consolidation of permissions; any developer with admin access on a repository can set up deploys to GitHub Pages and provision the deploy keys needed for a CI service to commit content.

Options for deploying to GitHub Pages

GitHub offers three options for deploying a site to GitHub Pages, with different implications for workflows and credentials.

The oldest option, which you will use in this tutorial, is for pushes to a special gh-pages branch to trigger deploys. This can be maintained as an “orphan” branch with a completely separate revision history from main. This can be a bit difficult to maintain. In this case, you’ll build a CircleCI workflow that builds documentation, commits changes to the gh-pages branch using a library, and then pushes the branch to GitHub using a deploy key that you will provision.

The second option is to have GitHub Pages render the main branch. This can be useful for a repository that exists only to host documentation. It doesn’t help much if your goal is to benefit from keeping code and rendered documentation close together with a single permissions model.

Finally, GitHub Pages can render a docs directory on the main branch, which supports workflows where developers are expected to generate and commit documentation as part of their local workflows. This requires no CI platform and no additional credentials, but most teams prefer not to include generated content in their main branch.

Creating a basic Python project

You will start by building a small Python package that uses standard Python ecosystem tools for tests (pytest) and documentation (sphinx). You’ll then configure CircleCI to run tests, build documentation, and finally deploy to GitHub Pages via a gh-pages branch. Full code for the project is available in CIRCLECI-GWP/docs-on-gh-pages.

In a fresh directory, create a simple package called mylib with a single hello function. mylib/__init__.py looks like:

def hello():
    return 'Hello'

You also need to create a test directory with an empty __init__.py file and test_hello.py containing:

import mylib

def test_hello():
    assert mylib.hello() == 'Hello'

To run the tests, you’ll need pytest, so specify that in a requirements.txt file. You’ll also need to request sphinx, the documentation tool you’ll be using in the next section:

sphinx
pytest

Before installing the dependencies, you need to create a virtual environment. Use the following commands to create a virtual environment and activate it:

python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`

Install the dependencies:

pip install -r requirements.txt

At this point, you can write a very simple CircleCI workflow containing a single job that will run your test. Create a .circleci/config:

version: 2.1

executors:
  python-executor:
    docker:
      - image: python:3.7
    working_directory: ~/project

jobs:
  test:
    executor: python-executor
    steps:
      - checkout
      - run:
          name: Install dependencies
          command: pip install -r requirements.txt
      - run:
          name: Run tests
          command: pytest

workflows:
  version: 2
  build-and-deploy:
    jobs:
      - test

This code has a python-executor that uses the python:3.7 Docker image and sets the working directory to ~/project. The test job uses this executor and runs the steps to install dependencies and run tests. The workflows section specifies that the test job should be run.

Next, commit the changes and push your project to GitHub. Log into CircleCI and search for your project.

Search for project

Click the Set Up Project button. CircleCI should generate an initial build for the main branch which should come back green.

CircleCI build

Add docs

Now that you have a basic library with tests, you can set up the documentation framework. Since you have already installed sphinx in the previous step, you can now run sphinx-quickstart to generate a skeleton documentation project. Run:

    sphinx-quickstart docs/ --project 'mylib' --author 'J. Doe'
    # accept defaults at all the interactive prompts

The flags you used are:

  • --project 'mylib': This sets the project name in the documentation, which will appear in various places, including the generated HTML.
  • --author 'J. Doe': This sets the author’s name in the generated documentation.

sphinx-quickstart generates a Makefile, so building docs is as simple as calling make html from the docs/ directory. Codify that in a new job in your CircleCI flow. Add this under jobs:

docs-build:
  executor: python-executor
  steps:
    - checkout
    - run:
        name: Install dependencies
        command: pip install -r requirements.txt
    - run:
        name: Build documentation
        command: |
          cd docs
          make html
    - persist_to_workspace:
        root: docs/_build
        paths: html

Invoking make html populates a docs/_build/html directory containing the content that you want to deploy. The final persist_to_workspace step of your new docs-build job saves the contents of that directory to an intermediate location that will be accessible to later jobs in your workflow. For now, we’ll add this new job:

workflows:
  version: 2
  build:
    jobs:
      - test
      - docs-build

and commit the results.

Even without deploying the rendered content, this job is now serving as a check on the integrity of your docs. If sphinx is unable to run successfully, this job will fail, letting you know something is wrong.

Deploying rendered docs to a gh-pages branch

We’re ready at this point to start building the final piece of your CI workflow, a job that will deploy the built documentation by pushing it to the gh-pages branch of your repository.

You want gh-pages to be an “orphan” branch that tracks only the rendered docs and has a separate timeline from the source code in master. It’s possible to create such a branch and copy content into it using bare git command-line invocations, but it can be full of edge cases and easily lead to a corrupted work environment if anything goes wrong. Pulling in a purpose-built tool is a reasonable choice in this case and there several available as open source projects. The most popular among these at the moment is actually a Node.js module called gh-pages that includes a command-line interface, which is what we’ll use here.

You would be completely justified in questioning why we’d choose an application requiring a JavaScript environment for deploying Python docs. It seems like added complexity at first glance, but it actually fits in your workflow fairly seamlessly since your CI environment supports Docker containers natively and you can choose independent base images for each of your jobs. You get to build the documentation inside a container with a Python runtime, then share the output with a new container with a Node.js runtime.

Now, write a first version of a docs-deploy job underneath the jobs section of your config.yml file and walk through the steps:

docs-deploy:
  docker:
    - image: cimg/node:16.17
  steps:
    - checkout
    - attach_workspace:
        at: docs/_build
    - run:
        name: Disable Jekyll for GitHub Pages
        command: touch docs/_build/html/.nojekyll
    - run:
        name: Install gh-pages tool
        command: npm install gh-pages@3.2.3 --silent
    - run:
        name: Configure Git for deployment
        command: |
          git config --global user.email "ci-build@example.com"
          git config --global user.name "CI Build"
    - run:
        name: Deploy documentation to GitHub Pages
        command: npx gh-pages --dotfiles --message "[skip ci] Updates" --dist docs/_build/html

You are using a node base image so that the npm package manager and Node.js runtime are available. The attach_workspace step mounts the rendered documentation from the docs-build step into your container, then you call npm install to download the target module, which includes a command-line utility, gh-pages, that we’ll invoke in the next step. The git config commands are required per the module documentation. Finally, the invocation of gh-pages --dist docs/_build/html copies the contents of the html directory into the root of the gh-pages branch and pushes the results to GitHub.

Let’s add this new step to your workflow. The workflows section now looks like:

workflows:
  version: 2
  build-and-deploy:
    jobs:
      - test
      - docs-build
      - docs-deploy:
          requires:
            - test
            - docs-build
          filters:
            branches:
              only: main

You made the docs-deploy job dependent on the other two steps, meaning that it won’t run until both those steps complete successfully. This ensures you don’t accidentally publish docs for a state of the repository that doesn’t pass tests. You also set a filter to specify that the docs-deploy job should be skipped except for builds of the main branch. That way, you don’t overwrite the published docs for changes that are still in flight on other branches.

If you check in all these changes and let CircleCI run your job, your new job will fail:

ERROR: The key you are authenticating with has been marked as read only.

Failed build

So there’s a bit more work you need to do to clean this up and make sure your CI job has the necessary credentials.

Provisioning a deploy key

As mentioned, GitHub provides a few options for giving a job access to change a repository. Generally, GitHub permissions are tied to users, so a credential must either be tied to a single human user account or a special machine user account must be provisioned. There’s a lot of flexibility there for granting access across repositories, but it can become somewhat complex.

Opt instead to provision a read/write deploy key. This is an ssh key pair specific to a single repository rather than a user. This is nice for teams, because it means access doesn’t disappear if the user who provisions the key leaves the organization or deletes their account. It also means that any user who is an administrator on the account can follow the steps below to get the integration set up.

Follow the instructions in the CircleCI docs and apply them to this project.

Start by creating an ssh key pair on your local machine:

ssh-keygen -t rsa -b 4096 -C "ci-build@example.com"
# Accept the default of no password for the key (This is a special case!)
# Choose a destination such as 'docs_deploy_key_rsa'

You end up with a private key docs_deploy_key_rsa and a public key docs_deploy_key_rsa.pub. Give the private key to CircleCI by going to https://circleci.com/gh/CIRCLECI-GWP/docs-on-gh-pages/edit#ssh. Click Add SSH Key, enter “github.com” as the hostname, and paste in the contents of the private key file. At this point, you candelete the private key from your system, as only your CircleCI project should have access. Run:

rm docs_deploy_key_rsa

The https://app.circleci.com/settings/project/github/CIRCLECI-GWP/docs-on-gh-pages/ssh page shows the fingerprint for your key, which is a unique identifier that’s safe to expose publicly (unlike the private key itself, which could give an attacker write access to your repository). Add a step in your docs-deploy job to grant the job access to the key with this fingerprint:

- add_ssh_keys:
    fingerprints:
      - "59:ad:fd:64:71:eb:81:01:6a:d7:1a:c9:0c:19:39:af"

Note: Your fingerprint will be different from the one shown here. Make sure to use the one shown in your CircleCI settings.

Next, go to https://app.circleci.com/settings/project/github/CIRCLECI-GWP/docs-on-gh-pages/advanced and check that “Pass secrets to builds from forked pull requests” is set to its default of “Off”.

SSH keys are one of the types of secrets that you should make available only if you trust the code being run. If you allowed this key to be available to forks, an attacker could craft a pull request that prints the contents of your private key to the CircleCI logs.

Upload the public key to GitHub so that it knows to trust a connection from CircleCI initiated with your private key. Go to https://github.com/CIRCLECI-GWP/docs-on-gh-pages Settings > Deploy keys. Enter “CircleCI write key” and paste in the contents of docs_deploy_key_rsa.pub. If you haven’t already deleted the private key, be extra careful you’re not accidentally copying from docs_deploy_key_rsa!

Some final fixups

Before you test that your CircleCI workflow can successfully push changes to GitHub, you need to address a few final details.

First, your built documentation contains directories starting with _, which have special meaning to jekyll, the static site engine built into GitHub Pages. You don’t want jekyll to alter your content, so you need to add a .nojekyll file and pass the --dotfiles flag to gh-pages since that utility will otherwise ignore all dotfiles.

Second, you need to provide a custom commit message that includes [skip ci] which instructs CircleCI that it shouldn’t initiate a new when you push this content to the gh-pages branch. The gh-pages branch contains only rendered HTML content, not the source code and config.yml, so the build will have nothing to do and will simply show up as failing in CircleCI. The full docs-deploy job is:

docs-deploy:
  docker:
    - image: cimg/node:16.17
  steps:
    - checkout
    - attach_workspace:
        at: docs/_build
    - run:
        name: Disable Jekyll for GitHub Pages
        command: touch docs/_build/html/.nojekyll
    - run:
        name: Install gh-pages tool
        command: npm install gh-pages@3.2.3 --silent
    - add_ssh_keys:
        fingerprints:
          - "SHA256:BPNOXVtVDIPb0Bll3FkCj12Vv9K7HSsup5jNx/XJrMs"
    - run:
        name: Configure Git for deployment
        command: |
          git config --global user.email "ci-build@example.com"
          git config --global user.name "CI Build"
    - run:
        name: Deploy documentation to GitHub Pages
        command: npx gh-pages --dotfiles --message "[skip ci] Updates" --dist docs/_build/html

You’re ready to commit your updated configuration and let CircleCI run the workflow.

Github page published

Once it shows green, you should notice that your repository now has a gh-pages branch and that the rendered content is now available at https://circleci-gwp.github.io/docs-on-gh-pages/.

Conclusion

There is no one obvious “best way” to build and deploy documentation. The path of least resistance for your team is going to depend on the particular mix of workflows, tools, and infrastructure that you are already familiar with. Your organizational structure is important as well, as it will have implications for who needs to be involved to provision credentials and get systems talking to one another.

The particular solution presented here is currently a good fit for the data platform team at Mozilla (see an example in practice at mozilla/python_moztelemetry) because it is adaptable to different languages (our team also maintains projects in Java and Scala), it minimizes the number of tools to be familiar with (we are already invested in GitHub and CircleCI), the permissions model gives your team autonomy in setting up and controlling the documentation workflow, and you haven’t seen a need for any of the more advanced features available from documentation-specific hosting providers.


Jeff Klukas has a background in experimental particle physics, working both as a teacher and as a researcher helping discover the Higgs boson. He now works remotely from Columbus, Ohio on the Firefox data platform at Mozilla and was previously the technical lead for the data platform at Simple, a branchless bank in the cloud.

Copy to clipboard