CircleCI server container architecture
CircleCI Server version 2.x is no longer a supported release. Please consult your account team for help in upgrading to a supported release. |
This document outlines the containerized services that run on the Services machine within a CircleCI server v2.x installation. This is provided both to give an overview of service operation, and to help with troubleshooting in the event of service outages. Supplementary notes and a key are provided below the following table.
Notes
-
Database migrator services are listed here with a low failure severity as they only run at startup, however:
If migrator services are down at startup connected services will fail. -
With a Premium support contract, some services can be externalized (marked with * here) and managed to suit your requirements. Externalization provides higher data security and allows for redundancy to be built into your system.
key
Icon | Description |
---|---|
Failure has a minor affect on production - no loss of data or functioning. | |
Failure might cause issues with some jobs, but no loss of data. | |
Failure can cause loss of data, corruption of jobs/workflows, major loss of functionality. |
Containers, Roles, Failure Modes and Startup Dependencies
Container / Image | Role | What happens if it fails? | Failure severity | Startup dependencies |
---|---|---|---|---|
| Provides a GraphQL API that provides much of the data to render the web frontend. | Many parts of the UI (e.g. Contexts) will fail completely. |
| |
| Persists audit log events to blob storage for long term storage. | Some events may not be recorded. |
| |
| Stores and provides encrypted contexts. | All builds using Contexts will fail. |
| |
| Runs postgresql migrations for the | Only runs at startup. |
| |
| Triggers scheduled workflows. | Scheduled workflows will not run. |
| |
| Runs postgresql migrations for the cron-service. | Only runs at startup. |
| |
| Stores and provides information about our domain model. | Workflows will fail to start and some REST API calls may fail causing |
| |
| Runs postgresql migrations for the | Only runs at startup. |
| |
| Mail Transfer Agent (MTA) used to send all outbound SMTP. | No email notifications will be sent. | None | |
| Stores user identities (LDAP). | If LDAP authentication is in use, all logins will fail and some REST API calls might fail. | only if LDAP in use |
|
| Runs postgresql migrations for the | Only runs at startup. |
| |
| File storage service used as a replacement for S3 when CircleCI server v2.x is run outside of AWS. Not used if server is configured to use S3. Stores step output logs, artifacts, test results, caches and workspaces. | If not using S3, builds will produce no outputand some REST API calls might fail. | if not using S3 | None |
| CircleCI web app and www-api proxy. | The UI and REST API will be unavailable and no jobs will be triggered by GitHub/Enterprise. Running builds will be OK but no updates will be seen. |
| |
| Mongo data store. | Potential total data loss. All running builds will fail and the UI will not work. |
| |
| Queries the nomad server for stats and sends them to statsd. | Nomad metrics will be lost, but everything else should run as normal. | None | |
| Receives job output & status updates and writes them to MongoDB. Also provides an API to running jobs to access caches, workspaces, store caches, workspaces, artifacts, & test results. | All running builds will either fail or be left in an unfixable, inconsistent state. There will also be data loss in terms of step output, test results and artifacts. | None | |
| Provides the CircleCI permissions interface. | Workflows will fail to start and some REST API calls may fail, causing 500 errors in the UI. |
| |
| Runs postgresql migrations for the | Only runs at startup. |
| |
| Splits a job into tasks and sends them to | No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data. | None | |
| Basic | Potential total data loss. All running builds will fail and the UI will not work. | None | |
| Runs the RabbitMQ server. Most of our services use RabbitMQ for queueing. | Potential total data loss. All running builds will fail and the UI will not work. | None | |
| The Redis key/value store. | Lose output from currently-running job steps. API calls out to GitHub may also fail. | None | |
| Sends tasks to | No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data. | None | |
| Used to run any mongo conversion/upgrade scripts during mongo version upgrade. | Not required to run all the time. \ | None | |
| Nomad primary service. | No 2.0 build jobs will run. | None | |
| Called by Replicated to check whether other containers are ready. | Only required on startup. If unavailable on startup the whole system will fail. | None | |
| Sends the user count to the internal CircleCI “phone home” endpoint. | CircleCI will not receive usage stats for your install but no affect on operation. | None | |
| Checks the | 1.0 Builder lifecycles will not be properly managed, but jobs will continue to run. | None | |
| Provides real-time events to the CircleCI app. | Live UI updates will stop but hard refreshes will still work. | None | |
| This is the statsd forwarding agent that our local services write to and can be configured to forward to an external metrics service. | Metics will stop working but jobs will continue to run. | None | |
| Used to manage log rotations for all containers on the services machine. | If this stays down for a long period the Services machine disk will eventually run out of space and other services will fail. | None | |
| Parses test result files and stores data. | There will be no test failure or timing data for jobs, but this will be back-filled once the service is restarted. | None | |
| Instance of Hashicorp’s Vault – an encryption service that provides key-management, secure storage, and other encryption related services. Used to handle the encryption and key store for the |
| None | |
| Periodically check for stale | Old vm-service instances might not be destroyed until this service is restarted. |
| |
| Periodically requests that | VM instances for |
| |
| Inventory of available | Jobs that use |
| |
| Used to run database migrations for | Only runs at startup. | None | |
| Coordinates and provides information about workflows. | No new workflows will start, currently running workflows might end up in an inconsistent state, and some REST and GraphQL API requests will fail. |
| |
| Runs postgreSQL migrations for the | Only runs on startup. |
|
Help make this document better
This guide, as well as the rest of our docs, are open source and available on GitHub. We welcome your contributions.
- Suggest an edit to this page (please read the contributing guide first).
- To report a problem in the documentation, or to submit feedback and comments, please open an issue on GitHub.
- CircleCI is always seeking ways to improve your experience with our platform. If you would like to share feedback, please join our research community.
Need support?
Our support engineers are available to help with service issues, billing, or account related questions, and can help troubleshoot build configurations. Contact our support engineers by opening a ticket.
You can also visit our support site to find support articles, community forums, and training resources.
CircleCI Documentation by CircleCI is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.