Configuring Nomad Client Metrics
CircleCI Server version 2.x is no longer a supported release. Please consult your account team for help in upgrading to a supported release. |
Nomad Metrics is a helper service used to collect metrics data from the Nomad server and clients running on the Services and Nomad instances respectively. Metrics are collected and sent using the DogStatsD protocol and sent to the Services machine.
Nomad Metrics Server
The Nomad Metrics container is run on the services host using the server flag and is installed as part of the CircleCI server installation process, requiring no additional configuration.
Nomad Metrics Client
The Nomad Metrics client is installed and run on all Nomad client instances. You will need to update your AWS Launch Configuration in order to install and configure it. Additionally, you will need to modify the AWS security group to ensure that UDP port 8125 is open on the Services machine. Steps for both configuration changes are explained below.
Before proceeding, you should be logged into the EC2 Service section of the AWS Console. Make sure that you logged into the region you use to run CircleCI server. |
Updating the Services machine Security Group
-
Select the
Instances
link located under the Instances group in the left sidebar. -
Select the Services Box Instance. The name tag typically resembles
circleci_services
. -
In the description box at the bottom, select the users security group link located next to the
Security Groups
section. It typically resembles*_users_sg
. -
This will take you straight to the Security Group page highlighting the users security group. In the description box at the bottom, select the
Inbound
tab followed by theEdit
button. -
Select the
Add a Rule
button. From the drop-down, selectCustom UDP Rule
. In the Port Range field enter8125
. -
The source field gives you a few options. However, this ultimately depends on how you have configured the VPC and subnet. Below are some more common scenarios.
-
(Suggested) Allow traffic from the nomad client subnet. You can usually match the entries used for ports 4647 or 3001. For example,
10.0.0.0/24
. -
Allow all traffic to UDP port 8125 using
0.0.0.0/0
.
-
-
Press the
Save Button
Updating the AWS Launch Configuration
Prerequisites
AWS EC2 Launch Configuration ID
-
Select the
Auto Scaling Groups
(ASG) link in the the sidebar on the left. -
Locate the ASG with a name tag similar to`*_nomad_clients_asg`
-
The Launch Configuration name is next to the ASG name IE
terraform-20180814231555427200000001
AWS EC2 Services Box Private IP Address
-
Select the
Instances
link located under the Instances group in the left sidebar -
Select the Services Box Instance. The name tag typically resembles
circleci_services
-
In the description box at the bottom of the page, make note of the private IP address.
Updating the Launch Configuration
-
Select the
Launch Configurations
link located underAuto Scaling
in the sidebar to the left. Select the Launch Configuration you retrieved in the previous steps. -
In the description pane at the bottom, select the
Copy launch configuration
button. -
Once the configuration page opens, select
3. Configure details
link located at the top of the page. -
Update the
Name
field to something meaningful IEnomad-builder-with-metrics-lc-DATE
. -
Select the
Advanced Details
drop down. -
Copy and paste the launch configuration script from below in the text field next to
User data
. -
IMPORTANT: Enter the private IP address of the services box at Line 10. For example,
export SERVICES_PRIVATE_IP="192.168.1.2"
. -
Select the
Skip to review
button and then theCreate launch configuration
button.
#!/bin/bash
set -exu
export http_proxy=""
export https_proxy=""
export no_proxy=""
export aws_instance_metadata_url="http://169.254.169.254"
export PUBLIC_IP="$(curl $aws_instance_metadata_url/latest/meta-data/public-ipv4)"
export PRIVATE_IP="$(curl $aws_instance_metadata_url/latest/meta-data/local-ipv4)"
export DEBIAN_FRONTEND=noninteractive
UNAME="$(uname -r)"
export CONTAINER_NAME="nomad_metrics"
export CONTAINER_IMAGE="circleci/nomad-metrics:0.1.198-5f5befe"
export SERVICES_PRIVATE_IP=""
export NOMAD_METRICS_PORT="8125"
echo "-------------------------------------------"
echo " Performing System Updates"
echo "-------------------------------------------"
apt-get update && apt-get -y upgrade
echo "--------------------------------------"
echo " Installing NTP"
echo "--------------------------------------"
apt-get install -y ntp
# Use AWS NTP config for EC2 instances and default for non-AWS
if [ -f /sys/hypervisor/uuid ] && [ `head -c 3 /sys/hypervisor/uuid` == ec2 ]; then
cat <<EOT > /etc/ntp.conf
driftfile /var/lib/ntp/ntp.drift
disable monitor
restrict default ignore
restrict 127.0.0.1 mask 255.0.0.0
restrict 169.254.169.123 nomodify notrap
server 169.254.169.123 prefer iburst
EOT
else
echo "USING DEFAULT NTP CONFIGURATION"
fi
service ntp restart
echo "--------------------------------------"
echo " Installing Docker"
echo "--------------------------------------"
apt-get install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get install -y "linux-image-$UNAME"
apt-get update
apt-get -y install docker-ce=5:18.09.9~3-0~ubuntu-xenial
# force docker to use userns-remap to mitigate CVE 2019-5736
apt-get -y install jq
mkdir -p /etc/docker
[ -f /etc/docker/daemon.json ] || echo '{}' > /etc/docker/daemon.json
tmp=$(mktemp)
cp /etc/docker/daemon.json /etc/docker/daemon.json.orig
jq '.["userns-remap"]="default"' /etc/docker/daemon.json > "$tmp" && mv "$tmp" /etc/docker/daemon.json
sudo echo 'export http_proxy="${http_proxy}"' >> /etc/default/docker
sudo echo 'export https_proxy="${https_proxy}"' >> /etc/default/docker
sudo echo 'export no_proxy="${no_proxy}"' >> /etc/default/docker
sudo service docker restart
sleep 5
echo "--------------------------------------"
echo " Populating /etc/circleci/public-ipv4"
echo "--------------------------------------"
if ! (echo $PUBLIC_IP | grep -qP "^[\d.]+$")
then
echo "Setting the IPv4 address below in /etc/circleci/public-ipv4."
echo "This address will be used in builds with \"Rebuild with SSH\"."
mkdir -p /etc/circleci
echo $PRIVATE_IP | tee /etc/circleci/public-ipv4
fi
echo "--------------------------------------"
echo " Installing nomad"
echo "--------------------------------------"
apt-get install -y zip
curl -o nomad.zip https://releases.hashicorp.com/nomad/0.9.3/nomad_0.9.3_linux_amd64.zip
unzip nomad.zip
mv nomad /usr/bin
echo "--------------------------------------"
echo " Creating config.hcl"
echo "--------------------------------------"
export INSTANCE_ID="$(curl $aws_instance_metadata_url/latest/meta-data/instance-id)"
mkdir -p /etc/nomad
cat <<EOT > /etc/nomad/config.hcl
log_level = "DEBUG"
name = "$INSTANCE_ID"
data_dir = "/opt/nomad"
datacenter = "default"
advertise {
http = "$PRIVATE_IP"
rpc = "$PRIVATE_IP"
serf = "$PRIVATE_IP"
}
client {
enabled = true
# Expecting to have DNS record for nomad server(s)
servers = ["$SERVICES_PRIVATE_IP:4647"]
node_class = "linux-64bit"
options = {"driver.raw_exec.enable" = "1"}
}
telemetry {
publish_node_metrics = true
statsd_address = "$SERVICES_PRIVATE_IP:8125"
}
EOT
echo "--------------------------------------"
echo " Creating nomad.conf"
echo "--------------------------------------"
cat <<EOT > /etc/systemd/system/nomad.service
[Unit]
Description="nomad"
[Service]
Restart=always
RestartSec=30
TimeoutStartSec=1m
ExecStart=/usr/bin/nomad agent -config /etc/nomad/config.hcl
[Install]
WantedBy=multi-user.target
EOT
echo "--------------------------------------"
echo " Creating ci-privileged network"
echo "--------------------------------------"
docker network create --driver=bridge --opt com.docker.network.bridge.name=ci-privileged ci-privileged
echo "--------------------------------------"
echo " Starting Nomad service"
echo "--------------------------------------"
service nomad restart
echo "--------------------------------------"
echo " Setting up Nomad metrics"
echo "--------------------------------------"
docker pull $CONTAINER_IMAGE
docker rm -f $CONTAINER_NAME || true
docker run -d --name $CONTAINER_NAME \
--rm \
--net=host \
--userns=host \
$CONTAINER_IMAGE \
start --nomad-uri=http://localhost:4646 --statsd-host=$SERVICES_PRIVATE_IP --statsd-port=$NOMAD_METRICS_PORT --client
Updating the Auto Scaling Group
-
Select the
Auto Scaling Groups
(ASG) link in the the sidebar on the left. -
Select the ASG with a name tag similar to
*_nomad_clients_asg
. -
In the description box at the bottom, select the
Edit
button. -
Select the newly created Launch Configuration from the drop-down.
-
Press the
Save
button. -
At this point, the older Nomad client instances will begin shutting down. They will be replaced with newer Nomad clients running Nomad Metrics.
StatsD Metrics
Metrics sent via StatsD will be updated every 10s. |
--server
The number of jobs in a terminal state (complete and dead ) will typically increase until Nomad garbage-collects the jobs from its state. |
Name | Type | Description |
---|---|---|
| Gauge | 1 if the last poll of the Nomad agent failed; 0 otherwise. This gauge is set independent of |
| Gauge | Total number of pending jobs across the cluster. |
| Gauge | Total number of running jobs across the cluster. |
| Gauge | Total number of complete jobs across the cluster. |
| Gauge | Total number of dead jobs across the cluster. |
--client
Name | Type | Description |
---|---|---|
| Gauge | 1 if the last poll of the Nomad agent failed; 0 otherwise. |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
| Gauge | (See below) |
|
Help make this document better
This guide, as well as the rest of our docs, are open source and available on GitHub. We welcome your contributions.
- Suggest an edit to this page (please read the contributing guide first).
- To report a problem in the documentation, or to submit feedback and comments, please open an issue on GitHub.
- CircleCI is always seeking ways to improve your experience with our platform. If you would like to share feedback, please join our research community.
Need support?
Our support engineers are available to help with service issues, billing, or account related questions, and can help troubleshoot build configurations. Contact our support engineers by opening a ticket.
You can also visit our support site to find support articles, community forums, and training resources.
CircleCI Documentation by CircleCI is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.