Introduction
In previous articles I introduced Packer and Terraform. In those two articles I did not discuss the files that were copied and executed remotely. To some degree, this will be the focus of this article in addition to configuring Nomad jobs. So to sum up quickly, I will briefly discuss the installation and configuration of Consul, Vault and Nomad to create a simple Nomad cluster on DigitalOcean. However, it is recommended to quickly go through the “Get Started Tutorials” for each product to get a better understanding. I emphasize that I do not utilize each product for all its capabilities.
Configuration
Performing a tree
of the packer directory, minus some folders, I have the following:
packer
├── consul
│ ├── configure_consul.sh
│ ├── consul-client.service
│ ├── consul-connect-enable.hcl
│ └── consul-server.service
├── nomad
│ ├── configure_nomad.sh
│ ├── jobs
│ │ ├── jessequinn.nomad
│ │ ├── scidoc.nomad
│ │ └── traefik.nomad
│ ├── nomad-client.hcl
│ ├── nomad-client.service
│ ├── nomad-server.hcl
│ └── nomad-server.service
└── vault
├── enable_vault.sh
├── init_vault.sh
├── vault-config.hcl
└── vault-server.service
I will go through each folder starting with consul.
Consul
With Packer, I copied four (4) files over to a snapshot. Rather than creating snapshots for server and client I just placed all files into the snapshot. The first file:
#! /bin/bash
echo "Configuring Consul\n"
mkdir /tmp/consul
if [ $1 == "server" ]; then
systemctl enable consul-server.service
systemctl start consul-server.service
else
systemctl enable consul-client.service
systemctl start consul-client.service
sleep 30
consul join $2
fi
echo "Configuration of Consul complete\n"
exit 0
is simply a bash
script to enable the Consul service depending on type of node. Then the next file:
[Unit]
Description=Consul client
Wants=network-online.target
After=network-online.target
[Service]
ExecStart= /bin/sh -c "consul agent -data-dir=/tmp/consul -node=agent-c-node_number -bind=$(ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p') -enable-script-checks=true -config-dir=/etc/consul.d"
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
is just the client service for systemd
. A good tutorial on systemd can be found on
DigitalOcean. The important line
is the ExecStart
. We basically tell consul to run as a worker node, provide a node name, bind the private ip, enable script checks,
and finally define the location of the config file.
The systemd service for servers:
[Unit]
Description=Consul server
Wants=network-online.target
After=network-online.target
[Service]
ExecStart= /bin/sh -c "consul agent -server -bootstrap-expect=server_count -data-dir=/tmp/consul -node=agent-s-node_number -bind=$(ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p') -enable-script-checks=true -config-file=/root/consul-connect-enable.hcl -config-dir=/etc/consul.d"
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
runs the consul agent in server mode bootstrapping with an expected server count, I use one (1), although we should use three (3) or more servers, and provide a config file:
{
"connect" : {
"enabled": true
}
}
within the configuration we enable connect.
Just a quick note, in several files we are using placeholders for values, placeholders that would be processed by sed
in Terraform.
Vault
Just like Consul, I utilize a bash script to enable and start the Vault server. Unlike Consul, I only install vault on the server:
#! /bin/bash
echo "Enabling Vault on server\n"
systemctl enable vault-server.service
systemctl start vault-server.service
export VAULT_ADDR=http://127.0.0.1:8200
echo "Enabled Vault complete\n"
exit 0
We need to initialize Vault:
#! /bin/bash
echo "Initialize Vault on server\n"
export VAULT_ADDR=http://127.0.0.1:8200
if [ $1 == "0" ]; then
vault operator init -address=http://127.0.0.1:8200 > /root/startupOutput.txt
vault operator unseal -address=http://127.0.0.1:8200 `grep "Unseal Key 1" /root/startupOutput.txt | cut -d' ' -f4`
vault operator unseal -address=http://127.0.0.1:8200 `grep "Unseal Key 2" /root/startupOutput.txt | cut -d' ' -f4`
vault operator unseal -address=http://127.0.0.1:8200 `grep "Unseal Key 3" /root/startupOutput.txt | cut -d' ' -f4`
fi
echo "Initialized Vault complete\n"
exit 0
Terraform actually uses a local execution to back up the information. Why? You will need these keys with Vault to unseal it. For instance, when restarting Vault.
The systemd service:
[Unit]
Description=Vault server
Wants=network-online.target
After=network-online.target
[Service]
ExecStart= /bin/sh -c "/usr/bin/vault server -config=/root/vault-config.hcl"
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Vault is ran in server mode with the following configuration:
storage "consul" {
address = "127.0.0.1:8500"
path = "vault/"
}
listener "tcp" {
address = "127.0.0.1:8200"
tls_disable = 1
}
Consul is used as the kv storage. I also disable tls and define the listening address and port.
Nomad
Once again I have a simple bash script to enable server and client services:
#! /bin/bash
echo "Configuring Nomad\n"
if [ $1 == "server" ]; then
systemctl enable nomad-server.service
systemctl start nomad-server.service
else
systemctl enable nomad-client.service
systemctl start nomad-client.service
fi
echo "Configuration of Nomad complete\n"
exit 0
The server service:
[Unit]
Description=Nomad server
Wants=network-online.target
After=network-online.target
[Service]
Environment="VAULT_TOKEN=replace_vault_token"
ExecStart= /bin/sh -c "/usr/bin/nomad agent -config /root/nomad-server.hcl"
Restart=always
RestartSec=10000
[Install]
WantedBy=multi-user.target
is configured with the following:
# Increase log verbosity
log_level = "DEBUG"
# Setup data dir
data_dir = "/tmp/server"
bind_addr = "server_ip" # edit to private network
advertise {
# Edit to the private IP address.
http = "server_ip:4646"
rpc = "server_ip:4647"
serf = "server_ip:4648" # non-default ports may be specified
}
# Enable the server
server {
enabled = true
# Self-elect, should be 3 or 5 for production
bootstrap_expect = server_count
}
# Enable a client on the same node
client {
enabled = true
options = {
"driver.raw_exec.enable" = "1"
}
}
consul {
address = "127.0.0.1:8500"
server_service_name = "nomad"
client_service_name = "nomad-client"
auto_advertise = true
server_auto_join = true
client_auto_join = true
}
vault {
enabled = true
address = "http://127.0.0.1:8200"
}
so my Nomad server has integrated Vault and Consul on the respective localhost ports. I only use boostrap_expect = 1 as I am using just one server. All other options should be self-explanatory.
The client service:
[Unit]
Description=Nomad client
Wants=network-online.target
After=network-online.target
[Service]
ExecStart= /bin/sh -c "/usr/bin/nomad agent -config /root/nomad-client.hcl"
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
is configured with:
# Increase log verbosity
log_level = "DEBUG"
# Setup data dir
data_dir = "/tmp/client"
# Enable the client
client {
enabled = true
}
ports {
http = 5657
}
consul {
address = "127.0.0.1:8500"
server_service_name = "nomad"
client_service_name = "nomad-client"
auto_advertise = true
server_auto_join = true
client_auto_join = true
}
# Disable the dangling container cleanup to avoid interaction with other clients
plugin "docker" {
config {
gc {
dangling_containers {
enabled = false
}
}
}
}
If I perform netstat -tulpn
on the server, the following is returned:
root@server-1:~# netstat -tupln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 xxxx:8300 0.0.0.0:* LISTEN 594/consul
tcp 0 0 xxxx:8301 0.0.0.0:* LISTEN 594/consul
tcp 0 0 xxxx:8302 0.0.0.0:* LISTEN 594/consul
tcp 0 0 127.0.0.1:8200 0.0.0.0:* LISTEN 592/vault
tcp6 0 0 :::8500 :::* LISTEN 594/consul
tcp6 0 0 :::8600 :::* LISTEN 594/consul
tcp6 0 0 :::5657 :::* LISTEN 614677/nomad
tcp6 0 0 :::4646 :::* LISTEN 583/nomad
tcp6 0 0 :::4647 :::* LISTEN 583/nomad
tcp6 0 0 :::4648 :::* LISTEN 583/nomad
udp 0 0 xxxx:8301 0.0.0.0:* 594/consul
udp 0 0 xxxx:8302 0.0.0.0:* 594/consul
udp6 0 0 :::8600 :::* 594/consul
udp6 0 0 :::4648 :::* 583/nomad
Job Scheduling
Before beginning here I would suggest reading more on jobs.
I could use nomad job init jessequinn.nomad
to create a nomad job file; however, I have already written one:
job "jessequinn" {
datacenters = ["dc1"]
name = "jessequinn"
update {
stagger = "10s"
max_parallel = 1
}
group "jessequinn" {
count = 2
task "jessequinn" {
env {
PORT = 3000
}
driver = "docker"
config {
image = "xxxx/xxxx:xxxx"
network_mode = "host"
port_map = {
http = 3000
}
}
service {
name = "jessequinn"
tags = [
"traefik.enable=true",
"traefik.http.routers.jessequinn.rule=Host(`jessequinn.info`)",
"traefik.http.routers.jessequinn.entrypoints=websecure",
"traefik.http.routers.jessequinn.service=jessequinn",
"traefik.http.services.jessequinn.loadbalancer.server.port=3000",
"traefik.http.routers.jessequinn.tls=true",
"traefik.http.routers.jessequinn.tls.certresolver=myresolver",
"traefik.http.routers.jessequinn.tls.domains[0].main=jessequinn.info",
"traefik.http.routers.jessequinn.tls.domains[0].sans=*.jessequinn.info",
"jessequinn"
]
port = "http"
check {
type = "http"
path = "/"
interval = "2s"
timeout = "2s"
}
}
resources {
cpu = 500
memory = 500
network {
mbits = 10
port "http" {
static = 3000
}
}
}
}
}
}
The jessequinn.nomad
job has a single task that pulls an image of my site, designates Docker as the engine, creates a service that includes tags for Traefik
and finally specifies the resources for each replica. The
idea here is very similar to that of any orchestration system like
Docker Swarm, K8S, etc. Without battling about
performance and such I decided to use Traefik for the
native support of Let’sEncrypt. However, Nomad has examples for
HAProxy, Fabio, NGINX, etc.
I also decided not to use a load balancer from DO due to price and SSL/TLS certificate limitation.
The following is the Traefik job:
job "traefik" {
region = "global"
datacenters = ["dc1"]
type = "service"
constraint {
attribute = "${node.unique.name}"
operator = "="
value = "server-1"
}
group "traefik" {
count = 1
task "traefik" {
env {
DO_AUTH_TOKEN = "xxxx"
}
driver = "docker"
config {
image = "traefik:v2.3"
network_mode = "host"
volumes = [
"local/traefik.toml:/etc/traefik/traefik.toml",
"local/acme.json:/acme.json",
"local/dyn/:/dyn/",
]
}
template {
data = <<EOF
{
"myresolver": {
"Account": {
"Email": "xxxx",
"Registration": {
"body": {
"status": "valid",
"contact": [
"mailto:xxxx"
]
},
"uri": "xxxx"
},
"PrivateKey": "xxxx",
"KeyType": "4096"
},
"Certificates": [
{
"domain": {
"main": "jessequinn.info"
},
"certificate": "xxxx",
"key": "xxxx",
"Store": "default"
},
{
"domain": {
"main": "scidoc.dev"
},
"certificate": "xxxx",
"key": "xxxx",
"Store": "default"
},
{
"domain": {
"main": "*.jessequinn.info"
},
"certificate": "xxxx",
"key": "xxxx",
"Store": "default"
},
{
"domain": {
"main": "*.scidoc.dev"
},
"certificate": "xxxx",
"key": "xxxx",
"Store": "default"
}
]
}
}
EOF
destination = "local/acme.json"
perms = "600"
}
template {
data = <<EOF
# Global redirection: http to https
[http.routers.http-catchall]
rule = "HostRegexp(`{host:(www\\.)?.+}`)"
entryPoints = ["web"]
middlewares = ["wwwtohttps"]
service = "noop"
# Global redirection: https (www.) to https
[http.routers.wwwsecure-catchall]
rule = "HostRegexp(`{host:(www\\.).+}`)"
entryPoints = ["websecure"]
middlewares = ["wwwtohttps"]
service = "noop"
[http.routers.wwwsecure-catchall.tls]
# middleware: http(s)://(www.) to https://
[http.middlewares.wwwtohttps.redirectregex]
regex = "^https?://(?:www\\.)?(.+)"
replacement = "https://${1}"
permanent = true
# NOOP service
[http.services.noop]
[[http.services.noop.loadBalancer.servers]]
url = "http://192.168.0.1:666"
EOF
destination = "local/dyn/global_redirection.toml"
}
template {
data = <<EOF
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.web.http]
[entryPoints.web.http.redirections]
[entryPoints.web.http.redirections.entryPoint]
to = "websecure"
scheme = "https"
[entryPoints.websecure]
address = ":443"
[entryPoints.traefik]
address = ":8081"
[api]
dashboard = true
insecure = true
[providers.file]
directory = "dyn/"
# Enable ACME (Let's Encrypt): automatic SSL.
[certificatesResolvers.myresolver.acme]
email = "xxxx"
storage = "acme.json"
[certificatesResolvers.myresolver.acme.dnsChallenge]
provider = "digitalocean"
delayBeforeCheck = 0
# Enable Consul Catalog configuration backend.
[providers.consulCatalog]
prefix = "traefik"
exposedByDefault = false
[providers.consulCatalog.endpoint]
address = "127.0.0.1:8500"
scheme = "http"
EOF
destination = "local/traefik.toml"
}
resources {
cpu = 100
memory = 128
network {
mbits = 10
port "http" {
static = 80
}
port "https" {
static = 443
}
port "api" {
static = 8081
}
}
}
service {
name = "traefik"
tags = [
"traefik"
]
check {
name = "alive"
type = "tcp"
port = "http"
interval = "10s"
timeout = "2s"
}
}
}
}
}
I basically include the toml
configuration file for Traefik and acme.json
for Let’s Encrypt to utilize. I suggest reading on Traefik if you do not understand what is happening. But in short, I am redirecting www to jessequinn.info
and http to https amongst other things. One other thing, I set a constraint to server-1 as I want this server to act as a load balancer. This way I can actually use Terraform to
set A record for www. and @.
As an example
## Add an A record to the domain for www.jessequinn.info ##
resource "digitalocean_record" "www-jessequinn" {
domain = "jessequinn.info"
type = "A"
name = "www"
value = digitalocean_droplet.server[0].ipv4_address
}
## Add an A record to the domain for jessequinn.info ##
resource "digitalocean_record" "jessequinn" {
domain = "jessequinn.info"
type = "A"
name = "@"
value = digitalocean_droplet.server[0].ipv4_address
}
Now we can schedule the job:
nomad plan job -address=http://private_ip:4646 jessequinn.nomad
nomad run job -address=http://private_ip:4646 jessequinn.nomad
nomad job status -address=http://private_ip:4646 jessequinn
Quick tip:
# nomad ui
ssh -L 4646:localhost:4646 username@server_ip
# traefik ui
ssh -L 8081:localhost:8081 username@server_ip
You can also tunnel into your UIs to see what is happening.
Final Words
I cannot say I prefer Nomad over Kubernetes or Kubernetes over Nomad, but I do find that Nomad is becoming more and more interesting to work with. I like the fact we can use other drivers. HOWEVER, I do not like how overly complicated Nomad is to configure HAProxy. In fact, I could not get it to work correctly!
In the end, the suite of products that Hashicorp offers are just fantastic. At work, I implemented Vault/Consul I just love it! I also test Ansible with Vagrant on a regular basis. Now that Boundary was released I will definitely consider using it.