Marco Santos

Nomad CSI with Scaleway


I’ve been an early adopter of Nomad and Scaleway, use them for both personal and professional workloads and I’ve been very happy with both so far however there was a limitation until recently…

Running stateless services with persistent volumes was not possible until the release of Nomad 0.11 with CSI support (Container Storage Interface) and Scaleway support for Block Storage.

Using CSI with Nomad is still in Beta, as I am writing nomad is at version 1.1.0 and they have been releasing improvements and bug fixing for their integration, some of the limitations and headaches that I faced during the integration of both are now solved.

How CSI works on Nomad?

Nomad CSI works by registering a controller plugin (coordinate block storage volumes) and a node plugin (mounting block storage on nodes), Nomad already provides documentation on how to do it on AWS.

Nomad with Scaleway

Scaleway implements also its version of CSI, the major changes are:

Environment variables must be populated accordingly with the following values:

  • SCW_ACCESS_KEY, SCW_SECRET_KEY

    Available after the generation of an account API Key

  • SCW_DEFAULT_PROJECT_ID

    Account Project ID

  • SCW_DEFAULT_ZONE

    Zone (fr-par-1, fr-par-2, …)

The following templates are based on the Nomad documentation with a few tweaks. Running the following commands will start scheduling in the Nomad cluster.

nomad job run plugin-scaleway-bs-controller.hcl

job "plugin-scaleway-bs-controller" {
  datacenters = ["scaleway-fr1"]

  group "controller" {
    task "plugin" {
      driver = "docker"

      env {
        SCW_ACCESS_KEY = "REPLACE_WITH_SCALEWAY_GENERATED_ACCESS_KEY"
        SCW_SECRET_KEY = "REPLACE_WITH_SCALEWAY_GENERATED_SECRET_KEY"
        # Project ID could also be an Organization ID
        SCW_DEFAULT_PROJECT_ID = "SCALEWAY_PROJECT_ID"
        # The default zone where the block volumes will be created, ex: fr-par-1
        SCW_DEFAULT_ZONE = "SCALEWAY_DEFAULT_ZONE"
      }

      config {
        image = "scaleway/scaleway-csi:master"

        args = [
          "--mode=controller",
          "--endpoint=unix://csi/csi.sock",
          "--logtostderr",
          "--v=5",
        ]
      }

      csi_plugin {
        id        = "csi.scaleway.com"
        type      = "controller"
        mount_dir = "/csi"
      }

      resources {
        cpu    = 64
        memory = 64
      }
    }
  }
}

nomad job run plugin-scaleway-bs-nodes.hcl

job "plugin-scaleway-bs-nodes" {
  datacenters = ["scaleway-fr1"]

  # you can run node plugins as service jobs as well, but this ensures
  # that all nodes in the DC have a copy.
  type = "system"

  # only one plugin of a given type and ID should be deployed on
  # any given client node
  #constraint {
  #  operator = "distinct_hosts"
  #  value = true
  #}

  group "nodes" {
    task "plugin" {
      driver = "docker"

      env {
        SCW_ACCESS_KEY = "REPLACE_WITH_SCALEWAY_GENERATED_ACCESS_KEY"
        SCW_SECRET_KEY = "REPLACE_WITH_SCALEWAY_GENERATED_SECRET_KEY"
        # Project ID could also be an Organization ID
        SCW_DEFAULT_PROJECT_ID = "SCALEWAY_PROJECT_ID"
        # The default zone where the block volumes will be created, ex: fr-par-1
        SCW_DEFAULT_ZONE = "SCALEWAY_DEFAULT_ZONE"
      }

      config {
        image = "scaleway/scaleway-csi:master"

        args = [
          "--mode=node",
          "--endpoint=unix://csi/csi.sock",
          "--logtostderr",
          "--v=5",
        ]

        # node plugins must run as privileged jobs because they
        # mount disks to the host
        privileged = true
      }

      csi_plugin {
        id        = "csi.scaleway.com"
        type      = "node"
        mount_dir = "/csi"
      }

      resources {
        cpu    = 64
        memory = 64
      }
    }
  }
}

It should be visible in the plugins section as shown in the picture

Nomad CSI Scaleway Plugin

Now it’s time for the volume registration, there are two ways to do it, if you already have a block storage volume you must populate the template below with the entry external_id (can be fetched from the Scaleway website in the details section of the existing block storage volume), the result should be similar to the following excerpt external_id = "fr-par-1/6142201f-6902-4aba-bcc7-e49c15234206", if it’s a volume from scratch just copy the template below to a file named block-storage.hcl or the name you desire.

# volume registration
type = "csi"
id = "block-storage"
name = "block-storage"
plugin_id = "csi.scaleway.com"
zone = "fr-par-1"
capacity_max = "25G"
capacity_min = "25G"
access_mode = "single-node-writer"
attachment_mode = "file-system"

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}

nomad volume create block-storage.hcl

The entry_id will be returned and can be populated in the template.

Running a job using CSI

Now it’s time to run a statefull job.
The following template will schedule a container using mongodb docker image listening on port 27017.

job "mongodb-scw-fr1" {
  datacenters = ["scaleway-fr1"]
  type        = "service"

  update {
    stagger      = "10s"
    max_parallel = 1
  }

  group "nosql" {
    count = 1

    network {
      port "db" {
        to = 27017
        static = 27017
      }
    }

    volume "MongoDB" {
      type      = "csi"
      read_only = false
      source    = "block-storage-v1"
      attachment_mode = "file-system"
      access_mode     = "single-node-writer"
    }

    restart {
      attempts = 10
      interval = "5m"
      delay    = "25s"
      mode     = "delay"
    }

    task "mongo" {
      driver = "docker"

      volume_mount {
        volume      = "MongoDB"
        destination = "/data/db"
        read_only   = false
      }

      config {
        image = "mongo:4.4"
        ports = ["db"]
      }

      resources {
        cpu    = 512
        memory = 768
      }

      service {
        name = "mongodb-scw-fr1"
        tags = ["db", "nosql", "mongodb"]
        port = "db"

        check {
          type     = "tcp"
          interval = "10s"
          timeout  = "4s"
        }
      }
    }
  }
}

nomad job run mongodb.hcl

In case the deployment succeeds a MongoDB service should start and can be migrated between servers (Nomad clients) without losing data.

Known and detected problems

  • When migrating volume(s) between nodes there’s downtime while nomad attempts to migrate the volume(s)
  • It might happen during a volume migration it won’t schedule properly, when it happens manual remediations should be taken such as deregistering the volume nomad volume deregister -force volume_name and registering it again.