Resource quotas
Enterprise Only
The functionality described here is available only in Nomad Enterprise with the Governance & Policy module. To explore Nomad Enterprise features, you can sign up for a free 30-day trial from here.
Nomad Enterprise provides support for resource quotas, which allow operators to restrict the aggregate resource usage of namespaces. Once a quota specification is attached to a namespace, the Nomad cluster will count all resource usage by jobs in that namespace toward the quota limits. If the resource is exhausted, allocations within the namespaces will be queued until resources become available—by other jobs finishing or the quota being expanded.
In this guide, you'll create a quota that limits resources in the global region. You will then assign the quota to a namespace where you will deploy a job. Finally, you'll learn how to secure the quota with ACLs.
Define and create a quota
You can manage resource quotas by using the nomad quota
subcommand. To
get started with creating a quota specification, run nomad quota init
which
produces an example quota specification.
$ nomad quota initExample quota specification written to spec.hcl
The file spec.hcl
defines a limit for the global region. Additional limits may
be specified in order to limit other regions.
# spec.hclname = "default-quota"description = "Limit the shared default namespace" # Create a limit for the global region. Additional limits may# be specified in-order to limit other regions.limit { region = "global" region_limit { cores = 0 cpu = 2500 memory = 1000 memory_max = 1000 device "nvidia/gpu/1080ti" { count = 1 } } variables_limit = 1000}
Resource limits
When specifying resource limits the following enforcement behaviors are defined:
limit < 0
: A limit less than zero disallows any access to the resource.limit == 0
: A limit of zero allows unlimited access to the resource.limit > 0
: A limit greater than zero enforces that the consumption is less than or equal to the given limit.
Note: Limits that specify the count of devices work differently. A count > 0
enforces that the consumption is less than or equal to the given limit, and setting
it to 0
disallows any usage of the given device. Unlike other limits, device count
cannot be set to a negative value. To allow unlimited access to a device
resource, remove it from the region limit.
A quota specification is composed of one or more resource limits. Each limit applies to a particular Nomad region. Within the limit object, operators can specify the allowed CPU and memory usage.
Create the quota
To create the quota, run nomad quota apply
with the filename of your quota specification.
$ nomad quota apply spec.hclSuccessfully applied quota specification "default-quota"!
Check for success with nomad quota list
.
$ nomad quota listName Descriptiondefault-quota Limit the shared default namespace
Attach a quota to a namespace
In order for a quota to be enforced, you have to attach the quota specification
to a namespace. This can be done using the nomad namespace apply
command. Add
the new quota specification to the default
namespace as follows:
$ nomad namespace apply -quota default-quota defaultSuccessfully applied namespace "default"!
View quota information
Now that you have attached a quota to a namespace, you can run a job in the default namespace.
Initialize the job.
$ nomad job initExample job file written to example.nomad.hcl
Run the job in the default namespace.
$ nomad job run -detach example.nomad.hclJob registration successfulEvaluation ID: 985a1df8-0221-b891-5dc1-4d31ad4e2dc3
Check the status.
$ nomad quota status default-quotaName = default-quotaDescription = Limit the shared default namespaceLimits = 1 Quota LimitsRegion CPU Usage Core Usage Memory Usage Memory Max Usage Variables Usageglobal 0 / 2500 0 / inf 0 / 1000 0 / 1000 0 / 1000 Quota Device LimitsRegion Device Name Device Usageglobal nvidia/gpu/1080ti 1 / 1
Notice that the newly created job is accounted against the quota specification since it is being run in the namespace attached to the "default-quota" quota.
$ nomad job run -detach example.nomad.hclJob registration successfulEvaluation ID: ce8e1941-0189-b866-3dc4-7cd92dc38a69
Now, add more instances of the job by changing count = 1
to count = 4
and
check to see if additional allocations were added.
$ nomad status exampleID = exampleName = exampleSubmit Date = 10/16/17 10:51:32 PDTType = servicePriority = 50Datacenters = dc1Status = runningPeriodic = falseParameterized = false SummaryTask Group Queued Starting Running Failed Complete Lostcache 1 0 3 0 0 0 Placement FailureTask Group "cache": * Quota limit hit "memory exhausted (1024 needed > 1000 limit)" Latest DeploymentID = 7cd98a69Status = runningDescription = Deployment is running DeployedTask Group Desired Placed Healthy Unhealthycache 4 3 0 0 AllocationsID Node ID Task Group Version Desired Status Created At6d735236 81f72d90 cache 1 run running 10/16/17 10:51:32 PDTce8e1941 81f72d90 cache 1 run running 10/16/17 10:51:32 PDT9b8e185e 81f72d90 cache 1 run running 10/16/17 10:51:24 PDT
The output indicates that Nomad created two more allocations but did not place the fourth allocation, which would have caused the quota to be oversubscribed on memory.
Secure quota with ACLs
Access to quotas can be restricted using ACLs. As an example, you could create an ACL policy that allows read-only access to quotas.
# Allow read only access to quotas.quota { policy = "read"}
Proper ACLs are necessary to prevent users from bypassing quota enforcement by increasing or removing the quota specification.
Design for federated clusters
Nomad makes working with quotas in a federated cluster simple by replicating quota specifications from the authoritative Nomad region. This allows operators to interact with a single cluster but create quota specifications that apply to all Nomad clusters.
As an example, you can create a quota specification that applies to two regions:
name = "federated-example"description = "A single quota spec affecting multiple regions" # Create a limits for two regionslimit { region = "europe" region_limit { cpu = 20000 memory = 10000 }} limit { region = "asia" region_limit { cpu = 10000 memory = 5000 }}
Apply the specification.
$ nomad quota apply spec.hclSuccessfully applied quota specification "federated-example"!
Now that the specification is applied and attached to a namespace with jobs in each region,
use the nomad quota status
command to observe how the enforcement applies
across federated clusters.
$ nomad quota status federated-exampleName = federated-exampleDescription = A single quota spec affecting multiple regionsLimits = 2 Quota LimitsRegion CPU Usage Memory Usageasia 2500 / 10000 1000 / 5000europe 8800 / 20000 6000 / 10000
Learn more about quotas
For specific details about working with quotas, consult the quota commands and HTTP API documentation.