What is Maelstrom?

Maelstrom is a suite of tools for running tests in isolated micro-containers locally on your machine or distributed across arbitrarily large clusters. Maelstrom currently has test runners for Rust, Go, and Python, with more on the way. You might use Maelstrom to run your tests because:

  • It's easy. Maelstrom provides drop-in replacements for cargo test, go test, and pytest. In most cases, it just works with your existing tests with minimal configuration.
  • It's reliable. Maelstrom runs every test isolated in its own lightweight container, eliminating confusing errors caused by inter-test or implicit test-environment dependencies.
  • It's scalable. Maelstrom can be run as a cluster. You can add more worker machines to linearly increase test throughput.
  • It's clean. Maelstrom has built a rootless container implementation (not relying on Docker or RunC) from scratch, in Rust, optimized to be low-overhead and start quickly.
  • It's fast. In most cases, Maelstrom is faster than cargo test or go test, even without using clustering. Maelstrom’s test-per-process model is inherently slower than Pytest’s shared-process model, but Maelstrom provides test isolation at a low performance cost

While our focus thus far has been on running tests, Maelstrom's underlying job execution system is general-purpose. We provide a command line utility to run arbitrary commands, as well a gRPC-based API and Rust bindings for programmatic access and control.

The project is currently Linux-only (x86 and ARM), as it relies on namespaces to implement containers.

Structure of This Book

This book will start out covering how to install Maelstrom. Next, it will cover common concepts that are applicable to all Maelstrom components, and other concepts that are specific to all Maelstrom clients. After that, there are in-depth chapters for each of the six binaries: cargo-maelstrom, maelstrom-go-test, maelstrom-pytest, maelstrom-run, maelstrom-broker, and maelstrom-worker.

There is no documentation yet for the gRPC API or the Rust bindings. Contact us if you're interested in using them, and we'll help get you started.

Installation

Maelstrom consists of a number of different programs. These are covered in more depth in this chapter. If you just want to give Maelstrom test-ride, you'll probably just want to install a test runner like cargo-maelstrom, maelstrom-go-test, or maelstrom-pytest.

The installation process is virtual identical for all programs. We'll demonstrate how to install all the binaries in the following sections. You can pick and choose which ones you actual want to install.

Maelstrom currently only supports Linux.

Installing From Pre-Built Binaries

The easiest way to install Maelstrom binaries is to use cargo-binstall, which allows you to pick and choose the binaries you want to install:

cargo binstall maelstrom-run
cargo binstall cargo-maelstrom
cargo binstall maelstrom-go-test
cargo binstall maelstrom-pytest
cargo binstall maelstrom-broker
cargo binstall maelstrom-worker

These commands retrieve the pre-built binaries from the Maelstrom GitHub release page. If you don't have cargo-binstall, you can directly install the pre-built binaries by simply untarring the release artifacts. For example:

wget -q -O - https://github.com/maelstrom-software/maelstrom/releases/latest/download/cargo-maelstrom-x86_64-unknown-linux-gnu.tgz | tar xzf -

This will download and extract the latest release of cargo-maelstrom for Linux on the x86-64 architecture.

Installing Using Nix

Maelstrom includes a nix.flake file, so you can install all Maelstrom binaries with nix profile install:

nix profile install github:maelstrom-software/maelstrom

The Nix flake doesn't currently support installing individual binaries.

Installing From Source With cargo install

Maelstrom binaries can be built from source using cargo install:

cargo install maelstrom-run
cargo install cargo-maelstrom
cargo install maelstrom-go-test
cargo install maelstrom-pytest
cargo install maelstrom-worker

However, maelstrom-broker requires some extra dependencies be installed before it can be built from source:

rustup target add wasm32-unknown-unknown
cargo install wasm-opt
cargo install maelstrom-broker

Common Concepts

This chapter covers concepts that are common to Maelstrom as a whole. Later chapters will cover individual programs and program classes in greater detail.

Jobs

The fundamental execution unit in Maelstrom is a job. A job is a program that is run in its own container, either locally or on a cluster. Jobs are intended to be programs that terminate on their own after doing some fixed amount of work. They aren't intended to be interactive or to run indefinitely. However, Maelstrom does provide a mechanism for running a job interactively for troubleshooting purposes.

Test runners will usually translate each test case into its own, standalone job. So, in the context of a test runner, we use the term "job" and "test" interchangeably.

Programs

Maelstrom comprises a number of different programs, split into two main categories: clients and daemons.

Clients

Clients include the test runners — cargo-maelstrom, maelstrom-go-test, and maelstrom-pytest — plus the CLI tool maelstrom-run.

Test Runners

Test runners are the glue between their associated test frameworks (Cargo, Go, Pytest, etc.) and the Maelstrom system. Test runners know how to build test binaries (if applicable), what dependencies to package up into containers to run the test binaries, and how to execute individual tests using the test binaries. They then use this information to build tests, execute them on the Maelstrom system, and then collect and present the results.

We currently have three test runners, but we're working to add more quickly. Please let us know if there is a specific test framework you are interested in, and we'll work to prioritize it.

maelstrom-run

In addition to the test runners, there is also a general-purpose CLI for running arbitrary jobs in the Maelstrom environment: maelstrom-run. This program can be useful in a few different scenarios.

First, it can be used to explore and debug containers used by tests. Sometimes, if you can't figure out why a test fails in a container but succeeds otherwise, it's useful to enter the container environment and poke around. You can try running test manually, exploring the directory structure, etc. This is done by running maelstrom-run --tty. In this scenario, it's very similar in feel to docker exec -it.

Second, it can be used in a script to execute arbitrary programs on the cluster. Let's say you had a Monte Cargo simulation program and you wanted to run it thousands of times, in parallel, on your Maelstrom cluster. You could use maelstrom-run to do this easily, either from the command-line or from a script.

Daemons

Unless configured otherwise, Maelstrom clients will execute jobs locally, in standalone mode. Put another way: clustering is completely optional. While standalone mode can be useful in certain applications, Maelstrom becomes even more powerful when jobs are executed on a cluster. The Maelstrom daemon programs are used to create Maelstrom clusters.

maelstrom-worker

Each Maelstrom cluster must have at least one worker. Workers are where jobs are actually executed. The worker executes each jobs in its own rootless container. Our custom-built container implementation ensures that there is very little overhead for container startup or teardown.

maelstrom-broker

Each Maelstrom cluster has exactly one broker. The broker coordinates between clients and workers. It caches artifacts, and schedules jobs on the workers.

The broker should be run on a machine that has good network connectivity with both the workers and clients, and which has a reasonably large amount of disk space available for caching artifacts.

Summary

You'll probably be mostly interested in a specific Maelstrom client: the test runner for your test framework. You may also be interested in the general-purpose maelstrom-run client, either for scripting against the cluster, or for exploring the containers used by your tests.

If you have access to multiple machines and want to build a Maelstrom cluster, you'll need to install one instance of the broker daemon and as many instances of the worker daemon as you have available machines.

Job States

Jobs transition through a number of states in their journey. This chapter explains those states.

Waiting for Artifacts

If a broker doesn't have all the required artifacts (i.e. container layers) for a job when it is submitted, the job enters the Awaiting-Artifacts state. The broker will notify the client of the missing artifacts, and wait for the client to transfer them. Once all artifacts have been received from the client, the job will proceed to the next state.

All jobs initially enter this state, though some then immediately transition to Pending if the broker has all of the required artifacts. Local jobs also immediately transition out of this state, since the worker is co-located with the client and has immediate access to all of the artifacts.

Pending

In the Pending state, the broker has the job and all of its artifacts, but hasn't yet found a free worker to execute the job. Jobs in this state are stored in a queue. Once a job reaches the front of the queue, and a worker becomes free, the job will be sent to the worker for execution.

Local jobs aren't technically sent to the broker. However, they still do enter a queue waiting to be submitted to the local worker, which is similar to the situation for remote jobs. For that reason, we lump local and remote jobs together in this state.

Running

A Running job has been set to the worker for execution. The worker could be executing the job, or it could be transferring some artifacts from the broker. In the future, we will likely split this state apart into the various different sub-states. If a worker disconnects from the broker, the broker moves all jobs that were assigned to that worker back to the [Pending] state.

Completed

Jobs in this state have been executed to completion by a worker.

Configuration Values

All Maelstrom programs are configured through "configuration values". Configuration values can be set through command-line options, environment variables, or configuration files.

Each configuration value has a type, which is either string, number, or boolean.

Imagine a hypothetical configuration value named config-value in a hypothetical program called maelstrom-prog. This configuration value could be specified via:

  • The --config-value command-line option.
  • The MAELSTROM_PROG_CONFIG_VALUE environment variable.
  • The config-value key in a configuration file.

Command-Line Options

Configuration values set on the command line override settings from environment variables or configuration files.

TypeExample
string--frob-name=string
string--frob-name string
number--frob-size=42
number--frob-size 42
boolean--enable-frobs
list--frob-name=a --frob-name=b or -- a b c

There is currently no way to set a boolean configuration value to false from the command-line.

Environment Variables

Configuration values set via environment variables override settings from configuration files, but are overridden by command-line options.

The environment variable name is created by converting the configuration value to "screaming snake case", and prepending a program-specific prefix. Image that we're evaluating configuration values for a program called maelstrom-prog:

TypeExample
stringMAELSTROM_PROG_FROB_NAME=string
numberMAELSTROM_PROG_FROB_SIZE=42
booleanMAELSTROM_PROG_ENABLE_FROBS=true
booleanMAELSTROM_PROG_ENABLE_FROBS=false
listMAELSTROM_PROG_FROBS="a b c"

Note that you don't put quotation marks around string values. You also can set boolean values to either true or false.

List values are white-space delimited. Any extra white-space between entries is ignored.

Configuration Files

Configuration files are in TOML format. In configuration files, configuration values map to keys of the same name. Values types map to the corresponding TOML types. For example:

frob-name = "string"
frob-size = 42
enable-frobs = true
enable-qux = false
frobs = ["a", "b"]

Maelstrom programs support the existence of multiple configuration files. In this case, the program will read each one in preference order, with the settings from the higher-preference files overriding those from lower-preference files.

By default, Maelstrom programs will use the XDG Base Directory Specification for searching for configuration files.

Specifically, any configuration file found in XDG_CONFIG_HOME has the highest preference, followed by those found in XDG_CONFIG_DIRS. If XDG_CONFIG_HOME is not set, or is empty, then ~/.config/ is used. Similarly, if XDG_CONFIG_DIRS is not set, or is empty, then /etc/xdg/ is used.

Each program has a program-specific suffix that it appends to the directory it gets from XDG. This has the form maelstrom/<prog>, where <prog> is program-specific.

Finally, the program looks for a file named config.toml in these directories.

More concretely, these are where Maelstrom programs will look for configuration files:

ProgramConfiguration File
cargo-maelstrom<xdg-config-dir>/maelstrom/cargo-maelstrom/config.toml
maelstrom-go-test<xdg-config-dir>/maelstrom/go-test/config.toml
maelstrom-pytest<xdg-config-dir>/maelstrom/pytest/config.toml
maelstrom-run<xdg-config-dir>/maelstrom/run/config.toml
maelstrom-broker<xdg-config-dir>/maelstrom/broker/config.toml
maelstrom-worker<xdg-config-dir>/maelstrom/worker/config.toml

For example, if neither XDG_CONFIG_HOME nor XDG_CONFIG_DIRS is set, then cargo-maelstrom will look for two configuration files:

  • ~/.config/maelstrom/cargo-maelstrom/config.toml
  • /etc/xdg/maelstrom/cargo-maelstrom/config.toml

Overriding Configuration File Location

Maelstrom programs support the --config-file (-c) command-line option. If this option is provided, the specified configuration file, and only that file, will be used.

If --config-file is given - as an argument, then no configuration file is used.

Here is a summary of which configuration files will be used for a given value of --config-file:

Command LineConfiguration File(s)
maelstrom-prog --config-file config.toml ...only config.toml
maelstrom-prog --config-file - ...none
maelstrom-prog ...search results

Common Configuration Values

Even Maelstrom program supports the log-level configuration value.

Log Level

The log-level configuration value specifies which log messages should be output. The program will output log messages of the given severity or higher. This string configuration value must be one of the following:

LevelMeaning
"error"indicates an unexpected and severe problem
"warning"indicates an unexpected problem that may degrade functionality
"info"is purely informational
"debug"is mostly for developers

The default value is "info".

Most programs output log messages to standard output or standard error, though the maelstrom-client background process will log them to a file in the state directory.

Common Command-Line Options

Every Maelstrom program supports the following command-line options, in addition to those command line options for common configuration values.

--help

The --help (or -h) command-line option will print out the program's command-line options, configuration values (including their associated environment variables), and configuration-file search path, then exit.

--version

The --version (or -v) command-line option will cause the program to print its software version, then exit.

--print-config

The --print-config (or -P) command-line option will print out all of the program's configuration values, then exit. This can be useful for validating configuration.

--config-file

The --config-file (or -c) command-line option is used to specify a specific configuration file, or specify that no configuration file should be used. See here for more details.

Client-Specific Concepts

This chapter covers concepts common to all clients. It is important to understand these concepts before reading the client-specific chapters.

Local Worker

Every client has a local built-in worker. The local worker is used to runs jobs in two scenarios.

First, if no broker configuration value is specified, then the client runs in standalone mode. In this mode, all jobs are executed by the local worker.

Second, some jobs are considered local-only. These jobs must be run on the local machine because they utilize some resource that is only available locally. These jobs are always run on the local worker, even if the client is connected to a broker.

Currently, when a client is connected to a broker, it will only use the local worker for local-only jobs. This will change in the future so that the local worker is utilized even when the client is connected to a cluster.

Clients have the following configuration values to configure their local workers:

ValueTypeDescriptionDefault
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output and error"1 MB"
slotsnumberjob slots available1 per CPU

cache-size

The cache-size configuration value specifies a target size for the cache. Its default value is 1 GB. When the cache consumes more than this amount of space, the worker will remove unused cache entries until the size is below this value.

It's important to note that this isn't a hard limit, and the worker will go above this amount in two cases. First, the worker always needs all of the currently-executing jobs' layers in cache. Second, the worker currently first downloads an artifact in its entirety, then adds it to the cache, then removes old values if the cache has grown too large. In this scenario, the combined size of the downloading artifact and the cache may exceed cache-size.

For these reasons, it's important to leave some wiggle room in the cache-size setting.

inline-limit

The inline-limit configuration value specifies how many bytes of stdout or stderr will be captured from jobs. Its default value is 1 MB. If stdout or stderr grows larger, the client will be given inline-limit bytes and told that the rest of the data was truncated.

In the future we will add support for the worker storing all of stdout and stderr if they exceed inline-limit. The client would then be able to download it "out of band".

slots

The slots configuration value specifies how many jobs the worker will run concurrently. Its default value is the number of CPU cores on the machine. In the future, we will add support for jobs consuming more than one slot.

Specifying the Broker

Every client has a broker configuration value that specifies the socket address of the broker. This configuration value is optional. If not provided, the client will run in standalone mode.

Here are some example socket address values:

  • broker.example.org:1234
  • 192.0.2.3:1234
  • [2001:db8::3]:1234

Directories

There are a number of directories that Maelstrom clients use. This chapter documents them.

Project Directory

The project directory is used to resolve local relative paths. It's also where the client will put the container tags lock file.

For maelstrom-pytest and maelstrom-run, the project directory is just the current working directory.

For cargo-maelstrom, the Maelstrom project directory is the same as the Cargo project directory. This is where the top-level Cargo.toml file is for the project. For simple Cargo projects with a single package, this will be the package's root directory. For more complex Cargo projects that use workspaces, this will be the workspace root directory.

For maelstrom-go-test, the Maelstrom project directory is root of main package. This is where the closest go.mod file is.

Container Depot Directory

The container depot directory is where clients cache container images that they download from image registries. It's usually desirable to share this directory across all clients and all projects. It's specified by the container-image-depot-root configuration value. See here for details.

State Directory

The state directory concept comes from the XDG Base Directory Specification. It's where the client will put things that should persist between restarts, but aren't important enough to be stored elsewhere, and which can removed safely.

Maelstrom clients use this directory for two purposes. First, every client spawns a program called maelstrom-client which it speaks to using gRPC messages. The log output for this program goes to the client-process.log file in the state directory.

Second, test runners keep track of test counts and test timings between runs. This lets them estimate how long a test will take, and how many tests still need to be built or run. Without this information, test runners will just give inaccurate estimates until they've rebuilt the state.

This state is project- and client-specific, so it is stored within the project in a client-specific directory:

ClientState Directory
maelstrom-runstate-root configuration value or the XDG specification
cargo-maelstrommaelstrom/state in the target subdirectory of the project directory
maelstrom-go-test.maelstrom-go-test/state in the project directory
maelstrom-pytest.maelstrom-pytest/state in the current directory

Cache Directory

The cache directory concept comes from the XDG Base Directory Specification. It's where non-essential files that are easily rebuilt are stored.

Maelstrom clients use this directory for their local worker. The size of the local-worker part of the cache is maintained by the cache-size configuration value.

In addition, clients use this directory to store other cached data like layers created with layer specifications. This size of this part of the cache directory isn't actively managed. If it grows too large, the user can safely delete the directory.

This cache is project- and client-specific, so it is stored within the project in a client-specific directory:

ClientCache Directory
maelstrom-runcache-root configuration value or the XDG specification
cargo-maelstrommaelstrom/cache in the target subdirectory of the project directory
maelstrom-go-test.maelstrom-go-test/cache in the project directory
maelstrom-pytest.maelstrom-pytest/cache in the current directory

Container Images

Maelstrom runs each job in its own container. By default, these containers are minimal and are built entirely from files copied from the local machine. This works well for compiled languages (like Rust), where job don't depend on executing other programs. However, for interpreted languages (like Python), or in situations where the job needs to execute other programs, Maelstrom provides a way to create containers based off of a standard, OCI container images. These container images can be retrieved from a container registry like Docker Hub, or provided from the local machine.

Container Image URIs

A container image is specified using the URI format defined by the Containers project. In particular, here is their specification of the URI format. We currently support the following URI schemes.

docker://docker_reference

This scheme indicates that the container image should be retrieved from a container registry using the Docker Registry HTTP API V2. The container registry, container name, and optional tags are specified in docker_reference.

docker-reference has the format: name[:tag | @digest]. If name does not contain a slash, it is treated as docker.io/library/name. Otherwise, the component before the first slash is checked to see if it is recognized as a hostname[:port] (i.e., it contains either a . or a :, or the component is exactly localhost). If the first component of name is not recognized as a hostname[:port], name is treated as docker.io/name.

Here is how the Containers project specifies this scheme.

oci:/path[:reference]

This scheme indicates that the container images should be retrieved from a local directory at path in the format specified by the Open Container Image Layout Specification.

Any characters after the first : are considered to be part of reference, which is used to match an org.opencontainers.image.ref.name annotation in the top-level index. If reference is not specified when reading an archive, the archive must contain exactly one image.

Here is how the Containers project specifies this scheme.

oci-archive:/path[:reference]

This scheme indicates that the container images should be retrieved from a tar file at path with contents in the format specified by the Open Container Image Layout Specification.

Any characters after the first : are considered to be part of reference, which is used to match an org.opencontainers.image.ref.name annotation in the top-level index. If reference is not specified when reading an archive, the archive must contain exactly one image.

Here is how the Containers project specifies this scheme.

Cached Container Images

When a container image is specified, the client will first download or copy the image into its cache directory. It will then use the internal bits of the OCI image — most importantly the file-system layers — to create the container for the job.

The cache directory can be set with the container-image-depot-root configuration value. If this value isn't set, but XDG_CACHE_HOME is set and non-empty, then $XDG_CACHE_HOME/maelstrom/containers will be used. Otherwise, ~/.cache/maelstrom/containers will be used. See the XDG Base Directories specification for more information.

Lock File

When a client first resolves a container registry tag, it stores the result in a local lock file. Subsequently, it will use the exact image specified in the lock file instead of resolving the tag again. This guarantees that subsequent runs use the same images as previous runs.

The lock file is maelstrom-container-tags.lock, stored in the project directory. It is recommended that this file be committed to revision control, so that others in the project, and CI, use the same images when running tests.

To update a tag to the latest version, remove the corresponding line from the lock file and then run the client. This will force it to re-evaluate the tag and store the new results.

Authentication

When a client connects to a container registry, it may need to authenticate. Maelstrom currently only supports registries for no authentication or with anonymous authentication. Password authentication will be added in the future.

Image Registry TLS Certificates

Maelstrom always uses HTTPS to connect to container registries. By default, it will reject self-signed or otherwise invalid certificates. However, clients can be told to accept these certificates with the accept-invalid-remote-container-tls-certs configuration value.

Configuration Values

All clients support the following container-image-related configuration values:

ValueTypeDescriptionDefault
container-image-depot-rootstringcontainer images cache directory$XDG_CACHE_HOME/maelstrom/containers
accept-invalid-remote-container-tls-certsbooleanallow invalid container registry certificatesfalse

The Job Specification

Each job run by Maelstrom is defined by a job specification, or "job spec" for short. Understanding job specifications is important for understanding what's going on with your tests, for troubleshooting failing tests, and for understanding test runners' configuration directives and maelstrom-run's input format.

This chapter shows the specification and its related types in Rust. This is done because it's a convenient format to use for documentation. You won't have to interact with Maelstrom at this level. Instead, clients will have analogous configuration options for providing the job specification.

This is what the JobSpec looks like:

#![allow(unused)]
fn main() {
pub struct JobSpec {
    pub container: ContainerRef,
    pub program: Utf8PathBuf,
    pub arguments: Vec<String>,
    pub timeout: Option<Timeout>,
    pub estimated_duration: Option<Duration>,
    pub allocate_tty: Option<JobTty>,
}
}

This is what ContainerRef and ContainerSpec looks like:

#![allow(unused)]
fn main() {
pub enum ContainerRef {
    Name(String),
    Inline(ContainerSpec)
}

pub struct ContainerSpec {
    pub image: Option<ImageSpec>,
    pub environment: Vec<EnvironmentSpec>,
    pub layers: Vec<LayerSpec>,
    pub devices: EnumSet<JobDevice>,
    pub mounts: Vec<JobMount>,
    pub network: JobNetwork,
    pub root_overlay: JobRootOverlay,
    pub working_directory: Option<Utf8PathBuf>,
    pub user: UserId,
    pub group: GroupId,
}
}

A JobSpec needs the information defined by a ContainerSpec to run, this information can be provided "inline" with the JobSpec or via the name of a previously saved ContainerSpec.

program

#![allow(unused)]
fn main() {
pub struct JobSpec {
    pub program: Utf8PathBuf,
    // ...
}
}

This is the path of the program to run, relative to the working_directory. The job will complete when this program terminates, regardless of any other processes that have been started.

The path must be valid UTF-8. Maelstrom doesn't support arbitrary binary strings for the path.

If program does not contain a /, then the program will be searched for in every directory specified in the PATH environment variable, similar to how execlp, execvp, and execvpe work.

The program is run as PID 1 in its own PID namespace, which means that it acts as the init process for the container. This shouldn't matter for most use cases, but if the program starts a lot of subprocesses, it may need to explicitly clean up after them.

The program is run as both session and process group leader. It will not have a controlling terminal unless allocate_tty is provided.

arguments

#![allow(unused)]
fn main() {
pub struct JobSpec {
    // ...
    pub arguments: Vec<String>,
    // ...
}
}

These are the arguments to pass to program, excluding the name of the program itself. For example, to run cat foo bar, you would set program to "cat" and arguments to ["foo", "bar"].

image

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub image: Option<ImageSpec>,
    // ...
}

pub struct ImageSpec {
    pub name: String,
    pub use_layers: bool,
    pub use_environment: bool,
    pub use_working_directory: bool,
}
}

The optional image field lets one define a job's container based off of an OCI container image. See here for more information.

name

The name field is a URI in the format defined here.

use_layers

A use_layers value of true indicates that the job specification should use the image's layers as the bottom of it's layers stack. More layers can be added with the job specifications layers field.

use_environment

A use_environment value of true indicates that the job specification should use the image's environment variables as a base. These can be modified with the job specification's environment field.

use_working_directory

A use_environment value of true indicates that the job specification should use the image's working directory instead of one provided in the job specification's working_directory field. If this flag is set, it is an error to also provide a working_directory field. It is also an error to set this flag with an image that doesn't provide a working directory.

environment

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub environment: Vec<EnvironmentSpec>,
    // ...
}

pub struct EnvironmentSpec {
    pub vars: BTreeMap<String, String>,
    pub extend: bool,
}
}

The environment field specifies the environment variables passed to program. This will be a map of key-value pairs of strings.

To compute the environment-variable map for a job, the client starts with either an empty map or with the environment variables provided by an image provided by the image field, and then only if the use_environment flag is set. This map is called the candidate map.

Then, for each map provided by an EnvironmentSpec element, it performs parameter expansion and then merges the resulting map into the candidate map. Once this has been done for every EnvironmentSpec element, the candidate map is used for the job.

Merging Environment-Variable Maps

If the extend flag is false, then the element's newly-computed environment-variable map will overwrite the candidate map.

If the extend flag is true, then the element's newly-computed environment-variable map will be merged into the candidate map: All variables specified in the element's map will overwrite the old values in the candidate map, but values not specified in the element's map will be left unchanged.

Environment-Variable Parameter Expansion

Parameter substitution is applied to The values provided in the EnvironmentSpec maps. A parameter has one of two forms:

  • $env{FOO} evaluates to the value of the client's FOO environment variable.
  • $prev{FOO} evaluates to the value of the FOO environment variable in the partially-computed map, without this map applied.

It is an error if the referenced variable doesn't exist. However, :- can be used to provide a default value, like $env{VAR:-default}.

Environment-Variable Examples

Here are some examples of specifying the environment variable map.

The simplest example is when a single element is provided and either the image field is not provided or the use_environment flag is false. In this case, the variables specified will be provided to the job.

[{ "vars": { "FOO": "foo", "BAR": "bar" }, "extend": false }]
[{ "vars": { "FOO": "foo", "BAR": "bar" }, "extend": true }]

Both of these will result in job being given two environment variables: FOO=foo and BAR=bar.

We can use the $env{} syntax to import variables from the client's environment. It can be useful to use :- in these cases to provide a default.

[{
    "vars": {
        "FOO": "$env{FOO}",
        "RUST_BACKTRACE": "$env{RUST_BACKTRACE:-0}"
    },
    "extend": false
}]

This will pass the client's value of $FOO to the job, and will error if there was no $FOO for the client. On the other hand, the job will be passed the client's $RUST_BACKTRACE, but if the client doesn't have that variable, RUST_BACKTRACE=0 will be provided.

We can use the $prev{} syntax to extract values from earlier in the array or from the specified image. For example, assume image is provided, and use_environment is set:

[{ "vars": { "PATH": "/my-bin:$prev{PATH}" }, "extend": true }]

This will prepend /my-bin to the PATH environment variable provided by the image. All other environment variables specified by the image will be preserved.

On the other hand, if we just wanted to use the image's FOO and BAR environment variables, but not any other ones it provided, we could do this:

[{ "vars": { "FOO": "$prev{FOO}", "BAR": "$prev{BAR}" }, "extend": false }]

Because extend is false, only FOO and BAR will be provided to the job.

It's possible to provide multiple environment elements. This feature is mostly aimed at programs, not humans, but it lets us provide an example that illustrates multiple features:

[
    { "vars": { "FOO": "foo1", "BAR": "bar1" }, "extend": false },
    { "vars": { "FOO": "foo2", "BAZ": "$env{BAZ}" }, "extend": true },
    { "vars": { "FOO": "$prev{BAZ}", "BAR": "$prev{BAR}" }, "extend": false },
]

The first element sets up an initial map with FOO=foo1 and BAR=bar1.

The second element sets FOO=foo2 and BAZ=<client-baz>, where <client-baz> is the client's value for $BAZ. Because extend is true, BAR isn't changed.

The third element sets FOO=<client-baz> and BAR=bar1. Because extend is false, the value for BAZ is removed. The end result is:

{ "FOO": "<client-baz>", BAR: "bar1" }

PATH Environment Variable

If the program field doesn't include a / character, then the PATH environment variable is used when searching for program, is done for execlp, execvp, and execvpe.

layers

#![allow(unused)]
fn main() {
pub struct JobSpec {
    // ...
    pub layers: Vec<LayerSpec>,
    // ...
}
}

The file system layers specify what file system the program will be run with. They are stacked on top of each other, starting with the first layer, with later layers overriding earlier layers.

These Layer objects are described in the next chapter. Test runners and maelstrom-run provide ways to conveniently specify these, as described in their respective chapters.

If the image field is provided, and it contains use_layers, then the layers provided in this field are stacked on top of the layers provided by the image.

mounts

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub mounts: Vec<JobMount>,
    // ...
}

pub enum JobMount {
    Bind {
        mount_point: Utf8PathBuf,
        local_path: Utf8PathBuf,
        read_only: bool,
    },
    Devices {
        devices: EnumSet<JobDevice>,
    },
    Devpts {
        mount_point: Utf8PathBuf,
    },
    Mqueue {
        mount_point: Utf8PathBuf,
    },
    Proc {
        mount_point: Utf8PathBuf,
    },
    Sys {
        mount_point: Utf8PathBuf,
    },
    Tmp {
        mount_point: Utf8PathBuf,
    },
}
}

These are extra file systems mounts put into the job's environment. They are applied in order, and the mount_point must already exist in the file system. Also the mount point must not be "/". Providing the mount point is one of the use cases for the "stubs" layer type.

Every mount type except Devices has a mount_point field, which is relative to the root of the file system, even if there is a working_directory specified.

Bind

#![allow(unused)]
fn main() {
pub enum JobMount {
    Bind {
        mount_point: Utf8PathBuf,
        local_path: Utf8PathBuf,
        read_only: bool,
    },
    // ...
}
}

If the job has a bind mount, it will become a local-only job, that can only be run on the local worker, not distributed on a cluster. However, bind mounts can be a useful "escape valve" for certain jobs that are tricky to containerize.

With a bind mount, the directory at mount_point in the job's container will refer to the directory at local_path in the client. In other words, local_path made available to the job at mount_point within the container. local_path is evaluated relative to the client's project directory.

The read_only flag specifies whether or not the job can write to the directory. NOTE: the mount isn't "locked", which means that if the job really wants to, it can remount mount_point read-write and modify the contents of the directory. We may consider locking mount points in a future version of Maelstrom.

Devices

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Devices {
        devices: EnumSet<JobDevice>,
    },
    // ...
}

pub enum JobDevice {
    Full,
    Fuse,
    Null,
    Random,
    Shm,
    Tty,
    Urandom,
    Zero,
}
}

This mount types contains the set of device files from /dev to add to the job's environment. Any subset can be specified.

Any specified device will be mounted in /dev based on its name. For example, Null would be mounted at /dev/null. For this to work, there must be a file located at the expected location in the container file system. In other words, if your job is going to specify Null, it also needs to have an empty file at /dev/null for the system to mount the device onto. This is one of the use cases for the "stubs" layer type.

Devpts

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Devpts {
        mount_point: Utf8PathBuf,
    },
    // ...
}
}

This provides a devpts file system at the provided mount point. Maelstrom will always specify a ptmxmode=0666 option.

If this file system is mounted, it usually makes sense to also add a symlink from /dev/pts/ptmx (or wherever the file system is mounted) to /dev/ptmx. This can be done with the symlinks layer type.

Mqueue

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Mqueue {
        mount_point: Utf8PathBuf,
    },
    // ...
}
}

This provides an mqueue file system at the provided mount point.

Proc

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Proc {
        mount_point: Utf8PathBuf,
    },
    // ...
}
}

This provides a proc file system at the provided mount point.

Sys

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Sys {
        mount_point: Utf8PathBuf,
    },
    // ...
}
}

This provides a sysfs file system at the provided mount point.

Linux disallows this mount type when using local networking. Jobs that specify both will receive an execution error and fail to run.

Tmp

#![allow(unused)]
fn main() {
pub enum JobMount {
    // ...
    Tmp {
        mount_point: Utf8PathBuf,
    },
}
}

This provides a tmpfs file system at the provided mount point.

network

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub network: JobNetwork,
    // ...
}

pub enum JobNetwork {
    Disabled,
    Loopback,
    Local,
}
}

By default, jobs are run with Disabled, which means they are completely disconnected from the network, without even a loopback interface. This means that they cannot communicate on localhost/127.0.0.1/::1.

If this field is set to Loopback, then the job will have a loopback interface and will be able to communicate on localhost/127.0.0.1/::1, but otherwise will be disconnected from the network.

If this field is set to Local, the job will become a local-only job, that can only be run on the local worker, not distributed on a cluster. The job will then be run without a network namespace, meaning that it will have access to all of the local machine's network devices. Note: if the job also specifies a Sys file system mount, Linux will fail to execute the job.

In the future, we plan to add more network options that will allow clustered jobs to communicate with the network. Until that time, if a job really has to communicate on the network, it must use Local.

root_overlay

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub root_overlay: JobRootOverlay,
    // ...
}

pub enum JobRootOverlay {
    None,
    Tmp,
    Local {
        upper: Utf8PathBuf,
        work: Utf8PathBuf,
    },
}
}

The root_overlay field controls whether the job's root file system should be mounted read-only or read-write, and if it is mounted read-write, whether to capture the changes the job made.

Note that this field doesn't affect any file systems specified in mounts: those will be writable.

The None value is the default, which means the job's root file system will be read-only.

The Tmp value means that / will be an overlayfs file system, with "lower" being the file system specified by the layers, and "upper" being a tmpfs file system. This will yield writable root file system. The contents of "upper" (i.e. the changes made by the job to the root file system) will be thrown away when the job terminates.

The Local value means that / will be an overlayfs file system, with "lower" being the file system specified by the layers, and "upper" being a local directory on the client. This value implies that the job will be a local-only job. When the job completes, the client can read the change it made to the root filesystem on upper. The work field must be a directory on the same file system as upper, and is used internally by overlayfs.

working_directory

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub working_directory: Option<Utf8PathBuf>,
    // ...
}
}

This specifies the directory that program is run in. The path provided in program will also be evaluated relative to this directory. If this isn't provided, / is used.

user

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub user: UserId,
    // ...
}

pub struct UserId(u32);
}

This specifies the UID the program is run as.

Maelstrom runs all of its jobs in rootless containers, meaning that they don't require any elevated permissions on the host machines. All containers will be run on the host machine as the user running cargo-maelstrom, maelstrom-run, or maelstrom-worker, regardless of what this field is set as.

However, if this is field is set to 0, the program will have some elevated permissions within the container, which may be undesirable for some jobs.

group

#![allow(unused)]
fn main() {
pub struct ContainerSpec {
    // ...
    pub group: GroupId,
    // ...
}

pub struct GroupId(u32);
}

The specifies the GID the program is run as. See user for more information.

Jobs don't have any supplemental GIDs, nor is there any way to provide them.

timeout

#![allow(unused)]
fn main() {
pub struct JobSpec {
    // ...
    pub timeout: Option<Timeout>,
    // ...
}

pub struct Timeout(NonZeroU32);
}

This specifies an optional timeout for the job, in seconds. If the job takes longer than the timeout, Maelstrom will terminate it and return the partial results. A value of 0 indicates an infinite timeout.

estimated_duration

#![allow(unused)]
fn main() {
pub struct JobSpec {
    // ...
    pub estimated_duration: Option<Duration>,
    // ...
}
}

The estimated_duration field is used to allow Maelstrom to use longest-processing-time-first scheduling (LPT). It's up to clients to provide a best guess of how long a job will take. If they can't provide one, they leave this field empty.

Test runners keep track of how long previous instances of a test took and use that information to fill in this field.

allocate_tty

#![allow(unused)]
fn main() {
pub struct JobSpec {
    // ...
    pub allocate_tty: Option<JobTty>,
}

pub struct JobTty {
    // ...
}
}

The allocate_tty field is used by maelstrom-run with the --tty command-line option to implement run the job interactively with a pseudo-terminal attached. Jobs run this way will have standard input, output, and error all associated with the allocated tty.

This can be useful for inspecting the container environment for a job.

Job Specification Layers

At the lowest level, a layer is just a tar file or a manifest. A manifest is a Maelstrom-specific file format that allows for file data to be transferred separately from file metadata. But for our purposes here, they're essentially the same.

As a user, having to specify every layer as a tar file would be very painful. For this reason, Maelstrom provides some conveniences for creating layers based on specifications. A layer specification looks like this:

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    Tar {
        path: Utf8PathBuf,
    },
    Glob {
        glob: String,
        prefix_options: PrefixOptions,
    },
    Paths {
        paths: Vec<Utf8PathBuf>,
        prefix_options: PrefixOptions,
    },
    Stubs { stubs: Vec<String> },
    Symlinks { symlinks: Vec<SymlinkSpec> },
    SharedLibraryDependencies {
        binary_paths: Vec<Utf8PathBuf>,
        prefix_options: PrefixOptions,
    },
}
}

We will cover each layer type below.

Tar

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    Tar {
        path: Utf8PathBuf,
    },
    // ...
}
}

The Tar layer type is very simple: The provided tar file will be used as a layer. The path is specified relative to the project directory.

PrefixOptions

#![allow(unused)]
fn main() {
pub struct PrefixOptions {
    pub strip_prefix: Option<Utf8PathBuf>,
    pub prepend_prefix: Option<Utf8PathBuf>,
    pub canonicalize: bool,
    pub follow_symlinks: bool,
}
}

The Paths and Glob layer types support some options that can be used to control how the resulting layer is created. They apply to all paths included in the layer. These options can be combined, and in such a scenario you can think of them taking effect in the given order:

  • follow_symlinks: Don't include symlinks, instead use what they point to.
  • canonicalize: Use absolute form of path, with components normalized and symlinks resolved.
  • strip_prefix: Remove the given prefix from paths.
  • prepend_prefix Add the given prefix to paths.

Here are some examples.

If test/d/symlink is a symlink which points to the file test/d/target, and is specified with follow_symlinks, then Maelstrom will put a regular file in the container at /test/d/symlink with the contents of test/d/target.

canonicalize

If the client is executing in the directory /home/bob/project, and the layers/c/*.bin glob is specified with canonicalize, then Maelstrom will put files in the container at /home/bob/project/layers/c/*.bin.

Additionally, if /home/bob/project/layers/py is a symlink pointing to /var/py, and the layers/py/*.py glob is specified with canonicalize, then Maelstrom will put files in the container at /var/py/*.py.

strip_prefix

If layers/a/a.bin is specified with strip_prefix = "layers/", then Maelstrom will put the file in the container at /a/a.bin.

prepend_prefix

If layers/a/a.bin is specified with prepend_prefix = "test/", then Maelstrom will put the file in the container at /test/layers/a/a.bin.

Glob

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    // ...
    Glob {
        glob: String,
        prefix_options: PrefixOptions,
    },
    // ...
}
}

The Glob layer type will include the files specified by the glob pattern in the layer. The glob pattern is executed by the client relative to the project directory. The glob pattern must use relative paths. The globset crate is used for glob pattern matching.

The prefix_options are applied to every matching path, as described above.

Paths

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    // ...
    Paths {
        paths: Vec<Utf8PathBuf>,
        prefix_options: PrefixOptions,
    },
    // ...
}
}

The Paths layer type will include each file referenced by the specified paths. This is executed by the client relative to the project directory. Relative and absolute paths may be used.

The prefix_options are applied to every matching path, as described above.

If a path points to a file, the file is included in the layer. If the path points to a symlink, either the symlink or the pointed-to-file gets included, depending on prefix_options.follow_symlinks. If the path points to a directory, an empty directory is included.

To include a directory and all of its contents, use the Glob layer type.

Stubs

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    // ...
    Stubs { stubs: Vec<String> },
    // ...
}
}

The Stubs layer type is used to create empty files and directories, usually so that they can be mount points for devices or mounts.

If a string contains the { character, the bracoxide crate is used to perform brace expansion, transforming the single string into multiple strings.

If a string ends in /, an empty directory will be added to the layer. Otherwise, and empty file will be added to the layer. Any parent directories will also be created as necessary.

For example, the set of stubs ["/dev/{null,zero}", "/{proc,tmp}/", "/usr/bin/"] would result in a layer with the following files and directories:

  • /dev/
  • /dev/null
  • /dev/zero
  • /proc/
  • /tmp/
  • /usr/
  • /usr/bin/
#![allow(unused)]
fn main() {
pub enum LayerSpec {
    // ...
    Symlinks { symlinks: Vec<SymlinkSpec> },
}

pub struct SymlinkSpec {
    pub link: Utf8PathBuf,
    pub target: Utf8PathBuf,
}
}

The Symlinks layer is used to create symlinks. The specified links will be created, pointing to the specified targets. Any parent directories will also be created, as necessary.

SharedLibraryDependencies

#![allow(unused)]
fn main() {
pub enum LayerSpec {
    // ...
    SharedLibraryDependencies {
        binary_paths: Vec<Utf8PathBuf>,
        prefix_options: PrefixOptions,
    },
}
}

The SharedLibraryDependencies layer is used to include the shared libraries required to run some binaries. The given paths to local binaries are inspected and the closure of shared libraries they rely on are included in the layer. This set of libraries includes libc and the dynamic linker. It will fail to create the layer if any of the binaries have missing libraries. The layer does not include the binaries themselves, to include them use a separate layer.

The prefix_options are applied to the paths to the shared libraries, as described above.

cargo-maelstrom

cargo-maelstrom is a replacement for cargo test which runs tests in lightweight containers, either locally or on a distributed cluster. Since each test runs in its own container, it is isolated from the computer it is running on and from other tests.

cargo-maelstrom is designed to be run as a custom Cargo subcommand. One can either run it as cargo-maelstrom or as cargo maelstrom.

For a lot of projects, cargo-maelstrom will run all tests successfully right out of the box. Some tests, though, have external dependencies that cause them to fail when run in cargo-maelstrom's default, stripped-down containers. When this happens, it's usually pretty easy to configure cargo-maelstrom so that it invokes the test in a container that contains all of the necessary dependencies. The Job Specification chapter goes into detail about how to do so.

Test Filter Patterns

There are times when a user needs to concisely specify a set of tests to cargo-maelstrom. One of those is on the command line: cargo-maelstrom can be told to only run a certain set of tests, or to exclude some tests. Another is the filter field of cargo-maelstrom.toml directives. This is used to choose which tests a directive applies too.

In order to allow users to easily specify a set of tests to cargo-maelstrom, we created the domain-specific pattern language described here.

If you are a fan of formal explanations check out the BNF. Otherwise, this page will attempt to give a more informal explanation of the language.

Simple Selectors

The most basic patterns are "simple selectors". These are only sometimes useful on their own, but they become more powerful when combined with other patterns. Simple selectors consist solely of one of the these identifiers:

Simple SelectorWhat it Matches
true, any, allany test
false, noneno test
libraryany test in a library crate
binaryany test in a binary crate
benchmarkany test in a benchmark crate
exampleany test in an example crate
testany test in a test crate

Simple selectors can optionally be followed by (). That is, library() and library are equivalent patterns.

Compound Selectors

"Compound selector patterns" are patterns like package.equals(foo). They combine "compound selectors" with "matchers" and "arguments". In our example, package is the compound selector, equals is the matcher, and foo is the argument.

These are the possible compound selectors:

Compound SelectorSelected Name
namethe name of the test
packagethe name of the test's package
binarythe name of the test's binary target
benchmarkthe name of the test's benchmark target
examplethe name of the test's example target
testthe name of the test's (integration) test target

Documentation on the various types of targets in cargo can be found here.

These are the possible matchers:

MatcherMatches If Selected Name...
equalsexactly equals argument
containscontains argument
starts_withstarts with argument
ends_withends with argument
matchesmatches argument evaluated as regular expression
globsmatches argument evaluated as glob pattern

Compound selectors and matchers are separated by . characters. Arguments are contained within delimiters, which must be a matched pair:

LeftRight
()
[]
{}
<>
//

The compound selectors binary, benchmark, example, and test will only match if the test is from a target of the specified type and the target's name matches. In other words, binary.equals(foo) can be thought of as shorthand for the compound pattern (binary && binary.equals(foo)).

Let's put this all together with some examples:

PatternWhat it Matches
name.equals(foo::tests::my_test)Any test named "foo::tests::my_test".
binary.contains/maelstrom/Any test in a binary crate, where the executable's name contains the substring "maelstrom".
package.matches{(foo)*bar}Any test whose package name matches the regular expression (foo)*bar.

Compound Expressions

Selectors can be joined together with operators to create compound expressions. These operators are:

OperatorsAction
!, ~, notLogical Not
&, &&, andLogical And
|, ||, orLogical Or
\, -, minusLogical Difference
(, )Grouping

The "logical difference" action is defined as follows: A - B == A && !B.

As an example, to select tests named foo or bar in package baz:

(name.equals(foo) || name.equals(bar)) && package.equals(baz)

As another example, to select tests named bar in package baz or tests named foo from any package:

name.equals(foo) || (name.equals(bar) && package.equals(baz))

Abbreviations

Selector and matcher names can be shortened to any unambiguous prefix.

For example, the following are all the same

name.equals(foo)
name.eq(foo)
n.eq(foo)

We can abbreviate name to n since no other selector starts with "n", but we can't abbreviate equals to e because there is another selector, ends_with, that also starts with an "e".

Test Pattern DSL BNF

Included on this page is the Backus-Naur form notation for the DSL

pattern                := or-expression
or-expression          := and-expression
                       |  or-expression or-operator and-expression
or-operator            := "|" | "||" | "or"
and-expression         := not-expression
                       |  and-expression and-operator not-expression
                       |  and-expression diff-operator not-expression
and-operator           := "&" | "&&" | "and" | "+"
diff-operator          := "\" | "-" | "minus"
not-expression         := simple-expression
                       |  not-operator not-expression
not-operator           := "!" | "~" | "not"
simple-expression      := "(" or-expression ")"
                       |  simple-selector
                       |  compound-selector
simple-selector        := simple-selector-name
                       |  simple-selector-name "(" ")"
simple-selector-name   := "all" | "any" | "true"
                       |  "none" | "false"
                       |  "library"
                       |  compound-selector-name
compound-selector      := compound-selector-name "." matcher-name matcher-parameter
compound-selector-name := "name" | "binary" | "benchmark" | "example" |
                          "test" | "package"
matcher-name           := "equals" | "contains" | "starts_with" | "ends_with" |
                          "matches" | "globs"
matcher-parameter      := <punctuation mark followed by characters followed by
                           matching punctuation mark>

Job Specification: cargo-maelstrom.toml

The file cargo-maelstrom.toml in the workspace root is used to specify to cargo-maelstrom what job specifications are used for which tests.

This chapter describes the format of that file and how it is used to set the job spec fields described here.

Default Configuration

If there is no cargo-maelstrom.toml in the workspace root, then cargo-maelstrom will run with the following defaults:

# Because it has no `filter` field, this directive applies to all tests.
[[directives]]

# This layer just includes files and directories for mounting the following
# file-systems and devices.
layers = [
    { stubs = [ "/{proc,sys,tmp}/", "/dev/{full,null,random,urandom,zero}" ] },
]

# Provide /tmp, /proc, /sys, and some devices in /dev/. These are used pretty
# commonly by tests.
mounts = [
    { type = "tmp", mount_point = "/tmp" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "devices", devices = ["full", "null", "random", "urandom", "zero"] },
]

# Forward the RUST_BACKTRACE and RUST_LIB_BACKTRACE environment variables.
# Later directives can override the `environment` key, but the `added_environment` key is only
# additive. By using it here we ensure it applies to all tests regardless of other directives.
[directives.added_environment]
RUST_BACKTRACE = "$env{RUST_BACKTRACE:-0}"
RUST_LIB_BACKTRACE = "$env{RUST_LIB_BACKTRACE:-0}"

Initializing cargo-maelstrom.toml

It's likely that at some point you'll need to adjust the job specs for some tests. At that point, you're going to need an actual cargo-maelstrom.toml. Instead of starting from scratch, you can have cargo-maelstrom create one for you:

cargo maelstrom --init

This will create a cargo-maelstrom.toml file, unless one already exists, then exit. The resulting cargo-maelstrom.toml will match the default configuration. It will also include some commented-out examples that may be useful.

Directives

The cargo-maelstrom.toml file consists of a list of "directives" which are applied in order. Each directive has some optional fields, one of which may be filter. To compute the job spec for a test, cargo-maelstrom starts with a default spec, then iterates over all the directives in order. If a directive's filter matches the test, the directive is applied to the test's job spec. Directives without a filter apply to all tests. When it reaches the end of the configuration, it pushes one or two more layers containing the test executable, and optionally all shared library dependencies (see here for details). The job spec is then used for the test.

There is no way to short-circuit the application of directives. Instead, filters can be used to limit scope of a given directive.

To specify a list of directives in TOML, we use the [[directives]] syntax. Each [[directives]] line starts a new directive. For example, this snippet specifies two directives:

[[directives]]
include_shared_libraries = true

[[directives]]
filter = "package.equals(maelstrom-util) && name.equals(io::splicer)"
added_mounts = [{ type = "proc", mount_point = "/proc" }]
added_layers = [{ stubs = [ "proc/" ] }]

The first directive applies to all tests, since it has no filter. It sets the include_shared_libraries pseudo-field in the job spec. The second directive only applies to a single test named io::splicer in the maelstrom-util package. It adds a layer and a mount to that test's job spec.

Directive Fields

This chapter specifies all of the possible fields for a directive. Most, but not all, of these fields have an obvious mapping to job-spec fields.

filter

This field must be a string, which is interpreted as a test filter pattern. The directive only applies to tests that match the filter. If there is no filter field, the directive applies to all tests.

Sometimes it is useful to use multi-line strings for long patterns:

[[directives]]
filter = """
package.equals(maelstrom-client) ||
package.equals(maelstrom-client-process) ||
package.equals(maelstrom-container) ||
package.equals(maelstrom-fuse) ||
package.equals(maelstrom-util)"""
layers = [{ stubs = ["/tmp/"] }]
mounts = [{ type = "tmp", mount_point = "/tmp" }]

include_shared_libraries

[[directives]]
include_shared_libraries = true

This boolean field sets the include_shared_libraries job spec pseudo-field. We call it a pseudo-field because it's not a real field in the job spec, but instead determines how cargo-maelstrom will do its post-processing after computing the job spec from directives.

In post-processing, if the include_shared_libraries pseudo-field is false, cargo-maelstrom will only push a single layer onto the job spec. This layer will contain the test executable, placed in the root directory.

On the other hand, if the pseudo-field is true, then cargo-maelstrom will push two layers onto the job spec. The first will be a layer containing all of the shared-library dependencies for the test executable. The second will contain the test executable, placed in the root directory. (Two layers are used so that the shared-library layer can be cached and used by other tests.)

If the pseudo-field is never set one way or the other, then cargo-maelstrom will choose a value based on the image field of the job spec. In this case, include_shared_libraries will be true if and only if image is not specified.

You usually want this pseudo-field to be true, unless you're using a container image for your tests. In that case, you probably want to use the shared libraries included with the container image, not those from the system running the tests.

image

Sometimes it makes sense to build your test's container from an OCI container image. For example, when we do integration tests of cargo-maelstrom, we want to run in an environment with cargo installed.

This is what the image field is for. It is used to set the job spec's image field.

[[directives]]
filter = "package.equals(cargo-maelstrom)"
image.name = "docker://rust"
image.use = ["layers", "environment"]

The image field may either be a string or a table. If it's a string, then it's assumed to be the URI of the image to use, as documented here. In this case, the job spec will have use_layers and use_environment both set to true.

If the image field is a table, then it must have a name subfield and optionally may have a use subfield.

The name sub-field specifies the name of the image. It must be a string. It specifies the URI of the image to use, as documented here.

The use sub-field must be a list of strings specifying what parts of the container image to use for the job spec. It must contain a non-empty subset of:

  • layers: This sets the use_layers field in the job spec's image value.
  • environment: This sets the use_environment field in the job spec's image value.
  • working_directory: This sets the use_working_directory field in the job spec's image value.

If the use sub-field isn't specified, then the job spec will have use_layers and use_environment both set to true.

For example, the following directives all have semantically equivalent image fields:

[[directives]]
filter = "package.equals(package-1)"
image.name = "docker://rust"
image.use = ["layers", "environment"]

[[directives]]
filter = "package.equals(package-2)"
image = { name = "docker://rust", use = ["layers", "environment"] }

[[directives]]
filter = "package.equals(package-3)"
image.name = "docker://rust"

[[directives]]
filter = "package.equals(package-4)"
image = { name = "docker://rust" }

[[directives]]
filter = "package.equals(package-5)"
image = "docker://rust"

layers

[[directives]]
layers = [
    { tar = "layers/foo.tar" },
    { paths = ["layers/a/b.bin", "layers/a/c.bin"], strip_prefix = "layers/a/" },
    { glob = "layers/b/**", strip_prefix = "layers/b/" },
    { stubs = ["/dev/{null, full}", "/proc/"] },
    { symlinks = [{ link = "/dev/stdout", target = "/proc/self/fd/1" }] },
    { shared-library-dependencies = ["/bin/bash"], prepend_prefix = "/usr" }
]

This field provides an ordered list of layers for the job spec's layers field.

Each element of the list must be a table with one of the following keys:

  • tar: The value must be a string, indicating the local path of the tar file. This is used to create a tar layer.
  • paths: The value must be a list of strings, indicating the local paths of the files and directories to include to create a paths layer. It may also include fields from prefix_options (see below).
  • glob: The value must be a string, indicating the glob pattern to use to create a glob layer. It may also include fields from prefix_options (see below).
  • stubs: The value must be a list of strings. These strings are optionally brace-expanded and used to create a stubs layer.
  • symlinks: The value must be a list of tables of link/target pairs. These strings are used to create a symlinks layer.
  • shared-library-dependencies: The value must be list of strings, indicating local paths of binaries. This layer includes the set of shared libraries the binaries depend on. This includes libc and the dynamic linker. This doesn't include the binary itself.

If the layer is a paths, glob, or shared-library-dependencies layer, then the table can have any of the following extra fields used to provide the prefix_options:

For example:

[[directives]]
layers = [
    { paths = ["layers"], strip_prefix = "layers/", prepend_prefix = "/usr/share/" },
]

This would create a layer containing all of the files and directories (recursively) in the local layers subdirectory, mapping local file layers/example to /usr/share/example in the test's container.

This field can't be set in the same directive as image if the image.use contains "layers".

Path Templating

Anywhere a path is accepted in a layer, certain template variables can be used. These variables are replaced with corresponding values in the path they are present in. Template variables are surrounded by < and >. The leading < can be escaped with a double << in cases where it precedes a valid template variable expression and no template substitution is desired.

The following are valid template variables for cargo-maelstrom

  • <build-path> The path to the directory where cargo stores build output for the current profile.

As an example, suppose you have an integration test for a binary named foo and want access to that binary when running the test. Cargo will provide the CARGO_BIN_EXE_foo environment variable at compile time which expands to the absolute path to foo. If we want to execute it in the test though, we have to include foo in a layer at the right place.

[[directives]]
filter = "test.equals(foo_integration_test)"
layers = [
    { paths = ["<build-path>/foo"], canonicalize = true }
]

added_layers

This field is like layers, except it appends to the job spec's layers field instead of replacing it.

This field can be used in the same directive as an image.use that contains "layers". For example:

[[directives]]
image.name = "cool-image"
image.use = ["layers"]
added_layers = [
    { paths = [ "extra-layers" ], strip_prefix = "extra-layers/" },
]

This directive sets uses the layers from "cool-image", but with the contents of local extra-layers directory added in as well.

environment

[[directives]]
environment = {
    USER = "bob",
    RUST_BACKTRACE = "$env{RUST_BACKTRACE:-0}",
}

This field sets the environment field of the job spec. It must be a table with string values. It supports two forms of $ expansion within those string values:

  • $env{FOO} evaluates to the value of cargo-maelstrom's FOO environment variable.
  • $prev{FOO} evaluates to the previous value of FOO for the job spec.

It is an error if the referenced variable doesn't exist. However, you can use :- to provide a default value:

FOO = "$env{FOO:-bar}"

This will set FOO to whatever cargo-maelstrom's FOO environment variable is, or to "bar" if cargo-maelstrom doesn't have a FOO environment variable.

This field can't be set in the same directive as image if the image.use contains "environment".

added_environment

This field is like environment, except it updates the job spec's environment field instead of replacing it.

When this is provided in the same directive as the environment field, the added_environment gets evaluated after the environment field. For example:

[[directives]]
environment = { VAR = "foo" }

[[directives]]
environment = { VAR = "bar" }
added_environment = { VAR = "$prev{VAR}" }

In this case, VAR will be "bar", not "foo".

This field can be used in the same directive as an image.use that contains "environment". For example:

[[directives]]
image = { name = "my-image", use = [ "layers", "environment" ] }
added_environment = { PATH = "/scripts:$prev{PATH}" }

This prepends "/scripts" to the PATH provided by the image without changing any of the other environment variables.

mounts

[[directives]]
mounts = [
    { type = "bind", mount_point = "/mnt", local_path = "data-for-job", read_only = true },
    { type = "devices", devices = [ "full", "fuse", "null", "random", "shm", "tty", "urandom", "zero" ] },
    { type = "devpts", mount_point = "/dev/pts" },
    { type = "mqueue", mount_point = "/dev/mqueue" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "tmp", mount_point = "/tmp" },
]

This field sets the mounts field of the job spec. It must be a list of tables, each of which is a TOML translation of the corresponding job-spec type.

added_mounts

This field is like mounts, except it appends to the job spec's mounts field instead of replacing it.

working_directory

[[directives]]
working_directory = "/home/root/"

This field sets the working_directory field of the job spec. It must be a string.

This field can't be set in the same directive as image if the image.use contains "working_directory".

network

[[directives]]
network = "loopback"

This field sets the network field of the job spec. It must be a string. It defaults to "disabled".

enable_writable_file_system

[[directives]]
enable_writable_file_system = true

This field sets the enable_writable_file_system field of the job spec. It must be a boolean.

user

[[directives]]
user = 1000

This field sets the user field of the job spec. It must be an unsigned, 32-bit integer.

group

[[directives]]
group = 1000

This field sets the group field of the job spec. It must be an unsigned, 32-bit integer.

timeout

[[directives]]
timeout = 60

This field sets the timeout field of the job spec. It must be an unsigned, 32-bit integer.

ignore

[[directives]]
ignore = true

This field specifies that any tests matching the directive should not be run.
When tests are run, ignored tests are displayed with a special "ignored" state.
When tests are listed, ignored tests are listed normally.

Files in Target Directory

cargo-maelstrom stores a number of files in the workspace's target directory, under the maelstrom subdirectory. This chapter lists them and explains what they're for.

It is safe to remove this directory whenever cargo-maelstrom isn't running.

Except in the case of the local worker, cargo-maelstrom doesn't currently make any effort to clean up these files. However, the total space consumed by these files should be pretty small.

cargo-maelstrom also uses the container-images cache. That cache is not stored in the target directory, as it can be shared by different Maelstrom clients.

Local Worker

The local worker stores its cache in maelstrom/cache/local-worker/ in the target directory. The cache-size configuration value indicates the target size of this cache directory.

Manifest Files

cargo-maelstrom uses "manifest files" for non-tar layers. These are like tar files, but without the actual data contents. These files are stored in maelstrom/cache/manifests/ in the target directory.

File Digests

Files uploaded to the broker are identified by a hash of their file contents. Calculating these hashes can be time consuming so cargo-maelstrom caches this information. This cache is stored in maelstrom/cache/cached-digests.toml in the target directory.

Client Log File

The local client process — the one that cargo-maelstrom talks to, and that contains the local worker — has a log file that is stored at maelstrom/state/client-process.log in the target directory.

Test Listing

When cargo-maelstrom finishes, it updates a list of all of the tests in the workspace, and how long they took to run. This is used to predict the number of tests that will be run in subsequent invocations, as well as how long they will take. This is stored in the maelstrom/state/test-listing.toml file in the target directory.

Test Execution Order

Maelstrom doesn't execute tests in a random order. Instead, it tries to execute them in an order that will be helpful to the user.

Understanding Priorities

Maelstrom assigns every test a priority. When there is a free slot and available tests, Maelstrom will choose the available test with the highest priority.

Tests may not be available for a variety of reasons. The test's binary may not have been compiled yet, the test's required container image may not have been downloaded yet, or the test's required artifacts may not have been uploaded yet.

When a test becomes available, there are no free slots, and the test has a higher priority than any of the existing tests, it does not preempt any of the existing tests. Instead, it will be chosen first the next time a slot becomes available.

New Test and Tests that Failed Previously

A test's priority consists of the two parts. The first part, and more important part, is whether the test is new or failed the last time it was run. The logic here is that user probably is most interested in finding out the outcomes of these tests. New tests and test that failed the last time they were run have the same priority.

A test is considered new if Maelstrom has no record of it executing. If the test listing file in the state directory has been removed, then Maelstrom will consider every test to be new. If a test, its artifacts, or its package is renamed, it is also considered new.

A test is considered to have failed the last time it was run if there was even one failure. This is relevant when the --repeat configuration value is set. If the previous run For example, if --repeat=1000 is passed, and the passes 999 times and fails just once, it is still considered to have failed.

Estimated Duration and LPT Scheduling

The second part of a test's priority is its estimated duration. In the test listing file in the state directory, Maelstrom keeps track of the recent running times of the test. It uses this to guess how long the test will take to execute. Tests that are expected to run the longest are scheduled first.

Using the estimated duration to set a test's priority means that Maelstrom uses longest-processing-time-first (LPT) scheduling. Finding optimal scheduling orders is an NP-hard problem, but LPT scheduling provides a good approximate. With LPT scheduling, the overall runtime will never be more than 133% of the optimal runtime.

Configuration Values

cargo-maelstrom supports the following configuration values:

ValueTypeDescriptionDefault
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output error"1 MB"
slotsnumberjob slots available1 per CPU
container-image-depot-rootstringcontainer images cache directory$XDG_CACHE_HOME/maelstrom/containers
accept-invalid-remote-container-tls-certsbooleanallow invalid container registry certificatesfalse
brokerstringaddress of brokerstandalone mode
log-levelstringminimum log level"info"
quietbooleandon't output per-test informationfalse
uistringUI style to use"auto"
repeatnumberhow many times to run each test1
timeoutstringoverride timeout value testsdon't override
featuresstringcomma-separated list of features to activateCargo's default
all-featuresbooleanactivate all available featuresCargo's default
no-default-featuresbooleando not activate the default featureCargo's default
profilestringbuild artifacts with the specified profileCargo's default
targetstringbuild for the target tripleCargo's default
target-dirstringdirectory for all generated artifactsCargo's default
manifest-pathstringpath to Cargo.tomlCargo's default
frozenbooleanrequire Cargo.lock and cache are up to dateCargo's default
lockedbooleanrequire Cargo.lock is up to dateCargo's default
offlinebooleanrun without Cargo accessing the networkCargo's default
extra-test-binary-argslistpass arbitrary arguments to test binaryno args
stop-afternumberstop after given number of failuresnever stop

cache-size

This is a local-worker setting, common to all clients. See here for details.

inline-limit

This is a local-worker setting, common to all clients. See here for details.

slots

This is a local-worker setting, common to all clients. See here for details.

container-image-depot-root

This is a container-image setting, common to all clients. See here for details.

accept-invalid-remote-container-tls-certs

This is a container-image setting, common to all clients. See here for details.

broker

The broker configuration value specifies the socket address of the broker. This configuration value is optional. If not provided, cargo-maelstrom will run in standalone mode.

Here are some example value socket addresses:

  • broker.example.org:1234
  • 192.0.2.3:1234
  • [2001:db8::3]:1234

log-level

This is a setting common to all Maelstrom programs. See here for details.

cargo-maelstrom always prints log messages to stdout. It also passes the log level to maelstrom-client, which will log its output in a file named client-process.log in the state directory.

quiet

The quiet configuration value, if set to true, causes cargo-maelstrom to be more more succinct with its output. If cargo-maelstrom is outputting to a terminal, it will display a single-line progress bar indicating all test state, then print a summary at the end. If not outputting to a terminal, it will only print a summary at the end.

ui

The ui configuration value controls the UI style used. It must be one of auto, fancy, quiet, or simple. The default value is auto.

StyleDescription
simpleThis is our original UI. It prints one line per test result (unless quiet is true), and will display some progress bars if standard output is a TTY.
fancyThis is our new UI. It has a rich TTY experience with a lot of status updates. It is incompatible with quiet or with non-TTY standard output.
quietMinimal UI with only a single progress bar
autoWill choose fancy if standard output is a TTY and quiet isn't true. Otherwise, it will choose simple.

repeat

The repeat configuration value specifies how many times each test will be run. It must be a nonnegative integer. On the command line, --loop can be used as an alias for --repeat.

timeout

The optional timeout configuration value provides the timeout value to use for all tests. This will override any value set in cargo-maelstrom.toml.

Cargo Settings

cargo-maelstrom shells out to cargo to get metadata about tests and to build the test artifacts. For the former, it uses cargo metadata. For the latter, it uses cargo test --no-run.

cargo-maelstrom supports a number of command-line options that are passed through directly to cargo. It does not inspect these values at all.

Command-Line OptionCargo GroupingPassed To
featuresfeature selectiontest and metadata
all-featuresfeature selectiontest and metadata
no-default-featuresfeature selectiontest and metadata
profilecompilationtest
targetcompilationtest
target-diroutputtest
manifest-pathmanifesttest and metadata
frozenmanifesttest and metadata
lockedmanifesttest and metadata
offlinemanifesttest and metadata

cargo-maelstrom doesn't accept multiple instances of the --features command-line option. Instead, combine the features into a single, comma-separated argument like this: --features=feat1,feat2,feat3.

cargo-maelstrom doesn't accept the --release alias. Use --profile=release instead.

extra-test-binary-args

This allows passing of arbitrary command-line arguments to the Rust test binary. See the help text for a test binary for what options are accepted normally.

These arguments are added last after --exact and --nocapture, but before we pass the test case to run. It could be possible to use these flags to somehow not run the test cargo-maelstrom was intending to, producing confusing results.

When provided on the command-line these arguments are positional and come after any other arguments. To avoid ambiguity, -- should be used to denote the end of normal command-line arguments, and the beginning these arguments like follows:

cargo maelstrom -- --force-run-in-process

stop-after

This optional configuration value if provided gives a limit on the number of failure to tolerate. If the limit is reached, cargo-maelstrom exits prematurely.

Command-Line Options

In addition the command-line options for used to specify configuration values described in the previous chapter, cargo-maelstrom supports these command-line options:

--include and --exclude

The --include (-i) and --exclude (-x) command-line options control which tests cargo-maelstrom runs or lists.

These options take a test filter pattern. The --include option includes any test that matches the pattern. Similarly, --exclude pattern excludes any test that matches the pattern. Both options are allowed to be repeated arbitrarily.

The tests that are selected are the set which match any --include pattern but don't match any --exclude pattern. In other words, --excludes have precedence over --includes, regardless of the order they are specified.

If no --include option is provided, cargo-maelstrom acts as if an --include all option was provided.

--init

The --init command-line option is used to create a starter cargo-maelstrom.toml file. See here for more information.

--list-tests or --list

The --list-tests (or --list) command-line option causes cargo-maelstrom to build all required test binaries, then print the tests that would normally be run, without actually running them.

This option can be combined with --include and --exclude.

--list-binaries

The --list-binaries command-line option causes cargo-maelstrom to print the names and types of the crates that it would run tests from, without actually building any binaries or running any tests.

This option can be combined with --include and --exclude.

--list-packages

The --list-packages command-line option causes cargo-maelstrom to print the packages from which it would run tests, without actually building any binaries or running any tests.

This option can be combined with --include and --exclude.

Working with Workspaces

When you specify a filter with a package, cargo-maelstrom will only build the matching packages. This can be a useful tip to remember when trying to run a single test.

If we were to run something like:

cargo maelstrom --include "name.equals(foobar)"

cargo-maelstrom would run any test which has the name "foobar". A test with this name could be found in any of the packages in the workspace, so it is forced to build all of them. But if we happened to know that only one package has this test — the baz package — it would be faster to instead run:

cargo maelstrom --include "package.equals(baz) && name.equals(foobar)"

Since we specified that we only care about the baz package, cargo-maelstrom will only bother to build that package.

Abbreviations

As discussed here, unambiguous prefixes can be used in patterns. This can come in handy when doing one-offs on the command line. For example, the example above could be written like this instead:

cargo maelstrom -i 'p.eq(baz) & n.eq(foobar)'

maelstrom-go-test

maelstrom-go-test is a replacement for go test which runs tests in lightweight containers, either locally or on a distributed cluster. Since each test runs in its own container, it is isolated from the computer it is running on and from other tests.

For a lot of projects, maelstrom-go-test will run all tests successfully right out of the box. Some tests, though, have external dependencies that cause them to fail when run in maelstrom-go-test's default, stripped-down containers. When this happens, it's usually pretty easy to configure maelstrom-go-test so that it invokes the test in a container that contains all of the necessary dependencies. The Job Specification chapter goes into detail about how to do so.

Running Tests

As described here, maelstrom-go-test finds the project directory and then proceding up the directory tree until a go.mod file is found. This means that maelstrom-go-test will run all the tests for the main module, even if it is invoked in a subdirectory of the module. If you want to restrict the tests to run, use the --include and --exclude options.

Similar to go test, maelstrom-go-test won't run any tests from nested modules. For these you need to change to the root or a sub-directory of the module-root for these modules.

Tests which call t.Skip are labeled as IGNORED.

Currently we don't support go's test caching, coverage instrumentation, profiling, or benchmarking.

We run fuzz tests and examples in the same way that go test does (without any special options). If you have corpus files you need to make sure you copy them into the test container using the Job Specification.

maelstrom-go-test doesn't currently do fuzzing beyond the included corpus entries.

Test Filter Patterns

There are times when a user needs to concisely specify a set of tests to maelstrom-go-test. One of those is on the command line: maelstrom-go-test can be told to only run a certain set of tests, or to exclude some tests. Another is the filter field of maelstrom-go-test.toml directives. This is used to choose which tests a directive applies too.

In order to allow users to easily specify a set of tests to maelstrom-go-test, we created the domain-specific pattern language described here.

If you are a fan of formal explanations check out the BNF. Otherwise, this page will attempt to give a more informal explanation of the language.

Simple Selectors

The most basic patterns are "simple selectors". These are only sometimes useful on their own, but they become more powerful when combined with other patterns. Simple selectors consist solely of one of the these identifiers:

Simple SelectorWhat it Matches
true, any, allany test
false, noneno test

Simple selectors can optionally be followed by (). That is, all() and all are equivalent patterns.

Compound Selectors

"Compound selector patterns" are patterns like package.equals(foo). They combine "compound selectors" with "matchers" and "arguments". In our example, package is the compound selector, equals is the matcher, and foo is the argument.

These are the possible compound selectors:

Compound SelectorSelected Name
namethe name of the test
package_import_paththe import-path of the test's package
package_pathtrailing part of the import-path after the module's name
package_namename the package uses in its package declaration

See below for more details on the various ways to specify a package.

These are the possible matchers:

MatcherMatches If Selected Name...
equalsexactly equals argument
containscontains argument
starts_withstarts with argument
ends_withends with argument
matchesmatches argument evaluated as regular expression
globsmatches argument evaluated as glob pattern

Compound selectors and matchers are separated by . characters. Arguments are contained within delimiters, which must be a matched pair:

LeftRight
()
[]
{}
<>
//

Let's put this all together with some examples:

PatternWhat it Matches
test.equals(foo_test)Any test named "foo_test".
package.matches{(foo)*bar}Any test whose package import-path matches the regular expression (foo)*bar.

Compound Expressions

Selectors can be joined together with operators to create compound expressions. These operators are:

OperatorsAction
!, ~, notLogical Not
&, &&, andLogical And
|, ||, orLogical Or
\, -, minusLogical Difference
(, )Grouping

The "logical difference" action is defined as follows: A - B == A && !B.

As an example, to select tests named foo or bar in package baz:

(name.equals(foo) || name.equals(bar)) && package.equals(baz)

As another example, to select tests named bar in package baz or tests named foo from any package:

name.equals(foo) || (name.equals(bar) && package.equals(baz))

Abbreviations

Selector and matcher names can be shortened to any unambiguous prefix.

For example, the following are all the same

name.equals(foo)
name.eq(foo)
n.eq(foo)

We can abbreviate name to n since no other selector starts with "n", but we can't abbreviate equals to e because there is another selector, ends_with, that also starts with an "e".

The package_import_path matcher name has a special case. Any prefix of package will resolve to package_import_path instead of package_path or package_name.

For example, all of the following resolve to package_import_path

package_import_path.equals(foo)
package_i.equals(foo)
package.equals(foo)
p.equals(foo)

Specifying Packages

What exactly counts as a "package name" in Go can be a bit confusing. We therefore support three different ways to specify a package. To illustrate the various ways, imagine that we have a module called github.org/maelstrom-software/maelstrom. Inside of that module there is a subdirectory called client, and inside of that, there is another subdirectory called rpc. To confuse things, all the .go files in rpc start with package client_rpc. In this case, the tests in this directory would all have the following values:

SelectorValue
package_import_pathgithub.org/maelstrom-software/client/rpc
package_pathclient/rpc
package_nameclient_rpc

The package_path for the root of a module will be the empty string.

Test Pattern DSL BNF

Included on this page is the Backus-Naur form notation for the DSL

pattern                := or-expression
or-expression          := and-expression
                       |  or-expression or-operator and-expression
or-operator            := "|" | "||" | "or"
and-expression         := not-expression
                       |  and-expression and-operator not-expression
                       |  and-expression diff-operator not-expression
and-operator           := "&" | "&&" | "and" | "+"
diff-operator          := "\" | "-" | "minus"
not-expression         := simple-expression
                       |  not-operator not-expression
not-operator           := "!" | "~" | "not"
simple-expression      := "(" or-expression ")"
                       |  simple-selector
                       |  compound-selector
simple-selector        := simple-selector-name
                       |  simple-selector-name "(" ")"
simple-selector-name   := "all" | "any" | "true"
                       |  "none" | "false"
compound-selector      := compound-selector-name "." matcher-name matcher-parameter
compound-selector-name := "name" | "package_import_path" | "package_path" | "package_name"
matcher-name           := "equals" | "contains" | "starts_with" | "ends_with" |
                          "matches" | "globs"
matcher-parameter      := <punctuation mark followed by characters followed by
                           matching punctuation mark>

Job Specification: maelstrom-go-test.toml

The file maelstrom-go-test.toml in the current directory is used to specify to maelstrom-go-test what job specifications are used for which tests.

This chapter describes the format of that file and how it is used to set the job spec fields described here.

Default Configuration

If there is no maelstrom-go-test.toml in the current directory, then maelstrom-go-test will run with the following defaults:

# Because it has no `filter` field, this directive applies to all tests.
[[directives]]

# This layer just includes files and directories for mounting the following
# file-systems and devices.
layers = [
    { stubs = [ "/{proc,sys,tmp}/", "/dev/{full,null,random,urandom,zero}" ] },
]

# Provide /tmp, /proc, /sys, and some devices in /dev/. These are used pretty
# commonly by tests.
mounts = [
    { type = "tmp", mount_point = "/tmp" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "devices", devices = ["full", "null", "random", "urandom", "zero"] },
]

Initializing maelstrom-go-test.toml

It's likely that at some point you'll need to adjust the job specs for some tests. At that point, you're going to need an actual maelstrom-go-test.toml. Instead of starting from scratch, you can have maelstrom-go-test create one for you:

maelstrom-to-test --init

This will create a maelstrom-go-test.toml file, unless one already exists, then exit. The resulting maelstrom-go-test.toml will match the default configuration. It will also include some commented-out examples that may be useful.

Directives

The maelstrom-go-test.toml file consists of a list of "directives" which are applied in order. Each directive has some optional fields, one of which may be filter. To compute the job spec for a test, maelstrom-go-test starts with a default spec, then iterates over all the directives in order. If a directive's filter matches the test, the directive is applied to the test's job spec. Directives without a filter apply to all tests. When it reaches the end of the configuration, it pushes one layer containing the test executable. The job spec is then used for the test.

There is no way to short-circuit the application of directives. Instead, filters can be used to limit scope of a given directive.

To specify a list of directives in TOML, we use the [[directives]] syntax. Each [[directives]] line starts a new directive. For example, this snippet specifies two directives:

[[directives]]
network = "loopback"

[[directives]]
filter = "package.equals(maelstrom-software.com/maelstrom/util) && name.equals(TestIoSplicer)"
added_mounts = [{ type = "proc", mount_point = "/proc" }]
added_layers = [{ stubs = [ "proc/" ] }]

The first directive applies to all tests, since it has no filter. It sets the network field in the job spec. The second directive only applies to a single test named io::splicer in the maelstrom-util package. It adds a layer and a mount to that test's job spec.

Directive Fields

This chapter specifies all of the possible fields for a directive. Most, but not all, of these fields have an obvious mapping to job-spec fields.

filter

This field must be a string, which is interpreted as a test filter pattern. The directive only applies to tests that match the filter. If there is no filter field, the directive applies to all tests.

Sometimes it is useful to use multi-line strings for long patterns:

[[directives]]
filter = """
package.equals(maelstrom-client) ||
package.equals(maelstrom-client-process) ||
package.equals(maelstrom-container) ||
package.equals(maelstrom-fuse) ||
package.equals(maelstrom-util)"""
layers = [{ stubs = ["/tmp/"] }]
mounts = [{ type = "tmp", mount_point = "/tmp" }]

image

Sometimes it makes sense to build your test's container from an OCI container image. For example, when we do integration tests of cargo-maelstrom, we want to run in an environment with cargo installed.

This is what the image field is for. It is used to set the job spec's image field.

[[directives]]
filter = "package.equals(cargo-maelstrom)"
image.name = "docker://rust"
image.use = ["layers", "environment"]

The image field may either be a string or a table. If it's a string, then it's assumed to be the URI of the image to use, as documented here. In this case, the job spec will have use_layers and use_environment both set to true.

If the image field is a table, then it must have a name subfield and optionally may have a use subfield.

The name sub-field specifies the name of the image. It must be a string. It specifies the URI of the image to use, as documented here.

The use sub-field must be a list of strings specifying what parts of the container image to use for the job spec. It must contain a non-empty subset of:

  • layers: This sets the use_layers field in the job spec's image value.
  • environment: This sets the use_environment field in the job spec's image value.
  • working_directory: This sets the use_working_directory field in the job spec's image value.

If the use sub-field isn't specified, then the job spec will have use_layers and use_environment both set to true.

For example, the following directives all have semantically equivalent image fields:

[[directives]]
filter = "package.equals(package-1)"
image.name = "docker://rust"
image.use = ["layers", "environment"]

[[directives]]
filter = "package.equals(package-2)"
image = { name = "docker://rust", use = ["layers", "environment"] }

[[directives]]
filter = "package.equals(package-3)"
image.name = "docker://rust"

[[directives]]
filter = "package.equals(package-4)"
image = { name = "docker://rust" }

[[directives]]
filter = "package.equals(package-5)"
image = "docker://rust"

layers

[[directives]]
layers = [
    { tar = "layers/foo.tar" },
    { paths = ["layers/a/b.bin", "layers/a/c.bin"], strip_prefix = "layers/a/" },
    { glob = "layers/b/**", strip_prefix = "layers/b/" },
    { stubs = ["/dev/{null, full}", "/proc/"] },
    { symlinks = [{ link = "/dev/stdout", target = "/proc/self/fd/1" }] },
    { shared-library-dependencies = ["/bin/bash"], prepend_prefix = "/usr" }
]

This field provides an ordered list of layers for the job spec's layers field.

Each element of the list must be a table with one of the following keys:

  • tar: The value must be a string, indicating the local path of the tar file. This is used to create a tar layer.
  • paths: The value must be a list of strings, indicating the local paths of the files and directories to include to create a paths layer. It may also include fields from prefix_options (see below).
  • glob: The value must be a string, indicating the glob pattern to use to create a glob layer. It may also include fields from prefix_options (see below).
  • stubs: The value must be a list of strings. These strings are optionally brace-expanded and used to create a stubs layer.
  • symlinks: The value must be a list of tables of link/target pairs. These strings are used to create a symlinks layer.
  • shared-library-dependencies: The value must be list of strings, indicating local paths of binaries. This layer includes the set of shared libraries the binaries depend on. This includes libc and the dynamic linker. This doesn't include the binary itself.

If the layer is a paths, glob, or shared-library-dependencies layer, then the table can have any of the following extra fields used to provide the prefix_options:

For example:

[[directives]]
layers = [
    { paths = ["layers"], strip_prefix = "layers/", prepend_prefix = "/usr/share/" },
]

This would create a layer containing all of the files and directories (recursively) in the local layers subdirectory, mapping local file layers/example to /usr/share/example in the test's container.

This field can't be set in the same directive as image if the image.use contains "layers".

added_layers

This field is like layers, except it appends to the job spec's layers field instead of replacing it.

This field can be used in the same directive as an image.use that contains "layers". For example:

[[directives]]
image.name = "cool-image"
image.use = ["layers"]
added_layers = [
    { paths = [ "extra-layers" ], strip_prefix = "extra-layers/" },
]

This directive sets uses the layers from "cool-image", but with the contents of local extra-layers directory added in as well.

environment

[[directives]]
environment = {
    USER = "bob",
    RUST_BACKTRACE = "$env{RUST_BACKTRACE:-0}",
}

This field sets the environment field of the job spec. It must be a table with string values. It supports two forms of $ expansion within those string values:

  • $env{FOO} evaluates to the value of maelstrom-go-test's FOO environment variable.
  • $prev{FOO} evaluates to the previous value of FOO for the job spec.

It is an error if the referenced variable doesn't exist. However, you can use :- to provide a default value:

FOO = "$env{FOO:-bar}"

This will set FOO to whatever maelstrom-go-test's FOO environment variable is, or to "bar" if maelstrom-go-test doesn't have a FOO environment variable.

This field can't be set in the same directive as image if the image.use contains "environment".

added_environment

This field is like environment, except it updates the job spec's environment field instead of replacing it.

When this is provided in the same directive as the environment field, the added_environment gets evaluated after the environment field. For example:

[[directives]]
environment = { VAR = "foo" }

[[directives]]
environment = { VAR = "bar" }
added_environment = { VAR = "$prev{VAR}" }

In this case, VAR will be "bar", not "foo".

This field can be used in the same directive as an image.use that contains "environment". For example:

[[directives]]
image = { name = "my-image", use = [ "layers", "environment" ] }
added_environment = { PATH = "/scripts:$prev{PATH}" }

This prepends "/scripts" to the PATH provided by the image without changing any of the other environment variables.

mounts

[[directives]]
mounts = [
    { type = "bind", mount_point = "/mnt", local_path = "data-for-job", read_only = true },
    { type = "devices", devices = [ "full", "fuse", "null", "random", "shm", "tty", "urandom", "zero" ] },
    { type = "devpts", mount_point = "/dev/pts" },
    { type = "mqueue", mount_point = "/dev/mqueue" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "tmp", mount_point = "/tmp" },
]

This field sets the mounts field of the job spec. It must be a list of tables, each of which is a TOML translation of the corresponding job-spec type.

added_mounts

This field is like mounts, except it appends to the job spec's mounts field instead of replacing it.

working_directory

[[directives]]
working_directory = "/home/root/"

This field sets the working_directory field of the job spec. It must be a string.

This field can't be set in the same directive as image if the image.use contains "working_directory".

network

[[directives]]
network = "loopback"

This field sets the network field of the job spec. It must be a string. It defaults to "disabled".

enable_writable_file_system

[[directives]]
enable_writable_file_system = true

This field sets the enable_writable_file_system field of the job spec. It must be a boolean.

user

[[directives]]
user = 1000

This field sets the user field of the job spec. It must be an unsigned, 32-bit integer.

group

[[directives]]
group = 1000

This field sets the group field of the job spec. It must be an unsigned, 32-bit integer.

timeout

[[directives]]
timeout = 60

This field sets the timeout field of the job spec. It must be an unsigned, 32-bit integer.

ignore

[[directives]]
ignore = true

This field specifies that any tests matching the directive should not be run.
When tests are run, ignored tests are displayed with a special "ignored" state.
When tests are listed, ignored tests are listed normally.
hey will instead display as "ignored". The tests still show up when listing.

Files in Project Directory

maelstrom-go-test stores a number of files in the project directory, under the .maelstrom-go-test subdirectory. This chapter lists them and explains what they're for.

It is safe to remove this directory whenever maelstrom-go-test isn't running.

Except in the case of the local worker, maelstrom-go-test doesn't currently make any effort to clean up these files. However, the total space consumed by these files should be pretty small.

maelstrom-go-test also uses the container-images cache. That cache is not stored in the target directory, as it can be shared by different Maelstrom clients.

Local Worker

The local worker stores its cache in .maelstrom-go-test/cache/local-worker/ in the project directory. The cache-size configuration value indicates the target size of this cache directory.

Manifest Files

maelstrom-go-test uses "manifest files" for non-tar layers. These are like tar files, but without the actual data contents. These files are stored in .maelstrom-go-test/cache/manifests/ in the project directory.

File Digests

Files uploaded to the broker are identified by a hash of their file contents. Calculating these hashes can be time consuming so maelstrom-go-test caches this information. This cache is stored in .maelstrom-go-test/cache/cached-digests.toml in the project directory.

Client Log File

The local client process — the one that maelstrom-go-test talks to, and that contains the local worker — has a log file that is stored at .maelstrom-go-test/state/client-process.log in the project directory.

Test Listing

When maelstrom-go-test finishes, it updates a list of all of the tests in the workspace, and how long they took to run. This is used to predict the number of tests that will be run in subsequent invocations, as well as how long they will take. This is stored in the .maelstrom-go-test/state/test-listing.toml file in the project directory.

Test Binaries

maelstrom-go-test builds go binaries and puts them in .maelstrom-go-test/cache/test-binaries. These need to exist here so we can either run them or upload them. They are organized in a directory structure that mirrors your project directory.

Test Execution Order

Maelstrom doesn't execute tests in a random order. Instead, it tries to execute them in an order that will be helpful to the user.

Understanding Priorities

Maelstrom assigns every test a priority. When there is a free slot and available tests, Maelstrom will choose the available test with the highest priority.

Tests may not be available for a variety of reasons. The test's binary may not have been compiled yet, the test's required container image may not have been downloaded yet, or the test's required artifacts may not have been uploaded yet.

When a test becomes available, there are no free slots, and the test has a higher priority than any of the existing tests, it does not preempt any of the existing tests. Instead, it will be chosen first the next time a slot becomes available.

New Test and Tests that Failed Previously

A test's priority consists of the two parts. The first part, and more important part, is whether the test is new or failed the last time it was run. The logic here is that user probably is most interested in finding out the outcomes of these tests. New tests and test that failed the last time they were run have the same priority.

A test is considered new if Maelstrom has no record of it executing. If the test listing file in the state directory has been removed, then Maelstrom will consider every test to be new. If a test, its artifacts, or its package is renamed, it is also considered new.

A test is considered to have failed the last time it was run if there was even one failure. This is relevant when the --repeat configuration value is set. If the previous run For example, if --repeat=1000 is passed, and the passes 999 times and fails just once, it is still considered to have failed.

Estimated Duration and LPT Scheduling

The second part of a test's priority is its estimated duration. In the test listing file in the state directory, Maelstrom keeps track of the recent running times of the test. It uses this to guess how long the test will take to execute. Tests that are expected to run the longest are scheduled first.

Using the estimated duration to set a test's priority means that Maelstrom uses longest-processing-time-first (LPT) scheduling. Finding optimal scheduling orders is an NP-hard problem, but LPT scheduling provides a good approximate. With LPT scheduling, the overall runtime will never be more than 133% of the optimal runtime.

Configuration Values

maelstrom-go-test supports the following configuration values:

ValueTypeDescriptionDefault
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output error"1 MB"
slotsnumberjob slots available1 per CPU
container-image-depot-rootstringcontainer images cache directory$XDG_CACHE_HOME/maelstrom/containers
accept-invalid-remote-container-tls-certsbooleanallow invalid container registry certificatesfalse
brokerstringaddress of brokerstandalone mode
log-levelstringminimum log level"info"
quietbooleandon't output per-test informationfalse
uistringUI style to use"auto"
repeatnumberhow many times to run each test1
timeoutstringoverride timeout value testsdon't override
vetstringcontrol go test -vet flag_valuego test's default
shortbooleantells long running tests to shorten their timesfalse
fullpathbooleanshows the full file name in error messagesfalse
extra-test-binary-argslistpass arbitrary arguments to test binaryno args
stop-afternumberstop after given number of failuresnever stop

cache-size

This is a local-worker setting, common to all clients. See here for details.

inline-limit

This is a local-worker setting, common to all clients. See here for details.

slots

This is a local-worker setting, common to all clients. See here for details.

container-image-depot-root

This is a container-image setting, common to all clients. See here for details.

accept-invalid-remote-container-tls-certs

This is a container-image setting, common to all clients. See here for details.

broker

The broker configuration value specifies the socket address of the broker. This configuration value is optional. If not provided, maelstrom-go-test will run in standalone mode.

Here are some example value socket addresses:

  • broker.example.org:1234
  • 192.0.2.3:1234
  • [2001:db8::3]:1234

log-level

This is a setting common to all Maelstrom programs. See here for details.

maelstrom-go-test always prints log messages to stdout. It also passes the log level to maelstrom-client, which will log its output in a file named client-process.log in the state directory.

quiet

The quiet configuration value, if set to true, causes maelstrom-go-test to be more more succinct with its output. If maelstrom-go-test is outputting to a terminal, it will display a single-line progress bar indicating all test state, then print a summary at the end. If not outputting to a terminal, it will only print a summary at the end.

ui

The ui configuration value controls the UI style used. It must be one of auto, fancy, quiet, or simple. The default value is auto.

StyleDescription
simpleThis is our original UI. It prints one line per test result (unless quiet is true), and will display some progress bars if standard output is a TTY.
fancyThis is our new UI. It has a rich TTY experience with a lot of status updates. It is incompatible with quiet or with non-TTY standard output.
quietMinimal UI with only a single progress bar
autoWill choose fancy if standard output is a TTY and quiet isn't true. Otherwise, it will choose simple.

repeat

The repeat configuration value specifies how many times each test will be run. It must be a nonnegative integer. On the command line, --loop can be used as an alias for --repeat.

timeout

The optional timeout configuration value provides the timeout value to use for all tests. This will override any value set in maelstrom-go-test.toml.

vet

This configuration controls the value of the -vet flag that is passed to go test. If not provided, the flag isn't provided to go test. See go help test for more information.

short

Tells long running tests to shorten their run time. This flag is forwarded to test binaries. See go help testflag for more information.

fullpath

Shows the full file name in error messages. This flag is forwarded to test binaries. See go help testflag for more information.

extra-test-binary-args

This allows passing of arbitrary command-line arguments to the Go test binary. See go help testflag for what are accepted normally. Since these arguments are passed directly and not interpreted, this can also be used for custom command-line arguments interpreted by the test.

These arguments are added last after short and fullpath. It could be possible to interfere with the operation of those flags by the addition of certain arguments.

When provided on the command-line these arguments are positional and come after any other arguments. To avoid ambiguity, -- should be used to denote the end of normal command-line arguments, and the beginning these arguments like follows:

maelstrom-go-test -- -test.parallel 12

stop-after

This optional configuration value if provided gives a limit on the number of failure to tolerate. If the limit is reached, cargo-maelstrom exits prematurely.

Command-Line Options

In addition the command-line options for used to specify configuration values described in the previous chapter, maelstrom-go-test supports these command-line options:

--include and --exclude

The --include (-i) and --exclude (-x) command-line options control which tests maelstrom-go-test runs or lists.

These options take a test filter pattern. The --include option includes any test that matches the pattern. Similarly, --exclude pattern excludes any test that matches the pattern. Both options are allowed to be repeated arbitrarily.

The tests that are selected are the set which match any --include pattern but don't match any --exclude pattern. In other words, --excludes have precedence over --includes, regardless of the order they are specified.

If no --include option is provided, maelstrom-go-test acts as if an --include all option was provided.

--init

The --init command-line option is used to create a starter maelstrom-go-test.toml file. See here for more information.

--list-tests or --list

The --list-tests (or --list) command-line option causes maelstrom-go-test to build all required test binaries, then print the tests that would normally be run, without actually running them.

This option can be combined with --include and --exclude.

--list-packages

The --list-packages command-line option causes maelstrom-go-test to print the packages from which it would potentially run tests, without actually building any binaries or running any tests. This command may have to wait for the go binary to download dependencies, however.

Because maelstrom-go-test won't attempt to build the tests in the given packages, it may include packages that don't actually have any tests.

This option can be combined with --include and --exclude.

Abbreviations

As discussed here, unambiguous prefixes can be used in patterns. This can come in handy when doing one-offs on the command line. For example, to run all tests in package foo with a name that includes bar:

maelstrom-go-test -i 'p.eq(foo) & n.c(bar)'

maelstrom-pytest

Choosing a Python Image

Before we start running tests, we need to choose a python image.

First generate a maelstrom-pytest.toml file

maelstrom-pytest --init

Then update the image in the file to have the version of Python you desire.

[[directives]]
image = "docker://python:3.11-slim"

The default configuration and our example uses an image from Docker

Including Your Project Python Files

So that your tests can be run from the container, your project's python must be included. Update the added_layers in the file to make sure it includes your project's Python.

added_layers = [ { glob = "**.py" } ]

This example just adds all files with a .py extension. You may also need to include .pyi files or other files.

Including pip Packages

If you have an image named "python", maelstrom-pytest will automatically include pip packages for you as part of the container. It expects to read these packages from a test-requirements.txt file in your project directory. This needs to at a minimum include the pytest package

test-requirements.txt

pytest==8.1.1

Running Tests

Once you have finished the configuration, you only need invoke maelstrom-pytest to run all the tests in your project. It must be run from an environment where pytest is in the Python path. If you are using virtualenv for your project make sure to source that first.

Test Filter Patterns

There are times when a user needs to concisely specify a set of tests to maelstrom-pytest. One of those is on the command line: maelstrom-pytest can be told to only run a certain set of tests, or to exclude some tests. Another is the filter field of maelstrom-pytest.toml directives. This is used to choose which tests a directive applies too.

In order to allow users to easily specify a set of tests to maelstrom-pytest, we created the domain-specific pattern language described here.

If you are a fan of formal explanations check out the BNF. Otherwise, this page will attempt to give a more informal explanation of the language.

Simple Selectors

The most basic patterns are "simple selectors". These are only sometimes useful on their own, but they become more powerful when combined with other patterns. Simple selectors consist solely of one of the these identifiers:

Simple SelectorWhat it Matches
true, any, allany test
false, noneno test

Simple selectors can optionally be followed by (). That is, all() and all are equivalent patterns.

Compound Selectors

"Compound selector patterns" are patterns like name.equals(foo). They combine "compound selectors" with "matchers" and "arguments". In our example, name is the compound selector, equals is the matcher, and foo is the argument.

These are the possible compound selectors:

Compound SelectorSelected Name
namethe name of the test
filethe name of the test's file
packagethe name of the test's package
node_idthe name of the test's Pytest "nodeid"

These are the possible matchers:

MatcherMatches If Selected Name...
equalsexactly equals argument
containscontains argument
starts_withstarts with argument
ends_withends with argument
matchesmatches argument evaluated as regular expression
globsmatches argument evaluated as glob pattern

Compound selectors and matchers are separated by . characters. Arguments are contained within delimiters, which must be a matched pair:

LeftRight
()
[]
{}
<>
//

Let's put this all together with some examples:

PatternWhat it Matches
node_id.equals(test_mod.py::TestClass::test_method)Any test named "test_mod.py::TestClass::test_method".
file.contains/maelstrom/Any test in file whose name contains the substring "maelstrom".
package.matches{(foo)*bar}Any test whose package name matches the regular expression (foo)*bar.

Markers Selector

The Pytest markers can be used to select tests as well. Each test has a set of markers associated with it. Using a markers selector, one can select tests that contain a specific marker. This is basically a compound selector named markers with only one matcher: contains.

In this case, contains doesn't do substring matching, but instead matches if the argument is one of the markers for the test.

For example:

PatternWhat it Matches
markers.contains(foo)Any test named that has a foo marker.
markers.contains/bar/Any test named that has a bar marker.

Compound Expressions

Selectors can be joined together with operators to create compound expressions. These operators are:

OperatorsAction
!, ~, notLogical Not
&, &&, andLogical And
|, ||, orLogical Or
\, -, minusLogical Difference
(, )Grouping

The "logical difference" action is defined as follows: A - B == A && !B.

As an example, to select tests named foo or bar in package baz:

(name.equals(foo) || name.equals(bar)) && package.equals(baz)

As another example, to select tests named bar in package baz or tests named foo from any package:

name.equals(foo) || (name.equals(bar) && package.equals(baz))

Abbreviations

Selector and matcher names can be shortened to any unambiguous prefix.

For example, the following are all the same

name.equals(foo)
name.eq(foo)
n.eq(foo)

We can abbreviate name to n since no other selector starts with "n", but we can't abbreviate equals to e because there is another selector, ends_with, that also starts with an "e".

Test Pattern DSL BNF

Included on this page is the Backus-Naur form notation for the DSL

pattern                := or-expression
or-expression          := and-expression
                       |  or-expression or-operator and-expression
or-operator            := "|" | "||" | "or"
and-expression         := not-expression
                       |  and-expression and-operator not-expression
                       |  and-expression diff-operator not-expression
and-operator           := "&" | "&&" | "and" | "+"
diff-operator          := "\" | "-" | "minus"
not-expression         := simple-expression
                       |  not-operator not-expression
not-operator           := "!" | "~" | "not"
simple-expression      := "(" or-expression ")"
                       |  simple-selector
                       |  compound-selector
                       |  markers-selector
simple-selector        := simple-selector-name
                       |  simple-selector-name "(" ")"
simple-selector-name   := "all" | "any" | "true" | "none" | "false"
compound-selector      := compound-selector-name "." matcher-name matcher-parameter
compound-selector-name := "name" | "node_id" | "package" | "file"
matcher-name           := "equals" | "contains" | "starts_with" | "ends_with" |
                          "matches" | "globs"
matcher-parameter      := <punctuation mark followed by characters followed by
                           matching punctuation mark>
markers-selector       := "markers" "." "contains" matcher-parameter

Job Specification: maelstrom-pytest.toml

The file maelstrom-pytest.toml in the project directory is used to specify to maelstrom-pytest what job specifications are used for which tests.

This chapter describes the format of that file and how it is used to set the job spec fields described here.

Default Configuration

If there is no maelstrom-pytest.toml in the project directory, then maelstrom-pytest.toml will run with the following defaults:

# Because it has no `filter` field, this directive applies to all tests.
[[directives]]

image = "docker://python:3.12.3-slim"

# Use `added_layers` here since we want to add to what are provided by the image.
added_layers = [
    # This layer includes all the Python files from our project.
    { glob = "**.{py,pyc,pyi}" },
    # Include pyproject.toml if it exists.
    { glob = "pyproject.toml" },
    # This layer just includes files and directories for mounting the following
    # file-systems and devices.
    { stubs = [ "/{proc,sys,tmp}/", "/dev/{full,null,random,urandom,zero}" ] },
]

# Provide /tmp, /proc, /sys, and some devices in /dev/. These are used pretty
# commonly by tests.
mounts = [
    { type = "tmp", mount_point = "/tmp" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "devices", devices = ["full", "null", "random", "urandom", "zero"] },
]

# Later directives can override the `environment` key, but the `added_environment` key is only
# additive. By using it here we ensure it applies to all tests regardless of other directives.
[directives.added_environment]
# If we are using the xdist plugin, we need to tell it we don't want any parallelism since we are
# only running one test per process.
PYTEST_XDIST_AUTO_NUM_WORKERS = "1"

Initializing maelstrom-pytest.toml

It's likely that at some point you'll need to adjust the job specs for some tests. At that point, you're going to need an actual maelstrom-pytest.toml. Instead of starting from scratch, you can have maelstrom-pytest create one for you:

maelstrom-pytest --init

This will create a maelstrom-pytest.toml file, unless one already exists, then exit. The resulting maelstrom-pytest.toml will match the default configuration. It will also include some commented-out examples that may be useful.

Directives

The maelstrom-pytest.toml file consists of a list of "directives" which are applied in order. Each directive has some optional fields, one of which may be filter. To compute the job spec for a test, maelstrom-pytest starts with a default spec, then iterates over all the directives in order. If a directive's filter matches the test, the directive is applied to the test's job spec. Directives without a filter apply to all tests. The job spec is then used for the test.

There is no way to short-circuit the application of directives. Instead, filters can be used to limit scope of a given directive.

To specify a list of directives in TOML, we use the [[directives]] syntax. Each [[directives]] line starts a new directive. For example, this snippet specifies two directives:

[[directives]]
include_shared_libraries = true

[[directives]]
filter = "package.equals(maelstrom) && name.equals(io::splicer)"
added_mounts = [{ type = "proc", mount_point = "/proc" }]
added_layers = [{ stubs = [ "proc/" ] }]

The first directive applies to all tests, since it has no filter. The second directive only applies to a single test named io::splicer in the maelstrom package. It adds a layer and a mount to that test's job spec.

Directive Fields

This chapter specifies all of the possible fields for a directive. Most, but not all, of these fields have an obvious mapping to job-spec fields.

filter

This field must be a string, which is interpreted as a test filter pattern. The directive only applies to tests that match the filter. If there is no filter field, the directive applies to all tests.

Sometimes it is useful to use multi-line strings for long patterns:

[[directives]]
filter = """
package.equals(maelstrom-client) ||
package.equals(maelstrom-client-process) ||
package.equals(maelstrom-container) ||
package.equals(maelstrom-fuse) ||
package.equals(maelstrom-util)"""
layers = [{ stubs = ["/tmp/"] }]
mounts = [{ type = "tmp", mount_point = "/tmp" }]

image

To use maelstrom-pytest, your tests' containers will need a python interpreter. The best way to provide one is to to build their containers from an OCI container image.

This is what the image field is for. It is used to set the job spec's image field.

[[directives]]
image.name = "docker://python:3.11-slim"
image.use = ["layers", "environment"]

[[directives]]
filter = "package.equals(foo)"
image = { name = "docker://python:3.11", use = ["layers", "environment"] }

In the example above, we specified a TOML table in two different, equivalent ways for illustrative purposes.

The image field must be a table with two subfields: name and use.

The name sub-field specifies the name of the image. It must be a string. It specifies the URI of the image to use, as documented here.

The use sub-field must be a list of strings specifying what parts of the container image to use for the job spec. It must contain a non-empty subset of:

  • layers: This sets the use_layers field in the job spec's image value.
  • environment: This sets the use_environment field in the job spec's image value.
  • working_directory: This sets the use_working_directory field in the job spec's image value.

layers

[[directives]]
layers = [
    { tar = "layers/foo.tar" },
    { paths = ["layers/a/b.bin", "layers/a/c.bin"], strip_prefix = "layers/a/" },
    { glob = "layers/b/**", strip_prefix = "layers/b/" },
    { stubs = ["/dev/{null, full}", "/proc/"] },
    { symlinks = [{ link = "/dev/stdout", target = "/proc/self/fd/1" }] },
    { shared-library-dependencies = ["/bin/bash"], prepend_prefix = "/usr" }
]

This field provides an ordered list of layers for the job spec's layers field.

Each element of the list must be a table with one of the following keys:

  • tar: The value must be a string, indicating the local path of the tar file. This is used to create a tar layer.
  • paths: The value must be a list of strings, indicating the local paths of the files and directories to include to create a paths layer. It may also include fields from prefix_options (see below).
  • glob: The value must be a string, indicating the glob pattern to use to create a glob layer. It may also include fields from prefix_options (see below).
  • stubs: The value must be a list of strings. These strings are optionally brace-expanded and used to create a stubs layer.
  • symlinks: The value must be a list of tables of link/target pairs. These strings are used to create a symlinks layer.
  • shared-library-dependencies: The value must be list of strings, indicating local paths of binaries. This layer includes the set of shared libraries the binaries depend on. This includes libc and the dynamic linker. This doesn't include the binary itself.

If the layer is a paths, glob, or shared-library-dependencies layer, then the table can have any of the following extra fields used to provide the prefix_options:

For example:

[[directives]]
layers = [
    { paths = ["layers"], strip_prefix = "layers/", prepend_prefix = "/usr/share/" },
]

This would create a layer containing all of the files and directories (recursively) in the local layers subdirectory, mapping local file layers/example to /usr/share/example in the test's container.

This field can't be set in the same directive as image if the image.use contains "layers".

added_layers

This field is like layers, except it appends to the job spec's layers field instead of replacing it.

This field can be used in the same directive as an image.use that contains "layers". For example:

[[directives]]
image.name = "cool-image"
image.use = ["layers"]
added_layers = [
    { paths = [ "extra-layers" ], strip_prefix = "extra-layers/" },
]

This directive sets uses the layers from "cool-image", but with the contents of local extra-layers directory added in as well.

environment

[[directives]]
environment = {
    USER = "bob",
    RUST_BACKTRACE = "$env{RUST_BACKTRACE:-0}",
}

This field sets the environment field of the job spec. It must be a table with string values. It supports two forms of $ expansion within those string values:

  • $env{FOO} evaluates to the value of maelstrom-pytest's FOO environment variable.
  • $prev{FOO} evaluates to the previous value of FOO for the job spec.

It is an error if the referenced variable doesn't exist. However, you can use :- to provide a default value:

FOO = "$env{FOO:-bar}"

This will set FOO to whatever maelstrom-pytest's FOO environment variable is, or to "bar" if maelstrom-pytest doesn't have a FOO environment variable.

This field can't be set in the same directive as image if the image.use contains "environment".

added_environment

This field is like environment, except it updates the job spec's environment field instead of replacing it.

When this is provided in the same directive as the environment field, the added_environment gets evaluated after the environment field. For example:

[[directives]]
environment = { VAR = "foo" }

[[directives]]
environment = { VAR = "bar" }
added_environment = { VAR = "$prev{VAR}" }

In this case, VAR will be "bar", not "foo".

This field can be used in the same directive as an image.use that contains "environment". For example:

[[directives]]
image = { name = "my-image", use = [ "layers", "environment" ] }
added_environment = { PATH = "/scripts:$prev{PATH}" }

This prepends "/scripts" to the PATH provided by the image without changing any of the other environment variables.

mounts

[[directives]]
mounts = [
    { type = "bind", mount_point = "/mnt", local_path = "data-for-job", read_only = true },
    { type = "devices", devices = [ "full", "fuse", "null", "random", "shm", "tty", "urandom", "zero" ] },
    { type = "devpts", mount_point = "/dev/pts" },
    { type = "mqueue", mount_point = "/dev/mqueue" },
    { type = "proc", mount_point = "/proc" },
    { type = "sys", mount_point = "/sys" },
    { type = "tmp", mount_point = "/tmp" },
]

This field sets the mounts field of the job spec. It must be a list of tables, each of which is a TOML translation of the corresponding job-spec type.

added_mounts

This field is like mounts, except it appends to the job spec's mounts field instead of replacing it.

working_directory

[[directives]]
working_directory = "/home/root/"

This field sets the working_directory field of the job spec. It must be a string.

This field can't be set in the same directive as image if the image.use contains "working_directory".

network

[[directives]]
network = "loopback"

This field sets the network field of the job spec. It must be a string. It defaults to "disabled".

enable_writable_file_system

[[directives]]
enable_writable_file_system = true

This field sets the enable_writable_file_system field of the job spec. It must be a boolean.

user

[[directives]]
user = 1000

This field sets the user field of the job spec. It must be an unsigned, 32-bit integer.

group

[[directives]]
group = 1000

This field sets the group field of the job spec. It must be an unsigned, 32-bit integer.

timeout

[[directives]]
timeout = 60

This field sets the timeout field of the job spec. It must be an unsigned, 32-bit integer.

ignore

[[directives]]
ignore = true

This field specifies that any tests matching the directive should not be run.
When tests are run, ignored tests are displayed with a special "ignored" state.
When tests are listed, ignored tests are listed normally.

Files in Project Directory

maelstrom-pytest stores a number of files in the project directory, under the .maelstrom-pytest subdirectory. This chapter lists them and explains what they're for.

It is safe to remove this directory whenever maelstrom-pytest isn't running.

Except in the case of the local worker, maelstrom-pytest doesn't currently make any effort to clean up these files. However, the total space consumed by these files should be pretty small.

maelstrom-pytest also uses the container-images cache. That cache is not stored in the target directory, as it can be shared by different Maelstrom clients.

Local Worker

The local worker stores its cache in .maelstrom-pytest/cache/local-worker/ in the project directory. The cache-size configuration value indicates the target size of this cache directory.

Manifest Files

maelstrom-pytest uses "manifest files" for non-tar layers. These are like tar files, but without the actual data contents. These files are stored in .maelstrom-pytest/cache/manifests/ in the project directory.

File Digests

Files uploaded to the broker are identified by a hash of their file contents. Calculating these hashes can be time consuming so maelstrom-pytest caches this information. This cache is stored in .maelstrom-pytest/cache/cached-digests.toml in the project directory.

Client Log File

The local client process — the one that maelstrom-pytest talks to, and that contains the local worker — has a log file that is stored at .maelstrom-pytest/state/client-process.log in the project directory.

Test Listing

When maelstrom-pytest finishes, it updates a list of all of the tests in the workspace, and how long they took to run. This is used to predict the number of tests that will be run in subsequent invocations, as well as how long they will take. This is stored in the .maelstrom-pytest/state/test-listing.toml file in the project directory.

Test Execution Order

Maelstrom doesn't execute tests in a random order. Instead, it tries to execute them in an order that will be helpful to the user.

Understanding Priorities

Maelstrom assigns every test a priority. When there is a free slot and available tests, Maelstrom will choose the available test with the highest priority.

Tests may not be available for a variety of reasons. The test's binary may not have been compiled yet, the test's required container image may not have been downloaded yet, or the test's required artifacts may not have been uploaded yet.

When a test becomes available, there are no free slots, and the test has a higher priority than any of the existing tests, it does not preempt any of the existing tests. Instead, it will be chosen first the next time a slot becomes available.

New Test and Tests that Failed Previously

A test's priority consists of the two parts. The first part, and more important part, is whether the test is new or failed the last time it was run. The logic here is that user probably is most interested in finding out the outcomes of these tests. New tests and test that failed the last time they were run have the same priority.

A test is considered new if Maelstrom has no record of it executing. If the test listing file in the state directory has been removed, then Maelstrom will consider every test to be new. If a test, its artifacts, or its package is renamed, it is also considered new.

A test is considered to have failed the last time it was run if there was even one failure. This is relevant when the --repeat configuration value is set. If the previous run For example, if --repeat=1000 is passed, and the passes 999 times and fails just once, it is still considered to have failed.

Estimated Duration and LPT Scheduling

The second part of a test's priority is its estimated duration. In the test listing file in the state directory, Maelstrom keeps track of the recent running times of the test. It uses this to guess how long the test will take to execute. Tests that are expected to run the longest are scheduled first.

Using the estimated duration to set a test's priority means that Maelstrom uses longest-processing-time-first (LPT) scheduling. Finding optimal scheduling orders is an NP-hard problem, but LPT scheduling provides a good approximate. With LPT scheduling, the overall runtime will never be more than 133% of the optimal runtime.

Configuration Values

maelstrom-pytest supports the following configuration values:

ValueTypeDescriptionDefault
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output error"1 MB"
slotsnumberjob slots available1 per CPU
container-image-depot-rootstringcontainer images cache directory$XDG_CACHE_HOME/maelstrom/containers
accept-invalid-remote-container-tls-certsbooleanallow invalid container registry certificatesfalse
brokerstringaddress of brokerstandalone mode
log-levelstringminimum log level"info"
quietbooleandon't output per-test informationfalse
uistringUI style to use"auto"
repeatnumberhow many times to run each test1
timeoutstringoverride timeout value testsdon't override
collect-from-modulestringcollect tests from the specified moduledon't override
extra-pytest-argslistpass arbitrary arguments to pytestno args
extra-pytest-collect-argslistpass arbitrary arguments to pytest when collectingno args
extra-pytest-test-argslistpass arbitrary arguments to pytest when running a testno args
stop-afternumberstop after given number of failuresnever stop

cache-size

This is a local-worker setting, common to all clients. See here for details.

inline-limit

This is a local-worker setting, common to all clients. See here for details.

slots

This is a local-worker setting, common to all clients. See here for details.

container-image-depot-root

This is a container-image setting, common to all clients. See here for details.

accept-invalid-remote-container-tls-certs

This is a container-image setting, common to all clients. See here for details.

broker

The broker configuration value specifies the socket address of the broker. This configuration value is optional. If not provided, maelstrom-pytest will run in standalone mode.

Here are some example value socket addresses:

  • broker.example.org:1234
  • 192.0.2.3:1234
  • [2001:db8::3]:1234

log-level

This is a setting common to all Maelstrom programs. See here for details.

maelstrom-pytest always prints log messages to stdout. It also passes the log level to maelstrom-client, which will log its output in a file named client-process.log in the state directory.

quiet

The quiet configuration value, if set to true, causes maelstrom-pytest to be more more succinct with its output. If maelstrom-pytest is outputting to a terminal, it will display a single-line progress bar indicating all test state, then print a summary at the end. If not outputting to a terminal, it will only print a summary at the end.

ui

The ui configuration value controls the UI style used. It must be one of auto, fancy, quiet, or simple. The default value is auto.

StyleDescription
simpleThis is our original UI. It prints one line per test result (unless quiet is true), and will display some progress bars if standard output is a TTY.
fancyThis is our new UI. It has a rich TTY experience with a lot of status updates. It is incompatible with quiet or with non-TTY standard output.
quietMinimal UI with only a single progress bar
autoWill choose fancy if standard output is a TTY and quiet isn't true. Otherwise, it will choose simple.

repeat

The repeat configuration value specifies how many times each test will be run. It must be a nonnegative integer. On the command line, --loop can be used as an alias for --repeat.

timeout

The optional timeout configuration value provides the timeout value to use for all tests. This will override any value set in maelstrom-pytest.toml.

collect-from-module

Collect tests from the provided module instead of using pytest's default collection algorithm. This will pass the provided module to pytest along with the --pyargs flag.

extra-pytest-args

This allows passing of arbitrary command-line arguments to pytest when collecting tests and running a test. These arguments are added in to both extra-pytest-collect-args and extra-pytest-test-args at the beginning. See those individual configuration values for details.

extra-pytest-collect-args

This allows passing of arbitrary command-line arguments to pytest when collecting tests. See pytest --help for what arguments are accepted normally. Since these arguments are passed directly and not interpreted, this can be used to interact with arbitrary pytest plugins.

These arguments are added after --co and --pyargs.

extra-pytest-test-args

This allows passing of arbitrary command-line arguments to pytest when running a test. See pytest --help for what arguments are accepted normally. Since these arguments are passed directly and not interpreted, this can be used to interact with arbitrary pytest plugins.

These arguments are added after the --verbose but before we pass the nodeid of which test to run. It could be possible to use these flags to somehow not run the test maelstrom-pytest was intending to, producing confusing results.

When provided on the command-line these arguments are positional and come after any other arguments. They must always be preceded by -- like as follows:

maelstrom-pytest -- -n1

stop-after

This optional configuration value if provided gives a limit on the number of failure to tolerate. If the limit is reached, cargo-maelstrom exits prematurely.

Command-Line Options

In addition the command-line options for used to specify configuration values described in the previous chapter, maelstrom-pytest supports these command-line options:

--include and --exclude

The --include (-i) and --exclude (-x) command-line options control which tests maelstrom-pytest runs or lists.

These options take a test filter pattern. The --include option includes any test that matches the pattern. Similarly, --exclude pattern excludes any test that matches the pattern. Both options are allowed to be repeated arbitrarily.

The tests that are selected are the set which match any --include pattern but don't match any --exclude pattern. In other words, --excludes have precedence over --includes, regardless of the order they are specified.

If no --include option is provided, maelstrom-pytest acts as if an --include all option was provided.

--init

The --init command-line option is used to create a starter maelstrom-pytest.toml file. See here for more information.

--list

The --list command-line option causes maelstrom-pytest to print the tests that would normally be run, without actually running them.

This option can be combined with --include and --exclude.

Abbreviations

As discussed here, unambiguous prefixes can be used in patterns. This can come in handy when doing one-offs on the command line. For example, to run all tests in package foo with the marker mark:

maelstrom-pytest -i 'p.eq(foo) & m.c(mark)'

maelstrom-run

maelstrom-run is a program for running arbitrary commands on a Maelstrom cluster.

Default Mode

There are three modes that maelstrom-run can be run in. In the default mode, the maelstrom-run reads a stream of JSON maps from standard input or a file, where each JSON map describes a job specification.

The jobs are run as soon as they are read, and their results are outputted as soon as they complete. It's possible to keep an instance of maelstrom-run around for an arbitrary amount of time, feeding it individual job specification and waiting for results. maelstrom-run won't exit until it has read an end-of-file and all of the pending jobs have completed.

If any job terminates abnormally or exits with a non-zero exit code, then maelstrom-run will eventually exit with a a code of 1. Otherwise, it will exit with a code of 0.

Output from jobs is printed when jobs complete. The standard output from jobs is printed to maelstrom-run's standard output, and the standard error from jobs is printed to maelstrom-run's standard error. All the output for a single jobs will be printed atomically before the output from any other jobs.

"One" Mode

The second mode for maelstrom-run is "one" mode. This is specified with the --one command-line option. This mode differs from the default mode in two ways. First, maelstrom-run can optionally take more positional command-line arguments. If that's the case, then they will replace the program and the argument in the job specification. This can be useful to run various commands in a job specification saved as a JSON file.

Second, in "one" mode, maelstrom-run tries to terminate itself in the same was that the job terminated. So, if the job received a SIGHUP, then maelstrom-run will terminate by being killed by a SIGHUP.

TTY Mode

The third mode for maelstrom-run is TTY mode. This mode is an extension of "One" Mode where the job's standard input, output, and error are connected to maelstrom-run's terminal. This is very useful for interacting with a job specification to debug a failing test or to verify some aspect of it's container.

Configuration Values

maelstrom-run supports the following configuration values:

ValueTypeDescriptionDefault
log-levelstringminimum log level"info"
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output and error"1 MB"
slotsnumberjob slots available1 per CPU
container-image-depot-rootstringcontainer images cache directory$XDG_CACHE_HOME/maelstrom/containers
accept-invalid-remote-container-tls-certsbooleanallow invalid container registry certificatesfalse
brokerstringaddress of brokerstandalone mode
state-rootstringdirectory for client process's log file$XDG_STATE_HOME/maelstrom/run
cache-rootstringdirectory for local worker's cache and cached layers$XDG_CACHE_HOME/maelstrom/run
escape-charstringTTY escape character for --tty mode"^]"

log-level

This is a setting common to all Maelstrom programs. See here for details.

maelstrom-run always prints log messages to standard error. It also passes the log level to maelstrom-client, which will log its output in a file named client-process.log in the state directory.

cache-size

This is a local-worker setting, common to all clients. See here for details.

inline-limit

This is a local-worker setting, common to all clients. See here for details.

slots

This is a local-worker setting, common to all clients. See here for details.

container-image-depot-root

This is a container-image setting, common to all clients. See here for details.

accept-invalid-remote-container-tls-certs

This is a container-image setting, common to all clients. See here for details.

broker

This is a setting common to all clients. See here for details.

state-root

This is a directory setting common to all clients. See here for more details.

cache-root

This is a directory setting common to all clients. See here for more details.

escape-char

This configuration value specifies the terminal escape character when maelstrom-run is run in TTY mode (with the --tty command-line option).

In TTY mode, all key presses are sent to the job's terminal. So, a ^C typed at the user's terminal be transmitted to the job's terminal, where it will be interpreted however the job interprets it. This is true for all control characters. As a result, typing ^C will not kill the maelstrom-run, nor will typing ^Z suspend maelstrom-run.

This what the terminal escape character is for. Assuming, the escape character is ^], then typing ^]^C will kill maelstrom-run, and typing ^]^Z will suspend maelstrom-run. To transmit the escape character to the job, it must be typed twice: ^[^[ will send ^[ to the job.

If the escape character is followed by any character other than ^C, ^Z, or itself, then it will have no special meaning. The escape character and the following character will both be transmitted to the job. Similarly, if no character follows the escape character for 1.5 seconds, then the escape character will be transmitted to the job.

The escape character must be specified as a string in one of three forms:

  • As a one-character string. In this case, the one character will be the escape character.
  • A two-character string in caret notation. For example: ^C, ^], etc.
  • A Rust byte literal starting with \. For example: \n, \x77, etc. Note that TOML and the shell may perform their own backslash escaping before Maelstrom sees the string. Be sure to either double the backslash or use a string form that isn't subject to backslash escaping.

Whatever form it takes, the character must be an ASCII character: it can't have a numeric value larger than 127.

Command-Line Options

In addition the command-line options for used to specify configuration values described in the previous chapter, maelstrom-run supports these command-line options:

--file

Read job specifications from the provided file instead of from standard input.

--one

Run in "one" mode.

This flag conflicts with --tty.

The job specifications to run can be provided with --file or on standard input. If provided on standard input, maelstrom-run will stop reading once it has read one complete job specification. If multiple job specifications are provided with --file, only the first one is used: the rest are discarded.

If any positional command-line arguments are provided, they will replace the program and arguments fields of the provided job specification.

--tty

Run in TTY mode.

This flag conflicts with --one.

The job specifications to run can be provided with --file or on standard input. If provided on standard input, maelstrom-run will stop reading once it has read one complete job specification. If multiple job specifications are provided with --file, only the first one is used: the rest are discarded.

If any positional command-line arguments are provided, they will replace the program and arguments fields of the provided job specification.

After the job specification has been read, maelstrom-run will start the job and attempt to connect to its TTY. Once that happens, the program will take over the local terminal in the same way SSH does, and will just forward data between the local terminal and the job's terminal, and vice versa.

Job Specification Format

Jobs in maelstrom-run are specified in JSON, either via standard input or from a file.

The format is a stream of individual JSON objects. The jobs aren't separated by any specific character, and they aren't part of a larger object or list.

We chose this format over TOML or other JSON representations because we wanted something that would allow us to start executing individual jobs as soon as they were read, without having to read all the job specification first.

Here is a simple example:

{
        "image": "docker://alpine",
        "program": "echo",
        "arguments": ["Hello", "world!"]
}
{
        "image": "docker://alpine",
        "program": "echo",
        "arguments": ["¡Hola", "mundo!"]
}

This will print out Hello world! and ¡Hola mundo! on two separate lines. If you run cargo-run without --file, and type in the first job specification, you will see that the job is run in the background and Hello world! is outputted, even before you specify the end of input. You can then input the second job specification and have it run.

The fields in the job specification JSON object are documented in the next chapter.

Job Specification Fields

This chapter specifies all of the possible fields for a job specification. Most, but not all, of these fields have an obvious mapping to job-spec fields.

image

This field must be an object as described below. It specifies the image field of the job spec.

The field can either be a string or an object. If it's a string, then it specifies the URI of the image to use, as documented here.

If it's an object, then it must have a string name field and it may have an optional use field. The name field specifies the URI of the image to use, as documented here.

The use fields must be a list of strings specifying what parts of the container image to use for the job spec. It must contain a non-empty subset of:

If no use field is provided, or if the first form is used where only a URI is specified, then the image will use the layers and environment from the image.

For example, the following three are identical job specifications:

{
        "image": {
                "name": "docker://ubuntu",
                "use": [ "layers", "environment" ]
        },
        "program": "echo",
        "arguments": [ "hello", "world" ]
}
{
        "image": { "name": "docker://ubuntu" },
        "program": "echo",
        "arguments": [ "hello", "world" ]
}
{
        "image": "docker://ubuntu",
        "program": "echo",
        "arguments": [ "hello", "world" ]
}

program

This field must be a string, and it specifies the program to be run. It sets the program field of the job spec. It must be provided.

arguments

This field must be a list of strings, and it specifies the program's arguments. It sets the arguments field of the job spec. If not provided, the job spec will have an empty arguments vector.

environment

This field sets the environment field of the job spec. If not provided, the job spec will have an empty environment vectors.

This field can be specified in one of two ways: implicit mode and explicit mode.

Implicit Mode

In this mode, a JSON map from string to string is expected. The job spec will have a single element in it, with the provided map. This is usually what you want.

% BAR=bar maelstrom-run --one
{
        "image": {
                "name": "docker://ubuntu",
                "use": [ "layers" ]
        },
        "program": "/usr/bin/env",
        "environment": {
                "FOO": "foo",
                "BAR": "$env{BAR}"
        }
}
BAR=bar
FOO=foo
%

This mode is incompatible with a use of environment in the image. It's ambiguous whether the desired environment is the one provided with environment or the one from the image:

% maelstrom-run --one
{
        "image": {
                "name": "docker://ubuntu",
                "use": [ "layers", "environment" ]
        },
        "program": "/usr/bin/env",
        "environment": {
                "FOO": "foo",
                "BAR": "bar"
        }
}
Error: field `environment` must provide `extend` flags if [`image` with a `use`
of `environment`](#image-use-environment) is also set at line 11 column 1
%

Explicit Mode

In this mode, a list of EnvironmentSpec is provided, and they are used verbatim in the job spec.

For example:

% BAR=bar maelstrom-run --one
{
        "image": {
                "name": "docker://ubuntu",
                "use": [ "layers", "environment" ]
        },
        "program": "/usr/bin/env",
        "environment": [
                { "vars": { "PATH": "$prev{PATH}", "FOO": "foo" }, "extend": false },
                { "vars": { "BAR": "$env{BAR}" }, "extend": true }
        ]
}
BAR=bar
FOO=foo
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
%

layers

This field sets the layers field of the job spec. Either this field, or an image with a use of layers must be provided, but not both.

To add additional layers beyond those provided by an image, use added_layers.

Here is an example of specifying layers:

% wget -O busybox https://busybox.net/downloads/binaries/1.35.0-x86_64-linux-musl/busybox
...
% chmod +x ./busybox
% maelstrom-run --one
{
        "layers": [
                { "paths": [ "busybox" ] },
                { "symlinks": [{ "link": "/ls", "target": "/busybox" }] }
        ],
        "program": "/ls"
}
busybox
ls
%

added_layers

This is just like layers, except is can only be used with an image that has a use of layers. The provided layers will be append to the layers provided by the image when creating the job spec.

Here's an example:

% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "added_layers": [
                { "stubs": [ "/foo/{bar,baz}" ] }
        ],
        "program": "/bin/ls",
        "arguments": [ "/foo" ]
}
bar
baz
%

mounts

This field sets the mounts field of the job spec. If this field isn't specified, an empty mounts will be set in the job spec.

The field must be a list of objects, where the format is the direct JSON translation of the corresponding job-spec type.

For example:

% maelstrom-run
{
        "image": "docker://ubuntu",
        "added_layers": [
                { "stubs": [ "/dev/{null,zero}" ] }
        ],
        "mounts": [
                { "type": "proc", "mount_point": "/proc" },
                { "type": "tmp", "mount_point": "/tmp" },
                { "type": "devices", "devices": [ "null", "zero" ] }
        ],
        "program": "mount"
}
Maelstrom LayerFS on / type fuse (ro,nosuid,nodev,relatime,user_id=0,group_id=0)
none on /proc type proc (rw,relatime)
none on /tmp type tmpfs (rw,relatime,uid=1000,gid=1000,inode64)
udev on /dev/null type devtmpfs (rw,nosuid,relatime,size=8087700k,nr_inodes=2021925,mode=755,inode64)
udev on /dev/zero type devtmpfs (rw,nosuid,relatime,size=8087700k,nr_inodes=2021925,mode=755,inode64)
%

Bind mounts can be used to transfer data out of the job:

% touch output
% cat output
% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "added_layers": [
                { "stubs": [ "/output" ] }
        ],
        "mounts": [
                {
                        "type": "bind",
                        "mount_point": "/output",
                        "local_path": "output",
                        "read_only": false
                }
        ],
        "program": "bash",
        "arguments": [ "-c", "echo foo >output" ]
}
% cat output
foo
%

network

This field must be a string with a value of one: "disabled", "loopback", "local". It sets the network field of the job spec to the provided value. If this field isn't provided, the default of "disabled" is used.

enable_writable_file_system

This field must be a boolean value. If it's true, it sets the root_overlay field of the job spec to Tmp. If it's not specified, or is set to false, the [root_overlay] field will be None.

working_directory

This field must be a string, and it specifies the working directory of the program be run. It sets the working_directory field of the job spec. If not provided, / will be used.

This field is incompatible with an image that has a use of working_directory.

For example:

% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "pwd"
}
/
% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "pwd",
        "working_directory": "/root"
}
/root
%

user

This field must be an integer, and it specifies the UID of the program to be run. It sets the user field of the job spec. If not provided, 0 will be used.

For example:

% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "id"
}
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "id",
        "user": 1234
}
uid=1234 gid=0(root) groups=0(root),65534(nogroup)
%

group

This field must be an integer, and it specifies the UID of the program to be run. It sets the group field of the job spec. If not provided, 0 will be used.

For example:

% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "id"
}
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "id",
        "group": 4321
}
uid=0(root) gid=4321 groups=4321,65534(nogroup)
%

timeout

This field must be an integers, and it specifies a timeout for the job in seconds. It sets the timeout field of the job spec. If not provided, the job will have no timeout.

For example:

% maelstrom-run --one
{
        "image": "docker://ubuntu",
        "program": "sleep",
        "arguments": [ "1d" ],
        "timeout": 1
}
timed out
%

maelstrom-broker

The maelstrom-broker is the coordinator for a Maelstrom cluster. it is responsible for scheduling work onto nodes in the cluster. The broker must be started before clients and workers, as the clients and workers connect to the broker, and will exit if they can't establish a connection.

The broker doesn't consume much CPU, so it can be run on any machine, including a worker machine. Ideally, whatever machine it runs on should have good throughput with the clients and workers, as all artifacts are first transferred from the clients to the broker, and then from the broker to workers.

Clients can be run in standalone mode where they don't need access to a cluster. In that case, there is no need to run a broker.

Cache

The broker maintains a cache of artifacts that are used as file system layers for jobs. If a client submits a job, and there are required artifacts for the job that the broker doesn't have in its cache, it will ask the client to transfer them. Later, when the broker submits the job to a worker, the worker may turn around and request missing artifacts from the broker.

A lot of artifacts are reused between jobs, and also between client invocations. So, the larger the broker's cache, the better. Ideally, it should be at least a few multiples of the working set size.

Command-Line Options

maelstrom-broker supports the standard command-line options, as well as a number of configuration values, which are covered in the next chapter.

Configuration Values

maelstrom-broker supports the following configuration values:

ValueTypeDescriptionDefault
log-levelstringminimum log level"info"
cache-rootstringcache directory$XDG_CACHE_HOME/maelstrom/worker/
cache-sizestringtarget cache disk space usage"1 GB"
portnumberport for clients and workers0
http-portstringport for web UI0

log-level

See here.

The broker always prints log messages to stderr.

cache-root

The cache-root configuration value specifies where the cache data will go. It defaults to $XDG_CACHE_HOME/maelstrom/broker, or ~/.cache/maelstrom/broker if XDG_CACHE_HOME isn't set. See the XDG spec for information.

cache-size

The cache-size configuration value specifies a target size for the cache. Its default value is 1 GB. When the cache consumes more than this amount of space, the broker will remove unused cache entries until the size is below this value.

It's important to note that this isn't a hard limit, and the broker will go above this amount in two cases. First, the broker always needs all of the currently-executing jobs' layers in cache. Second, the broker currently first downloads an artifact from the client in its entirety, then adds it to the cache, then removes old values if the cache has grown too large. In this scenario, the combined size of the downloading artifact and the cache may exceed cache-size.

For these reasons, it's important to leave some wiggle room in the cache-size setting.

port

The port configuration value specifies the port the broker will listen on for connections from clients and workers. It must be an integer value in the range 0–65535. A value of 0 indicates that the operating system should choose an unused port. The broker will always listen on all IP addresses of the host.

http-port

the http-port configuration value specifies the port the broker will serve the web UI on. A value of 0 indicates that the operating system should choose an unused port. The broker will always listen on all IP addresses of the host.

Running as systemd Service

You may choose to run maelstrom-broker in the background as a systemd service. This chapter covers one way to do that.

The maelstrom-broker does not need to run as root. Given this, we can create a non-privileged user to run the service:

sudo adduser --disabled-login --gecos "Maelstrom Broker User" maelstrom-broker
sudo -u maelstrom-broker mkdir ~maelstrom-broker/cache
sudo -u maelstrom-broker touch ~maelstrom-broker/config.toml
sudo cp ~/.cargo/bin/maelstrom-broker ~maelstrom-broker/

This assumes the maelstrom-broker binary is installed in ~/.cargo/bin/.

Next, create a service file at /etc/systemd/system/maelstrom-broker.service and fill it with the following contents:

[Unit]
Description=Maelstrom Broker

[Service]
User=maelstrom-broker
WorkingDirectory=/home/maelstrom-broker
ExecStart=/home/maelstrom-broker/maelstrom-broker \
    --config-file /home/maelstrom-broker/config.toml
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Next, edit the file at /home/maelstrom-broker/config.toml and fill it with the following contents:

port = 9000
http-port = 9001
cache-root = "/home/maelstrom-broker/cache"

You can add other configuration values as you please.

Finally, enable and start the broker:

sudo systemctl enable maelstrom-broker
sudo systemctl start maelstrom-broker

The broker should be running now. If you want, you can verify this by attempting to pull up the web UI, or by verifying the logs messages with journalctl.

Web UI

The broker has a web UI that is available by connecting via the configured HTTP port.

The following is an explanation of the various elements on the web UI.

Connected Machines

The web UI contains information about the number of client and the number of worker connected to the broker. The web UI itself is counted as a client.

Slots

Each running job consumes one slot. The more workers connected, the more slots are available. The broker shows the total number of available slots as well as the number currently used.

Job Statistics

The web UI contains information about current and past jobs. This includes the current number of jobs and graphs containing historical information about jobs and their states. There is a graph per connected client as well as an aggregate graph at the top. The graphs are all stacked line-charts. See Job States for information about what the various states mean.

maelstrom-worker

The maelstrom-worker is used to execute jobs in a Maelstrom cluster. In order to do any work, a cluster must have at least one worker.

The system is designed to require only one worker per node in the cluster. The worker will then run as many jobs in parallel as it has "slots". By default, it will have one slot per CPU, but it can be configured otherwise.

Clients can be run in standalone mode where they don't need access to a cluster. In that case, they will have a internal, local copy of the worker.

All jobs are run inside of containers. In addition to providing isolation to the jobs, this provides some amount of security for the worker.

Cache

Each job requires a file system for its containers. The worker provides these file systems via FUSE. It keeps the artifacts necessary to implement these file systems in its cache directory. Artifacts are reused if possible.

The worker will strive to keep the size of the cache under the configurable limit. It's important to size the cache properly. Ideally, it should be a small multiple larger than the largest working set.

Command-Line Options

maelstrom-worker supports the standard command-line options, as well as a number of configuration values, which are covered in the next chapter.

Configuration Values

maelstrom-worker supports the following configuration values:

ValueTypeDescriptionDefault
brokerstringaddress of brokermust be provided
log-levelstringminimum log level"info"
cache-rootstringcache directory$XDG_CACHE_HOME/maelstrom/worker/
cache-sizestringtarget cache disk space usage"1 GB"
inline-limitstringmaximum amount of captured standard output and error"1 MB"
slotsnumberjob slots available1 per CPU

broker

The broker configuration value specifies the socket address of the broker. This configuration value must be provided. The worker will exit if it fails to connect to the broker, or when its connection to the broker terminates.

Here are some example value socket addresses:

  • broker.example.org:1234
  • 192.0.2.3:1234
  • [2001:db8::3]:1234

log-level

See here.

The worker always prints log messages to stderr.

cache-root

The cache-root configuration value specifies where the cache data will go. It defaults to $XDG_CACHE_HOME/maelstrom/worker, or ~/.cache/maelstrom/worker if XDG_CACHE_HOME isn't set. See the XDG spec for information.

cache-size

The cache-size configuration value specifies a target size for the cache. Its default value is 1 GB. When the cache consumes more than this amount of space, the worker will remove unused cache entries until the size is below this value.

It's important to note that this isn't a hard limit, and the worker will go above this amount in two cases. First, the worker always needs all of the currently-executing jobs' layers in cache. Second, the worker currently first downloads an artifact in its entirety, then adds it to the cache, then removes old values if the cache has grown too large. In this scenario, the combined size of the downloading artifact and the cache may exceed cache-size.

For these reasons, it's important to leave some wiggle room in the cache-size setting.

inline-limit

The inline-limit configuration value specifies how many bytes of stdout or stderr will be captured from jobs. Its default value is 1 MB. If stdout or stderr grows larger, the client will be given inline-limit bytes and told that the rest of the data was truncated.

In the future we will add support for the worker storing all of stdout and stderr if they exceed inline-limit. The client would then be able to download it "out of band".

slots

The slots configuration value specifies how many jobs the worker will run concurrently. Its default value is the number of CPU cores on the machine. In the future, we will add support for jobs consuming more than one slot.

Running as systemd Service

You may choose to run maelstrom-worker in the background as a systemd service. This chapter covers one way to do that.

The maelstrom-worker does not need to run as root. Given this, we can create a non-privileged user to run the service:

sudo adduser --disabled-login --gecos "Maelstrom Worker User" maelstrom-worker
sudo -u maelstrom-worker mkdir ~maelstrom-worker/cache
sudo -u maelstrom-worker touch ~maelstrom-worker/config.toml
sudo cp ~/.cargo/bin/maelstrom-worker ~maelstrom-worker/

This assumes the maelstrom-worker binary is installed in ~/.cargo/bin/.

Next, create a service file at /etc/systemd/system/maelstrom-worker.service and fill it with the following contents:

[Unit]
Description=Maelstrom Worker

[Service]
User=maelstrom-worker
WorkingDirectory=/home/maelstrom-worker
ExecStart=/home/maelstrom-worker/maelstrom-worker \
    --config-file /home/maelstrom-worker/config.toml
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Next, edit the file at /home/maelstrom-worker/config.toml and fill it with the following contents:

broker = "<broker-machine-address>:<broker-port>"
cache-root = "/home/maelstrom-worker/cache"

The <broker-machine-address> and <broker-port> need to be substituted with their actual values. You can add other configuration values as you please.

Finally, enable and start the worker:

sudo systemctl enable maelstrom-worker
sudo systemctl start maelstrom-worker

The worker should be running now. If you want, you can verify this by pulling up the broker web UI and checking the worker count, or by looking at the broker's log messages.