Skip to main content

run_experiment()

run_experiment(name, run, parallelizable=False, args=[], options={}, deps=[])

A run_experiment() task runs the command specified in the run argument. The task's output files are versioned and archivable.

All files that this task produces should be written to the path given in the COND_OUT environment variable. This ensures that (i) your experiment results can be versioned and archived correctly, and (ii) that other tasks that depend on this task can find this task's produced files.

Arguments

name

Type: String (required)

The task's name. This name must be unique within the task's COND file. A task name can only contain letters, numbers, hyphens (-), and underscores (_).

run

Type: String (required)

The command to execute when running this task. This command will be executed using bash, with the location of the task's COND file as the shell's working directory. In other words, any relative paths in the command will be interpreted as relative to the directory containing the task's COND file.

Conductor uses the exit code of the command to determine whether it succeeded or failed. If the command succeeds, it should exit with an exit code of 0. Conductor interprets any non-zero exit code as a failure. If a task fails, it prevents any other tasks that depend on it from executing (see the reference for cond run).

parallelizable

Type: Boolean (default: False)

If set to True, Conductor may launch this task while other parallelizable tasks are running. You should set this argument to True if it is okay for this task to execute while other tasks are also running.

By default, tasks are not parallelizable, and so Conductor will not launch a new task until the previously launched task has completed (or failed).

args

Type: List of primitive types (default: [])

A list of ordered arguments that should be passed to the command string specified in run. The arguments will be passed to the command in the order they are listed in args. The primitive types supported in args are strings, Booleans, integers, and floating point numbers.

Example

run_experiment(
name="example",
run="./run.sh",
args=["arg1", "arg2", 123, True, 0.3],
)

Conductor will execute the task shown above by running ./run.sh arg1 arg2 123 true 0.3 in bash.

options

Type: Dictionary mapping string keys to primitive values (default: {})

A map of string keys to primitive values that should be passed to the command string specified in run. Conductor treats these options as command line "flags" and will pass them to the run command using --key=value syntax. Like args, the primitive types supported in options are strings, Booleans, integers, and floating point numbers. When args and options are both non-empty, args are always passed first before options.

Example

run_experiment(
name="example",
run="./run.sh",
args=["arg1"],
options={
"foo": 3,
"bar": True,
},
)

Conductor will execute the task shown above by running ./run.sh arg1 --foo=3 --bar=true in bash.

deps

Type: List of task identifiers (default: [])

A list of task identifiers that this task should depend on. Conductor will ensure that all dependencies execute successfully before launching this task.

When depending on tasks defined in the same COND file, you can just specify the task's name prefixed by a colon (e.g., :compile would refer to a task named compile defined in the same file). If you need to depend on a task defined in a different COND file, you must specify the fully qualified task identifier (e.g., //experiments:benchmark would refer to a task named benchmark defined in the COND file in the experiments directory).

Reserved File Names

Conductor records additional metadata about run_experiment() tasks in special files under the path given by $COND_OUT. Your executable should not produce files with the same names because they will be overwritten.

File NameDescription
args.jsonThe arguments passed to the run command using args, serialized to JSON.
options.jsonThe options passed to the run command using options, serialized to JSON.
stderr.logA log of the run command's output to standard error.
stdout.logA log of the run command's output to standard out.

Versioning and Caching Semantics

Conductor versions run_experiment() tasks' outputs. When running a run_experiment() task, Conductor will check to see if a previous compatible output version exists. If so, it will use the cached results of the "most compatible" version instead of running the task again (unless otherwise specified, see cond run). This section describes the semantics of output version "compatibility."

Projects Using Git

Whenever a run_experiment() task executes, Conductor records the repository's current commit hash and associates it with the outputs. Conductor then determines a task output's compatibility based on the task output's commit hash and the repository's current commit (i.e., the value of HEAD).

The most compatible version is the version whose commit is both (i) an ancestor of the current commit (i.e., an ancestor of HEAD), and (ii) "closest" to the current commit. Conductor defines closeness as the number of commits separating the repository's current commit (HEAD) and the task version's commit. If there are multiple closest task versions, Conductor selects the most recent one as determined by execution timestamp. If there are no "compatible" outputs, Conductor will execute the task.

Projects Without Git

If your project is not managed using Git (or has Conductor's Git integration explicitly disabled), Conductor will always select the most recent version of the task's outputs. Recency is determined by when the task was executed (i.e., a timestamp). If there are no outputs available, Conductor will execute the task.

Usage Example

COND
run_experiment(
name="benchmark",
run="./run_benchmark.sh",
parallelizable=False,
args=["my_dataset.csv"],
options={
"threads": 3,
},
deps=[
":compile",
],
)