Skip to main content

Overview

Conductor is a simple and elegant tool that helps orchestrate your research computing. Conductor automates your research computing pipeline, all the way from experiments to figures in your paper.

Installing

Conductor requires Python 3.8+ and is currently only supported on macOS and Linux machines. It has been tested on macOS 10.14 and Ubuntu 20.04.

Conductor is available on PyPI and so it can be installed using pip.

pip install conductor-cli

After installation, the cond executable should be available in your shell.

cond --help

Getting Started

A quick way to get started is to look at Conductor's example projects. Below is a quick overview of a few important Conductor concepts.

Project Root

When using Conductor with your project, you first need to add a cond_config.toml file to your project's root directory. This file tells Conductor where your project files are located and is important because all task identifiers (defined below) are relative to your project root.

Tasks

Conductor works with "tasks", which are jobs (arbitrary shell commands or scripts) that it should run. You define tasks in COND files using Python syntax. All tasks are of a predefined "type" (e.g., run_experiment()), which are listed in the task types reference documentation.

Conductor's tasks are very similar to (and inspired by) Bazel's and Buck's build rules.

Task Identifiers

A task is identified using the path to the COND file where it is defined (relative to your project's root directory), followed by its name. For example, a task named run_benchmark defined in a COND file located in experiments/COND would have the task identifier //experiments:run_benchmark. To have Conductor run the task, you run cond run //experiments:run_benchmark in your shell.

Dependencies

Tasks can be dependent on other tasks. To specify a dependency, you use the deps keyword argument when defining a task. When running a task that has dependencies, Conductor will ensure that all of its dependencies are executed first before the task is executed. This allows you to build a dependency graph of tasks, which can be used to automate your entire research computing pipeline.

Task Outputs

Tasks usually (but not always) will need to produce output file(s) (e.g., measurements, figures). When Conductor runs a task, it will set the COND_OUT environment variable to a path where the task should write its outputs. See the example projects for an example of how this is used. All task outputs will be stored under the cond-out directory.

Similarly, Conductor will also set the COND_DEPS environment variable to a colon (:) separated list of paths to the task's dependencies' outputs. If the task has no dependencies, the COND_DEPS environment variable will be set to an empty string.

It's important to write task outputs to the path specified by COND_OUT. This ensures other tasks can find the current task's outputs, and also allows Conductor to archive your tasks' outputs.