Commands

Commands are at the heart of how Raster Vision turns configuration into actions that can run in various environments (e.g. locally or on AWS Batch). When a user runs an Experiment through Raster Vision, every ExperimentConfig is transformed into one or more commands configurations, which are then tied together through their inputs and outputs, and used to generate the commands to be run. Without commands, experiments are simply configuration.

Command Generation and Execution

Commands are generated from CommandConfigs in the runner environment. Commands follow the same Configuration vs Entity differentiation that ExperimentConfig elements do - they are only created when and where they are to be executed. For example, if you are running Raster Vision against AWS Batch, the Commands themselves are only created in the AWS Batch task that is going to run the command.

Each CommandConfig is initially generated in the client environment. They can be created directly from a CommandConfigBuilder, or generated as part of an internal Raster Vision process that generates CommandConfigs from ExperimentConfigs. The flowchart below shows how all configurations are eventually decomposed into CommandConfigs, and then executed in the runner environment as Commands:

_images/command-execution-workflow.png

Command Architecture

_images/command-hierarchy.png

Every command derives from the Command abstract class, and is associated with a CommandConfig and CommandConfigBuilder. Every command must implement methods that describe the input and output of the command; this is how commands are structured in the Directed Acyclic Graph (DAG) of commands - if command B declares an input that is declared as output from command A, then there will be an edge (Command A)->(Command B) in the DAG of commands. This ensures that commands are run in the proper order. Commands often will declare their inputs implicitly based on configuration, so that you do not have to specify full URIs for inputs and outputs. However, this is command specific; e.g. Aux Commands are often more explicitly configured.

Commands are further differentiated between standard commands and auxiliary commands. Auxiliary commands are a simplified version of commands are less flexible as far as implicit configuration setting, but are often easier to utilize and implement for explicitly configured commands such as those used for preprocessing data.

Standard Commands

There are several commands that are commonly at the core to machine learning workflow, which are implemented as standard commands in Raster Vision:

_images/commands-chain-workflow.png

ANALYZE

The ANALYZE command is used to analyze scenes that are part of an experiment and produce some output that can be consumed by later commands. Geospatial raster sources such as GeoTIFFs often contain 16- and 32-bit pixel color values, but many deep learning libraries expect 8-bit values. In order to perform this transformation, we need to know the distribution of pixel values. So one usage of the ANALYZE command is to compute statistics of the raster sources and save them to a JSON file which is later used by the StatsTransformer (one of the available Raster Transformers) to do the conversion.

CHIP

Scenes are comprised of large geospatial raster sources (e.g. GeoTIFFs) and geospatial label sources (e.g. GeoJSONs), but models can only consume small images (i.e. chips) and labels in pixel based-coordinates. In addition, each backend has its own dataset format. The CHIP command solves this problem by converting scenes into training chips and into a format the backend can use for training.

TRAIN

The TRAIN command is used to train a model using the dataset generated by the CHIP command. The command is a thin wrapper around the train method in the backend that synchronizes files with the cloud, configures and calls the training routine provided by the associated third-party machine learning library, and sets up a log visualization server in some cases (e.g. Tensorboard). The output is a trained model that can be used to make predictions and fine-tune on another dataset.

PREDICT

The PREDICT command makes predictions for a set of scenes using a model produced by the TRAIN command. To do this, a sliding window is used to feed small images into the model, and the predictions are transformed from image-centric, pixel-based coordinates into scene-centric, map-based coordinates.

EVAL

The EVAL command evaluates the quality of models by comparing the predictions generated by the PREDICT command to ground truth labels. A variety of metrics including F1, precision, and recall are computed for each class (as well as overall) and are written to a JSON file.

BUNDLE

The BUNDLE command gathers files necessary to create a prediction package from the output of the previous commands. A prediction package contains a model file plus associated configuration data, and can be used to make predictions on new imagery in a deployed application.

Auxiliary (Aux) Commands

Raster Vision utilizes auxiliary commands for things like data preparation. These are commands that do not run in the normal ML pipeline (e.g., if one were to run run rastervision run without an command specified). Auxiliary commands normally do not have the same type of implicit configuration setting as normal commands; because of this, file paths are often set explicitly, and these commands are often configured and returned from an ExperimentSet method directly, instead of implicitly created through the ExperimentConfig.

Configuring Aux Commands

There are two ways to configure an Aux command: one is through custom configuration set on an ExperimentConfig, and the other is to directly return a CommandConfig instance from an experiment method. Normally Aux Commands are run separately from the normal experiment workflow, so we suggest returning command configurations as a default.

Configuring an Aux Command from an ExperimentConfig

In order to pass an Aux Command configuration through the experiment, you must set the configuration on the custom configuration of the experiment, as a dictionary of aux command configuration values, set onto a property that is the command name.

The aux command configuration dict must either have a root_uri property set, which will determine the root URI to store command configuration, or a key property, which will be used to implicitly construct the root URI based on the Experiment’s overall root URI.

The aux command configuration must also have a config key, which holds the configuration values for that particular command as a dict.

For example, to set the configuration for the CogifyCommand on your experiment, you would do the following:

import rastervision as rv

class ExampleExperiments(rv.ExperimentSet):
   def exp_example(self):

       # Full experiment configuration builder generated elsewhere...
       experiment_builder = get_experiment_builder()

       # Before building the ExperimentConfig, set custom configuration
       # for the COGIFY Aux Command.
       e = experiment_builder \
           .with_root_uri(tmp_dir) \
           .with_custom_config({
               'cogify': {
                   'key': 'test',
                   'config': {
                       'uris': [(src_path, cog_path)],
                       'block_size': 128
                   }
               }
           }) \
           .build()

       return e

Configuring an Aux Command directly

You can configure the command configuration using the builder pattern directly. Aux Command builders all have the with_root_uri method, to set the root URI that will store command configuration, as well as the with_config method. This with_config method accepts **kwargs for configuration values.

You can return one or more command configuration directly from an experiment method, as a single command configuration or a list of configs.

Below is an example of an ExperimentSet that has one experiment method, that returns a configuration for a cogify command.

import rastervision as rv

class Preprocess(rv.ExperimentSet):
   def exp_cogify(self):
       root_uri = 's3://my-bucket/cogify'
       uris = [('s3://my-bucket/original/some.tif', 's3://my-bucket/cogs/some-cog.tif')]

       cmd_config = rv.CommandConfig.builder(rv.COGIFY) \
                                    .with_root_uri(root_uri) \
                                    .with_config(uris=uris,
                                                 resample_method='bilinear',
                                                 compression='jpeg') \
                                    .build()

       return cmd_config

Running Aux Commands

By default Aux Commands won’t run without explicitly being run. That means

> rastervision -p example run local -e example.Preprocess

Will not run the above Cogify command, however this will:

> rastervision -p example run local -e example.Preprocess cogify

Aux Commands included with Raster Vision

COGIFY

The COGIFY command will turn GDAL-readable images and turn them into Cloud Optimized GeoTiffs.

See the CogifyCommand entry in the Aux Commands API docs for configuration options.

Custom Commands

Custom Commands allow advanced Raster Vision users to implement their own commands using the Plugins architecture.

To create a standard custom command, you will need to create implementations of the Command, CommandConfig, and CommandConfigBuilder interfaces. You then need to register the CommandConfigBuilder using the register_command_config_builder method of the plugin registry.

Custom commands that are built as standard commands will by default always be run - that is, if you run rastervision run … without any specific command, your custom command will be run by default. The order in which it is run will be determined by how the inputs and outputs it declares are connected with other command definitions. One detail to note is the update_for_command method of custom commands will be called after it is called for the standard commands, in the order in which the custom commands were registered with Raster Vision.

Custom Aux Commands

Custom Aux Commands are more simple to write than a standard custom command. For instance, the following example creates and registers a custom AuxCommand that copies a file from one location to the other, with a no-op processing:

import rastervision as rv
from rastervision.utils.files import (download_or_copy, upload_or_copy)

def process_file(local_file_path, options):
    # Do something
    local_output_path = local_file_path
    return local_output_path

class ExampleCommand(rv.AuxCommand):
    command_type = "EXAMPLE"
    options = rv.AuxCommandOptions(
        split_on='uris',
        inputs=lambda conf: map(lambda tup: tup[0], conf['uris']),
        outputs=lambda conf: map(lambda tup: tup[1], conf['uris']),
        required_fields=['uris', 'options'])

    def run(self, tmp_dir=None):
        if not tmp_dir:
            tmp_dir = self.get_tmp_dir()

        options = self.command_config['options']
        for src, dest in self.command_config['uris']:
            src_local = download_or_copy(src, tmp_dir)
            output_local = process_file(src_local, options)
            upload_or_copy(output_local, dest)

def register_plugin(plugin_registry):
    plugin_registry.register_aux_command("EXAMPLE",
                                         ExampleCommand)

Notice there is only one class to implement: the rv.AuxCommand class.

When creating an custom AuxCommand, be sure to set the options correctly - see the Aux Command Options API docs for more information about options.

To use a custom command, refer to it by the command_type in the rv.CommandConfig.builder(...) method, like so:

import rastervision as rv

class Preprocess(rv.ExperimentSet):
   def exp_example_command(self):
       root_uri = 's3://my-bucket/example'
       uris = [('s3://my-bucket/original/some.tif', 's3://my-bucket/processed/some.tif')]
       options = { 'something_useful': 'yes' }

       cmd_config = rv.CommandConfig.builder("EXAMPLE") \
                                    .with_root_uri(root_uri) \
                                    .with_config(uris=uris,
                                                 options=options) \
                                    .build()

       return cmd_config

To run the command, use the command_type name on the command line, e.g.:

> rastervision -p example run local -e example.Preprocess example