At a high-level, a typical machine learning workflow for geospatial data involves the following steps:
Read geospatial data
Train a model
Write predictions (as geospatial data)
Below, we describe various Raster Vision components that can be used to perform these steps.
Reading geospatial data#
Raster Vision internally uses the following pipeline for reading geo-referenced data and coaxing it into a form suitable for training computer vision models.
When using Raster Vision as a library, users generally do not need to deal with all the individual components to arrive at a working GeoDataset (see the tutorial on Sampling training data), but certainly can if needed.
Below, we briefly describe each of the components shown in the diagram above.
Tutorial: Reading raster data
RasterSource represents a source of raster data for a scene. It is used to retrieve small windows of raster data (or chips) from larger scenes. It can also be used to subset image channels (i.e. bands) as well as do more complex transformations using
RasterTransformers. You can even combine bands from multiple sources using a
MultiRasterSource or stack images from sources in a time-series using a
Tutorial: Reading vector data
Annotations for geospatial data are often represented as vector data such as polygons and lines. A
VectorSource is Raster Vision’s abstraction for a vector data reader. Just like
VectorSources also allow transforming the data using
Tutorial: Reading labels
LabelSource interprets the data read by raster or vector sources into a form suitable for machine learning. They can be queried for the labels that lie within a window and are used for creating training chips, as well as providing ground truth labels for evaluation against model predictions. There are different implementations available for
semantic segmentation, and
Tutorial: Scenes and AOIs
It can also
hold a LabelStore; this is useful for evaluating predictions against ground truth labels
Tutorial: Sampling training data
GeoDataset (provided by Raster Vision’s
pytorch_learner plugin) is a PyTorch-compatible dataset that can readily be wrapped into a DataLoader and used by any PyTorch training code. Raster Vision provides a
Learner class for training models, but you can also use GeoDatasets with either your own custom training code, or with a 3rd party library like PyTorch Lightning.
AlbumentationsDataset(base dataset class)
Training a model#
Tutorial: Training a model
pytorch_learner plugin provides a
Learner class that encapsulates the entire training process. It is highly configurable. You can either fill out a
LearnerConfig and have the
Learner set everything up (datasets, model, loss, optimizers, etc.) for you, or you can pass in your own models, datasets, etc. and have the
Learner use them instead.
The main output of the
Learner is a trained model. This is available as a
last-model.pth file which is a serialized dictionary of model weights that can be loaded into a model via
You can also make the
Learner output a “model-bundle” (via
save_model_bundle()), which outputs a zip file containing the model weights as well as a config file that can be used to re-create the
Learners are not limited to
GeoDatasets and can work with any PyTorch-compatible image dataset. In fact,
pytorch_learner also provides an
ImageDataset class for dealing with non-geospatial datasets.
Making predictions and saving them#
Tutorial: Prediction and Evaluation
Having trained a model, you would naturally want to use it to make predictions on new scenes. The usual workflow for this is:
Labels class is an in-memory representation of labels. It can represent both ground truth labels and model predictions.