Learner#
- class Learner[source]#
Bases:
ABC
Abstract training and prediction routines for a model.
This can be subclassed to handle different computer vision tasks.
The datasets, model, optimizer, and schedulers will be generated from the cfg if not specified in the constructor.
If instantiated with training=False, the training apparatus (loss, optimizer, scheduler, logging, etc.) will not be set up and the model will be put into eval mode.
Note that various training and prediction methods have the side effect of putting Learner.model into training or eval mode. No attempt is made to put the model back into the mode it was previously in.
Attributes
- __init__(cfg: LearnerConfig, output_dir: Optional[str] = None, train_ds: Optional[Dataset] = None, valid_ds: Optional[Dataset] = None, test_ds: Optional[Dataset] = None, model: Optional[torch.nn.Module] = None, loss: Optional[Callable] = None, optimizer: Optional[Optimizer] = None, epoch_scheduler: Optional[_LRScheduler] = None, step_scheduler: Optional[_LRScheduler] = None, tmp_dir: Optional[str] = None, model_weights_path: Optional[str] = None, model_def_path: Optional[str] = None, loss_def_path: Optional[str] = None, training: bool = True)[source]#
Constructor.
- Parameters
cfg (LearnerConfig) – LearnerConfig.
train_ds (Optional[Dataset], optional) – The dataset to use for training. If None, will be generated from cfg.data. Defaults to None.
valid_ds (Optional[Dataset], optional) – The dataset to use for validation. If None, will be generated from cfg.data. Defaults to None.
test_ds (Optional[Dataset], optional) – The dataset to use for testing. If None, will be generated from cfg.data. Defaults to None.
model (Optional[nn.Module], optional) – The model. If None, will be generated from cfg.model. Defaults to None.
loss (Optional[Callable], optional) – The loss function. If None, will be generated from cfg.solver. Defaults to None.
optimizer (Optional[Optimizer], optional) – The optimizer. If None, will be generated from cfg.solver. Defaults to None.
epoch_scheduler (Optional[_LRScheduler], optional) – The scheduler that updates after each epoch. If None, will be generated from cfg.solver. Defaults to None.
step_scheduler (Optional[_LRScheduler], optional) – The scheduler that updates after each optimizer-step. If None, will be generated from cfg.solver. Defaults to None.
tmp_dir (Optional[str], optional) – A temporary directory to use for downloads etc. If None, will be auto-generated. Defaults to None.
model_weights_path (Optional[str], optional) – URI of model weights to initialize the model with. Defaults to None.
model_def_path (Optional[str], optional) – A local path to a directory with a hubconf.py. If provided, the model definition is imported from here. This is used when loading an external model from a model-bundle. Defaults to None.
loss_def_path (Optional[str], optional) – A local path to a directory with a hubconf.py. If provided, the loss function definition is imported from here. This is used when loading an external loss function from a model-bundle. Defaults to None.
training (bool, optional) – If False, the training apparatus (loss, optimizer, scheduler, logging, etc.) will not be set up and the model will be put into eval mode. If True, the training apparatus will be set up and the model will be put into training mode. Defaults to True.
Methods
__init__
(cfg[, output_dir, train_ds, ...])Constructor.
Set the DataLoaders for train, validation, and test sets.
build_epoch_scheduler
([start_epoch])Returns an LR scheduler that changes the LR each epoch.
build_loss
([loss_def_path])Build a loss Callable.
build_model
([model_def_path])Build a PyTorch model.
Returns optimizer.
build_step_scheduler
([start_epoch])Returns an LR scheduler that changes the LR each step.
eval_model
(split)Evaluate model using a particular dataset split.
export_to_onnx
(path[, model, sample_input, ...])Export model to ONNX format via
torch.onnx.export()
.from_model_bundle
(model_bundle_uri[, ...])Create a Learner from a model bundle.
Returns a custom collate_fn to use in DataLoader.
get_dataloader
(split)Get the DataLoader for a split.
Get start epoch.
get_train_sampler
(train_ds)Return an optional sampler for the training dataloader.
Returns a Visualizer class object for plotting data samples.
Load last weights from previous run if available.
load_init_weights
([model_weights_path])Load the weights to initialize model.
load_onnx_model
(model_path)load_weights
(uri, **kwargs)Load model weights from a file.
Log stats about each DataSet.
main
()Main training sequence.
Normalize x to [0, 1].
on_epoch_end
(curr_epoch, metrics)Hook that is called at end of epoch.
Hook that is called at start of overfit routine.
Hook that is called at start of train routine.
output_to_numpy
(out)Convert output of model to numpy format.
overfit
()Optimize model using the same batch repeatedly.
plot_dataloader
(dl, output_path[, ...])Plot images and ground truth labels for a DataLoader.
plot_dataloaders
([batch_limit, show])Plot images and ground truth labels for all DataLoaders.
plot_predictions
(split[, batch_limit, show])Plot predictions for a split.
post_forward
(x)Post process output of call to model().
predict
(x[, raw_out])Make prediction for an image or batch of images.
predict_dataloader
(dl[, batched_output, ...])Returns an iterator over predictions on the given dataloader.
predict_dataset
(dataset[, return_format, ...])Returns an iterator over predictions on the given dataset.
predict_onnx
(x[, raw_out])Alternative to predict() for ONNX inference.
prob_to_pred
(x)Convert a Tensor with prediction probabilities to class ids.
Run TB server serving logged stats.
save_model_bundle
([export_onnx])Save a model bundle.
Set datasets and dataLoaders for train, validation, and test sets.
setup_loss
([loss_def_path])Setup self.loss.
setup_model
([model_weights_path, model_def_path])Setup self.model.
Setup for logging stats to TB.
setup_training
([loss_def_path])Stop TB logging and server if it's running.
Sync any previous output in the cloud to output_dir.
Sync any output to the cloud at output_uri.
to_batch
(x)Ensure that image array has batch dimension.
to_device
(x, device)Load Tensors onto a device.
train
([epochs])Training loop that will attempt to resume training if appropriate.
train_end
(outputs)Aggregate the output of train_step at the end of the epoch.
train_epoch
(optimizer[, step_scheduler])Train for a single epoch.
train_step
(batch, batch_ind)Compute loss for a single training batch.
validate_end
(outputs)Aggregate the output of validate_step at the end of the epoch.
validate_epoch
(dl)Validate for a single epoch.
validate_step
(batch, batch_ind)Compute metrics on validation batch.
- __init__(cfg: LearnerConfig, output_dir: Optional[str] = None, train_ds: Optional[Dataset] = None, valid_ds: Optional[Dataset] = None, test_ds: Optional[Dataset] = None, model: Optional[torch.nn.Module] = None, loss: Optional[Callable] = None, optimizer: Optional[Optimizer] = None, epoch_scheduler: Optional[_LRScheduler] = None, step_scheduler: Optional[_LRScheduler] = None, tmp_dir: Optional[str] = None, model_weights_path: Optional[str] = None, model_def_path: Optional[str] = None, loss_def_path: Optional[str] = None, training: bool = True)[source]#
Constructor.
- Parameters
cfg (LearnerConfig) – LearnerConfig.
train_ds (Optional[Dataset], optional) – The dataset to use for training. If None, will be generated from cfg.data. Defaults to None.
valid_ds (Optional[Dataset], optional) – The dataset to use for validation. If None, will be generated from cfg.data. Defaults to None.
test_ds (Optional[Dataset], optional) – The dataset to use for testing. If None, will be generated from cfg.data. Defaults to None.
model (Optional[nn.Module], optional) – The model. If None, will be generated from cfg.model. Defaults to None.
loss (Optional[Callable], optional) – The loss function. If None, will be generated from cfg.solver. Defaults to None.
optimizer (Optional[Optimizer], optional) – The optimizer. If None, will be generated from cfg.solver. Defaults to None.
epoch_scheduler (Optional[_LRScheduler], optional) – The scheduler that updates after each epoch. If None, will be generated from cfg.solver. Defaults to None.
step_scheduler (Optional[_LRScheduler], optional) – The scheduler that updates after each optimizer-step. If None, will be generated from cfg.solver. Defaults to None.
tmp_dir (Optional[str], optional) – A temporary directory to use for downloads etc. If None, will be auto-generated. Defaults to None.
model_weights_path (Optional[str], optional) – URI of model weights to initialize the model with. Defaults to None.
model_def_path (Optional[str], optional) – A local path to a directory with a hubconf.py. If provided, the model definition is imported from here. This is used when loading an external model from a model-bundle. Defaults to None.
loss_def_path (Optional[str], optional) – A local path to a directory with a hubconf.py. If provided, the loss function definition is imported from here. This is used when loading an external loss function from a model-bundle. Defaults to None.
training (bool, optional) – If False, the training apparatus (loss, optimizer, scheduler, logging, etc.) will not be set up and the model will be put into eval mode. If True, the training apparatus will be set up and the model will be put into training mode. Defaults to True.
- build_dataloaders() Tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader, torch.utils.data.DataLoader] [source]#
Set the DataLoaders for train, validation, and test sets.
- build_datasets() Tuple[Dataset, Dataset, Dataset] [source]#
- Return type
Tuple[Dataset, Dataset, Dataset]
- build_epoch_scheduler(start_epoch: int = 0) _LRScheduler [source]#
Returns an LR scheduler that changes the LR each epoch.
- Parameters
start_epoch (int) –
- Return type
_LRScheduler
- build_model(model_def_path: Optional[str] = None) torch.nn.Module [source]#
Build a PyTorch model.
- Parameters
- Return type
- build_step_scheduler(start_epoch: int = 0) _LRScheduler [source]#
Returns an LR scheduler that changes the LR each step.
- Parameters
start_epoch (int) –
- Return type
_LRScheduler
- eval_model(split: str)[source]#
Evaluate model using a particular dataset split.
Gets validation metrics and saves them along with prediction plots.
- Parameters
split (str) – the dataset split to use: train, valid, or test.
- export_to_onnx(path: str, model: Optional[torch.nn.Module] = None, sample_input: Optional[torch.Tensor] = None, validate_export: bool = True, **kwargs) None [source]#
Export model to ONNX format via
torch.onnx.export()
.- Parameters
path (str) – File path to save to.
model (Optional[nn.Module]) – The model to export. If None, self.model will be used. Defaults to None.
sample_input (Optional[Tensor]) – Sample input to the model. If None, a single batch from any available DataLoader in this Learner will be used. Defaults to None.
validate_export (bool) – If True, use
onnx.checker.check_model()
to validate exported model. An exception is raised if the check fails. Defaults to True.**kwargs (dict) – Keyword args to pass to
torch.onnx.export()
. These override the default values used in the function definition.
- Raises
ValueError – If sample_input is None and the Learner has no valid DataLoaders.
- Return type
None
- classmethod from_model_bundle(model_bundle_uri: str, tmp_dir: Optional[str] = None, cfg: Optional[LearnerConfig] = None, training: bool = False, use_onnx_model: Optional[bool] = None, **kwargs) Learner [source]#
Create a Learner from a model bundle.
Note
This is the bundle saved in
train/model-bundle.zip
and notbundle/model-bundle.zip
.- Parameters
model_bundle_uri (str) – URI of the model bundle.
tmp_dir (Optional[str], optional) – Optional temporary directory. Will be used for unzipping bundle and also passed to the default constructor. If None, will be auto-generated. Defaults to None.
cfg (Optional[LearnerConfig], optional) – If None, will be read from the bundle. Defaults to None.
training (bool, optional) – If False, the training apparatus (loss, optimizer, scheduler, logging, etc.) will not be set up and the model will be put into eval mode. If True, the training apparatus will be set up and the model will be put into training mode. Defaults to True.
use_onnx_model (Optional[bool]) – If True and training=False and a model.onnx file is available in the bundle, use that for inference rather than the PyTorch weights. Defaults to the boolean environment variable RASTERVISION_USE_ONNX if set, False otherwise.
**kwargs – Extra args for
__init__()
.
- Raises
FileNotFoundError – If using custom Albumentations transforms and definition file is not found in bundle.
- Returns
Object of the Learner subclass on which this was called.
- Return type
- get_collate_fn() Optional[callable] [source]#
Returns a custom collate_fn to use in DataLoader.
None is returned if default collate_fn should be used.
See https://pytorch.org/docs/stable/data.html#working-with-collate-fn
- Return type
Optional[callable]
- get_dataloader(split: str) torch.utils.data.DataLoader [source]#
Get the DataLoader for a split.
- Parameters
split (str) – a split name which can be train, valid, or test
- Return type
- get_start_epoch() int [source]#
Get start epoch.
If training was interrupted, this returns the last complete epoch + 1.
- Return type
- get_train_sampler(train_ds: Dataset) Optional[Sampler] [source]#
Return an optional sampler for the training dataloader.
- Parameters
train_ds (Dataset) –
- Return type
Optional[Sampler]
- abstract get_visualizer_class() Type[Visualizer] [source]#
Returns a Visualizer class object for plotting data samples.
- Return type
- load_init_weights(model_weights_path: Optional[str] = None) None [source]#
Load the weights to initialize model.
- load_onnx_model(model_path: str) ONNXRuntimeAdapter [source]#
- Parameters
model_path (str) –
- Return type
- load_weights(uri: str, **kwargs) None [source]#
Load model weights from a file.
- Parameters
uri (str) – URI.
**kwargs – Extra args for
nn.Module.load_state_dict()
.
- Return type
None
- main()[source]#
Main training sequence.
This plots the dataset, runs a training and validation loop (which will resume if interrupted), logs stats, plots predictions, and syncs results to the cloud.
- normalize_input(x: ndarray) ndarray [source]#
Normalize x to [0, 1].
If x.dtype is a subtype of np.unsignedinteger, normalize it to [0, 1] using the max possible value of that dtype. Otherwise, assume it is in [0, 1] already and do nothing.
- Parameters
x (np.ndarray) – an image or batch of images
- Returns
the same array scaled to [0, 1].
- Return type
- on_epoch_end(curr_epoch: int, metrics: Dict[str, float]) None [source]#
Hook that is called at end of epoch.
Writes metrics to CSV and TensorBoard, and saves model.
- output_to_numpy(out: torch.Tensor) ndarray [source]#
Convert output of model to numpy format.
- Parameters
out (torch.Tensor) – the output of the model in PyTorch format
- Return type
Returns: the output of the model in numpy format
- plot_dataloader(dl: torch.utils.data.DataLoader, output_path: str, batch_limit: Optional[int] = None, show: bool = False)[source]#
Plot images and ground truth labels for a DataLoader.
- Parameters
dl (torch.utils.data.DataLoader) –
output_path (str) –
show (bool) –
- plot_dataloaders(batch_limit: Optional[int] = None, show: bool = False)[source]#
Plot images and ground truth labels for all DataLoaders.
- plot_predictions(split: str, batch_limit: Optional[int] = None, show: bool = False)[source]#
Plot predictions for a split.
Uses the first batch for the corresponding DataLoader.
- post_forward(x: Any) Any [source]#
Post process output of call to model().
Useful for when predictions are inside a structure returned by model().
- predict(x: torch.Tensor, raw_out: bool = False) Any [source]#
Make prediction for an image or batch of images.
- predict_dataloader(dl: torch.utils.data.DataLoader, batched_output: bool = True, return_format: Literal['xyz', 'yz', 'z'] = 'z', raw_out: bool = True, predict_kw: dict = {}) Union[Iterator[Any], Iterator[Tuple[Any, ...]]] [source]#
Returns an iterator over predictions on the given dataloader.
- Parameters
dl (DataLoader) – The dataloader to make predictions on.
batched_output (bool, optional) – If True, return batches of x, y, z as defined by the dataloader. If False, unroll the batches into individual items. Defaults to True.
return_format (Literal['xyz', 'yz', 'z'], optional) – Format of the return elements of the returned iterator. Must be one of: ‘xyz’, ‘yz’, and ‘z’. If ‘xyz’, elements are 3-tuples of x, y, and z. If ‘yz’, elements are 2-tuples of y and z. If ‘z’, elements are (non-tuple) values of z. Where x = input image, y = ground truth, and z = prediction. Defaults to ‘z’.
raw_out (bool, optional) – If true, return raw predicted scores. Defaults to True.
predict_kw (dict) – Dict with keywords passed to Learner.predict(). Useful if a Learner subclass implements a custom predict() method.
- Raises
ValueError – If return_format is not one of the allowed values.
- Returns
- If return_format
is ‘z’, the returned value is an iterator of whatever type the predictions are. Otherwise, the returned value is an iterator of tuples.
- Return type
Union[Iterator[Any], Iterator[Tuple[Any, …]]]
- predict_dataset(dataset: Dataset, return_format: Literal['xyz', 'yz', 'z'] = 'z', raw_out: bool = True, numpy_out: bool = False, predict_kw: dict = {}, dataloader_kw: dict = {}, progress_bar: bool = True, progress_bar_kw: dict = {}) Union[Iterator[Any], Iterator[Tuple[Any, ...]]] [source]#
Returns an iterator over predictions on the given dataset.
- Parameters
dataset (Dataset) – The dataset to make predictions on.
return_format (Literal['xyz', 'yz', 'z'], optional) – Format of the return elements of the returned iterator. Must be one of: ‘xyz’, ‘yz’, and ‘z’. If ‘xyz’, elements are 3-tuples of x, y, and z. If ‘yz’, elements are 2-tuples of y and z. If ‘z’, elements are (non-tuple) values of z. Where x = input image, y = ground truth, and z = prediction. Defaults to ‘z’.
raw_out (bool, optional) – If true, return raw predicted scores. Defaults to True.
numpy_out (bool, optional) – If True, convert predictions to numpy arrays before returning. Defaults to False.
predict_kw (dict) – Dict with keywords passed to Learner.predict(). Useful if a Learner subclass implements a custom predict() method.
dataloader_kw (dict) – Dict with keywords passed to the DataLoader constructor.
progress_bar (bool, optional) – If True, display a progress bar. Since this function returns an iterator, the progress bar won’t be visible until the iterator is consumed. Defaults to True.
progress_bar_kw (dict) – Dict with keywords passed to tqdm.
- Raises
ValueError – If return_format is not one of the allowed values.
- Returns
If return_format is ‘z’, the returned value is an iterator of whatever type the predictions are. Otherwise, the returned value is an iterator of tuples.
- Return type
- predict_onnx(x: torch.Tensor, raw_out: bool = False) torch.Tensor [source]#
Alternative to predict() for ONNX inference.
- Parameters
x (torch.Tensor) –
raw_out (bool) –
- Return type
- prob_to_pred(x: torch.Tensor) torch.Tensor [source]#
Convert a Tensor with prediction probabilities to class ids.
The class ids should be the classes with the maximum probability.
- Parameters
x (torch.Tensor) –
- Return type
- save_model_bundle(export_onnx: bool = True)[source]#
Save a model bundle.
This is a zip file with the model weights in .pth format and a serialized copy of the LearningConfig, which allows for making predictions in the future.
- Parameters
export_onnx (bool) –
- setup_loss(loss_def_path: Optional[str] = None) None [source]#
Setup self.loss.
- Parameters
loss_def_path (str, optional) – Loss definition path. Will be
None. (available when loading from a bundle. Defaults to) –
- Return type
None
- setup_model(model_weights_path: Optional[str] = None, model_def_path: Optional[str] = None) None [source]#
Setup self.model.
- to_batch(x: torch.Tensor) torch.Tensor [source]#
Ensure that image array has batch dimension.
- Parameters
x (torch.Tensor) – assumed to be either image or batch of images
- Returns
x with extra batch dimension of length 1 if needed
- Return type
- train(epochs: Optional[int] = None)[source]#
Training loop that will attempt to resume training if appropriate.
- train_end(outputs: List[Dict[str, Union[float, torch.Tensor]]]) Dict[str, float] [source]#
Aggregate the output of train_step at the end of the epoch.
- train_epoch(optimizer: Optimizer, step_scheduler: Optional[_LRScheduler] = None) Dict[str, float] [source]#
Train for a single epoch.
- abstract train_step(batch: Any, batch_ind: int) Dict[str, float] [source]#
Compute loss for a single training batch.
- validate_end(outputs: List[Dict[str, Union[float, torch.Tensor]]]) Dict[str, float] [source]#
Aggregate the output of validate_step at the end of the epoch.
- validate_epoch(dl: torch.utils.data.DataLoader) Dict[str, float] [source]#
Validate for a single epoch.
- Parameters
dl (torch.utils.data.DataLoader) –
- Return type