LearnerPipelineConfig#

Note

All Configs are derived from rastervision.pipeline.config.Config, which itself is a pydantic Model.

pydantic model LearnerPipelineConfig[source]#

Configure a LearnerPipeline.

Show JSON schema
{
   "title": "LearnerPipelineConfig",
   "description": "Configure a :class:`.LearnerPipeline`.",
   "type": "object",
   "properties": {
      "root_uri": {
         "title": "Root Uri",
         "description": "The root URI for output generated by the pipeline",
         "type": "string"
      },
      "rv_config": {
         "title": "Rv Config",
         "description": "Used to store serialized RVConfig so pipeline can run in remote environment with the local RVConfig. This should not be set explicitly by users -- it is only used by the runner when running a remote pipeline.",
         "type": "object"
      },
      "plugin_versions": {
         "title": "Plugin Versions",
         "description": "Used to store a mapping of plugin module paths to the latest version number. This should not be set explicitly by users -- it is set automatically when serializing and saving the config to disk.",
         "type": "object",
         "additionalProperties": {
            "type": "integer"
         }
      },
      "type_hint": {
         "title": "Type Hint",
         "default": "learner_pipeline",
         "enum": [
            "learner_pipeline"
         ],
         "type": "string"
      },
      "learner": {
         "$ref": "#/definitions/LearnerConfig"
      }
   },
   "required": [
      "learner"
   ],
   "additionalProperties": false,
   "definitions": {
      "Backbone": {
         "title": "Backbone",
         "description": "An enumeration.",
         "enum": [
            "alexnet",
            "densenet121",
            "densenet169",
            "densenet201",
            "densenet161",
            "googlenet",
            "inception_v3",
            "mnasnet0_5",
            "mnasnet0_75",
            "mnasnet1_0",
            "mnasnet1_3",
            "mobilenet_v2",
            "resnet18",
            "resnet34",
            "resnet50",
            "resnet101",
            "resnet152",
            "resnext50_32x4d",
            "resnext101_32x8d",
            "wide_resnet50_2",
            "wide_resnet101_2",
            "shufflenet_v2_x0_5",
            "shufflenet_v2_x1_0",
            "shufflenet_v2_x1_5",
            "shufflenet_v2_x2_0",
            "squeezenet1_0",
            "squeezenet1_1",
            "vgg11",
            "vgg11_bn",
            "vgg13",
            "vgg13_bn",
            "vgg16",
            "vgg16_bn",
            "vgg19_bn",
            "vgg19"
         ]
      },
      "ExternalModuleConfig": {
         "title": "ExternalModuleConfig",
         "description": "Config describing an object to be loaded via Torch Hub.",
         "type": "object",
         "properties": {
            "uri": {
               "title": "Uri",
               "description": "Local uri of a zip file, or local uri of a directory,or remote uri of zip file.",
               "minLength": 1,
               "type": "string"
            },
            "github_repo": {
               "title": "Github Repo",
               "description": "<repo-owner>/<repo-name>[:tag]",
               "pattern": ".+/.+",
               "type": "string"
            },
            "name": {
               "title": "Name",
               "description": "Name of the folder in which to extract/copy the definition files.",
               "minLength": 1,
               "type": "string"
            },
            "entrypoint": {
               "title": "Entrypoint",
               "description": "Name of a callable present in hubconf.py. See docs for torch.hub for details.",
               "minLength": 1,
               "type": "string"
            },
            "entrypoint_args": {
               "title": "Entrypoint Args",
               "description": "Args to pass to the entrypoint. Must be serializable.",
               "default": [],
               "type": "array",
               "items": {}
            },
            "entrypoint_kwargs": {
               "title": "Entrypoint Kwargs",
               "description": "Keyword args to pass to the entrypoint. Must be serializable.",
               "default": {},
               "type": "object"
            },
            "force_reload": {
               "title": "Force Reload",
               "description": "Force reload of module definition.",
               "default": false,
               "type": "boolean"
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "external-module",
               "enum": [
                  "external-module"
               ],
               "type": "string"
            }
         },
         "required": [
            "entrypoint"
         ],
         "additionalProperties": false
      },
      "ModelConfig": {
         "title": "ModelConfig",
         "description": "Config related to models.",
         "type": "object",
         "properties": {
            "backbone": {
               "description": "The torchvision.models backbone to use.",
               "default": "resnet18",
               "allOf": [
                  {
                     "$ref": "#/definitions/Backbone"
                  }
               ]
            },
            "pretrained": {
               "title": "Pretrained",
               "description": "If True, use ImageNet weights. If False, use random initialization.",
               "default": true,
               "type": "boolean"
            },
            "init_weights": {
               "title": "Init Weights",
               "description": "URI of PyTorch model weights used to initialize model. If set, this supercedes the pretrained option.",
               "type": "string"
            },
            "load_strict": {
               "title": "Load Strict",
               "description": "If True, the keys in the state dict referenced by init_weights must match exactly. Setting this to False can be useful if you just want to load the backbone of a model.",
               "default": true,
               "type": "boolean"
            },
            "external_def": {
               "title": "External Def",
               "description": "If specified, the model will be built from the definition from this external source, using Torch Hub.",
               "allOf": [
                  {
                     "$ref": "#/definitions/ExternalModuleConfig"
                  }
               ]
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "model",
               "enum": [
                  "model"
               ],
               "type": "string"
            }
         },
         "additionalProperties": false
      },
      "SolverConfig": {
         "title": "SolverConfig",
         "description": "Config related to solver aka optimizer.",
         "type": "object",
         "properties": {
            "lr": {
               "title": "Lr",
               "description": "Learning rate.",
               "default": 0.0001,
               "exclusiveMinimum": 0,
               "type": "number"
            },
            "num_epochs": {
               "title": "Num Epochs",
               "description": "Number of epochs (ie. sweeps through the whole training set).",
               "default": 10,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "test_num_epochs": {
               "title": "Test Num Epochs",
               "description": "Number of epochs to use in test mode.",
               "default": 2,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "test_batch_sz": {
               "title": "Test Batch Sz",
               "description": "Batch size to use in test mode.",
               "default": 4,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "overfit_num_steps": {
               "title": "Overfit Num Steps",
               "description": "Number of optimizer steps to use in overfit mode.",
               "default": 1,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "sync_interval": {
               "title": "Sync Interval",
               "description": "The interval in epochs for each sync to the cloud.",
               "default": 1,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "batch_sz": {
               "title": "Batch Sz",
               "description": "Batch size.",
               "default": 32,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "one_cycle": {
               "title": "One Cycle",
               "description": "If True, use triangular LR scheduler with a single cycle across all epochs with start and end LR being lr/10 and the peak being lr.",
               "default": true,
               "type": "boolean"
            },
            "multi_stage": {
               "title": "Multi Stage",
               "description": "List of epoch indices at which to divide LR by 10.",
               "default": [],
               "type": "array",
               "items": {}
            },
            "class_loss_weights": {
               "title": "Class Loss Weights",
               "description": "Class weights for weighted loss.",
               "type": "array",
               "items": {
                  "type": "number"
               }
            },
            "ignore_class_index": {
               "title": "Ignore Class Index",
               "description": "If specified, this index is ignored when computing the loss. See pytorch documentation for nn.CrossEntropyLoss for more details. This can also be negative, in which case it is treated as a negative slice index i.e. -1 = last index, -2 = second-last index, and so on.",
               "type": "integer"
            },
            "external_loss_def": {
               "title": "External Loss Def",
               "description": "If specified, the loss will be built from the definition from this external source, using Torch Hub.",
               "allOf": [
                  {
                     "$ref": "#/definitions/ExternalModuleConfig"
                  }
               ]
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "solver",
               "enum": [
                  "solver"
               ],
               "type": "string"
            }
         },
         "additionalProperties": false
      },
      "PlotOptions": {
         "title": "PlotOptions",
         "description": "Config related to plotting.",
         "type": "object",
         "properties": {
            "transform": {
               "title": "Transform",
               "description": "An Albumentations transform serialized as a dict that will be applied to each image before it is plotted. Mainly useful for undoing any data transformation that you do not want included in the plot, such as normalization. The default value will shift and scale the image so the values range from 0.0 to 1.0 which is the expected range for the plotting function. This default is useful for cases where the values after normalization are close to zero which makes the plot difficult to see.",
               "default": {
                  "__version__": "1.3.0",
                  "transform": {
                     "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize",
                     "always_apply": false,
                     "p": 1.0,
                     "min_val": 0.0,
                     "max_val": 1.0,
                     "dtype": 5
                  }
               },
               "type": "object"
            },
            "channel_display_groups": {
               "title": "Channel Display Groups",
               "description": "Groups of image channels to display together as a subplot when plotting the data and predictions. Can be a list or tuple of groups (e.g. [(0, 1, 2), (3,)]) or a dict containing title-to-group mappings (e.g. {\"RGB\": [0, 1, 2], \"IR\": [3]}), where each group is a list or tuple of channel indices and title is a string that will be used as the title of the subplot for that group.",
               "anyOf": [
                  {
                     "type": "object",
                     "additionalProperties": {
                        "type": "array",
                        "items": {
                           "type": "integer",
                           "minimum": 0
                        }
                     }
                  },
                  {
                     "type": "array",
                     "items": {
                        "type": "array",
                        "items": {
                           "type": "integer",
                           "minimum": 0
                        }
                     }
                  }
               ]
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "plot_options",
               "enum": [
                  "plot_options"
               ],
               "type": "string"
            }
         },
         "additionalProperties": false
      },
      "DataConfig": {
         "title": "DataConfig",
         "description": "Config related to dataset for training and testing.",
         "type": "object",
         "properties": {
            "class_names": {
               "title": "Class Names",
               "description": "Names of classes.",
               "default": [],
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "class_colors": {
               "title": "Class Colors",
               "description": "Colors used to display classes. Can be color 3-tuples in list form.",
               "type": "array",
               "items": {
                  "anyOf": [
                     {
                        "type": "string"
                     },
                     {
                        "type": "array",
                        "minItems": 3,
                        "maxItems": 3,
                        "items": [
                           {
                              "type": "integer"
                           },
                           {
                              "type": "integer"
                           },
                           {
                              "type": "integer"
                           }
                        ]
                     }
                  ]
               }
            },
            "img_channels": {
               "title": "Img Channels",
               "description": "The number of channels of the training images.",
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "img_sz": {
               "title": "Img Sz",
               "description": "Length of a side of each image in pixels. This is the size to transform it to during training, not the size in the raw dataset.",
               "default": 256,
               "exclusiveMinimum": 0,
               "type": "integer"
            },
            "train_sz": {
               "title": "Train Sz",
               "description": "If set, the number of training images to use. If fewer images exist, then an exception will be raised.",
               "type": "integer"
            },
            "train_sz_rel": {
               "title": "Train Sz Rel",
               "description": "If set, the proportion of training images to use.",
               "type": "number"
            },
            "num_workers": {
               "title": "Num Workers",
               "description": "Number of workers to use when DataLoader makes batches.",
               "default": 4,
               "type": "integer"
            },
            "augmentors": {
               "title": "Augmentors",
               "description": "Names of albumentations augmentors to use for training batches. Choices include: ['Blur', 'RandomRotate90', 'HorizontalFlip', 'VerticalFlip', 'GaussianBlur', 'GaussNoise', 'RGBShift', 'ToGray']. Alternatively, a custom transform can be provided via the aug_transform option.",
               "default": [
                  "RandomRotate90",
                  "HorizontalFlip",
                  "VerticalFlip"
               ],
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "base_transform": {
               "title": "Base Transform",
               "description": "An Albumentations transform serialized as a dict that will be applied to all datasets: training, validation, and test. This transformation is in addition to the resizing due to img_sz. This is useful for, for example, applying the same normalization to all datasets.",
               "type": "object"
            },
            "aug_transform": {
               "title": "Aug Transform",
               "description": "An Albumentations transform serialized as a dict that will be applied as data augmentation to the training dataset. This transform is applied before base_transform. If provided, the augmentors option is ignored.",
               "type": "object"
            },
            "plot_options": {
               "title": "Plot Options",
               "description": "Options to control plotting.",
               "default": {
                  "transform": {
                     "__version__": "1.3.0",
                     "transform": {
                        "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize",
                        "always_apply": false,
                        "p": 1.0,
                        "min_val": 0.0,
                        "max_val": 1.0,
                        "dtype": 5
                     }
                  },
                  "channel_display_groups": null,
                  "type_hint": "plot_options"
               },
               "allOf": [
                  {
                     "$ref": "#/definitions/PlotOptions"
                  }
               ]
            },
            "preview_batch_limit": {
               "title": "Preview Batch Limit",
               "description": "Optional limit on the number of items in the preview plots produced during training.",
               "type": "integer"
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "data",
               "enum": [
                  "data"
               ],
               "type": "string"
            }
         },
         "additionalProperties": false
      },
      "LearnerConfig": {
         "title": "LearnerConfig",
         "description": "Config for Learner.",
         "type": "object",
         "properties": {
            "model": {
               "$ref": "#/definitions/ModelConfig"
            },
            "solver": {
               "$ref": "#/definitions/SolverConfig"
            },
            "data": {
               "$ref": "#/definitions/DataConfig"
            },
            "predict_mode": {
               "title": "Predict Mode",
               "description": "If True, skips training, loads model, and does final eval.",
               "default": false,
               "type": "boolean"
            },
            "test_mode": {
               "title": "Test Mode",
               "description": "If True, uses test_num_epochs, test_batch_sz, truncated datasets with only a single batch, image_sz that is cut in half, and num_workers = 0. This is useful for testing that code runs correctly on CPU without multithreading before running full job on GPU.",
               "default": false,
               "type": "boolean"
            },
            "overfit_mode": {
               "title": "Overfit Mode",
               "description": "If True, uses half image size, and instead of doing epoch-based training, optimizes the model using a single batch repeatedly for overfit_num_steps number of steps.",
               "default": false,
               "type": "boolean"
            },
            "eval_train": {
               "title": "Eval Train",
               "description": "If True, runs final evaluation on training set (in addition to test set). Useful for debugging.",
               "default": false,
               "type": "boolean"
            },
            "save_model_bundle": {
               "title": "Save Model Bundle",
               "description": "If True, saves a model bundle at the end of training which is zip file with model and this LearnerConfig which can be used to make predictions on new images at a later time.",
               "default": true,
               "type": "boolean"
            },
            "log_tensorboard": {
               "title": "Log Tensorboard",
               "description": "Save Tensorboard log files at the end of each epoch.",
               "default": true,
               "type": "boolean"
            },
            "run_tensorboard": {
               "title": "Run Tensorboard",
               "description": "run Tensorboard server during training",
               "default": false,
               "type": "boolean"
            },
            "output_uri": {
               "title": "Output Uri",
               "description": "URI of where to save output",
               "type": "string"
            },
            "type_hint": {
               "title": "Type Hint",
               "default": "learner",
               "enum": [
                  "learner"
               ],
               "type": "string"
            }
         },
         "required": [
            "solver",
            "data"
         ],
         "additionalProperties": false
      }
   }
}

Config
  • extra: str = forbid

  • validate_assignment: bool = True

Fields
field learner: LearnerConfig [Required]#
field plugin_versions: Optional[Dict[str, int]] = None#

Used to store a mapping of plugin module paths to the latest version number. This should not be set explicitly by users – it is set automatically when serializing and saving the config to disk.

field root_uri: str = None#

The root URI for output generated by the pipeline

field rv_config: dict = None#

Used to store serialized RVConfig so pipeline can run in remote environment with the local RVConfig. This should not be set explicitly by users – it is only used by the runner when running a remote pipeline.

field type_hint: Literal['learner_pipeline'] = 'learner_pipeline'#
build(tmp_dir)[source]#

Return a pipeline based on this configuration.

Subclasses should override this to return an instance of the corresponding subclass of Pipeline.

Parameters

tmp_dir – root of any temporary directory to pass to pipeline

get_config_uri() str#

Get URI of serialized version of this PipelineConfig.

Return type

str

recursive_validate_config()#

Recursively validate hierarchies of Configs.

This uses reflection to call validate_config on a hierarchy of Configs using a depth-first pre-order traversal.

revalidate()#

Re-validate an instantiated Config.

Runs all Pydantic validators plus self.validate_config().

Adapted from: https://github.com/samuelcolvin/pydantic/issues/1864#issuecomment-679044432

update()[source]#

Update any fields before validation.

Subclasses should override this to provide complex default behavior, for example, setting default values as a function of the values of other fields. The arguments to this method will vary depending on the type of Config.

validate_config()#

Validate fields that should be checked after update is called.

This is to complement the builtin validation that Pydantic performs at the time of object construction.

validate_list(field: str, valid_options: List[str])#

Validate a list field.

Parameters
  • field (str) – name of field to validate

  • valid_options (List[str]) – values that field is allowed to take

Raises

ConfigError – if field is invalid