RegressionGeoDataConfig#
Note
All Configs are derived from rastervision.pipeline.config.Config
, which itself is a pydantic Model.
- pydantic model RegressionGeoDataConfig[source]#
Configure regression
GeoDatasets
.See
rastervision.pytorch_learner.dataset.regression_dataset
.Show JSON schema
{ "title": "RegressionGeoDataConfig", "description": "Configure regression :class:`GeoDatasets <.GeoDataset>`.\n\nSee :mod:`rastervision.pytorch_learner.dataset.regression_dataset`.", "type": "object", "properties": { "class_names": { "title": "Class Names", "description": "Names of classes.", "default": [], "type": "array", "items": { "type": "string" } }, "class_colors": { "title": "Class Colors", "description": "Colors used to display classes. Can be color 3-tuples in list form.", "type": "array", "items": { "anyOf": [ { "type": "string" }, { "type": "array", "minItems": 3, "maxItems": 3, "items": [ { "type": "integer" }, { "type": "integer" }, { "type": "integer" } ] } ] } }, "img_channels": { "title": "Img Channels", "description": "The number of channels of the training images.", "exclusiveMinimum": 0, "type": "integer" }, "img_sz": { "title": "Img Sz", "description": "Length of a side of each image in pixels. This is the size to transform it to during training, not the size in the raw dataset.", "default": 256, "exclusiveMinimum": 0, "type": "integer" }, "train_sz": { "title": "Train Sz", "description": "If set, the number of training images to use. If fewer images exist, then an exception will be raised.", "type": "integer" }, "train_sz_rel": { "title": "Train Sz Rel", "description": "If set, the proportion of training images to use.", "type": "number" }, "num_workers": { "title": "Num Workers", "description": "Number of workers to use when DataLoader makes batches.", "default": 4, "type": "integer" }, "augmentors": { "title": "Augmentors", "description": "Names of albumentations augmentors to use for training batches. Choices include: ['Blur', 'RandomRotate90', 'HorizontalFlip', 'VerticalFlip', 'GaussianBlur', 'GaussNoise', 'RGBShift', 'ToGray']. Alternatively, a custom transform can be provided via the aug_transform option.", "default": [ "RandomRotate90", "HorizontalFlip", "VerticalFlip" ], "type": "array", "items": { "type": "string" } }, "base_transform": { "title": "Base Transform", "description": "An Albumentations transform serialized as a dict that will be applied to all datasets: training, validation, and test. This transformation is in addition to the resizing due to img_sz. This is useful for, for example, applying the same normalization to all datasets.", "type": "object" }, "aug_transform": { "title": "Aug Transform", "description": "An Albumentations transform serialized as a dict that will be applied as data augmentation to the training dataset. This transform is applied before base_transform. If provided, the augmentors option is ignored.", "type": "object" }, "plot_options": { "title": "Plot Options", "description": "Options to control plotting.", "default": { "transform": { "__version__": "1.3.0", "transform": { "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize", "always_apply": false, "p": 1.0, "min_val": 0.0, "max_val": 1.0, "dtype": 5 } }, "channel_display_groups": null, "type_hint": "regression_plot_options", "max_scatter_points": 5000, "hist_bins": 30 }, "allOf": [ { "$ref": "#/definitions/RegressionPlotOptions" } ] }, "preview_batch_limit": { "title": "Preview Batch Limit", "description": "Optional limit on the number of items in the preview plots produced during training.", "type": "integer" }, "type_hint": { "title": "Type Hint", "default": "regression_geo_data", "enum": [ "regression_geo_data" ], "type": "string" }, "scene_dataset": { "$ref": "#/definitions/DatasetConfig" }, "window_opts": { "title": "Window Opts", "default": {}, "anyOf": [ { "$ref": "#/definitions/GeoDataWindowConfig" }, { "type": "object", "additionalProperties": { "$ref": "#/definitions/GeoDataWindowConfig" } } ] }, "pos_class_names": { "title": "Pos Class Names", "default": [], "type": "array", "items": { "type": "string" } }, "prob_class_names": { "title": "Prob Class Names", "default": [], "type": "array", "items": { "type": "string" } } }, "additionalProperties": false, "definitions": { "RegressionPlotOptions": { "title": "RegressionPlotOptions", "description": "Config related to plotting.", "type": "object", "properties": { "transform": { "title": "Transform", "description": "An Albumentations transform serialized as a dict that will be applied to each image before it is plotted. Mainly useful for undoing any data transformation that you do not want included in the plot, such as normalization. The default value will shift and scale the image so the values range from 0.0 to 1.0 which is the expected range for the plotting function. This default is useful for cases where the values after normalization are close to zero which makes the plot difficult to see.", "default": { "__version__": "1.3.0", "transform": { "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize", "always_apply": false, "p": 1.0, "min_val": 0.0, "max_val": 1.0, "dtype": 5 } }, "type": "object" }, "channel_display_groups": { "title": "Channel Display Groups", "description": "Groups of image channels to display together as a subplot when plotting the data and predictions. Can be a list or tuple of groups (e.g. [(0, 1, 2), (3,)]) or a dict containing title-to-group mappings (e.g. {\"RGB\": [0, 1, 2], \"IR\": [3]}), where each group is a list or tuple of channel indices and title is a string that will be used as the title of the subplot for that group.", "anyOf": [ { "type": "object", "additionalProperties": { "type": "array", "items": { "type": "integer", "minimum": 0 } } }, { "type": "array", "items": { "type": "array", "items": { "type": "integer", "minimum": 0 } } } ] }, "type_hint": { "title": "Type Hint", "default": "regression_plot_options", "enum": [ "regression_plot_options" ], "type": "string" }, "max_scatter_points": { "title": "Max Scatter Points", "description": "Maximum number of datapoints to use in scatter plot. Useful to avoid running out of memory and cluttering.", "default": 5000, "type": "integer" }, "hist_bins": { "title": "Hist Bins", "description": "Number of bins to use for histogram.", "default": 30, "type": "integer" } }, "additionalProperties": false }, "ClassConfig": { "title": "ClassConfig", "description": "Configure class information for a machine learning task.", "type": "object", "properties": { "names": { "title": "Names", "description": "Names of classes. The i-th class in this list will have class ID = i.", "type": "array", "items": { "type": "string" } }, "colors": { "title": "Colors", "description": "Colors used to visualize classes. Can be color strings accepted by matplotlib or RGB tuples. If None, a random color will be auto-generated for each class.", "type": "array", "items": { "anyOf": [ { "type": "string" }, { "type": "array", "items": {} } ] } }, "null_class": { "title": "Null Class", "description": "Optional name of class in `names` to use as the null class. This is used in semantic segmentation to represent the label for imagery pixels that are NODATA or that are missing a label. If None and the class names include \"null\", it will automatically be used as the null class. If None, and this Config is part of a SemanticSegmentationConfig, a null class will be added automatically.", "type": "string" }, "type_hint": { "title": "Type Hint", "default": "class_config", "enum": [ "class_config" ], "type": "string" } }, "required": [ "names" ], "additionalProperties": false }, "RasterTransformerConfig": { "title": "RasterTransformerConfig", "description": "Configure a :class:`.RasterTransformer`.", "type": "object", "properties": { "type_hint": { "title": "Type Hint", "default": "raster_transformer", "enum": [ "raster_transformer" ], "type": "string" } }, "additionalProperties": false }, "RasterSourceConfig": { "title": "RasterSourceConfig", "description": "Configure a :class:`.RasterSource`.", "type": "object", "properties": { "channel_order": { "title": "Channel Order", "description": "The sequence of channel indices to use when reading imagery.", "type": "array", "items": { "type": "integer" } }, "transformers": { "title": "Transformers", "default": [], "type": "array", "items": { "$ref": "#/definitions/RasterTransformerConfig" } }, "bbox": { "title": "Bbox", "description": "User-specified bbox in pixel coords in the form (ymin, xmin, ymax, xmax). Useful for cropping the raster source so that only part of the raster is read from.", "type": "array", "minItems": 4, "maxItems": 4, "items": [ { "type": "integer" }, { "type": "integer" }, { "type": "integer" }, { "type": "integer" } ] }, "type_hint": { "title": "Type Hint", "default": "raster_source", "enum": [ "raster_source" ], "type": "string" } }, "additionalProperties": false }, "LabelSourceConfig": { "title": "LabelSourceConfig", "description": "Configure a :class:`.LabelSource`.", "type": "object", "properties": { "type_hint": { "title": "Type Hint", "default": "label_source", "enum": [ "label_source" ], "type": "string" } }, "additionalProperties": false }, "LabelStoreConfig": { "title": "LabelStoreConfig", "description": "Configure a :class:`.LabelStore`.", "type": "object", "properties": { "type_hint": { "title": "Type Hint", "default": "label_store", "enum": [ "label_store" ], "type": "string" } }, "additionalProperties": false }, "SceneConfig": { "title": "SceneConfig", "description": "Configure a :class:`.Scene` comprising raster data & labels for an AOI.\n ", "type": "object", "properties": { "id": { "title": "Id", "type": "string" }, "raster_source": { "$ref": "#/definitions/RasterSourceConfig" }, "label_source": { "$ref": "#/definitions/LabelSourceConfig" }, "label_store": { "$ref": "#/definitions/LabelStoreConfig" }, "aoi_uris": { "title": "Aoi Uris", "description": "List of URIs of GeoJSON files that define the AOIs for the scene. Each polygon defines an AOI which is a piece of the scene that is assumed to be fully labeled and usable for training or validation. The AOIs are assumed to be in EPSG:4326 coordinates.", "type": "array", "items": { "type": "string" } }, "type_hint": { "title": "Type Hint", "default": "scene", "enum": [ "scene" ], "type": "string" } }, "required": [ "id", "raster_source" ], "additionalProperties": false }, "DatasetConfig": { "title": "DatasetConfig", "description": "Configure train, validation, and test splits for a dataset.", "type": "object", "properties": { "class_config": { "$ref": "#/definitions/ClassConfig" }, "train_scenes": { "title": "Train Scenes", "type": "array", "items": { "$ref": "#/definitions/SceneConfig" } }, "validation_scenes": { "title": "Validation Scenes", "type": "array", "items": { "$ref": "#/definitions/SceneConfig" } }, "test_scenes": { "title": "Test Scenes", "default": [], "type": "array", "items": { "$ref": "#/definitions/SceneConfig" } }, "scene_groups": { "title": "Scene Groups", "description": "Groupings of scenes. Should be a dict of the form: {<group-name>: Set(scene_id_1, scene_id_2, ...)}. Three groups are added by default: \"train_scenes\", \"validation_scenes\", and \"test_scenes\"", "default": {}, "type": "object", "additionalProperties": { "type": "array", "items": { "type": "string" }, "uniqueItems": true } }, "type_hint": { "title": "Type Hint", "default": "dataset", "enum": [ "dataset" ], "type": "string" } }, "required": [ "class_config", "train_scenes", "validation_scenes" ], "additionalProperties": false }, "GeoDataWindowMethod": { "title": "GeoDataWindowMethod", "description": "An enumeration.", "enum": [ "sliding", "random" ] }, "GeoDataWindowConfig": { "title": "GeoDataWindowConfig", "description": "Configure a :class:`.GeoDataset`.\n\nSee :mod:`rastervision.pytorch_learner.dataset.dataset`.", "type": "object", "properties": { "method": { "default": "sliding", "allOf": [ { "$ref": "#/definitions/GeoDataWindowMethod" } ] }, "size": { "title": "Size", "description": "If method = sliding, this is the size of sliding window. If method = random, this is the size that all the windows are resized to before they are returned. If method = random and neither size_lims nor h_lims and w_lims have been specified, then size_lims is set to (size, size + 1).", "anyOf": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "integer", "exclusiveMinimum": 0 } ] } ] }, "stride": { "title": "Stride", "description": "Stride of sliding window. Only used if method = sliding.", "anyOf": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "integer", "exclusiveMinimum": 0 } ] } ] }, "padding": { "title": "Padding", "description": "How many pixels are windows allowed to overflow the edges of the raster source.", "anyOf": [ { "type": "integer", "minimum": 0 }, { "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "minimum": 0 }, { "type": "integer", "minimum": 0 } ] } ] }, "pad_direction": { "title": "Pad Direction", "description": "If \"end\", only pad ymax and xmax (bottom and right). If \"start\", only pad ymin and xmin (top and left). If \"both\", pad all sides. Has no effect if paddiong is zero. Defaults to \"end\".", "default": "end", "enum": [ "both", "start", "end" ], "type": "string" }, "size_lims": { "title": "Size Lims", "description": "[min, max) interval from which window sizes will be uniformly randomly sampled. The upper limit is exclusive. To fix the size to a constant value, use size_lims = (sz, sz + 1). Only used if method = random. Specify either size_lims or h_lims and w_lims, but not both. If neither size_lims nor h_lims and w_lims have been specified, then this will be set to (size, size + 1).", "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "integer", "exclusiveMinimum": 0 } ] }, "h_lims": { "title": "H Lims", "description": "[min, max] interval from which window heights will be uniformly randomly sampled. Only used if method = random.", "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "integer", "exclusiveMinimum": 0 } ] }, "w_lims": { "title": "W Lims", "description": "[min, max] interval from which window widths will be uniformly randomly sampled. Only used if method = random.", "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "type": "integer", "exclusiveMinimum": 0 }, { "type": "integer", "exclusiveMinimum": 0 } ] }, "max_windows": { "title": "Max Windows", "description": "Max allowed reads from a GeoDataset. Only used if method = random.", "default": 10000, "minimum": 0, "type": "integer" }, "max_sample_attempts": { "title": "Max Sample Attempts", "description": "Max attempts when trying to find a window within the AOI of a scene. Only used if method = random and the scene has aoi_polygons specified.", "default": 100, "exclusiveMinimum": 0, "type": "integer" }, "efficient_aoi_sampling": { "title": "Efficient Aoi Sampling", "description": "If the scene has AOIs, sampling windows at random anywhere in the extent and then checking if they fall within any of the AOIs can be very inefficient. This flag enables the use of an alternate algorithm that only samples window locations inside the AOIs. Only used if method = random and the scene has aoi_polygons specified. Defaults to True", "default": true, "type": "boolean" }, "type_hint": { "title": "Type Hint", "default": "geo_data_window", "enum": [ "geo_data_window" ], "type": "string" } }, "required": [ "size" ], "additionalProperties": false } } }
- Config
extra: str = forbid
validate_assignment: bool = True
- Fields
- Validators
ensure_class_colors
»all fields
get_class_info_from_class_config_if_needed
»all fields
validate_augmentors
»augmentors
validate_plot_options
»all fields
validate_window_opts
»window_opts
- field aug_transform: Optional[dict] = None#
An Albumentations transform serialized as a dict that will be applied as data augmentation to the training dataset. This transform is applied before base_transform. If provided, the augmentors option is ignored.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field augmentors: List[str] = ['RandomRotate90', 'HorizontalFlip', 'VerticalFlip']#
Names of albumentations augmentors to use for training batches. Choices include: [‘Blur’, ‘RandomRotate90’, ‘HorizontalFlip’, ‘VerticalFlip’, ‘GaussianBlur’, ‘GaussNoise’, ‘RGBShift’, ‘ToGray’]. Alternatively, a custom transform can be provided via the aug_transform option.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_augmentors
validate_plot_options
- field base_transform: Optional[dict] = None#
An Albumentations transform serialized as a dict that will be applied to all datasets: training, validation, and test. This transformation is in addition to the resizing due to img_sz. This is useful for, for example, applying the same normalization to all datasets.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field class_colors: Optional[List[Union[str, RGBTuple]]] = None#
Colors used to display classes. Can be color 3-tuples in list form.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field class_names: List[str] = []#
Names of classes.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field img_channels: Optional[PosInt] = None#
The number of channels of the training images.
- Constraints
exclusiveMinimum = 0
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field img_sz: PosInt = 256#
Length of a side of each image in pixels. This is the size to transform it to during training, not the size in the raw dataset.
- Constraints
exclusiveMinimum = 0
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field num_workers: int = 4#
Number of workers to use when DataLoader makes batches.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field plot_options: Optional[RegressionPlotOptions] = RegressionPlotOptions(transform={'__version__': '1.3.0', 'transform': {'__class_fullname__': 'rastervision.pytorch_learner.utils.utils.MinMaxNormalize', 'always_apply': False, 'p': 1.0, 'min_val': 0.0, 'max_val': 1.0, 'dtype': 5}}, channel_display_groups=None, max_scatter_points=5000, hist_bins=30)#
Options to control plotting.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field pos_class_names: List[str] = []#
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field preview_batch_limit: Optional[int] = None#
Optional limit on the number of items in the preview plots produced during training.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field prob_class_names: List[str] = []#
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field scene_dataset: Optional['SceneDatasetConfig'] = None#
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field train_sz: Optional[int] = None#
If set, the number of training images to use. If fewer images exist, then an exception will be raised.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field train_sz_rel: Optional[float] = None#
If set, the proportion of training images to use.
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field type_hint: Literal['regression_geo_data'] = 'regression_geo_data'#
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
- field window_opts: Union[GeoDataWindowConfig, Dict[str, GeoDataWindowConfig]] = {}#
- Validated by
ensure_class_colors
get_class_info_from_class_config_if_needed
validate_plot_options
validate_window_opts
- build(tmp_dir: str, overfit_mode: bool = False, test_mode: bool = False) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
Build an instance of the corresponding type of object using this config.
For example, BackendConfig will build a Backend object. The arguments to this method will vary depending on the type of Config.
- Parameters
- Return type
Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset]
- build_scenes(tmp_dir: str) Tuple[List[Scene], List[Scene], List[Scene]] #
Build training, validation, and test scenes.
- get_bbox_params() Optional[BboxParams] #
Returns BboxParams used by albumentations for data augmentation.
- Return type
Optional[BboxParams]
- validator get_class_info_from_class_config_if_needed » all fields#
- get_custom_albumentations_transforms() List[dict] #
Returns all custom transforms found in this config.
This should return all serialized albumentations transforms with a ‘lambda_transforms_path’ field contained in this config or in any of its members no matter how deeply neseted.
The purpose is to make it easier to adjust their paths all at once while saving to or loading from a bundle.
- get_data_transforms() Tuple[BasicTransform, BasicTransform] #
Get albumentations transform objects for data augmentation.
Returns a 2-tuple of a “base” transform and an augmentation transform. The base transform comprises a resize transform based on img_sz followed by the transform specified in base_transform. The augmentation transform comprises the base transform followed by either the transform in aug_transform (if specified) or the transforms in the augmentors field.
The augmentation transform is intended to be used for training data, and the base transform for all other data where data augmentation is not desirable, such as validation or prediction.
- Returns
base transform and augmentation transform.
- Return type
Tuple[BasicTransform, BasicTransform]
- make_datasets(tmp_dir: str, train_tf: Optional[BasicTransform] = None, val_tf: Optional[BasicTransform] = None, test_tf: Optional[BasicTransform] = None, **kwargs) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
Make training, validation, and test datasets.
- Parameters
tmp_dir (str) – Temporary directory to be used for building scenes.
train_tf (Optional[A.BasicTransform], optional) – Transform for the training dataset. Defaults to None.
val_tf (Optional[A.BasicTransform], optional) – Transform for the validation dataset. Defaults to None.
test_tf (Optional[A.BasicTransform], optional) – Transform for the test dataset. Defaults to None.
**kwargs – Kwargs to pass to
scene_to_dataset()
.
- Returns
PyTorch-compatiable training, validation, and test datasets.
- Return type
Tuple[Dataset, Dataset, Dataset]
- random_subset_dataset(ds: torch.utils.data.Dataset, size: Optional[int] = None, fraction: Optional[ConstrainedFloatValue] = None) torch.utils.data.Subset #
- Parameters
ds (torch.utils.data.Dataset) –
fraction (Optional[ConstrainedFloatValue]) –
- Return type
- recursive_validate_config()#
Recursively validate hierarchies of Configs.
This uses reflection to call validate_config on a hierarchy of Configs using a depth-first pre-order traversal.
- revalidate()#
Re-validate an instantiated Config.
Runs all Pydantic validators plus self.validate_config().
Adapted from: https://github.com/samuelcolvin/pydantic/issues/1864#issuecomment-679044432
- scene_to_dataset(scene: Scene, transform: Optional[BasicTransform] = None) Union[RegressionSlidingWindowGeoDataset, RegressionRandomWindowGeoDataset] [source]#
Make a dataset from a single scene.
- Parameters
- Return type
Union[RegressionSlidingWindowGeoDataset, RegressionRandomWindowGeoDataset]
- update(*args, **kwargs)#
Update any fields before validation.
Subclasses should override this to provide complex default behavior, for example, setting default values as a function of the values of other fields. The arguments to this method will vary depending on the type of Config.
- validator validate_augmentors » augmentors#
- validate_config()#
Validate fields that should be checked after update is called.
This is to complement the builtin validation that Pydantic performs at the time of object construction.
- validate_list(field: str, valid_options: List[str])#
Validate a list field.
- Parameters
- Raises
ConfigError – if field is invalid
- validator validate_window_opts » window_opts#
- Parameters
v (Union[GeoDataWindowConfig, Dict[str, GeoDataWindowConfig]]) –
values (dict) –
- Return type
- property num_classes#