ClassificationImageDataConfig#
Note
All Configs are derived from rastervision.pipeline.config.Config
, which itself is a pydantic Model.
- pydantic model ClassificationImageDataConfig[source]#
Configure
ClassificationImageDatasets
.Show JSON schema
{ "title": "ClassificationImageDataConfig", "description": "Configure :class:`ClassificationImageDatasets <.ClassificationImageDataset>`.", "type": "object", "properties": { "class_names": { "title": "Class Names", "description": "Names of classes.", "default": [], "type": "array", "items": { "type": "string" } }, "class_colors": { "title": "Class Colors", "description": "Colors used to display classes. Can be color 3-tuples in list form.", "type": "array", "items": { "anyOf": [ { "type": "string" }, { "type": "array", "minItems": 3, "maxItems": 3, "items": [ { "type": "integer" }, { "type": "integer" }, { "type": "integer" } ] } ] } }, "img_channels": { "title": "Img Channels", "description": "The number of channels of the training images.", "exclusiveMinimum": 0, "type": "integer" }, "img_sz": { "title": "Img Sz", "description": "Length of a side of each image in pixels. This is the size to transform it to during training, not the size in the raw dataset.", "default": 256, "exclusiveMinimum": 0, "type": "integer" }, "train_sz": { "title": "Train Sz", "description": "If set, the number of training images to use. If fewer images exist, then an exception will be raised.", "type": "integer" }, "train_sz_rel": { "title": "Train Sz Rel", "description": "If set, the proportion of training images to use.", "type": "number" }, "num_workers": { "title": "Num Workers", "description": "Number of workers to use when DataLoader makes batches.", "default": 4, "type": "integer" }, "augmentors": { "title": "Augmentors", "description": "Names of albumentations augmentors to use for training batches. Choices include: ['Blur', 'RandomRotate90', 'HorizontalFlip', 'VerticalFlip', 'GaussianBlur', 'GaussNoise', 'RGBShift', 'ToGray']. Alternatively, a custom transform can be provided via the aug_transform option.", "default": [ "RandomRotate90", "HorizontalFlip", "VerticalFlip" ], "type": "array", "items": { "type": "string" } }, "base_transform": { "title": "Base Transform", "description": "An Albumentations transform serialized as a dict that will be applied to all datasets: training, validation, and test. This transformation is in addition to the resizing due to img_sz. This is useful for, for example, applying the same normalization to all datasets.", "type": "object" }, "aug_transform": { "title": "Aug Transform", "description": "An Albumentations transform serialized as a dict that will be applied as data augmentation to the training dataset. This transform is applied before base_transform. If provided, the augmentors option is ignored.", "type": "object" }, "plot_options": { "title": "Plot Options", "description": "Options to control plotting.", "default": { "transform": { "__version__": "1.3.0", "transform": { "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize", "always_apply": false, "p": 1.0, "min_val": 0.0, "max_val": 1.0, "dtype": 5 } }, "channel_display_groups": null, "type_hint": "plot_options" }, "allOf": [ { "$ref": "#/definitions/PlotOptions" } ] }, "preview_batch_limit": { "title": "Preview Batch Limit", "description": "Optional limit on the number of items in the preview plots produced during training.", "type": "integer" }, "type_hint": { "title": "Type Hint", "default": "classification_image_data", "enum": [ "classification_image_data" ], "type": "string" }, "data_format": { "default": "image_folder", "allOf": [ { "$ref": "#/definitions/ClassificationDataFormat" } ] }, "uri": { "title": "Uri", "description": "One of the following:\n(1) a URI of a directory containing \"train\", \"valid\", and (optinally) \"test\" subdirectories;\n(2) a URI of a zip file containing (1);\n(3) a list of (2);\n(4) a URI of a directory containing zip files containing (1).", "anyOf": [ { "type": "string" }, { "type": "array", "items": { "type": "string" } } ] }, "group_uris": { "title": "Group Uris", "description": "This can be set instead of uri in order to specify groups of chips. Each element in the list is expected to be an object of the same form accepted by the uri field. The purpose of separating chips into groups is to be able to use the group_train_sz field.", "type": "array", "items": { "anyOf": [ { "type": "string" }, { "type": "array", "items": { "type": "string" } } ] } }, "group_train_sz": { "title": "Group Train Sz", "description": "If group_uris is set, this can be used to specify the number of chips to use per group. Only applies to training chips. This can either be a single value that will be used for all groups or a list of values (one for each group).", "anyOf": [ { "type": "integer" }, { "type": "array", "items": { "type": "integer" } } ] }, "group_train_sz_rel": { "title": "Group Train Sz Rel", "description": "Relative version of group_train_sz. Must be a float in [0, 1]. If group_uris is set, this can be used to specify the proportion of the total chips in each group to use per group. Only applies to training chips. This can either be a single value that will be used for all groups or a list of values (one for each group).", "anyOf": [ { "type": "number", "minimum": 0, "maximum": 1 }, { "type": "array", "items": { "type": "number", "minimum": 0, "maximum": 1 } } ] } }, "additionalProperties": false, "definitions": { "PlotOptions": { "title": "PlotOptions", "description": "Config related to plotting.", "type": "object", "properties": { "transform": { "title": "Transform", "description": "An Albumentations transform serialized as a dict that will be applied to each image before it is plotted. Mainly useful for undoing any data transformation that you do not want included in the plot, such as normalization. The default value will shift and scale the image so the values range from 0.0 to 1.0 which is the expected range for the plotting function. This default is useful for cases where the values after normalization are close to zero which makes the plot difficult to see.", "default": { "__version__": "1.3.0", "transform": { "__class_fullname__": "rastervision.pytorch_learner.utils.utils.MinMaxNormalize", "always_apply": false, "p": 1.0, "min_val": 0.0, "max_val": 1.0, "dtype": 5 } }, "type": "object" }, "channel_display_groups": { "title": "Channel Display Groups", "description": "Groups of image channels to display together as a subplot when plotting the data and predictions. Can be a list or tuple of groups (e.g. [(0, 1, 2), (3,)]) or a dict containing title-to-group mappings (e.g. {\"RGB\": [0, 1, 2], \"IR\": [3]}), where each group is a list or tuple of channel indices and title is a string that will be used as the title of the subplot for that group.", "anyOf": [ { "type": "object", "additionalProperties": { "type": "array", "items": { "type": "integer", "minimum": 0 } } }, { "type": "array", "items": { "type": "array", "items": { "type": "integer", "minimum": 0 } } } ] }, "type_hint": { "title": "Type Hint", "default": "plot_options", "enum": [ "plot_options" ], "type": "string" } }, "additionalProperties": false }, "ClassificationDataFormat": { "title": "ClassificationDataFormat", "description": "An enumeration.", "enum": [ "image_folder" ] } } }
- Config
extra: str = forbid
validate_assignment: bool = True
- Fields
- Validators
ensure_class_colors
»all fields
validate_augmentors
»augmentors
validate_group_uris
»all fields
validate_plot_options
»all fields
- field aug_transform: Optional[dict] = None#
An Albumentations transform serialized as a dict that will be applied as data augmentation to the training dataset. This transform is applied before base_transform. If provided, the augmentors option is ignored.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field augmentors: List[str] = ['RandomRotate90', 'HorizontalFlip', 'VerticalFlip']#
Names of albumentations augmentors to use for training batches. Choices include: [‘Blur’, ‘RandomRotate90’, ‘HorizontalFlip’, ‘VerticalFlip’, ‘GaussianBlur’, ‘GaussNoise’, ‘RGBShift’, ‘ToGray’]. Alternatively, a custom transform can be provided via the aug_transform option.
- Validated by
ensure_class_colors
validate_augmentors
validate_group_uris
validate_plot_options
- field base_transform: Optional[dict] = None#
An Albumentations transform serialized as a dict that will be applied to all datasets: training, validation, and test. This transformation is in addition to the resizing due to img_sz. This is useful for, for example, applying the same normalization to all datasets.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field class_colors: Optional[List[Union[str, RGBTuple]]] = None#
Colors used to display classes. Can be color 3-tuples in list form.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field class_names: List[str] = []#
Names of classes.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field data_format: ClassificationDataFormat = ClassificationDataFormat.image_folder#
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field group_train_sz: Optional[Union[int, List[int]]] = None#
If group_uris is set, this can be used to specify the number of chips to use per group. Only applies to training chips. This can either be a single value that will be used for all groups or a list of values (one for each group).
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field group_train_sz_rel: Optional[Union[Proportion, List[Proportion]]] = None#
Relative version of group_train_sz. Must be a float in [0, 1]. If group_uris is set, this can be used to specify the proportion of the total chips in each group to use per group. Only applies to training chips. This can either be a single value that will be used for all groups or a list of values (one for each group).
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field group_uris: Optional[List[Union[str, List[str]]]] = None#
This can be set instead of uri in order to specify groups of chips. Each element in the list is expected to be an object of the same form accepted by the uri field. The purpose of separating chips into groups is to be able to use the group_train_sz field.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field img_channels: Optional[PosInt] = None#
The number of channels of the training images.
- Constraints
exclusiveMinimum = 0
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field img_sz: PosInt = 256#
Length of a side of each image in pixels. This is the size to transform it to during training, not the size in the raw dataset.
- Constraints
exclusiveMinimum = 0
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field num_workers: int = 4#
Number of workers to use when DataLoader makes batches.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field plot_options: Optional[PlotOptions] = PlotOptions(transform={'__version__': '1.3.0', 'transform': {'__class_fullname__': 'rastervision.pytorch_learner.utils.utils.MinMaxNormalize', 'always_apply': False, 'p': 1.0, 'min_val': 0.0, 'max_val': 1.0, 'dtype': 5}}, channel_display_groups=None)#
Options to control plotting.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field preview_batch_limit: Optional[int] = None#
Optional limit on the number of items in the preview plots produced during training.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field train_sz: Optional[int] = None#
If set, the number of training images to use. If fewer images exist, then an exception will be raised.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field train_sz_rel: Optional[float] = None#
If set, the proportion of training images to use.
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field type_hint: Literal['classification_image_data'] = 'classification_image_data'#
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- field uri: Optional[Union[str, List[str]]] = None#
One of the following: (1) a URI of a directory containing “train”, “valid”, and (optinally) “test” subdirectories; (2) a URI of a zip file containing (1); (3) a list of (2); (4) a URI of a directory containing zip files containing (1).
- Validated by
ensure_class_colors
validate_group_uris
validate_plot_options
- build(tmp_dir: str, overfit_mode: bool = False, test_mode: bool = False) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
Build an instance of the corresponding type of object using this config.
For example, BackendConfig will build a Backend object. The arguments to this method will vary depending on the type of Config.
- Parameters
- Return type
Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset]
- dir_to_dataset(data_dir: str, transform: BasicTransform) ClassificationImageDataset [source]#
- Parameters
data_dir (str) –
transform (BasicTransform) –
- Return type
- get_bbox_params() Optional[BboxParams] #
Returns BboxParams used by albumentations for data augmentation.
- Return type
Optional[BboxParams]
- get_custom_albumentations_transforms() List[dict] #
Returns all custom transforms found in this config.
This should return all serialized albumentations transforms with a ‘lambda_transforms_path’ field contained in this config or in any of its members no matter how deeply neseted.
The pupose is to make it easier to adjust their paths all at once while saving to or loading from a bundle.
- get_data_dirs(uri: Union[str, List[str]], unzip_dir: str) List[str] #
Extract data dirs from uri.
Data dirs are directories containing “train”, “valid”, and (optinally) “test” subdirectories.
- Parameters
- Returns
paths to directories that each contain contents of one zip file
- Return type
- get_data_transforms() Tuple[BasicTransform, BasicTransform] #
Get albumentations transform objects for data augmentation.
- Returns
a transform that doesn’t do any data augmentation 2nd tuple arg: a transform with data augmentation
- Return type
1st tuple arg
- get_datasets_from_group_uris(uris: Union[str, List[str]], tmp_dir: str, group_train_sz: Optional[int] = None, group_train_sz_rel: Optional[float] = None, overfit_mode: bool = False, test_mode: bool = False) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
- Parameters
- Return type
Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset]
- get_datasets_from_uri(uri: Union[str, List[str]], tmp_dir: str, overfit_mode: bool = False, test_mode: bool = False) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
Get image train, validation, & test datasets from a single zip file.
- make_datasets(train_dirs: Iterable[str], val_dirs: Iterable[str], test_dirs: Iterable[str], train_tf: Optional[BasicTransform] = None, val_tf: Optional[BasicTransform] = None, test_tf: Optional[BasicTransform] = None) Tuple[torch.utils.data.Dataset, torch.utils.data.Dataset, torch.utils.data.Dataset] #
Make training, validation, and test datasets.
- Parameters
train_dirs (str) – Directories where training data is located.
val_dirs (str) – Directories where validation data is located.
test_dirs (str) – Directories where test data is located.
train_tf (Optional[A.BasicTransform], optional) – Transform for the training dataset. Defaults to None.
val_tf (Optional[A.BasicTransform], optional) – Transform for the validation dataset. Defaults to None.
test_tf (Optional[A.BasicTransform], optional) – Transform for the test dataset. Defaults to None.
- Returns
- PyTorch-compatiable training,
validation, and test datasets.
- Return type
Tuple[Dataset, Dataset, Dataset]
- random_subset_dataset(ds: torch.utils.data.Dataset, size: Optional[int] = None, fraction: Optional[ConstrainedFloatValue] = None) torch.utils.data.Subset #
- Parameters
ds (torch.utils.data.Dataset) –
fraction (Optional[ConstrainedFloatValue]) –
- Return type
- recursive_validate_config()#
Recursively validate hierarchies of Configs.
This uses reflection to call validate_config on a hierarchy of Configs using a depth-first pre-order traversal.
- revalidate()#
Re-validate an instantiated Config.
Runs all Pydantic validators plus self.validate_config().
Adapted from: https://github.com/samuelcolvin/pydantic/issues/1864#issuecomment-679044432
- update(*args, **kwargs)#
Update any fields before validation.
Subclasses should override this to provide complex default behavior, for example, setting default values as a function of the values of other fields. The arguments to this method will vary depending on the type of Config.
- validator validate_augmentors » augmentors#
- validate_config()#
Validate fields that should be checked after update is called.
This is to complement the builtin validation that Pydantic performs at the time of object construction.
- validate_list(field: str, valid_options: List[str])#
Validate a list field.
- Parameters
- Raises
ConfigError – if field is invalid
- property num_classes#