super_gradients.training.datasets package

Subpackages

Submodules

super_gradients.training.datasets.all_datasets module

exception super_gradients.training.datasets.all_datasets.DataSetDoesNotExistException[source]

Bases: Exception

The requested dataset does not exist, or is not implemented.

class super_gradients.training.datasets.all_datasets.SgLibraryDatasets[source]

Bases: object

Holds all of the different library dataset dictionaries, by DL Task mapping

Attributes:

CLASSIFICATION Dictionary of Classification Data sets OBJECT_DETECTION Dictionary of Object Detection Data sets SEMANTIC_SEGMENTATION Dictionary of Semantic Segmentation Data sets

CLASSIFICATION = {'cifar_10': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.Cifar10DatasetInterface'>, 'cifar_100': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.Cifar100DatasetInterface'>, 'classification_dataset': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.ClassificationDatasetInterface'>, 'imagenet': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.ImageNetDatasetInterface'>, 'library_dataset': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.LibraryDatasetInterface'>, 'test_dataset': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.TestDatasetInterface'>, 'tiny_imagenet': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.TinyImageNetDatasetInterface'>}
OBJECT_DETECTION = {'coco': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.CoCoDetectionDatasetInterface'>}
SEMANTIC_SEGMENTATION = {'coco': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.CoCoSegmentationDatasetInterface'>, 'pascal_aug': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.PascalAUG2012SegmentationDataSetInterface'>, 'pascal_voc': <class 'super_gradients.training.datasets.dataset_interfaces.dataset_interface.PascalVOC2012SegmentationDataSetInterface'>}
static get_all_available_datasets()Dict[str, List[str]][source]

Gets all the available datasets.

static get_dataset(dl_task: str, dataset_name: str)Type[super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface][source]

Get’s a dataset with a given name for a given deep learning task. examp: >>> SgLibraryDatasets.get_dataset(dl_task=’classification’, dataset_name=’cifar_100’) >>> <Cifar100DatasetInterface instance>

super_gradients.training.datasets.auto_augment module

RandAugment RandAugment is a variant of AutoAugment which randomly selects transformations

from AutoAugment to be applied on an image.

RandomAugmentation Implementation adapted from:

https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/auto_augment.py

Papers:

RandAugment: Practical automated data augmentation… - https://arxiv.org/abs/1909.13719

super_gradients.training.datasets.auto_augment.shear_x(img, factor, **kwargs)[source]
super_gradients.training.datasets.auto_augment.shear_y(img, factor, **kwargs)[source]
super_gradients.training.datasets.auto_augment.translate_x_rel(img, pct, **kwargs)[source]
super_gradients.training.datasets.auto_augment.translate_y_rel(img, pct, **kwargs)[source]
super_gradients.training.datasets.auto_augment.translate_x_abs(img, pixels, **kwargs)[source]
super_gradients.training.datasets.auto_augment.translate_y_abs(img, pixels, **kwargs)[source]
super_gradients.training.datasets.auto_augment.rotate(img, degrees, **kwargs)[source]
super_gradients.training.datasets.auto_augment.auto_contrast(img, **__)[source]
super_gradients.training.datasets.auto_augment.invert(img, **__)[source]
super_gradients.training.datasets.auto_augment.equalize(img, **__)[source]
super_gradients.training.datasets.auto_augment.solarize(img, thresh, **__)[source]
super_gradients.training.datasets.auto_augment.solarize_add(img, add, thresh=128, **__)[source]
super_gradients.training.datasets.auto_augment.posterize(img, bits_to_keep, **__)[source]
super_gradients.training.datasets.auto_augment.contrast(img, factor, **__)[source]
super_gradients.training.datasets.auto_augment.color(img, factor, **__)[source]
super_gradients.training.datasets.auto_augment.brightness(img, factor, **__)[source]
super_gradients.training.datasets.auto_augment.sharpness(img, factor, **__)[source]
class super_gradients.training.datasets.auto_augment.AugmentOp(name, prob=0.5, magnitude=10, hparams=None)[source]

Bases: object

single auto augment operations

super_gradients.training.datasets.auto_augment.rand_augment_ops(magnitude=10, hparams=None, transforms=None)[source]
class super_gradients.training.datasets.auto_augment.RandAugment(ops, num_layers=2, choice_weights=None)[source]

Bases: object

Random auto augment class, will select auto augment transforms according to probability weights for each op

super_gradients.training.datasets.auto_augment.rand_augment_transform(config_str, hparams)[source]

Create a RandAugment transform

Parameters

config_str – String defining configuration of random augmentation. Consists of multiple sections separated by

dashes (‘-‘). The first section defines the specific variant of rand augment (currently only ‘rand’). The remaining sections, not order sepecific determine

‘m’ - integer magnitude of rand augment ‘n’ - integer num layers (number of transform ops selected per image) ‘w’ - integer probabiliy weight index (index of a set of weights to influence choice of op) ‘mstd’ - float std deviation of magnitude noise applied ‘inc’ - integer (bool), use augmentations that increase in severity with magnitude (default: 0)

Ex ‘rand-m9-n3-mstd0.5’ results in RandAugment with magnitude 9, num_layers 3, magnitude_std 0.5 ‘rand-mstd1-w0’ results in magnitude_std 1.0, weights 0, default magnitude of 10 and num_layers 2

Parameters

hparams – Other hparams (kwargs) for the RandAugmentation scheme

Returns

A PyTorch compatible Transform

super_gradients.training.datasets.data_augmentation module

class super_gradients.training.datasets.data_augmentation.DataAugmentation[source]

Bases: object

static to_tensor()[source]
static normalize(mean, std)[source]
static cutout(mask_size, p=1, cutout_inside=False, mask_color=(0, 0, 0))[source]
class super_gradients.training.datasets.data_augmentation.Lighting(alphastd, eigval=tensor([0.2175, 0.0188, 0.0045]), eigvec=tensor([[- 0.5675, 0.7192, 0.4009], [- 0.5808, - 0.0045, - 0.8140], [- 0.5836, - 0.6948, 0.4203]]))[source]

Bases: object

Lighting noise(AlexNet - style PCA - based noise) Taken from fastai Imagenet training - https://github.com/fastai/imagenet-fast/blob/faa0f9dfc9e8e058ffd07a248724bf384f526fae/imagenet_nv/fastai_imagenet.py#L103 To use:

  • training_params = {“imagenet_pca_aug”: 0.1}

  • Default training_params arg is 0.0 (“don’t use”)

  • 0.1 is that default in the original paper

class super_gradients.training.datasets.data_augmentation.RandomErase(probability: float, value: str)[source]

Bases: torchvision.transforms.transforms.RandomErasing

A simple class that translates the parameters supported in SuperGradient’s code base

training: bool

super_gradients.training.datasets.datasets_conf module

super_gradients.training.datasets.datasets_utils module

super_gradients.training.datasets.datasets_utils.get_mean_and_std_torch(data_dir=None, dataloader=None, num_workers=4, RandomResizeSize=224)[source]

A function for getting the mean and std of large datasets using pytorch dataloader and gpu functionality.

Parameters
  • data_dir – String, path to none-library dataset folder. For example “/data/Imagenette” or “/data/TinyImagenet”

  • dataloader – a torch DataLoader, as it would feed the data into the trainer (including transforms etc).

  • RandomResizeSize – Int, the size of the RandomResizeCrop as it appears in the DataInterface (for example, for Imagenet,

this value should be 224). :return: 2 lists,mean and std, each one of len 3 (1 for each channel)

super_gradients.training.datasets.datasets_utils.get_mean_and_std(dataset)[source]

Compute the mean and std value of dataset.

class super_gradients.training.datasets.datasets_utils.AbstractCollateFunction[source]

Bases: abc.ABC

A collate function (for torch DataLoader)

class super_gradients.training.datasets.datasets_utils.ComposedCollateFunction(functions: list)[source]

Bases: super_gradients.training.datasets.datasets_utils.AbstractCollateFunction

A function (for torch DataLoader) which executes a sequence of sub collate functions

class super_gradients.training.datasets.datasets_utils.AtomicInteger(value: int = 0)[source]

Bases: object

class super_gradients.training.datasets.datasets_utils.MultiScaleCollateFunction(target_size: Optional[int] = None, min_image_size: Optional[int] = None, max_image_size: Optional[int] = None, image_size_steps: int = 32, change_frequency: int = 10)[source]

Bases: super_gradients.training.datasets.datasets_utils.AbstractCollateFunction

a collate function to implement multi-scale data augmentation according to https://arxiv.org/pdf/1612.08242.pdf

class super_gradients.training.datasets.datasets_utils.AbstractPrePredictionCallback[source]

Bases: abc.ABC

Abstract class for forward pass preprocessing function, to be used by passing its inheritors through training_params

pre_prediction_callback keyword arg.

Should implement __call__ and return images, targets after applying the desired preprocessing.

class super_gradients.training.datasets.datasets_utils.MultiscalePrePredictionCallback(multiscale_range: int = 5, image_size_steps: int = 32, change_frequency: int = 10)[source]

Bases: super_gradients.training.datasets.datasets_utils.AbstractPrePredictionCallback

Mutiscale pre-prediction callback pass function.

When passed through train_params images, targets will be applied by the below transform to support multi scaling on the fly.

After each self.frequency forward passes, change size randomly from

(input_size-self.multiscale_range*self.image_size_steps, input_size-(self.multiscale_range-1)*self.image_size_steps, …input_size+self.multiscale_range*self.image_size_steps)

multiscale_range

(int) Range of values for resize sizes as discussed above (default=5)

image_size_steps

(int) Image step sizes as discussed abov (default=32)

change_frequency

(int) The frequency to apply change in input size.

class super_gradients.training.datasets.datasets_utils.DetectionMultiscalePrePredictionCallback(multiscale_range: int = 5, image_size_steps: int = 32, change_frequency: int = 10)[source]

Bases: super_gradients.training.datasets.datasets_utils.MultiscalePrePredictionCallback

Mutiscalepre-prediction callback for object detection.

When passed through train_params images, targets will be applied by the below transform to support multi scaling on the fly.

After each self.frequency forward passes, change size randomly from

(input_size-self.multiscale_range*self.image_size_steps, input_size-(self.multiscale_range-1)*self.image_size_steps, …input_size+self.multiscale_range*self.image_size_steps) and apply the same rescaling to the box coordinates.

multiscale_range

(int) Range of values for resize sizes as discussed above (default=5)

image_size_steps

(int) Image step sizes as discussed abov (default=32)

change_frequency

(int) The frequency to apply change in input size.

class super_gradients.training.datasets.datasets_utils.RandomResizedCropAndInterpolation(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation='default')[source]

Bases: torchvision.transforms.transforms.RandomResizedCrop

Crop the given PIL Image to random size and aspect ratio with explicitly chosen or random interpolation.

A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.

Parameters
  • size – expected output size of each edge

  • scale – range of size of the origin size cropped

  • ratio – range of aspect ratio of the origin aspect ratio cropped

  • interpolation – Default: PIL.Image.BILINEAR

forward(img)[source]
Parameters

img (PIL Image) – Image to be cropped and resized.

Returns

Randomly cropped and resized image.

Return type

PIL Image

training: bool
class super_gradients.training.datasets.datasets_utils.DatasetStatisticsTensorboardLogger(sg_logger: super_gradients.common.sg_loggers.abstract_sg_logger.AbstractSGLogger, summary_params: dict = {'max_batches': 30, 'plot_anchors_coverage': True, 'plot_box_size_distribution': True, 'plot_class_distribution': True, 'sample_images': 32})[source]

Bases: object

logger = <Logger super_gradients.training.datasets.datasets_utils (INFO)>
DEFAULT_SUMMARY_PARAMS = {'max_batches': 30, 'plot_anchors_coverage': True, 'plot_box_size_distribution': True, 'plot_class_distribution': True, 'sample_images': 32}
analyze(data_loader: torch.utils.data.dataloader.DataLoader, title: str, all_classes: List[str], anchors: Optional[list] = None)[source]
Parameters
  • data_loader – the dataset data loader

  • dataset_params – the dataset parameters

  • title – the title for this dataset (i.e. Coco 2017 test set)

  • anchors – the list of anchors used by the model. applicable only for detection datasets

  • all_classes – the list of all classes names

super_gradients.training.datasets.datasets_utils.get_color_augmentation(rand_augment_config_string: str, color_jitter: tuple, crop_size=224, img_mean=[0.485, 0.456, 0.406])[source]

Returns color augmentation class. As these augmentation cannot work on top one another, only one is returned according to rand_augment_config_string

Parameters
  • rand_augment_config_string – string which defines the auto augment configurations. If none, color jitter will be returned. For possibile values see auto_augment.py

  • color_jitter – tuple for color jitter value.

  • crop_size – relevant only for auto augment

  • img_mean – relevant only for auto augment

Returns

RandAugment transform or ColorJitter

super_gradients.training.datasets.datasets_utils.worker_init_reset_seed(worker_id)[source]

Make sure each process has different random seed, especially for ‘fork’ method. Check https://github.com/pytorch/pytorch/issues/63311 for more details.

Parameters

worker_id – placeholder (needs to be passed to DataLoader init).

super_gradients.training.datasets.mixup module

Mixup and Cutmix

Papers: mixup: Beyond Empirical Risk Minimization (https://arxiv.org/abs/1710.09412)

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features (https://arxiv.org/abs/1905.04899)

Code Reference: CutMix: https://github.com/clovaai/CutMix-PyTorch CutMix by timm: https://github.com/rwightman/pytorch-image-models/timm

super_gradients.training.datasets.mixup.one_hot(x, num_classes, on_value=1.0, off_value=0.0, device='cuda')[source]
super_gradients.training.datasets.mixup.mixup_target(target: torch.Tensor, num_classes: int, lam: float = 1.0, smoothing: float = 0.0, device: str = 'cuda')[source]

generate a smooth target (label) two-hot tensor to support the mixed images with different labels :param target: the targets tensor :param num_classes: number of classes (to set the final tensor size) :param lam: percentage of label a range [0, 1] in the mixing :param smoothing: the smoothing multiplier :param device: usable device [‘cuda’, ‘cpu’] :return:

super_gradients.training.datasets.mixup.rand_bbox(img_shape: tuple, lam: float, margin: float = 0.0, count: Optional[int] = None)[source]

Standard CutMix bounding-box Generates a random square bbox based on lambda value. This impl includes support for enforcing a border margin as percent of bbox dimensions.

Parameters
  • img_shape – Image shape as tuple

  • lam – Cutmix lambda value

  • margin – Percentage of bbox dimension to enforce as margin (reduce amount of box outside image)

  • count – Number of bbox to generate

super_gradients.training.datasets.mixup.rand_bbox_minmax(img_shape: tuple, minmax: Union[tuple, list], count: Optional[int] = None)[source]

Min-Max CutMix bounding-box Inspired by Darknet cutmix impl, generates a random rectangular bbox based on min/max percent values applied to each dimension of the input image.

Typical defaults for minmax are usually in the .2-.3 for min and .8-.9 range for max.

Parameters
  • img_shape – Image shape as tuple

  • minmax – Min and max bbox ratios (as percent of image size)

  • count – Number of bbox to generate

super_gradients.training.datasets.mixup.cutmix_bbox_and_lam(img_shape: tuple, lam: float, ratio_minmax: Optional[Union[tuple, list]] = None, correct_lam: bool = True, count: Optional[int] = None)[source]

Generate bbox and apply lambda correction.

class super_gradients.training.datasets.mixup.CollateMixup(mixup_alpha: float = 1.0, cutmix_alpha: float = 0.0, cutmix_minmax: Optional[List[float]] = None, prob: float = 1.0, switch_prob: float = 0.5, mode: str = 'batch', correct_lam: bool = True, label_smoothing: float = 0.1, num_classes: int = 1000)[source]

Bases: object

Collate with Mixup/Cutmix that applies different params to each element or whole batch A Mixup impl that’s performed while collating the batches.

super_gradients.training.datasets.sg_dataset module

class super_gradients.training.datasets.sg_dataset.BaseSgVisionDataset(root: str, sample_loader: Callable = <function default_loader>, target_loader: Optional[Callable] = None, collate_fn: Optional[Callable] = None, valid_sample_extensions: tuple = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'), sample_transform: Optional[Callable] = None, target_transform: Optional[Callable] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

static numpy_loader_func(path)[source]
_numpy_loader_func - Uses numpy load func
param path

return

static text_file_loader_func(text_file_path: str, inline_splitter: str = ' ')list[source]
text_file_loader_func - Uses a line by line based code to get vectorized data from a text-based file
param text_file_path

Input text file

param inline_splitter

The char to use in order to separate between different VALUES of the SAME vector please notice that DIFFERENT VECTORS SHOULD BE IN SEPARATE LINES (’

‘) SEPARATED
return

a list of tuples, where each tuple is a vector of target values

class super_gradients.training.datasets.sg_dataset.DirectoryDataSet(root: str, samples_sub_directory: str, targets_sub_directory: str, target_extension: str, sample_loader: Callable = <function default_loader>, target_loader: Optional[Callable] = None, collate_fn: Optional[Callable] = None, sample_extensions: tuple = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'), sample_transform: Optional[Callable] = None, target_transform: Optional[Callable] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

DirectoryDataSet - A PyTorch Vision Data Set extension that receives a root Dir and two separate sub directories:
  • Sub-Directory for Samples

  • Sub-Directory for Targets

class super_gradients.training.datasets.sg_dataset.ListDataset(root, file, sample_loader: Callable = <function default_loader>, target_loader: Optional[Callable] = None, collate_fn: Optional[Callable] = None, sample_extensions: tuple = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'), sample_transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, target_extension='.npy')[source]

Bases: Generic[torch.utils.data.dataset.T_co]

ListDataset - A PyTorch Vision Data Set extension that receives a file with FULL PATH to each of the samples.

Then, the assumption is that for every sample, there is a * matching target * in the same path but with a different extension, i.e:

for the samples paths: (That appear in the list file)

/root/dataset/class_x/sample1.png /root/dataset/class_y/sample123.png

the matching labels paths: (That DO NOT appear in the list file)

/root/dataset/class_x/sample1.ext /root/dataset/class_y/sample123.ext

Module contents

class super_gradients.training.datasets.DataAugmentation[source]

Bases: object

static to_tensor()[source]
static normalize(mean, std)[source]
static cutout(mask_size, p=1, cutout_inside=False, mask_color=(0, 0, 0))[source]
class super_gradients.training.datasets.ListDataset(root, file, sample_loader: Callable = <function default_loader>, target_loader: Optional[Callable] = None, collate_fn: Optional[Callable] = None, sample_extensions: tuple = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'), sample_transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, target_extension='.npy')[source]

Bases: Generic[torch.utils.data.dataset.T_co]

ListDataset - A PyTorch Vision Data Set extension that receives a file with FULL PATH to each of the samples.

Then, the assumption is that for every sample, there is a * matching target * in the same path but with a different extension, i.e:

for the samples paths: (That appear in the list file)

/root/dataset/class_x/sample1.png /root/dataset/class_y/sample123.png

the matching labels paths: (That DO NOT appear in the list file)

/root/dataset/class_x/sample1.ext /root/dataset/class_y/sample123.ext

class super_gradients.training.datasets.DirectoryDataSet(root: str, samples_sub_directory: str, targets_sub_directory: str, target_extension: str, sample_loader: Callable = <function default_loader>, target_loader: Optional[Callable] = None, collate_fn: Optional[Callable] = None, sample_extensions: tuple = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'), sample_transform: Optional[Callable] = None, target_transform: Optional[Callable] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

DirectoryDataSet - A PyTorch Vision Data Set extension that receives a root Dir and two separate sub directories:
  • Sub-Directory for Samples

  • Sub-Directory for Targets

class super_gradients.training.datasets.SegmentationDataSet(root: str, list_file: str = None, samples_sub_directory: str = None, targets_sub_directory: str = None, img_size: int = 608, crop_size: int = 512, batch_size: int = 16, augment: bool = False, dataset_hyper_params: dict = None, cache_labels: bool = False, cache_images: bool = False, sample_loader: Callable = None, target_loader: Callable = None, collate_fn: Callable = None, target_extension: str = '.png', image_mask_transforms: torchvision.transforms.transforms.Compose = None, image_mask_transforms_aug: torchvision.transforms.transforms.Compose = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

static sample_loader(sample_path: str)<module ‘PIL.Image’ from ‘/Users/shaniperl/opt/anaconda3/lib/python3.9/site-packages/PIL/Image.py’>[source]
sample_loader - Loads a dataset image from path using PIL
param sample_path

The path to the sample image

return

The loaded Image

static sample_transform(image)[source]

sample_transform - Transforms the sample image

param image

The input image to transform

return

The transformed image

static target_loader(target_path: str)<module ‘PIL.Image’ from ‘/Users/shaniperl/opt/anaconda3/lib/python3.9/site-packages/PIL/Image.py’>[source]
Parameters

target_path – The path to the sample image

Returns

The loaded Image

static target_transform(target)[source]

target_transform - Transforms the sample image

param target

The target mask to transform

return

The transformed target mask

class super_gradients.training.datasets.PascalVOC2012SegmentationDataSet(sample_suffix=None, target_suffix=None, *args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

PascalVOC2012SegmentationDataSet - Segmentation Data Set Class for Pascal VOC 2012 Data Set

decode_segmentation_mask(label_mask: numpy.ndarray)[source]
decode_segmentation_mask - Decodes the colors for the Segmentation Mask
param

label_mask: an (M,N) array of integer values denoting the class label at each spatial location.

Returns

class super_gradients.training.datasets.PascalAUG2012SegmentationDataSet(*args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

PascalAUG2012SegmentationDataSet - Segmentation Data Set Class for Pascal AUG 2012 Data Set

static target_loader(target_path: str)<module ‘PIL.Image’ from ‘/Users/shaniperl/opt/anaconda3/lib/python3.9/site-packages/PIL/Image.py’>[source]
Parameters

target_path – The path to the target data

Returns

The loaded target

class super_gradients.training.datasets.CoCoSegmentationDataSet(dataset_classes_inclusion_tuples_list: Optional[list] = None, *args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

CoCoSegmentationDataSet - Segmentation Data Set Class for COCO 2017 Segmentation Data Set

target_loader(mask_metadata_tuple)<module ‘PIL.Image’ from ‘/Users/shaniperl/opt/anaconda3/lib/python3.9/site-packages/PIL/Image.py’>[source]
Parameters

mask_metadata_tuple – A tuple of (coco_image_id, original_image_height, original_image_width)

Returns

The mask image created from the array

class super_gradients.training.datasets.TestDatasetInterface(trainset, dataset_params={}, classes=None)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface

get_data_loaders(batch_size_factor=1, num_workers=8, train_batch_size=None, val_batch_size=None, distributed_sampler=False)[source]

Get self.train_loader, self.val_loader, self.test_loader, self.classes.

If the data loaders haven’t been initialized yet, build them first.

Parameters

kwargs – kwargs are passed to build_data_loaders.

class super_gradients.training.datasets.DatasetInterface(dataset_params={}, train_loader=None, val_loader=None, test_loader=None, classes=None)[source]

Bases: object

DatasetInterface - This class manages all of the “communiation” the Model has with the Data Sets

download_from_cloud()[source]
build_data_loaders(batch_size_factor=1, num_workers=8, train_batch_size=None, val_batch_size=None, test_batch_size=None, distributed_sampler: bool = False)[source]

define train, val (and optionally test) loaders. The method deals separately with distributed training and standard (non distributed, or parallel training). In the case of distributed training we need to rely on distributed samplers. :param batch_size_factor: int - factor to multiply the batch size (usually for multi gpu) :param num_workers: int - number of workers (parallel processes) for dataloaders :param train_batch_size: int - batch size for train loader, if None will be taken from dataset_params :param val_batch_size: int - batch size for val loader, if None will be taken from dataset_params :param distributed_sampler: boolean flag for distributed training mode :return: train_loader, val_loader, classes: list of classes

get_data_loaders(**kwargs)[source]

Get self.train_loader, self.val_loader, self.test_loader, self.classes.

If the data loaders haven’t been initialized yet, build them first.

Parameters

kwargs – kwargs are passed to build_data_loaders.

get_val_sample(num_samples=1)[source]
get_dataset_params()[source]
print_dataset_details()[source]
class super_gradients.training.datasets.Cifar10DatasetInterface(dataset_params={})[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.LibraryDatasetInterface

class super_gradients.training.datasets.CoCoSegmentationDatasetInterface(dataset_params=None, cache_labels: bool = False, cache_images: bool = False, dataset_classes_inclusion_tuples_list: Optional[list] = None)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.CoCoDataSetInterfaceBase

class super_gradients.training.datasets.PascalVOC2012SegmentationDataSetInterface(dataset_params=None, cache_labels=False, cache_images=False)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface

class super_gradients.training.datasets.PascalAUG2012SegmentationDataSetInterface(dataset_params=None, cache_labels=False, cache_images=False)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface

class super_gradients.training.datasets.TestYoloDetectionDatasetInterface(dataset_params={}, input_dims=(3, 32, 32), batch_size=5)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface

note: the output size is (batch_size, 6) in the test while in real training the size of axis 0 can vary (the number of bounding boxes)

class super_gradients.training.datasets.DetectionTestDatasetInterface(dataset_params={}, image_size=320, batch_size=4, classes=None)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.TestDatasetInterface

class super_gradients.training.datasets.ClassificationTestDatasetInterface(dataset_params={}, image_size=32, batch_size=5, classes=None)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.TestDatasetInterface

class super_gradients.training.datasets.SegmentationTestDatasetInterface(dataset_params={}, image_size=512, batch_size=4)[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.TestDatasetInterface

class super_gradients.training.datasets.ImageNetDatasetInterface(dataset_params={}, data_dir='/data/Imagenet')[source]

Bases: super_gradients.training.datasets.dataset_interfaces.dataset_interface.DatasetInterface

class super_gradients.training.datasets.DetectionDataset(data_dir: str, input_dim: tuple, original_target_format: super_gradients.training.utils.detection_utils.DetectionTargetsFormat, max_num_samples: Optional[int] = None, cache: bool = False, cache_path: Optional[str] = None, transforms: List[super_gradients.training.transforms.transforms.DetectionTransform] = [], all_classes_list: Optional[List[str]] = None, class_inclusion_list: Optional[List[str]] = None, ignore_empty_annotations: bool = True, target_fields: Optional[List[str]] = None, output_fields: Optional[List[str]] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset.

This is a boilerplate class to facilitate the implementation of datasets.

HOW TO CREATE A DATASET THAT INHERITS FROM DetectionDataSet ?
  • Inherit from DetectionDataSet

  • implement the method self._load_annotation to return at least the fields “target” and “img_path”

  • Call super().__init__ with the required params.
    //!super().__init__ will call self._load_annotation, so make sure that every required

    attributes are set up before calling super().__init__ (ideally just call it last)

WORKFLOW:
  • On instantiation:
    • All annotations are cached. If class_inclusion_list was specified, there is also subclassing at this step.

    • If cache is True, the images are also cached

  • On call (__getitem__) for a specific image index:
    • The image and annotations are grouped together in a dict called SAMPLE

    • the sample is processed according to th transform

    • Only the specified fields are returned by __getitem__

TERMINOLOGY
  • TARGET: Groundtruth, made of bboxes. The format can vary from one dataset to another

  • ANNOTATION: Combination of targets (groundtruth) and metadata of the image, but without the image itself.

    > Has to include the fields “target” and “img_path” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • SAMPLE: Outout of the dataset:

    > Has to include the fields “target” and “image” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • INDEX: Refers to the index in the dataset.

  • SAMPLE ID: Refers to the id of sample before droping any annotaion.

    Let’s imagine a situation where the downloaded data is made of 120 images but 20 were drop because they had no annotation. In that case:

    > We have 120 samples so sample_id will be between 0 and 119 > But only 100 will be indexed so index will be between 0 and 99 > Therefore, we also have len(self) = 100

get_random_item()[source]
get_sample(index: int)Dict[str, Union[numpy.ndarray, Any]][source]

Get raw sample, before any transform (beside subclassing). :param index: Image index :return: Sample, i.e. a dictionary including at least “image” and “target”

get_resized_image(index: int)numpy.ndarray[source]

Get the resized image at a specific sample_id, either from cache or by loading from disk, based on self.cached_imgs :param index: Image index :return: Resized image

apply_transforms(sample: Dict[str, Union[numpy.ndarray, Any]])Dict[str, Union[numpy.ndarray, Any]][source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_annotations” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.get_sample)

Returns

Transformed sample

get_random_samples(count: int, non_empty_annotations_only: bool = False)List[Dict[str, Union[numpy.ndarray, Any]]][source]

Load random samples.

Parameters
  • count – The number of samples wanted

  • non_empty_annotations_only – If true, only return samples with at least 1 annotation

Returns

A list of samples satisfying input params

get_random_sample(non_empty_annotations_only: bool = False)[source]
property output_target_format
plot(max_samples_per_plot: int = 16, n_plots: int = 1, plot_transformed_data: bool = True)[source]

Combine samples of images with bbox into plots and display the result.

Parameters
  • max_samples_per_plot – Maximum number of images to be displayed per plot

  • n_plots – Number of plots to display (each plot being a combination of img with bbox)

  • plot_transformed_data – If True, the plot will be over samples after applying transforms (i.e. on __getitem__). If False, the plot will be over the raw samples (i.e. on get_sample)

Returns

class super_gradients.training.datasets.COCODetectionDataset(img_size: tuple, data_dir: Optional[str] = None, json_file: str = 'instances_train2017.json', name: str = 'images/train2017', cache: bool = False, cache_dir_path: Optional[str] = None, tight_box_rotation: bool = False, transforms: list = [], with_crowd: bool = True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset COCO implementation

load_resized_img(index)[source]

Loads image at index, and resizes it to self.input_dim

Parameters

index – index to load the image from

Returns

resized_img

load_sample(index)[source]
Loads sample at self.ids[index] as dictionary that holds:

“image”: Image resized to self.input_dim “target”: Detection ground truth, np.array shaped (num_targets, 5), format is [class,x1,y1,x2,y2] with

image coordinates.

“target_seg”: Segmentation map convex hull derived detection target. “info”: Original shape (height,width). “id”: COCO image id

Parameters

index – Sample index

Returns

sample as described above

load_image(index)[source]

Loads image at index with its original resolution :param index: index in self.annotations :return: image (np.array)

apply_transforms(sample: dict)[source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_targets” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.load_sample)

Returns

Transformed sample

class super_gradients.training.datasets.PascalVOCDetectionDataset(images_sub_directory: str, *args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for Pascal VOC object detection

static download(data_dir: str)[source]

Download Pascal dataset in XYXY_LABEL format.

Data extracted form http://host.robots.ox.ac.uk/pascal/VOC/