super_gradients.training.datasets.detection_datasets package

Submodules

super_gradients.training.datasets.detection_datasets.coco_detection module

class super_gradients.training.datasets.detection_datasets.coco_detection.COCODetectionDataset(img_size: tuple, data_dir: Optional[str] = None, json_file: str = 'instances_train2017.json', name: str = 'images/train2017', cache: bool = False, cache_dir_path: Optional[str] = None, tight_box_rotation: bool = False, transforms: list = [], with_crowd: bool = True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset COCO implementation

load_resized_img(index)[source]

Loads image at index, and resizes it to self.input_dim

Parameters

index – index to load the image from

Returns

resized_img

load_sample(index)[source]
Loads sample at self.ids[index] as dictionary that holds:

“image”: Image resized to self.input_dim “target”: Detection ground truth, np.array shaped (num_targets, 5), format is [class,x1,y1,x2,y2] with

image coordinates.

“target_seg”: Segmentation map convex hull derived detection target. “info”: Original shape (height,width). “id”: COCO image id

Parameters

index – Sample index

Returns

sample as described above

load_image(index)[source]

Loads image at index with its original resolution :param index: index in self.annotations :return: image (np.array)

apply_transforms(sample: dict)[source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_targets” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.load_sample)

Returns

Transformed sample

super_gradients.training.datasets.detection_datasets.coco_detection.remove_useless_info(coco, use_seg_info=False)[source]

Remove useless info in coco dataset. COCO object is modified inplace. This function is mainly used for saving memory (save about 30% mem).

super_gradients.training.datasets.detection_datasets.detection_dataset module

class super_gradients.training.datasets.detection_datasets.detection_dataset.DetectionDataset(data_dir: str, input_dim: tuple, original_target_format: super_gradients.training.utils.detection_utils.DetectionTargetsFormat, max_num_samples: Optional[int] = None, cache: bool = False, cache_path: Optional[str] = None, transforms: List[super_gradients.training.transforms.transforms.DetectionTransform] = [], all_classes_list: Optional[List[str]] = None, class_inclusion_list: Optional[List[str]] = None, ignore_empty_annotations: bool = True, target_fields: Optional[List[str]] = None, output_fields: Optional[List[str]] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset.

This is a boilerplate class to facilitate the implementation of datasets.

HOW TO CREATE A DATASET THAT INHERITS FROM DetectionDataSet ?
  • Inherit from DetectionDataSet

  • implement the method self._load_annotation to return at least the fields “target” and “img_path”

  • Call super().__init__ with the required params.
    //!super().__init__ will call self._load_annotation, so make sure that every required

    attributes are set up before calling super().__init__ (ideally just call it last)

WORKFLOW:
  • On instantiation:
    • All annotations are cached. If class_inclusion_list was specified, there is also subclassing at this step.

    • If cache is True, the images are also cached

  • On call (__getitem__) for a specific image index:
    • The image and annotations are grouped together in a dict called SAMPLE

    • the sample is processed according to th transform

    • Only the specified fields are returned by __getitem__

TERMINOLOGY
  • TARGET: Groundtruth, made of bboxes. The format can vary from one dataset to another

  • ANNOTATION: Combination of targets (groundtruth) and metadata of the image, but without the image itself.

    > Has to include the fields “target” and “img_path” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • SAMPLE: Outout of the dataset:

    > Has to include the fields “target” and “image” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • INDEX: Refers to the index in the dataset.

  • SAMPLE ID: Refers to the id of sample before droping any annotaion.

    Let’s imagine a situation where the downloaded data is made of 120 images but 20 were drop because they had no annotation. In that case:

    > We have 120 samples so sample_id will be between 0 and 119 > But only 100 will be indexed so index will be between 0 and 99 > Therefore, we also have len(self) = 100

get_random_item()[source]
get_sample(index: int)Dict[str, Union[numpy.ndarray, Any]][source]

Get raw sample, before any transform (beside subclassing). :param index: Image index :return: Sample, i.e. a dictionary including at least “image” and “target”

get_resized_image(index: int)numpy.ndarray[source]

Get the resized image at a specific sample_id, either from cache or by loading from disk, based on self.cached_imgs :param index: Image index :return: Resized image

apply_transforms(sample: Dict[str, Union[numpy.ndarray, Any]])Dict[str, Union[numpy.ndarray, Any]][source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_annotations” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.get_sample)

Returns

Transformed sample

get_random_samples(count: int, non_empty_annotations_only: bool = False)List[Dict[str, Union[numpy.ndarray, Any]]][source]

Load random samples.

Parameters
  • count – The number of samples wanted

  • non_empty_annotations_only – If true, only return samples with at least 1 annotation

Returns

A list of samples satisfying input params

get_random_sample(non_empty_annotations_only: bool = False)[source]
property output_target_format
plot(max_samples_per_plot: int = 16, n_plots: int = 1, plot_transformed_data: bool = True)[source]

Combine samples of images with bbox into plots and display the result.

Parameters
  • max_samples_per_plot – Maximum number of images to be displayed per plot

  • n_plots – Number of plots to display (each plot being a combination of img with bbox)

  • plot_transformed_data – If True, the plot will be over samples after applying transforms (i.e. on __getitem__). If False, the plot will be over the raw samples (i.e. on get_sample)

Returns

super_gradients.training.datasets.detection_datasets.pascal_voc_detection module

class super_gradients.training.datasets.detection_datasets.pascal_voc_detection.PascalVOCDetectionDataset(images_sub_directory: str, *args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for Pascal VOC object detection

static download(data_dir: str)[source]

Download Pascal dataset in XYXY_LABEL format.

Data extracted form http://host.robots.ox.ac.uk/pascal/VOC/

Module contents

class super_gradients.training.datasets.detection_datasets.COCODetectionDataset(img_size: tuple, data_dir: Optional[str] = None, json_file: str = 'instances_train2017.json', name: str = 'images/train2017', cache: bool = False, cache_dir_path: Optional[str] = None, tight_box_rotation: bool = False, transforms: list = [], with_crowd: bool = True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset COCO implementation

load_resized_img(index)[source]

Loads image at index, and resizes it to self.input_dim

Parameters

index – index to load the image from

Returns

resized_img

load_sample(index)[source]
Loads sample at self.ids[index] as dictionary that holds:

“image”: Image resized to self.input_dim “target”: Detection ground truth, np.array shaped (num_targets, 5), format is [class,x1,y1,x2,y2] with

image coordinates.

“target_seg”: Segmentation map convex hull derived detection target. “info”: Original shape (height,width). “id”: COCO image id

Parameters

index – Sample index

Returns

sample as described above

load_image(index)[source]

Loads image at index with its original resolution :param index: index in self.annotations :return: image (np.array)

apply_transforms(sample: dict)[source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_targets” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.load_sample)

Returns

Transformed sample

class super_gradients.training.datasets.detection_datasets.DetectionDataset(data_dir: str, input_dim: tuple, original_target_format: super_gradients.training.utils.detection_utils.DetectionTargetsFormat, max_num_samples: Optional[int] = None, cache: bool = False, cache_path: Optional[str] = None, transforms: List[super_gradients.training.transforms.transforms.DetectionTransform] = [], all_classes_list: Optional[List[str]] = None, class_inclusion_list: Optional[List[str]] = None, ignore_empty_annotations: bool = True, target_fields: Optional[List[str]] = None, output_fields: Optional[List[str]] = None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Detection dataset.

This is a boilerplate class to facilitate the implementation of datasets.

HOW TO CREATE A DATASET THAT INHERITS FROM DetectionDataSet ?
  • Inherit from DetectionDataSet

  • implement the method self._load_annotation to return at least the fields “target” and “img_path”

  • Call super().__init__ with the required params.
    //!super().__init__ will call self._load_annotation, so make sure that every required

    attributes are set up before calling super().__init__ (ideally just call it last)

WORKFLOW:
  • On instantiation:
    • All annotations are cached. If class_inclusion_list was specified, there is also subclassing at this step.

    • If cache is True, the images are also cached

  • On call (__getitem__) for a specific image index:
    • The image and annotations are grouped together in a dict called SAMPLE

    • the sample is processed according to th transform

    • Only the specified fields are returned by __getitem__

TERMINOLOGY
  • TARGET: Groundtruth, made of bboxes. The format can vary from one dataset to another

  • ANNOTATION: Combination of targets (groundtruth) and metadata of the image, but without the image itself.

    > Has to include the fields “target” and “img_path” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • SAMPLE: Outout of the dataset:

    > Has to include the fields “target” and “image” > Can include other fields like “crowd_target”, “image_info”, “segmentation”, …

  • INDEX: Refers to the index in the dataset.

  • SAMPLE ID: Refers to the id of sample before droping any annotaion.

    Let’s imagine a situation where the downloaded data is made of 120 images but 20 were drop because they had no annotation. In that case:

    > We have 120 samples so sample_id will be between 0 and 119 > But only 100 will be indexed so index will be between 0 and 99 > Therefore, we also have len(self) = 100

get_random_item()[source]
get_sample(index: int)Dict[str, Union[numpy.ndarray, Any]][source]

Get raw sample, before any transform (beside subclassing). :param index: Image index :return: Sample, i.e. a dictionary including at least “image” and “target”

get_resized_image(index: int)numpy.ndarray[source]

Get the resized image at a specific sample_id, either from cache or by loading from disk, based on self.cached_imgs :param index: Image index :return: Resized image

apply_transforms(sample: Dict[str, Union[numpy.ndarray, Any]])Dict[str, Union[numpy.ndarray, Any]][source]

Applies self.transforms sequentially to sample

If a transforms has the attribute ‘additional_samples_count’, additional samples will be loaded and stored in

sample[“additional_samples”] prior to applying it. Combining with the attribute “non_empty_annotations” will load only additional samples with objects in them.

Parameters

sample – Sample to apply the transforms on to (loaded with self.get_sample)

Returns

Transformed sample

get_random_samples(count: int, non_empty_annotations_only: bool = False)List[Dict[str, Union[numpy.ndarray, Any]]][source]

Load random samples.

Parameters
  • count – The number of samples wanted

  • non_empty_annotations_only – If true, only return samples with at least 1 annotation

Returns

A list of samples satisfying input params

get_random_sample(non_empty_annotations_only: bool = False)[source]
property output_target_format
plot(max_samples_per_plot: int = 16, n_plots: int = 1, plot_transformed_data: bool = True)[source]

Combine samples of images with bbox into plots and display the result.

Parameters
  • max_samples_per_plot – Maximum number of images to be displayed per plot

  • n_plots – Number of plots to display (each plot being a combination of img with bbox)

  • plot_transformed_data – If True, the plot will be over samples after applying transforms (i.e. on __getitem__). If False, the plot will be over the raw samples (i.e. on get_sample)

Returns

class super_gradients.training.datasets.detection_datasets.PascalVOCDetectionDataset(images_sub_directory: str, *args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for Pascal VOC object detection

static download(data_dir: str)[source]

Download Pascal dataset in XYXY_LABEL format.

Data extracted form http://host.robots.ox.ac.uk/pascal/VOC/