super_gradients.training.utils package

Subpackages

Submodules

super_gradients.training.utils.callbacks module

class super_gradients.training.utils.callbacks.Phase(value)[source]

Bases: enum.Enum

An enumeration.

PRE_TRAINING = 'PRE_TRAINING'
TRAIN_BATCH_END = 'TRAIN_BATCH_END'
TRAIN_BATCH_STEP = 'TRAIN_BATCH_STEP'
TRAIN_EPOCH_START = 'TRAIN_EPOCH_START'
TRAIN_EPOCH_END = 'TRAIN_EPOCH_END'
VALIDATION_BATCH_END = 'VALIDATION_BATCH_END'
VALIDATION_EPOCH_END = 'VALIDATION_EPOCH_END'
VALIDATION_END_BEST_EPOCH = 'VALIDATION_END_BEST_EPOCH'
TEST_BATCH_END = 'TEST_BATCH_END'
TEST_END = 'TEST_END'
POST_TRAINING = 'POST_TRAINING'
class super_gradients.training.utils.callbacks.ContextSgMethods(**methods)[source]

Bases: object

Class for delegating SgModel’s methods, so that only the relevant ones are (“phase wise”) are accessible.

class super_gradients.training.utils.callbacks.PhaseContext(epoch=None, batch_idx=None, optimizer=None, metrics_dict=None, inputs=None, preds=None, target=None, metrics_compute_fn=None, loss_avg_meter=None, loss_log_items=None, criterion=None, device=None, experiment_name=None, ckpt_dir=None, net=None, lr_warmup_epochs=None, sg_logger=None, train_loader=None, valid_loader=None, training_params=None, ddp_silent_mode=None, checkpoint_params=None, architecture=None, arch_params=None, metric_idx_in_results_tuple=None, metric_to_watch=None, valid_metrics=None, context_methods=None)[source]

Bases: object

Represents the input for phase callbacks, and is constantly updated after callback calls.

update_context(**kwargs)[source]
class super_gradients.training.utils.callbacks.PhaseCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]

Bases: object

class super_gradients.training.utils.callbacks.ModelConversionCheckCallback(model_meta_data: deci_lab_client.models.model_metadata.ModelMetadata, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Pre-training callback that verifies model conversion to onnx given specified conversion parameters.

The model is converted, then inference is applied with onnx runtime.

Use this callback wit hthe same args as DeciPlatformCallback to prevent conversion fails at the end of training.

model_meta_data

(ModelMetadata) model’s meta-data object.

The following parameters may be passed as kwargs in order to control the conversion to onnx
class super_gradients.training.utils.callbacks.DeciLabUploadCallback(model_meta_data, optimization_request_form, auth_token: Optional[str] = None, ckpt_name='ckpt_best.pth', **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Post-training callback for uploading and optimizing a model.

email

(str) username for Deci platform.

model_meta_data

(ModelMetadata) model’s meta-data object.

optimization_request_form

(dict) optimization request form object.

password

(str) default=None, should only be used for testing.

ckpt_name

(str) default=”ckpt_best” refers to the filename of the checkpoint, inside the checkpoint directory.

The following parameters may be passed as kwargs in order to control the conversion to onnx
static log_optimization_failed()[source]
upload_model(model)[source]

This function will upload the trained model to the Deci Lab

Parameters

model – The resulting model from the training process

get_optimization_status(optimized_model_name: str)[source]

This function will do fetch the optimized version of the trained model and check on its benchmark status. The status will be checked against the server every 30 seconds and the process will timeout after 30 minutes or log about the successful optimization - whichever happens first. :param optimized_model_name: Optimized model name :type optimized_model_name: str

Returns

whether or not the optimized model has been benchmarked

Return type

bool

class super_gradients.training.utils.callbacks.LRCallbackBase(phase, initial_lr, update_param_groups, train_loader_len, net, training_params, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Base class for hard coded learning rate scheduling regimes, implemented as callbacks.

is_lr_scheduling_enabled(context: super_gradients.training.utils.callbacks.PhaseContext)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

perform_scheduling(context: super_gradients.training.utils.callbacks.PhaseContext)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

update_lr(optimizer, epoch, batch_idx=None)[source]
class super_gradients.training.utils.callbacks.WarmupLRCallback(**kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

LR scheduling callback for linear step warmup. LR climbs from warmup_initial_lr with even steps to initial lr. When warmup_initial_lr is None- LR climb starts from

initial_lr/(1+warmup_epochs).

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

class super_gradients.training.utils.callbacks.StepLRCallback(lr_updates, lr_decay_factor, step_lr_update_freq=None, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

Hard coded step learning rate scheduling (i.e at specific milestones).

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

class super_gradients.training.utils.callbacks.ExponentialLRCallback(lr_decay_factor: float, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

Exponential decay learning rate scheduling. Decays the learning rate by lr_decay_factor every epoch.

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

class super_gradients.training.utils.callbacks.PolyLRCallback(max_epochs, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

Hard coded polynomial decay learning rate scheduling (i.e at specific milestones).

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

class super_gradients.training.utils.callbacks.CosineLRCallback(max_epochs, cosine_final_lr_ratio, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

Hard coded step Cosine anealing learning rate scheduling.

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

class super_gradients.training.utils.callbacks.FunctionLRCallback(max_epochs, lr_schedule_function, **kwargs)[source]

Bases: super_gradients.training.utils.callbacks.LRCallbackBase

Hard coded rate scheduling for user defined lr scheduling function.

is_lr_scheduling_enabled(context)[source]

Predicate that controls whether to perform lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.

perform_scheduling(context)[source]

Performs lr scheduling based on values in context.

@param context: PhaseContext: current phase’s context.

exception super_gradients.training.utils.callbacks.IllegalLRSchedulerMetric(metric_name, metrics_dict)[source]

Bases: Exception

Exception raised illegal combination of training parameters.

message -- explanation of the error
class super_gradients.training.utils.callbacks.LRSchedulerCallback(scheduler, phase, metric_name=None)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Learning rate scheduler callback.

scheduler

torch.optim._LRScheduler, the learning rate scheduler to be called step() with.

metric_name

str, (default=None) the metric name for ReduceLROnPlateau learning rate scheduler.

When passing __call__ a metrics_dict, with a key=self.metric_name, the value of that metric will monitored

for ReduceLROnPlateau (i.e step(metrics_dict[self.metric_name]).

class super_gradients.training.utils.callbacks.MetricsUpdateCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

class super_gradients.training.utils.callbacks.KDModelMetricsUpdateCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]

Bases: super_gradients.training.utils.callbacks.MetricsUpdateCallback

class super_gradients.training.utils.callbacks.PhaseContextTestCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

A callback that saves the phase context the for testing.

class super_gradients.training.utils.callbacks.DetectionVisualizationCallback(phase: super_gradients.training.utils.callbacks.Phase, freq: int, post_prediction_callback: super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback, classes: list, batch_idx: int = 0, last_img_idx_in_batch: int = - 1)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

A callback that adds a visualization of a batch of detection predictions to context.sg_logger .. attribute:: freq

frequency (in epochs) to perform this callback.

batch_idx

batch index to perform visualization for.

classes

class list of the dataset.

last_img_idx_in_batch

Last image index to add to log. (default=-1, will take entire batch).

class super_gradients.training.utils.callbacks.BinarySegmentationVisualizationCallback(phase: super_gradients.training.utils.callbacks.Phase, freq: int, batch_idx: int = 0, last_img_idx_in_batch: int = - 1)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

A callback that adds a visualization of a batch of segmentation predictions to context.sg_logger .. attribute:: freq

frequency (in epochs) to perform this callback.

batch_idx

batch index to perform visualization for.

last_img_idx_in_batch

Last image index to add to log. (default=-1, will take entire batch).

class super_gradients.training.utils.callbacks.TrainingStageSwitchCallbackBase(next_stage_start_epoch: int)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

TrainingStageSwitchCallback

A phase callback that is called at a specific epoch (epoch start) to support multi-stage training. It does so by manipulating the objects inside the context.

next_stage_start_epoch

int, the epoch idx to apply the stage change.

apply_stage_change(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
This method is called when the callback is fired on the next_stage_start_epoch,

and holds the stage change logic that should be applied to the context’s objects.

Parameters

context – PhaseContext, context of current phase

class super_gradients.training.utils.callbacks.YoloXTrainingStageSwitchCallback(next_stage_start_epoch: int = 285)[source]

Bases: super_gradients.training.utils.callbacks.TrainingStageSwitchCallbackBase

Training stage switch for YoloX training. Disables mosaic, and manipulates YoloX loss to use L1.

apply_stage_change(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
This method is called when the callback is fired on the next_stage_start_epoch,

and holds the stage change logic that should be applied to the context’s objects.

Parameters

context – PhaseContext, context of current phase

class super_gradients.training.utils.callbacks.CallbackHandler(callbacks)[source]

Bases: object

Runs all callbacks who’s phase attribute equals to the given phase.

callbacks

List[PhaseCallback]. Callbacks to be run.

class super_gradients.training.utils.callbacks.TestLRCallback(lr_placeholder)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Phase callback that collects the learning rates in lr_placeholder at the end of each epoch (used for testing). In

the case of multiple parameter groups (i.e multiple learning rates) the learning rate is collected from the first one. The phase is VALIDATION_EPOCH_END to ensure all lr updates have been performed before calling this callback.

super_gradients.training.utils.checkpoint_utils module

super_gradients.training.utils.checkpoint_utils.get_ckpt_local_path(source_ckpt_folder_name: str, experiment_name: str, ckpt_name: str, model_checkpoints_location: str, external_checkpoint_path: str, overwrite_local_checkpoint: bool, load_weights_only: bool)[source]
Gets the local path to the checkpoint file, which will be:
  • By default: YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name.

  • if the checkpoint file is remotely located:

    when overwrite_local_checkpoint=True then it will be saved in a temporary path which will be returned, otherwise it will be downloaded to YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name and overwrite YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name/ckpt_name if such file exists.

  • external_checkpoint_path when external_checkpoint_path != None

@param source_ckpt_folder_name: The folder where the checkpoint is saved. When set to None- uses the experiment_name. @param experiment_name: experiment name attr in sg_model @param ckpt_name: checkpoint filename @param model_checkpoints_location: S3, local ot URL @param external_checkpoint_path: full path to checkpoint file (that might be located outside of super_gradients/checkpoints directory) @param overwrite_local_checkpoint: whether to overwrite the checkpoint file with the same name when downloading from S3. @param load_weights_only: whether to load the network’s state dict only. @return:

super_gradients.training.utils.checkpoint_utils.adaptive_load_state_dict(net: torch.nn.modules.module.Module, state_dict: dict, strict: str)[source]

Adaptively loads state_dict to net, by adapting the state_dict to net’s layer names first.

@param net: (nn.Module) to load state_dict to @param state_dict: (dict) Chekpoint state_dict @param strict: (str) key matching strictness @return:

super_gradients.training.utils.checkpoint_utils.read_ckpt_state_dict(ckpt_path: str, device='cpu')[source]
super_gradients.training.utils.checkpoint_utils.adapt_state_dict_to_fit_model_layer_names(model_state_dict: dict, source_ckpt: dict, exclude: list = [], solver: Optional[callable] = None)[source]

Given a model state dict and source checkpoints, the method tries to correct the keys in the model_state_dict to fit the ckpt in order to properly load the weights into the model. If unsuccessful - returns None

param model_state_dict

the model state_dict

param source_ckpt

checkpoint dict

:param exclude optional list for excluded layers :param solver: callable with signature (ckpt_key, ckpt_val, model_key, model_val)

that returns a desired weight for ckpt_val.

return

renamed checkpoint dict (if possible)

super_gradients.training.utils.checkpoint_utils.raise_informative_runtime_error(state_dict, checkpoint, exception_msg)[source]

Given a model state dict and source checkpoints, the method calls “adapt_state_dict_to_fit_model_layer_names” and enhances the exception_msg if loading the checkpoint_dict via the conversion method is possible

super_gradients.training.utils.checkpoint_utils.load_checkpoint_to_model(ckpt_local_path: str, load_backbone: bool, net: torch.nn.modules.module.Module, strict: str, load_weights_only: bool, load_ema_as_net: bool = False)[source]

Loads the state dict in ckpt_local_path to net and returns the checkpoint’s state dict.

@param load_ema_as_net: Will load the EMA inside the checkpoint file to the network when set @param ckpt_local_path: local path to the checkpoint file @param load_backbone: whether to load the checkpoint as a backbone @param net: network to load the checkpoint to @param strict: @param load_weights_only: @return:

exception super_gradients.training.utils.checkpoint_utils.MissingPretrainedWeightsException(desc)[source]

Bases: Exception

Exception raised by unsupported pretrianed model.

message -- explanation of the error
super_gradients.training.utils.checkpoint_utils.load_pretrained_weights(model: torch.nn.modules.module.Module, architecture: str, pretrained_weights: str)[source]

Loads pretrained weights from the MODEL_URLS dictionary to model @param architecture: name of the model’s architecture @param model: model to load pretrinaed weights for @param pretrained_weights: name for the pretrianed weights (i.e imagenet) @return: None

super_gradients.training.utils.detection_utils module

class super_gradients.training.utils.detection_utils.DetectionTargetsFormat(value)[source]

Bases: enum.Enum

Enum class for the different detection output formats

When NORMALIZED is not specified- the type refers to unnormalized image coordinates (of the bboxes).

For example: LABEL_NORMALIZED_XYXY means [class_idx,x1,y1,x2,y2]

LABEL_XYXY = 'LABEL_XYXY'
XYXY_LABEL = 'XYXY_LABEL'
LABEL_NORMALIZED_XYXY = 'LABEL_NORMALIZED_XYXY'
NORMALIZED_XYXY_LABEL = 'NORMALIZED_XYXY_LABEL'
LABEL_CXCYWH = 'LABEL_CXCYWH'
CXCYWH_LABEL = 'CXCYWH_LABEL'
LABEL_NORMALIZED_CXCYWH = 'LABEL_NORMALIZED_CXCYWH'
NORMALIZED_CXCYWH_LABEL = 'NORMALIZED_CXCYWH_LABEL'
super_gradients.training.utils.detection_utils.get_cls_posx_in_target(target_format: super_gradients.training.utils.detection_utils.DetectionTargetsFormat)int[source]

Get the label of a given target :param target_format: Representation of the target (ex: LABEL_XYXY) :return: Position of the class id in a bbox

ex: 0 if bbox of format label_xyxy | -1 if bbox of format xyxy_label

super_gradients.training.utils.detection_utils.convert_xywh_bbox_to_xyxy(input_bbox: torch.Tensor)[source]
Converts bounding box format from [x, y, w, h] to [x1, y1, x2, y2]
param input_bbox

input bbox either 2-dimensional (for all boxes of a single image) or 3-dimensional (for boxes of a batch of images)

return

Converted bbox in same dimensions as the original

super_gradients.training.utils.detection_utils.calculate_bbox_iou_matrix(box1, box2, x1y1x2y2=True, GIoU: bool = False, DIoU=False, CIoU=False, eps=1e-09)[source]
calculate iou matrix containing the iou of every couple iuo(i,j) where i is in box1 and j is in box2
param box1

a 2D tensor of boxes (shape N x 4)

param box2

a 2D tensor of boxes (shape M x 4)

param x1y1x2y2

boxes format is x1y1x2y2 (True) or xywh where xy is the center (False)

return

a 2D iou matrix (shape NxM)

super_gradients.training.utils.detection_utils.calc_bbox_iou_matrix(pred: torch.Tensor)[source]

calculate iou for every pair of boxes in the boxes vector :param pred: a 3-dimensional tensor containing all boxes for a batch of images [N, num_boxes, 4], where

each box format is [x1,y1,x2,y2]

Returns

a 3-dimensional matrix where M_i_j_k is the iou of box j and box k of the i’th image in the batch

super_gradients.training.utils.detection_utils.change_bbox_bounds_for_image_size(boxes, img_shape)[source]
class super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback[source]

Bases: abc.ABC, torch.nn.modules.module.Module

abstract forward(x, device: str)[source]
Parameters
  • x – the output of your model

  • device – the device to move all output tensors into

Returns

a list with length batch_size, each item in the list is a detections with shape: nx6 (x1, y1, x2, y2, confidence, class) where x and y are in range [0,1]

training: bool
class super_gradients.training.utils.detection_utils.IouThreshold(value)[source]

Bases: tuple, enum.Enum

An enumeration.

MAP_05 = (0.5, 0.5)
MAP_05_TO_095 = (0.5, 0.95)
is_range()[source]
to_tensor()[source]
super_gradients.training.utils.detection_utils.box_iou(box1, box2)[source]

Return intersection-over-union (Jaccard index) of boxes. Both sets of boxes are expected to be in (x1, y1, x2, y2) format. :param box1: :type box1: Tensor[N, 4] :param box2: :type box2: Tensor[M, 4]

Returns

the NxM matrix containing the pairwise

IoU values for every element in boxes1 and boxes2

Return type

iou (Tensor[N, M])

super_gradients.training.utils.detection_utils.non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, multi_label_per_box: bool = True, with_confidence: bool = False)[source]
Performs Non-Maximum Suppression (NMS) on inference results
param prediction

raw model prediction

param conf_thres

below the confidence threshold - prediction are discarded

param iou_thres

IoU threshold for the nms algorithm

param multi_label_per_box

whether to use re-use each box with all possible labels (instead of the maximum confidence all confidences above threshold will be sent to NMS); by default is set to True

param with_confidence

whether to multiply objectness score with class score. usually valid for Yolo models only.

return

(x1, y1, x2, y2, object_conf, class_conf, class)

Returns

nx6 (x1, y1, x2, y2, conf, cls)

Return type

detections with shape

super_gradients.training.utils.detection_utils.matrix_non_max_suppression(pred, conf_thres: float = 0.1, kernel: str = 'gaussian', sigma: float = 3.0, max_num_of_detections: int = 500)[source]
Performs Matrix Non-Maximum Suppression (NMS) on inference results

https://arxiv.org/pdf/1912.04488.pdf :param pred: raw model prediction (in test mode) - a Tensor of shape [batch, num_predictions, 85] where each item format is (x, y, w, h, object_conf, class_conf, … 80 classes score …) :param conf_thres: below the confidence threshold - prediction are discarded :param kernel: type of kernel to use [‘gaussian’, ‘linear’] :param sigma: sigma for the gussian kernel :param max_num_of_detections: maximum number of boxes to output :return: list of (x1, y1, x2, y2, object_conf, class_conf, class)

Returns

(x1, y1, x2, y2, conf, cls)

Return type

detections list with shape

class super_gradients.training.utils.detection_utils.NMS_Type(value)[source]

Bases: str, enum.Enum

Type of non max suppression algorithm that can be used for post processing detection

ITERATIVE = 'iterative'
MATRIX = 'matrix'
super_gradients.training.utils.detection_utils.undo_image_preprocessing(im_tensor: torch.Tensor)numpy.ndarray[source]
Parameters

im_tensor – images in a batch after preprocessing for inference, RGB, (B, C, H, W)

Returns

images in a batch in cv2 format, BGR, (B, H, W, C)

class super_gradients.training.utils.detection_utils.DetectionVisualization[source]

Bases: object

static visualize_batch(image_tensor: torch.Tensor, pred_boxes: List[torch.Tensor], target_boxes: torch.Tensor, batch_name: Union[int, str], class_names: List[str], checkpoint_dir: Optional[str] = None, undo_preprocessing_func: Callable[[torch.Tensor], numpy.ndarray] = <function undo_image_preprocessing>, box_thickness: int = 2, image_scale: float = 1.0, gt_alpha: float = 0.4)[source]

A helper function to visualize detections predicted by a network: saves images into a given path with a name that is {batch_name}_{imade_idx_in_the_batch}.jpg, one batch per call. Colors are generated on the fly: uniformly sampled from color wheel to support all given classes.

Adjustable:
  • Ground truth box transparency;

  • Box width;

  • Image size (larger or smaller than what’s provided)

Parameters
  • image_tensor – rgb images, (B, H, W, 3)

  • pred_boxes – boxes after NMS for each image in a batch, each (Num_boxes, 6), values on dim 1 are: x1, y1, x2, y2, confidence, class

  • target_boxes – (Num_targets, 6), values on dim 1 are: image id in a batch, class, x y w h (coordinates scaled to [0, 1])

  • batch_name – id of the current batch to use for image naming

  • class_names – names of all classes, each on its own index

  • checkpoint_dir – a path where images with boxes will be saved. if None, the result images will be returns as a list of numpy image arrays

  • undo_preprocessing_func – a function to convert preprocessed images tensor into a batch of cv2-like images

  • box_thickness – box line thickness in px

  • image_scale – scale of an image w.r.t. given image size, e.g. incoming images are (320x320), use scale = 2. to preview in (640x640)

  • gt_alpha – a value in [0., 1.] transparency on ground truth boxes, 0 for invisible, 1 for fully opaque

class super_gradients.training.utils.detection_utils.Anchors(anchors_list: List[List], strides: List[int])[source]

Bases: torch.nn.modules.module.Module

A wrapper function to hold the anchors used by detection models such as Yolo

property stride: torch.nn.parameter.Parameter
property anchors: torch.nn.parameter.Parameter
property anchor_grid: torch.nn.parameter.Parameter
property detection_layers_num: int
property num_anchors: int
training: bool
super_gradients.training.utils.detection_utils.xyxy2cxcywh(bboxes)[source]

Transforms bboxes from xyxy format to centerized xy wh format :param bboxes: array, shaped (nboxes, 4) :return: modified bboxes

super_gradients.training.utils.detection_utils.cxcywh2xyxy(bboxes)[source]

Transforms bboxes from centerized xy wh format to xyxy format :param bboxes: array, shaped (nboxes, 4) :return: modified bboxes

super_gradients.training.utils.detection_utils.get_mosaic_coordinate(mosaic_index, xc, yc, w, h, input_h, input_w)[source]

Returns the mosaic coordinates of final mosaic image according to mosaic image index.

Parameters
  • mosaic_index – (int) mosaic image index

  • xc – (int) center x coordinate of the entire mosaic grid.

  • yc – (int) center y coordinate of the entire mosaic grid.

  • w – (int) width of bbox

  • h – (int) height of bbox

  • input_h – (int) image input height (should be 1/2 of the final mosaic output image height).

  • input_w – (int) image input width (should be 1/2 of the final mosaic output image width).

Returns

(x1, y1, x2, y2), (x1s, y1s, x2s, y2s) where (x1, y1, x2, y2) are the coordinates in the final mosaic output image, and (x1s, y1s, x2s, y2s) are the coordinates in the placed image.

super_gradients.training.utils.detection_utils.adjust_box_anns(bbox, scale_ratio, padw, padh, w_max, h_max)[source]

Adjusts the bbox annotations of rescaled, padded image.

Parameters
  • bbox – (np.array) bbox to modify.

  • scale_ratio – (float) scale ratio between rescale output image and original one.

  • padw – (int) width padding size.

  • padh – (int) height padding size.

  • w_max – (int) width border.

  • h_max – (int) height border

Returns

modified bbox (np.array)

class super_gradients.training.utils.detection_utils.DetectionCollateFN[source]

Bases: object

Collate function for Yolox training

class super_gradients.training.utils.detection_utils.CrowdDetectionCollateFN[source]

Bases: super_gradients.training.utils.detection_utils.DetectionCollateFN

Collate function for Yolox training with additional_batch_items that includes crowd targets

super_gradients.training.utils.detection_utils.compute_box_area(box: torch.Tensor)torch.Tensor[source]
Compute the area of one or many boxes.
param box

One or many boxes, shape = (4, ?), each box in format (x1, y1, x2, y2)

Returns

Area of every box, shape = (1, ?)

super_gradients.training.utils.detection_utils.crowd_ioa(det_box: torch.Tensor, crowd_box: torch.Tensor)torch.Tensor[source]

Return intersection-over-detection_area of boxes, used for crowd ground truths. Both sets of boxes are expected to be in (x1, y1, x2, y2) format. :param det_box: :type det_box: Tensor[N, 4] :param crowd_box: :type crowd_box: Tensor[M, 4]

Returns

the NxM matrix containing the pairwise

IoA values for every element in det_box and crowd_box

Return type

crowd_ioa (Tensor[N, M])

super_gradients.training.utils.detection_utils.compute_detection_matching(output: torch.Tensor, targets: torch.Tensor, height: int, width: int, iou_thresholds: torch.Tensor, denormalize_targets: bool, device: str, crowd_targets: Optional[torch.Tensor] = None, top_k: int = 100, return_on_cpu: bool = True)List[Tuple][source]

Match predictions (NMS output) and the targets (ground truth) with respect to IoU and confidence score. :param output: list (of length batch_size) of Tensors of shape (num_predictions, 6)

format: (x1, y1, x2, y2, confidence, class_label) where x1,y1,x2,y2 are according to image size

Parameters
  • targets – targets for all images of shape (total_num_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]

  • height – dimensions of the image

  • width – dimensions of the image

  • iou_thresholds – Threshold to compute the mAP

  • device – Device

  • crowd_targets – crowd targets for all images of shape (total_num_crowd_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]

  • top_k – Number of predictions to keep per class, ordered by confidence score

  • denormalize_targets – If True, denormalize the targets and crowd_targets

  • return_on_cpu – If True, the output will be returned on “CPU”, otherwise it will be returned on “device”

Returns

list of the following tensors, for every image: :preds_matched: Tensor of shape (num_img_predictions, n_iou_thresholds)

True when prediction (i) is matched with a target with respect to the (j)th IoU threshold

preds_to_ignore

Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold

preds_scores

Tensor of shape (num_img_predictions), confidence score for every prediction

preds_cls

Tensor of shape (num_img_predictions), predicted class for every prediction

targets_cls

Tensor of shape (num_img_targets), ground truth class for every target

super_gradients.training.utils.detection_utils.compute_img_detection_matching(preds: torch.Tensor, targets: torch.Tensor, crowd_targets: torch.Tensor, height: int, width: int, iou_thresholds: torch.Tensor, device: str, denormalize_targets: bool, top_k: int = 100, return_on_cpu: bool = True)Tuple[source]

Match predictions (NMS output) and the targets (ground truth) with respect to IoU and confidence score for a given image. :param preds: Tensor of shape (num_img_predictions, 6)

format: (x1, y1, x2, y2, confidence, class_label) where x1,y1,x2,y2 are according to image size

Parameters
  • targets – targets for this image of shape (num_img_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]

  • height – dimensions of the image

  • width – dimensions of the image

  • iou_thresholds – Threshold to compute the mAP

  • device

  • crowd_targets – crowd targets for all images of shape (total_num_crowd_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]

  • top_k – Number of predictions to keep per class, ordered by confidence score

  • device – Device

  • denormalize_targets – If True, denormalize the targets and crowd_targets

  • return_on_cpu – If True, the output will be returned on “CPU”, otherwise it will be returned on “device”

Returns

preds_matched

Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the (j)th IoU threshold

preds_to_ignore

Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold

preds_scores

Tensor of shape (num_img_predictions), confidence score for every prediction

preds_cls

Tensor of shape (num_img_predictions), predicted class for every prediction

targets_cls

Tensor of shape (num_img_targets), ground truth class for every target

super_gradients.training.utils.detection_utils.get_top_k_idx_per_cls(preds_scores: torch.Tensor, preds_cls: torch.Tensor, top_k: int)[source]

Get the indexes of all the top k predictions for every class

Parameters
  • preds_scores – The confidence scores, vector of shape (n_pred)

  • preds_cls – The predicted class, vector of shape (n_pred)

  • top_k – Number of predictions to keep per class, ordered by confidence score

Return top_k_idx

Indexes of the top k predictions. length <= (k * n_unique_class)

super_gradients.training.utils.detection_utils.compute_detection_metrics(preds_matched: torch.Tensor, preds_to_ignore: torch.Tensor, preds_scores: torch.Tensor, preds_cls: torch.Tensor, targets_cls: torch.Tensor, device: str, recall_thresholds: Optional[torch.Tensor] = None, score_threshold: Optional[float] = 0.1)Tuple[source]

Compute the list of precision, recall, MaP and f1 for every recall IoU threshold and for every class.

Parameters

preds_matched – Tensor of shape (num_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the (j)th IoU threshold

:param preds_to_ignore Tensor of shape (num_predictions, n_iou_thresholds)

True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold

Parameters
  • preds_scores – Tensor of shape (num_predictions), confidence score for every prediction

  • preds_cls – Tensor of shape (num_predictions), predicted class for every prediction

  • targets_cls – Tensor of shape (num_targets), ground truth class for every target box to be detected

  • recall_thresholds – Recall thresholds used to compute MaP.

  • score_threshold – Minimum confidence score to consider a prediction for the computation of precision, recall and f1 (not MaP)

  • device – Device

Returns

ap, precision, recall, f1

Tensors of shape (n_class, nb_iou_thrs)

unique_classes

Vector with all unique target classes

super_gradients.training.utils.detection_utils.compute_detection_metrics_per_cls(preds_matched: torch.Tensor, preds_to_ignore: torch.Tensor, preds_scores: torch.Tensor, n_targets: int, recall_thresholds: torch.Tensor, score_threshold: float, device: str)[source]

Compute the list of precision, recall and MaP of a given class for every recall IoU threshold.

param preds_matched

Tensor of shape (num_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the(j)th IoU threshold

:param preds_to_ignore Tensor of shape (num_predictions, n_iou_thresholds)

True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold

param preds_scores

Tensor of shape (num_predictions), confidence score for every prediction

param n_targets

Number of target boxes of this class

param recall_thresholds

Tensor of shape (max_n_rec_thresh) list of recall thresholds used to compute MaP

param score_threshold

Minimum confidence score to consider a prediction for the computation of precision and recall (not MaP)

param device

Device

return ap, precision, recall

Tensors of shape (nb_iou_thrs)

super_gradients.training.utils.distributed_training_utils module

super_gradients.training.utils.distributed_training_utils.distributed_all_reduce_tensor_average(tensor, n)[source]

This method performs a reduce operation on multiple nodes running distributed training It first sums all of the results and then divides the summation :param tensor: The tensor to perform the reduce operation for :param n: Number of nodes :return: Averaged tensor from all of the nodes

super_gradients.training.utils.distributed_training_utils.reduce_results_tuple_for_ddp(validation_results_tuple, device)[source]

Gather all validation tuples from the various devices and average them

class super_gradients.training.utils.distributed_training_utils.MultiGPUModeAutocastWrapper(func)[source]

Bases: object

super_gradients.training.utils.distributed_training_utils.scaled_all_reduce(tensors: torch.Tensor, num_gpus: int)[source]

Performs the scaled all_reduce operation on the provided tensors. The input tensors are modified in-place. Currently supports only the sum reduction operator. The reduced values are scaled by the inverse size of the process group (equivalent to num_gpus).

super_gradients.training.utils.distributed_training_utils.compute_precise_bn_stats(model: torch.nn.modules.module.Module, loader: torch.utils.data.dataloader.DataLoader, precise_bn_batch_size: int, num_gpus: int)[source]
Parameters
  • model – The model being trained (ie: SgModel.net)

  • loader – Training dataloader (ie: SgModel.train_loader)

  • precise_bn_batch_size – The effective batch size we want to calculate the batchnorm on. For example, if we are training a model on 8 gpus, with a batch of 128 on each gpu, a good rule of thumb would be to give it 8192 (ie: effective_batch_size * num_gpus = batch_per_gpu * num_gpus * num_gpus). If precise_bn_batch_size is not provided in the training_params, the latter heuristic will be taken.

param num_gpus: The number of gpus we are training on

super_gradients.training.utils.distributed_training_utils.get_local_rank()[source]

Returns the local rank if running in DDP, and 0 otherwise :return: local rank

super_gradients.training.utils.distributed_training_utils.get_world_size()int[source]

Returns the world size if running in DDP, and 1 otherwise :return: world size

super_gradients.training.utils.distributed_training_utils.wait_for_the_master(local_rank: int)[source]

Make all processes waiting for the master to do some task.

super_gradients.training.utils.early_stopping module

class super_gradients.training.utils.early_stopping.EarlyStop(phase: super_gradients.training.utils.callbacks.Phase, monitor: str, mode: str = 'min', min_delta: float = 0.0, patience: int = 3, check_finite: bool = True, threshold: Optional[float] = None, verbose: bool = False, strict: bool = True)[source]

Bases: super_gradients.training.utils.callbacks.PhaseCallback

Callback to monitor a metric and stop training when it stops improving. Inspired by pytorch_lightning.callbacks.early_stopping and tf.keras.callbacks.EarlyStopping

mode_dict = {'max': <built-in method gt of type object>, 'min': <built-in method lt of type object>}
supported_phases = (<Phase.VALIDATION_EPOCH_END: 'VALIDATION_EPOCH_END'>, <Phase.TRAIN_EPOCH_END: 'TRAIN_EPOCH_END'>)
exception super_gradients.training.utils.early_stopping.MissingMonitorKeyException[source]

Bases: Exception

Exception raised for missing monitor key in metrics_dict.

super_gradients.training.utils.ema module

super_gradients.training.utils.ema.copy_attr(a: torch.nn.modules.module.Module, b: torch.nn.modules.module.Module, include: Union[list, tuple] = (), exclude: Union[list, tuple] = ())[source]
class super_gradients.training.utils.ema.ModelEMA(model, decay: float = 0.9999, beta: float = 15, exp_activation: bool = True)[source]

Bases: object

Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.

update(model, training_percent: float)[source]

Update the state of the EMA model. :param model: current training model :param training_percent: the percentage of the training process [0,1]. i.e 0.4 means 40% of the training have passed

update_attr(model)[source]

This function updates model attributes (not weight and biases) from original model to the ema model. attributes of the original model, such as anchors and grids (of detection models), may be crucial to the model operation and need to be updated. If include_attributes and exclude_attributes lists were not defined, all non-private (not starting with ‘_’) attributes will be updated (and only them). :param model: the source model

class super_gradients.training.utils.ema.KDModelEMA(kd_model: super_gradients.training.models.kd_modules.kd_module.KDModule, decay: float = 0.9999, beta: float = 15, exp_activation: bool = True)[source]

Bases: super_gradients.training.utils.ema.ModelEMA

Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.

super_gradients.training.utils.export_utils module

super_gradients.training.utils.export_utils.fuse_conv_bn(model: torch.nn.modules.module.Module, replace_bn_with_identity: bool = False)[source]

Fuses consecutive nn.Conv2d and nn.BatchNorm2d layers recursively inplace in all of the model :param replace_bn_with_identity: if set to true, bn will be replaced with identity. otherwise, bn will be removed :param model: the target model :return: the number of fuses executed

super_gradients.training.utils.get_model_stats module

super_gradients.training.utils.module_utils module

class super_gradients.training.utils.module_utils.MultiOutputModule(module: torch.nn.modules.module.Module, output_paths: list, prune: bool = True)[source]

Bases: torch.nn.modules.module.Module

This module wraps around a container nn.Module (such as Module, Sequential and ModuleList) and allows to extract multiple output from its inner modules on each forward call() (as a list of output tensors) note: the default output of the wrapped module will not be added to the output list by default. To get the default output in the outputs list, explicitly include its path in the @output_paths parameter

i.e. for module:

Sequential(
(0): Sequential(

(0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU6(inplace=True)

) ===================================>> (1): InvertedResidual(

(conv): Sequential(

(0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU6(inplace=True) ===================================>> (3): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

)

)

)

and paths:

[0, [1, ‘conv’, 2]]

the output are marked with arrows

save_output_hook(_, input, output)[source]
forward(x)list[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
super_gradients.training.utils.module_utils.replace_activations(module: torch.nn.modules.module.Module, new_activation: torch.nn.modules.module.Module, activations_to_replace: List[type])[source]

Recursively go through module and replaces each activation in activations_to_replace with a copy of new_activation :param module: a module that will be changed inplace :param new_activation: a sample of a new activation (will be copied) :param activations_to_replace: types of activations to replace, each must be a subclass of nn.Module

super_gradients.training.utils.module_utils.fuse_repvgg_blocks_residual_branches(model: torch.nn.modules.module.Module)[source]

Call fuse_block_residual_branches for all repvgg blocks in the model :param model: torch.nn.Module with repvgg blocks. Doesn’t have to be entirely consists of repvgg. :type model: torch.nn.Module

class super_gradients.training.utils.module_utils.ConvBNReLU(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', use_normalization: bool = True, eps: float = 1e-05, momentum: float = 0.1, affine: bool = True, track_running_stats: bool = True, device=None, dtype=None, use_activation: bool = True, inplace: bool = False)[source]

Bases: torch.nn.modules.module.Module

Class for Convolution2d-Batchnorm2d-Relu layer. Default behaviour is Conv-BN-Relu. To exclude Batchnorm module use

use_normalization=False, to exclude Relu activation use use_activation=False.

For convolution arguments documentation see nn.Conv2d. For batchnorm arguments documentation see nn.BatchNorm2d. For relu arguments documentation see nn.Relu.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class super_gradients.training.utils.module_utils.NormalizationAdapter(mean_original, std_original, mean_required, std_required)[source]

Bases: torch.nn.modules.module.Module

Denormalizes input by mean_original, std_original, then normalizes by mean_required, std_required.

Used in KD training where teacher expects data normalized by mean_required, std_required.

mean_original, std_original, mean_required, std_required are all list-like objects of length that’s equal to the

number of input channels.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

super_gradients.training.utils.optimizer_utils module

super_gradients.training.utils.optimizer_utils.separate_zero_wd_params_groups_for_optimizer(module: torch.nn.modules.module.Module, net_named_params, weight_decay: float)[source]
separate param groups for batchnorm and biases and others with weight decay. return list of param groups in format

required by torch Optimizer classes.

bias + BN with weight decay=0 and the rest with the given weight decay
param module

train net module.

param net_named_params

list of params groups, output of SgModule.initialize_param_groups

param weight_decay

value to set for the non BN and bias parameters

super_gradients.training.utils.optimizer_utils.build_optimizer(net, lr, training_params)[source]
Wrapper function for initializing the optimizer
param net

the nn_module to build the optimizer for

param lr

initial learning rate

param training_params

training_parameters

super_gradients.training.utils.regularization_utils module

class super_gradients.training.utils.regularization_utils.DropPath(drop_prob=None)[source]

Bases: torch.nn.modules.module.Module

Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).

Code taken from TIMM (https://github.com/rwightman/pytorch-image-models) Apache License 2.0

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

super_gradients.training.utils.segmentation_utils module

super_gradients.training.utils.segmentation_utils.coco_sub_classes_inclusion_tuples_list()[source]
super_gradients.training.utils.segmentation_utils.to_one_hot(target: torch.Tensor, num_classes: int, ignore_index: Optional[int] = None)[source]

Target label to one_hot tensor. labels and ignore_index must be consecutive numbers. :param target: Class labels long tensor, with shape [N, H, W] :param num_classes: num of classes in datasets excluding ignore label, this is the output channels of the one hot

result.

Returns

one hot tensor with shape [N, num_classes, H, W]

super_gradients.training.utils.segmentation_utils.reverse_imagenet_preprocessing(im_tensor: torch.Tensor)numpy.ndarray[source]
Parameters

im_tensor – images in a batch after preprocessing for inference, RGB, (B, C, H, W)

Returns

images in a batch in cv2 format, BGR, (B, H, W, C)

class super_gradients.training.utils.segmentation_utils.BinarySegmentationVisualization[source]

Bases: object

static visualize_batch(image_tensor: torch.Tensor, pred_mask: torch.Tensor, target_mask: torch.Tensor, batch_name: Union[int, str], checkpoint_dir: Optional[str] = None, undo_preprocessing_func: Callable[[torch.Tensor], numpy.ndarray] = <function reverse_imagenet_preprocessing>, image_scale: float = 1.0)[source]

A helper function to visualize detections predicted by a network: saves images into a given path with a name that is {batch_name}_{imade_idx_in_the_batch}.jpg, one batch per call. Colors are generated on the fly: uniformly sampled from color wheel to support all given classes.

Parameters
  • image_tensor – rgb images, (B, H, W, 3)

  • pred_boxes – boxes after NMS for each image in a batch, each (Num_boxes, 6), values on dim 1 are: x1, y1, x2, y2, confidence, class

  • target_boxes – (Num_targets, 6), values on dim 1 are: image id in a batch, class, x y w h (coordinates scaled to [0, 1])

  • batch_name – id of the current batch to use for image naming

  • checkpoint_dir – a path where images with boxes will be saved. if None, the result images will be returns as a list of numpy image arrays

  • undo_preprocessing_func – a function to convert preprocessed images tensor into a batch of cv2-like images

  • image_scale – scale factor for output image

super_gradients.training.utils.segmentation_utils.visualize_batches(dataloader, module, visualization_path, num_batches=1, undo_preprocessing_func=None)[source]
super_gradients.training.utils.segmentation_utils.one_hot_to_binary_edge(x: torch.Tensor, kernel_size: int, flatten_channels: bool = True)torch.Tensor[source]

Utils function to create edge feature maps. :param x: input tensor, must be one_hot tensor with shape [B, C, H, W] :param kernel_size: kernel size of dilation erosion convolutions. The result edge widths depends on this argument as

follows: edge_width = kernel - 1

Parameters

flatten_channels – Whether to apply logical_or across channels dimension, if at least one pixel class is considered as edge pixel flatten value is 1. If set as False the output tensor shape is [B, C, H, W], else [B, 1, H, W]. Default is True.

Returns

one_hot edge torch.Tensor.

super_gradients.training.utils.segmentation_utils.target_to_binary_edge(target: torch.Tensor, num_classes: int, kernel_size: int, ignore_index: Optional[int] = None, flatten_channels: bool = True)torch.Tensor[source]

Utils function to create edge feature maps from target. :param target: Class labels long tensor, with shape [N, H, W] :param num_classes: num of classes in datasets excluding ignore label, this is the output channels of the one hot

result.

Parameters
  • kernel_size – kernel size of dilation erosion convolutions. The result edge widths depends on this argument as follows: edge_width = kernel - 1

  • flatten_channels – Whether to apply logical or across channels dimension, if at least one pixel class is considered as edge pixel flatten value is 1. If set as False the output tensor shape is [B, C, H, W], else [B, 1, H, W]. Default is True.

Returns

one_hot edge torch.Tensor.

super_gradients.training.utils.sg_model_utils module

class super_gradients.training.utils.sg_model_utils.MonitoredValue(name: str, greater_is_better: bool, current: Optional[float] = None, previous: Optional[float] = None, best: Optional[float] = None, change_from_previous: Optional[float] = None, change_from_best: Optional[float] = None)[source]

Bases: object

Store a value and some indicators relative to its past iterations.

The value can be a metric/loss, and the iteration can be epochs/batch.

name: str
greater_is_better: bool
current: float = None
previous: float = None
best: float = None
change_from_previous: float = None
change_from_best: float = None
property is_better_than_previous
property is_best_value
super_gradients.training.utils.sg_model_utils.update_monitored_value(previous_monitored_value: super_gradients.training.utils.sg_model_utils.MonitoredValue, new_value: float)super_gradients.training.utils.sg_model_utils.MonitoredValue[source]

Update the given ValueToMonitor object (could be a loss or a metric) with the new value

Parameters
  • previous_monitored_value – The stats about the value that is monitored throughout epochs.

  • new_value – The value of the current epoch that will be used to update previous_monitored_value

Returns

super_gradients.training.utils.sg_model_utils.update_monitored_values_dict(monitored_values_dict: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue], new_values_dict: Dict[str, float])Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue][source]

Update the given ValueToMonitor object (could be a loss or a metric) with the new value

Parameters
  • monitored_values_dict – Dict mapping value names to their stats throughout epochs.

  • new_values_dict – Dict mapping value names to their new (i.e. current epoch) value.

Returns

Updated monitored_values_dict

super_gradients.training.utils.sg_model_utils.display_epoch_summary(epoch: int, n_digits: int, train_monitored_values: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue], valid_monitored_values: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue])None[source]

Display a summary of loss/metric of interest, for a given epoch.

Parameters
  • epoch – the number of epoch.

  • n_digits – number of digits to display on screen for float values

  • train_monitored_values – mapping of loss/metric with their stats that will be displayed

  • valid_monitored_values – mapping of loss/metric with their stats that will be displayed

Returns

super_gradients.training.utils.sg_model_utils.try_port(port)[source]

try_port - Helper method for tensorboard port binding :param port: :return:

super_gradients.training.utils.sg_model_utils.launch_tensorboard_process(checkpoints_dir_path: str, sleep_postpone: bool = True, port: Optional[int] = None)Tuple[multiprocessing.context.Process, int][source]
launch_tensorboard_process - Default behavior is to scan all free ports from 6006-6016 and try using them

unless port is defined by the user

param checkpoints_dir_path

param sleep_postpone

param port

return

tuple of tb process, port

super_gradients.training.utils.sg_model_utils.init_summary_writer(tb_dir, checkpoint_loaded, user_prompt=False)[source]

Remove previous tensorboard files from directory and launch a tensor board process

super_gradients.training.utils.sg_model_utils.add_log_to_file(filename, results_titles_list, results_values_list, epoch, max_epochs)[source]

Add a message to the log file

super_gradients.training.utils.sg_model_utils.write_training_results(writer, results_titles_list, results_values_list, epoch)[source]

Stores the training and validation loss and accuracy for current epoch in a tensorboard file

super_gradients.training.utils.sg_model_utils.write_hpms(writer, hpmstructs=[], special_conf={})[source]

Stores the training and dataset hyper params in the tensorboard file

super_gradients.training.utils.sg_model_utils.unpack_batch_items(batch_items: Union[tuple, torch.Tensor])[source]

Adds support for unpacking batch items in train/validation loop.

@param batch_items: (Union[tuple, torch.Tensor]) returned by the data loader, which is expected to be in one of
the following formats:
  1. torch.Tensor or tuple, s.t inputs = batch_items[0], targets = batch_items[1] and len(batch_items) = 2

  2. tuple: (inputs, targets, additional_batch_items)

where inputs are fed to the network, targets are their corresponding labels and additional_batch_items is a dictionary (format {additional_batch_item_i_name: additional_batch_item_i …}) which can be accessed through the phase context under the attribute additional_batch_item_i_name, using a phase callback.

@return: inputs, target, additional_batch_items

super_gradients.training.utils.sg_model_utils.log_uncaught_exceptions(logger)[source]

Makes logger log uncaught exceptions @param logger: logging.Logger

@return: None

super_gradients.training.utils.ssd_utils module

class super_gradients.training.utils.ssd_utils.DefaultBoxes(fig_size: int, feat_size: List[int], scales: List[int], aspect_ratios: List[List[int]], scale_xy=0.1, scale_wh=0.2)[source]

Bases: object

Default Boxes, (aka: anchor boxes or priors boxes) used by SSD model

property scale_xy
property scale_wh
class super_gradients.training.utils.ssd_utils.SSDPostPredictCallback(conf: float = 0.001, iou: float = 0.6, classes: Optional[list] = None, max_predictions: int = 300, nms_type: super_gradients.training.utils.detection_utils.NMS_Type = <NMS_Type.ITERATIVE: 'iterative'>, multi_label_per_box=True)[source]

Bases: super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback

post prediction callback module to convert and filter predictions coming from the SSD net to a format used by all other detection models

forward(predictions, device=None)[source]
Parameters
  • x – the output of your model

  • device – the device to move all output tensors into

Returns

a list with length batch_size, each item in the list is a detections with shape: nx6 (x1, y1, x2, y2, confidence, class) where x and y are in range [0,1]

training: bool

super_gradients.training.utils.utils module

super_gradients.training.utils.utils.convert_to_tensor(array)[source]

Converts numpy arrays and lists to Torch tensors before calculation losses :param array: torch.tensor / Numpy array / List

class super_gradients.training.utils.utils.HpmStruct(**entries)[source]

Bases: object

set_schema(schema: dict)[source]
override(**entries)[source]
to_dict()[source]
validate()[source]

Validate the current dict values according to the provided schema :raises

AttributeError if schema was not set jsonschema.exceptions.ValidationError if the instance is invalid jsonschema.exceptions.SchemaError if the schema itselfis invalid

class super_gradients.training.utils.utils.WrappedModel(module)[source]

Bases: torch.nn.modules.module.Module

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class super_gradients.training.utils.utils.Timer(device: str)[source]

Bases: object

A class to measure time handling both GPU & CPU processes Returns time in milliseconds

start()[source]
stop()[source]
class super_gradients.training.utils.utils.AverageMeter[source]

Bases: object

A class to calculate the average of a metric, for each batch during training/testing

update(value: Union[float, tuple, list, torch.Tensor], batch_size: int)[source]
property average
super_gradients.training.utils.utils.tensor_container_to_device(obj: Union[torch.Tensor, tuple, list, dict], device: str, non_blocking=True)[source]
recursively send compounded objects to device (sending all tensors to device and maintaining structure)

:param obj the object to send to device (list / tuple / tensor / dict) :param device: device to send the tensors to :param non_blocking: used for DistributedDataParallel :returns an object with the same structure (tensors, lists, tuples) with the device pointers (like

the return value of Tensor.to(device)

super_gradients.training.utils.utils.get_param(params, name, default_val=None)[source]

Retrieves a param from a parameter object/dict. If the parameter does not exist, will return default_val. In case the default_val is of type dictionary, and a value is found in the params - the function will return the default value dictionary with internal values overridden by the found value

i.e. default_opt_params = {‘lr’:0.1, ‘momentum’:0.99, ‘alpha’:0.001} training_params = {‘optimizer_params’: {‘lr’:0.0001}, ‘batch’: 32 …. } get_param(training_params, name=’optimizer_params’, default_val=default_opt_params) will return {‘lr’:0.0001, ‘momentum’:0.99, ‘alpha’:0.001}

Parameters
  • params – an object (typically HpmStruct) or a dict holding the params

  • name – name of the searched parameter

  • default_val – assumed to be the same type as the value searched in the params

Returns

the found value, or default if not found

super_gradients.training.utils.utils.static_vars(**kwargs)[source]
super_gradients.training.utils.utils.print_once(s: str)[source]
super_gradients.training.utils.utils.move_state_dict_to_device(model_sd, device)[source]

Moving model state dict tensors to target device (cuda or cpu) :param model_sd: model state dict :param device: either cuda or cpu

super_gradients.training.utils.utils.random_seed(is_ddp, device, seed)[source]

Sets random seed of numpy, torch and random.

When using ddp a seed will be set for each process according to its local rank derived from the device number. :param is_ddp: bool, will set different random seed for each process when using ddp. :param device: ‘cuda’,’cpu’, ‘cuda:<device_number>’ :param seed: int, random seed to be set

super_gradients.training.utils.utils.load_func(dotpath: str)[source]

load function in module. function is right-most segment.

Used for passing functions (without calling them) in yaml files.

@param dotpath: path to module. @return: a python function

super_gradients.training.utils.utils.get_filename_suffix_by_framework(framework: str)[source]

Return the file extension of framework.

@param framework: (str) @return: (str) the suffix for the specific framework

super_gradients.training.utils.utils.check_models_have_same_weights(model_1: torch.nn.modules.module.Module, model_2: torch.nn.modules.module.Module)[source]

Checks whether two networks have the same weights

@param model_1: Net to be checked @param model_2: Net to be checked @return: True iff the two networks have the same weights

super_gradients.training.utils.utils.recursive_override(base: dict, extension: dict)[source]
super_gradients.training.utils.utils.download_and_unzip_from_url(url, dir='.', unzip=True, delete=True)[source]

Downloads a zip file from url to dir, and unzips it.

Parameters
  • url – Url to download the file from.

  • dir – Destination directory.

  • unzip – Whether to unzip the downloaded file.

  • delete – Whether to delete the zip file.

used to downlaod VOC.

Source: https://github.com/ultralytics/yolov5/blob/master/data/VOC.yaml

super_gradients.training.utils.utils.download_and_untar_from_url(urls: List[str], dir: Union[str, pathlib.Path] = '.')[source]

Download a file from url and untar.

Parameters
  • urls – Url to download the file from.

  • dir – Destination directory.

super_gradients.training.utils.utils.make_divisible(x: int, divisor: int, ceil: bool = True)int[source]

Returns x evenly divisible by divisor. If ceil=True it will return the closest larger number to the original x, and ceil=False the closest smaller number.

super_gradients.training.utils.utils.check_img_size_divisibility(img_size: int, stride: int = 32)Tuple[bool, Optional[Tuple[int, int]]][source]
Parameters
  • img_size – Int, the size of the image (H or W).

  • stride – Int, the number to check if img_size is divisible by.

Returns

(True, None) if img_size is divisble by stride, (False, Suggestions) if it’s not. Note: Suggestions are the two closest numbers to img_size that are divisible by stride. For example if img_size=321, stride=32, it will return (False,(352, 320)).

super_gradients.training.utils.utils.get_orientation_key()int[source]

Get the orientation key according to PIL, which is useful to get the image size for instance :return: Orientation key according to PIL

super_gradients.training.utils.utils.exif_size(image: <module 'PIL.Image' from '/Users/shaniperl/opt/anaconda3/lib/python3.9/site-packages/PIL/Image.py'>)Tuple[int, int][source]

Get the size of image. :param image: The image to get size from :return: (width, height)

super_gradients.training.utils.utils.get_image_size_from_path(img_path: str)Tuple[int, int][source]

Get the image size of an image at a specific path

super_gradients.training.utils.weight_averaging_utils module

class super_gradients.training.utils.weight_averaging_utils.ModelWeightAveraging(ckpt_dir, greater_is_better, source_ckpt_folder_name=None, metric_to_watch='acc', metric_idx=1, load_checkpoint=False, number_of_models_to_average=10, model_checkpoints_location='local')[source]

Bases: object

Utils class for managing the averaging of the best several snapshots into a single model. A snapshot dictionary file and the average model will be saved / updated at every epoch and evaluated only when training is completed. The snapshot file will only be deleted upon completing the training. The snapshot dict will be managed on cpu.

update_snapshots_dict(model, validation_results_tuple)[source]

Update the snapshot dict and returns the updated average model for saving :param model: the latest model :param validation_results_tuple: performance of the latest model

get_average_model(model, validation_results_tuple=None)[source]

Returns the averaged model :param model: will be used to determine arch :param validation_results_tuple: if provided, will update the average model before returning :param target_device: if provided, return sd on target device

cleanup()[source]

Delete snapshot file when reaching the last epoch

Module contents

class super_gradients.training.utils.Timer(device: str)[source]

Bases: object

A class to measure time handling both GPU & CPU processes Returns time in milliseconds

start()[source]
stop()[source]
class super_gradients.training.utils.HpmStruct(**entries)[source]

Bases: object

set_schema(schema: dict)[source]
override(**entries)[source]
to_dict()[source]
validate()[source]

Validate the current dict values according to the provided schema :raises

AttributeError if schema was not set jsonschema.exceptions.ValidationError if the instance is invalid jsonschema.exceptions.SchemaError if the schema itselfis invalid

class super_gradients.training.utils.WrappedModel(module)[source]

Bases: torch.nn.modules.module.Module

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
super_gradients.training.utils.convert_to_tensor(array)[source]

Converts numpy arrays and lists to Torch tensors before calculation losses :param array: torch.tensor / Numpy array / List

super_gradients.training.utils.get_param(params, name, default_val=None)[source]

Retrieves a param from a parameter object/dict. If the parameter does not exist, will return default_val. In case the default_val is of type dictionary, and a value is found in the params - the function will return the default value dictionary with internal values overridden by the found value

i.e. default_opt_params = {‘lr’:0.1, ‘momentum’:0.99, ‘alpha’:0.001} training_params = {‘optimizer_params’: {‘lr’:0.0001}, ‘batch’: 32 …. } get_param(training_params, name=’optimizer_params’, default_val=default_opt_params) will return {‘lr’:0.0001, ‘momentum’:0.99, ‘alpha’:0.001}

Parameters
  • params – an object (typically HpmStruct) or a dict holding the params

  • name – name of the searched parameter

  • default_val – assumed to be the same type as the value searched in the params

Returns

the found value, or default if not found

super_gradients.training.utils.tensor_container_to_device(obj: Union[torch.Tensor, tuple, list, dict], device: str, non_blocking=True)[source]
recursively send compounded objects to device (sending all tensors to device and maintaining structure)

:param obj the object to send to device (list / tuple / tensor / dict) :param device: device to send the tensors to :param non_blocking: used for DistributedDataParallel :returns an object with the same structure (tensors, lists, tuples) with the device pointers (like

the return value of Tensor.to(device)

super_gradients.training.utils.adapt_state_dict_to_fit_model_layer_names(model_state_dict: dict, source_ckpt: dict, exclude: list = [], solver: Optional[callable] = None)[source]

Given a model state dict and source checkpoints, the method tries to correct the keys in the model_state_dict to fit the ckpt in order to properly load the weights into the model. If unsuccessful - returns None

param model_state_dict

the model state_dict

param source_ckpt

checkpoint dict

:param exclude optional list for excluded layers :param solver: callable with signature (ckpt_key, ckpt_val, model_key, model_val)

that returns a desired weight for ckpt_val.

return

renamed checkpoint dict (if possible)

super_gradients.training.utils.raise_informative_runtime_error(state_dict, checkpoint, exception_msg)[source]

Given a model state dict and source checkpoints, the method calls “adapt_state_dict_to_fit_model_layer_names” and enhances the exception_msg if loading the checkpoint_dict via the conversion method is possible

super_gradients.training.utils.random_seed(is_ddp, device, seed)[source]

Sets random seed of numpy, torch and random.

When using ddp a seed will be set for each process according to its local rank derived from the device number. :param is_ddp: bool, will set different random seed for each process when using ddp. :param device: ‘cuda’,’cpu’, ‘cuda:<device_number>’ :param seed: int, random seed to be set