super_gradients.training.utils package
Subpackages
Submodules
super_gradients.training.utils.callbacks module
- class super_gradients.training.utils.callbacks.Phase(value)[source]
Bases:
enum.Enum
An enumeration.
- PRE_TRAINING = 'PRE_TRAINING'
- TRAIN_BATCH_END = 'TRAIN_BATCH_END'
- TRAIN_BATCH_STEP = 'TRAIN_BATCH_STEP'
- TRAIN_EPOCH_START = 'TRAIN_EPOCH_START'
- TRAIN_EPOCH_END = 'TRAIN_EPOCH_END'
- VALIDATION_BATCH_END = 'VALIDATION_BATCH_END'
- VALIDATION_EPOCH_END = 'VALIDATION_EPOCH_END'
- VALIDATION_END_BEST_EPOCH = 'VALIDATION_END_BEST_EPOCH'
- TEST_BATCH_END = 'TEST_BATCH_END'
- TEST_END = 'TEST_END'
- POST_TRAINING = 'POST_TRAINING'
- class super_gradients.training.utils.callbacks.ContextSgMethods(**methods)[source]
Bases:
object
Class for delegating SgModel’s methods, so that only the relevant ones are (“phase wise”) are accessible.
- class super_gradients.training.utils.callbacks.PhaseContext(epoch=None, batch_idx=None, optimizer=None, metrics_dict=None, inputs=None, preds=None, target=None, metrics_compute_fn=None, loss_avg_meter=None, loss_log_items=None, criterion=None, device=None, experiment_name=None, ckpt_dir=None, net=None, lr_warmup_epochs=None, sg_logger=None, train_loader=None, valid_loader=None, training_params=None, ddp_silent_mode=None, checkpoint_params=None, architecture=None, arch_params=None, metric_idx_in_results_tuple=None, metric_to_watch=None, valid_metrics=None, context_methods=None)[source]
Bases:
object
Represents the input for phase callbacks, and is constantly updated after callback calls.
- class super_gradients.training.utils.callbacks.PhaseCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]
Bases:
object
- class super_gradients.training.utils.callbacks.ModelConversionCheckCallback(model_meta_data: deci_lab_client.models.model_metadata.ModelMetadata, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
Pre-training callback that verifies model conversion to onnx given specified conversion parameters.
The model is converted, then inference is applied with onnx runtime.
Use this callback wit hthe same args as DeciPlatformCallback to prevent conversion fails at the end of training.
- model_meta_data
(ModelMetadata) model’s meta-data object.
- The following parameters may be passed as kwargs in order to control the conversion to onnx
- class super_gradients.training.utils.callbacks.DeciLabUploadCallback(model_meta_data, optimization_request_form, auth_token: Optional[str] = None, ckpt_name='ckpt_best.pth', **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
Post-training callback for uploading and optimizing a model.
- email
(str) username for Deci platform.
- model_meta_data
(ModelMetadata) model’s meta-data object.
- optimization_request_form
(dict) optimization request form object.
- password
(str) default=None, should only be used for testing.
- ckpt_name
(str) default=”ckpt_best” refers to the filename of the checkpoint, inside the checkpoint directory.
- The following parameters may be passed as kwargs in order to control the conversion to onnx
- upload_model(model)[source]
This function will upload the trained model to the Deci Lab
- Parameters
model – The resulting model from the training process
- get_optimization_status(optimized_model_name: str)[source]
This function will do fetch the optimized version of the trained model and check on its benchmark status. The status will be checked against the server every 30 seconds and the process will timeout after 30 minutes or log about the successful optimization - whichever happens first. :param optimized_model_name: Optimized model name :type optimized_model_name: str
- Returns
whether or not the optimized model has been benchmarked
- Return type
bool
- class super_gradients.training.utils.callbacks.LRCallbackBase(phase, initial_lr, update_param_groups, train_loader_len, net, training_params, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
Base class for hard coded learning rate scheduling regimes, implemented as callbacks.
- is_lr_scheduling_enabled(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
Predicate that controls whether to perform lr scheduling based on values in context.
@param context: PhaseContext: current phase’s context. @return: bool, whether to apply lr scheduling or not.
- perform_scheduling(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
Performs lr scheduling based on values in context.
@param context: PhaseContext: current phase’s context.
- class super_gradients.training.utils.callbacks.WarmupLRCallback(**kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
LR scheduling callback for linear step warmup. LR climbs from warmup_initial_lr with even steps to initial lr. When warmup_initial_lr is None- LR climb starts from
initial_lr/(1+warmup_epochs).
- class super_gradients.training.utils.callbacks.StepLRCallback(lr_updates, lr_decay_factor, step_lr_update_freq=None, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
Hard coded step learning rate scheduling (i.e at specific milestones).
- class super_gradients.training.utils.callbacks.ExponentialLRCallback(lr_decay_factor: float, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
Exponential decay learning rate scheduling. Decays the learning rate by lr_decay_factor every epoch.
- class super_gradients.training.utils.callbacks.PolyLRCallback(max_epochs, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
Hard coded polynomial decay learning rate scheduling (i.e at specific milestones).
- class super_gradients.training.utils.callbacks.CosineLRCallback(max_epochs, cosine_final_lr_ratio, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
Hard coded step Cosine anealing learning rate scheduling.
- class super_gradients.training.utils.callbacks.FunctionLRCallback(max_epochs, lr_schedule_function, **kwargs)[source]
Bases:
super_gradients.training.utils.callbacks.LRCallbackBase
Hard coded rate scheduling for user defined lr scheduling function.
- exception super_gradients.training.utils.callbacks.IllegalLRSchedulerMetric(metric_name, metrics_dict)[source]
Bases:
Exception
Exception raised illegal combination of training parameters.
- message -- explanation of the error
- class super_gradients.training.utils.callbacks.LRSchedulerCallback(scheduler, phase, metric_name=None)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
Learning rate scheduler callback.
- scheduler
torch.optim._LRScheduler, the learning rate scheduler to be called step() with.
- metric_name
str, (default=None) the metric name for ReduceLROnPlateau learning rate scheduler.
- When passing __call__ a metrics_dict, with a key=self.metric_name, the value of that metric will monitored
for ReduceLROnPlateau (i.e step(metrics_dict[self.metric_name]).
- class super_gradients.training.utils.callbacks.MetricsUpdateCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
- class super_gradients.training.utils.callbacks.KDModelMetricsUpdateCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]
Bases:
super_gradients.training.utils.callbacks.MetricsUpdateCallback
- class super_gradients.training.utils.callbacks.PhaseContextTestCallback(phase: super_gradients.training.utils.callbacks.Phase)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
A callback that saves the phase context the for testing.
- class super_gradients.training.utils.callbacks.DetectionVisualizationCallback(phase: super_gradients.training.utils.callbacks.Phase, freq: int, post_prediction_callback: super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback, classes: list, batch_idx: int = 0, last_img_idx_in_batch: int = - 1)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
A callback that adds a visualization of a batch of detection predictions to context.sg_logger .. attribute:: freq
frequency (in epochs) to perform this callback.
- batch_idx
batch index to perform visualization for.
- classes
class list of the dataset.
- last_img_idx_in_batch
Last image index to add to log. (default=-1, will take entire batch).
- class super_gradients.training.utils.callbacks.BinarySegmentationVisualizationCallback(phase: super_gradients.training.utils.callbacks.Phase, freq: int, batch_idx: int = 0, last_img_idx_in_batch: int = - 1)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
A callback that adds a visualization of a batch of segmentation predictions to context.sg_logger .. attribute:: freq
frequency (in epochs) to perform this callback.
- batch_idx
batch index to perform visualization for.
- last_img_idx_in_batch
Last image index to add to log. (default=-1, will take entire batch).
- class super_gradients.training.utils.callbacks.TrainingStageSwitchCallbackBase(next_stage_start_epoch: int)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
TrainingStageSwitchCallback
A phase callback that is called at a specific epoch (epoch start) to support multi-stage training. It does so by manipulating the objects inside the context.
- next_stage_start_epoch
int, the epoch idx to apply the stage change.
- apply_stage_change(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
- This method is called when the callback is fired on the next_stage_start_epoch,
and holds the stage change logic that should be applied to the context’s objects.
- Parameters
context – PhaseContext, context of current phase
- class super_gradients.training.utils.callbacks.YoloXTrainingStageSwitchCallback(next_stage_start_epoch: int = 285)[source]
Bases:
super_gradients.training.utils.callbacks.TrainingStageSwitchCallbackBase
Training stage switch for YoloX training. Disables mosaic, and manipulates YoloX loss to use L1.
- apply_stage_change(context: super_gradients.training.utils.callbacks.PhaseContext)[source]
- This method is called when the callback is fired on the next_stage_start_epoch,
and holds the stage change logic that should be applied to the context’s objects.
- Parameters
context – PhaseContext, context of current phase
- class super_gradients.training.utils.callbacks.CallbackHandler(callbacks)[source]
Bases:
object
Runs all callbacks who’s phase attribute equals to the given phase.
- callbacks
List[PhaseCallback]. Callbacks to be run.
- class super_gradients.training.utils.callbacks.TestLRCallback(lr_placeholder)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
- Phase callback that collects the learning rates in lr_placeholder at the end of each epoch (used for testing). In
the case of multiple parameter groups (i.e multiple learning rates) the learning rate is collected from the first one. The phase is VALIDATION_EPOCH_END to ensure all lr updates have been performed before calling this callback.
super_gradients.training.utils.checkpoint_utils module
- super_gradients.training.utils.checkpoint_utils.get_ckpt_local_path(source_ckpt_folder_name: str, experiment_name: str, ckpt_name: str, model_checkpoints_location: str, external_checkpoint_path: str, overwrite_local_checkpoint: bool, load_weights_only: bool)[source]
- Gets the local path to the checkpoint file, which will be:
By default: YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name.
- if the checkpoint file is remotely located:
when overwrite_local_checkpoint=True then it will be saved in a temporary path which will be returned, otherwise it will be downloaded to YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name and overwrite YOUR_REPO_ROOT/super_gradients/checkpoints/experiment_name/ckpt_name if such file exists.
external_checkpoint_path when external_checkpoint_path != None
@param source_ckpt_folder_name: The folder where the checkpoint is saved. When set to None- uses the experiment_name. @param experiment_name: experiment name attr in sg_model @param ckpt_name: checkpoint filename @param model_checkpoints_location: S3, local ot URL @param external_checkpoint_path: full path to checkpoint file (that might be located outside of super_gradients/checkpoints directory) @param overwrite_local_checkpoint: whether to overwrite the checkpoint file with the same name when downloading from S3. @param load_weights_only: whether to load the network’s state dict only. @return:
- super_gradients.training.utils.checkpoint_utils.adaptive_load_state_dict(net: torch.nn.modules.module.Module, state_dict: dict, strict: str)[source]
Adaptively loads state_dict to net, by adapting the state_dict to net’s layer names first.
@param net: (nn.Module) to load state_dict to @param state_dict: (dict) Chekpoint state_dict @param strict: (str) key matching strictness @return:
- super_gradients.training.utils.checkpoint_utils.read_ckpt_state_dict(ckpt_path: str, device='cpu')[source]
- super_gradients.training.utils.checkpoint_utils.adapt_state_dict_to_fit_model_layer_names(model_state_dict: dict, source_ckpt: dict, exclude: list = [], solver: Optional[callable] = None)[source]
Given a model state dict and source checkpoints, the method tries to correct the keys in the model_state_dict to fit the ckpt in order to properly load the weights into the model. If unsuccessful - returns None
- param model_state_dict
the model state_dict
- param source_ckpt
checkpoint dict
:param exclude optional list for excluded layers :param solver: callable with signature (ckpt_key, ckpt_val, model_key, model_val)
that returns a desired weight for ckpt_val.
- return
renamed checkpoint dict (if possible)
- super_gradients.training.utils.checkpoint_utils.raise_informative_runtime_error(state_dict, checkpoint, exception_msg)[source]
Given a model state dict and source checkpoints, the method calls “adapt_state_dict_to_fit_model_layer_names” and enhances the exception_msg if loading the checkpoint_dict via the conversion method is possible
- super_gradients.training.utils.checkpoint_utils.load_checkpoint_to_model(ckpt_local_path: str, load_backbone: bool, net: torch.nn.modules.module.Module, strict: str, load_weights_only: bool, load_ema_as_net: bool = False)[source]
Loads the state dict in ckpt_local_path to net and returns the checkpoint’s state dict.
@param load_ema_as_net: Will load the EMA inside the checkpoint file to the network when set @param ckpt_local_path: local path to the checkpoint file @param load_backbone: whether to load the checkpoint as a backbone @param net: network to load the checkpoint to @param strict: @param load_weights_only: @return:
- exception super_gradients.training.utils.checkpoint_utils.MissingPretrainedWeightsException(desc)[source]
Bases:
Exception
Exception raised by unsupported pretrianed model.
- message -- explanation of the error
- super_gradients.training.utils.checkpoint_utils.load_pretrained_weights(model: torch.nn.modules.module.Module, architecture: str, pretrained_weights: str)[source]
Loads pretrained weights from the MODEL_URLS dictionary to model @param architecture: name of the model’s architecture @param model: model to load pretrinaed weights for @param pretrained_weights: name for the pretrianed weights (i.e imagenet) @return: None
super_gradients.training.utils.detection_utils module
- class super_gradients.training.utils.detection_utils.DetectionTargetsFormat(value)[source]
Bases:
enum.Enum
Enum class for the different detection output formats
When NORMALIZED is not specified- the type refers to unnormalized image coordinates (of the bboxes).
For example: LABEL_NORMALIZED_XYXY means [class_idx,x1,y1,x2,y2]
- LABEL_XYXY = 'LABEL_XYXY'
- XYXY_LABEL = 'XYXY_LABEL'
- LABEL_NORMALIZED_XYXY = 'LABEL_NORMALIZED_XYXY'
- NORMALIZED_XYXY_LABEL = 'NORMALIZED_XYXY_LABEL'
- LABEL_CXCYWH = 'LABEL_CXCYWH'
- CXCYWH_LABEL = 'CXCYWH_LABEL'
- LABEL_NORMALIZED_CXCYWH = 'LABEL_NORMALIZED_CXCYWH'
- NORMALIZED_CXCYWH_LABEL = 'NORMALIZED_CXCYWH_LABEL'
- super_gradients.training.utils.detection_utils.get_cls_posx_in_target(target_format: super_gradients.training.utils.detection_utils.DetectionTargetsFormat) → int[source]
Get the label of a given target :param target_format: Representation of the target (ex: LABEL_XYXY) :return: Position of the class id in a bbox
ex: 0 if bbox of format label_xyxy | -1 if bbox of format xyxy_label
- super_gradients.training.utils.detection_utils.convert_xywh_bbox_to_xyxy(input_bbox: torch.Tensor)[source]
- Converts bounding box format from [x, y, w, h] to [x1, y1, x2, y2]
- param input_bbox
input bbox either 2-dimensional (for all boxes of a single image) or 3-dimensional (for boxes of a batch of images)
- return
Converted bbox in same dimensions as the original
- super_gradients.training.utils.detection_utils.calculate_bbox_iou_matrix(box1, box2, x1y1x2y2=True, GIoU: bool = False, DIoU=False, CIoU=False, eps=1e-09)[source]
- calculate iou matrix containing the iou of every couple iuo(i,j) where i is in box1 and j is in box2
- param box1
a 2D tensor of boxes (shape N x 4)
- param box2
a 2D tensor of boxes (shape M x 4)
- param x1y1x2y2
boxes format is x1y1x2y2 (True) or xywh where xy is the center (False)
- return
a 2D iou matrix (shape NxM)
- super_gradients.training.utils.detection_utils.calc_bbox_iou_matrix(pred: torch.Tensor)[source]
calculate iou for every pair of boxes in the boxes vector :param pred: a 3-dimensional tensor containing all boxes for a batch of images [N, num_boxes, 4], where
each box format is [x1,y1,x2,y2]
- Returns
a 3-dimensional matrix where M_i_j_k is the iou of box j and box k of the i’th image in the batch
- super_gradients.training.utils.detection_utils.change_bbox_bounds_for_image_size(boxes, img_shape)[source]
- class super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback[source]
Bases:
abc.ABC
,torch.nn.modules.module.Module
- abstract forward(x, device: str)[source]
- Parameters
x – the output of your model
device – the device to move all output tensors into
- Returns
a list with length batch_size, each item in the list is a detections with shape: nx6 (x1, y1, x2, y2, confidence, class) where x and y are in range [0,1]
- training: bool
- class super_gradients.training.utils.detection_utils.IouThreshold(value)[source]
Bases:
tuple
,enum.Enum
An enumeration.
- MAP_05 = (0.5, 0.5)
- MAP_05_TO_095 = (0.5, 0.95)
- super_gradients.training.utils.detection_utils.box_iou(box1, box2)[source]
Return intersection-over-union (Jaccard index) of boxes. Both sets of boxes are expected to be in (x1, y1, x2, y2) format. :param box1: :type box1: Tensor[N, 4] :param box2: :type box2: Tensor[M, 4]
- Returns
- the NxM matrix containing the pairwise
IoU values for every element in boxes1 and boxes2
- Return type
iou (Tensor[N, M])
- super_gradients.training.utils.detection_utils.non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, multi_label_per_box: bool = True, with_confidence: bool = False)[source]
- Performs Non-Maximum Suppression (NMS) on inference results
- param prediction
raw model prediction
- param conf_thres
below the confidence threshold - prediction are discarded
- param iou_thres
IoU threshold for the nms algorithm
- param multi_label_per_box
whether to use re-use each box with all possible labels (instead of the maximum confidence all confidences above threshold will be sent to NMS); by default is set to True
- param with_confidence
whether to multiply objectness score with class score. usually valid for Yolo models only.
- return
(x1, y1, x2, y2, object_conf, class_conf, class)
- Returns
nx6 (x1, y1, x2, y2, conf, cls)
- Return type
detections with shape
- super_gradients.training.utils.detection_utils.matrix_non_max_suppression(pred, conf_thres: float = 0.1, kernel: str = 'gaussian', sigma: float = 3.0, max_num_of_detections: int = 500)[source]
- Performs Matrix Non-Maximum Suppression (NMS) on inference results
https://arxiv.org/pdf/1912.04488.pdf :param pred: raw model prediction (in test mode) - a Tensor of shape [batch, num_predictions, 85] where each item format is (x, y, w, h, object_conf, class_conf, … 80 classes score …) :param conf_thres: below the confidence threshold - prediction are discarded :param kernel: type of kernel to use [‘gaussian’, ‘linear’] :param sigma: sigma for the gussian kernel :param max_num_of_detections: maximum number of boxes to output :return: list of (x1, y1, x2, y2, object_conf, class_conf, class)
- Returns
(x1, y1, x2, y2, conf, cls)
- Return type
detections list with shape
- class super_gradients.training.utils.detection_utils.NMS_Type(value)[source]
Bases:
str
,enum.Enum
Type of non max suppression algorithm that can be used for post processing detection
- ITERATIVE = 'iterative'
- MATRIX = 'matrix'
- super_gradients.training.utils.detection_utils.undo_image_preprocessing(im_tensor: torch.Tensor) → numpy.ndarray[source]
- Parameters
im_tensor – images in a batch after preprocessing for inference, RGB, (B, C, H, W)
- Returns
images in a batch in cv2 format, BGR, (B, H, W, C)
- class super_gradients.training.utils.detection_utils.DetectionVisualization[source]
Bases:
object
- static visualize_batch(image_tensor: torch.Tensor, pred_boxes: List[torch.Tensor], target_boxes: torch.Tensor, batch_name: Union[int, str], class_names: List[str], checkpoint_dir: Optional[str] = None, undo_preprocessing_func: Callable[[torch.Tensor], numpy.ndarray] = <function undo_image_preprocessing>, box_thickness: int = 2, image_scale: float = 1.0, gt_alpha: float = 0.4)[source]
A helper function to visualize detections predicted by a network: saves images into a given path with a name that is {batch_name}_{imade_idx_in_the_batch}.jpg, one batch per call. Colors are generated on the fly: uniformly sampled from color wheel to support all given classes.
- Adjustable:
Ground truth box transparency;
Box width;
Image size (larger or smaller than what’s provided)
- Parameters
image_tensor – rgb images, (B, H, W, 3)
pred_boxes – boxes after NMS for each image in a batch, each (Num_boxes, 6), values on dim 1 are: x1, y1, x2, y2, confidence, class
target_boxes – (Num_targets, 6), values on dim 1 are: image id in a batch, class, x y w h (coordinates scaled to [0, 1])
batch_name – id of the current batch to use for image naming
class_names – names of all classes, each on its own index
checkpoint_dir – a path where images with boxes will be saved. if None, the result images will be returns as a list of numpy image arrays
undo_preprocessing_func – a function to convert preprocessed images tensor into a batch of cv2-like images
box_thickness – box line thickness in px
image_scale – scale of an image w.r.t. given image size, e.g. incoming images are (320x320), use scale = 2. to preview in (640x640)
gt_alpha – a value in [0., 1.] transparency on ground truth boxes, 0 for invisible, 1 for fully opaque
- class super_gradients.training.utils.detection_utils.Anchors(anchors_list: List[List], strides: List[int])[source]
Bases:
torch.nn.modules.module.Module
A wrapper function to hold the anchors used by detection models such as Yolo
- property stride: torch.nn.parameter.Parameter
- property anchors: torch.nn.parameter.Parameter
- property anchor_grid: torch.nn.parameter.Parameter
- property detection_layers_num: int
- property num_anchors: int
- training: bool
- super_gradients.training.utils.detection_utils.xyxy2cxcywh(bboxes)[source]
Transforms bboxes from xyxy format to centerized xy wh format :param bboxes: array, shaped (nboxes, 4) :return: modified bboxes
- super_gradients.training.utils.detection_utils.cxcywh2xyxy(bboxes)[source]
Transforms bboxes from centerized xy wh format to xyxy format :param bboxes: array, shaped (nboxes, 4) :return: modified bboxes
- super_gradients.training.utils.detection_utils.get_mosaic_coordinate(mosaic_index, xc, yc, w, h, input_h, input_w)[source]
Returns the mosaic coordinates of final mosaic image according to mosaic image index.
- Parameters
mosaic_index – (int) mosaic image index
xc – (int) center x coordinate of the entire mosaic grid.
yc – (int) center y coordinate of the entire mosaic grid.
w – (int) width of bbox
h – (int) height of bbox
input_h – (int) image input height (should be 1/2 of the final mosaic output image height).
input_w – (int) image input width (should be 1/2 of the final mosaic output image width).
- Returns
(x1, y1, x2, y2), (x1s, y1s, x2s, y2s) where (x1, y1, x2, y2) are the coordinates in the final mosaic output image, and (x1s, y1s, x2s, y2s) are the coordinates in the placed image.
- super_gradients.training.utils.detection_utils.adjust_box_anns(bbox, scale_ratio, padw, padh, w_max, h_max)[source]
Adjusts the bbox annotations of rescaled, padded image.
- Parameters
bbox – (np.array) bbox to modify.
scale_ratio – (float) scale ratio between rescale output image and original one.
padw – (int) width padding size.
padh – (int) height padding size.
w_max – (int) width border.
h_max – (int) height border
- Returns
modified bbox (np.array)
- class super_gradients.training.utils.detection_utils.DetectionCollateFN[source]
Bases:
object
Collate function for Yolox training
- class super_gradients.training.utils.detection_utils.CrowdDetectionCollateFN[source]
Bases:
super_gradients.training.utils.detection_utils.DetectionCollateFN
Collate function for Yolox training with additional_batch_items that includes crowd targets
- super_gradients.training.utils.detection_utils.compute_box_area(box: torch.Tensor) → torch.Tensor[source]
- Compute the area of one or many boxes.
- param box
One or many boxes, shape = (4, ?), each box in format (x1, y1, x2, y2)
- Returns
Area of every box, shape = (1, ?)
- super_gradients.training.utils.detection_utils.crowd_ioa(det_box: torch.Tensor, crowd_box: torch.Tensor) → torch.Tensor[source]
Return intersection-over-detection_area of boxes, used for crowd ground truths. Both sets of boxes are expected to be in (x1, y1, x2, y2) format. :param det_box: :type det_box: Tensor[N, 4] :param crowd_box: :type crowd_box: Tensor[M, 4]
- Returns
- the NxM matrix containing the pairwise
IoA values for every element in det_box and crowd_box
- Return type
crowd_ioa (Tensor[N, M])
- super_gradients.training.utils.detection_utils.compute_detection_matching(output: torch.Tensor, targets: torch.Tensor, height: int, width: int, iou_thresholds: torch.Tensor, denormalize_targets: bool, device: str, crowd_targets: Optional[torch.Tensor] = None, top_k: int = 100, return_on_cpu: bool = True) → List[Tuple][source]
Match predictions (NMS output) and the targets (ground truth) with respect to IoU and confidence score. :param output: list (of length batch_size) of Tensors of shape (num_predictions, 6)
format: (x1, y1, x2, y2, confidence, class_label) where x1,y1,x2,y2 are according to image size
- Parameters
targets – targets for all images of shape (total_num_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]
height – dimensions of the image
width – dimensions of the image
iou_thresholds – Threshold to compute the mAP
device – Device
crowd_targets – crowd targets for all images of shape (total_num_crowd_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]
top_k – Number of predictions to keep per class, ordered by confidence score
denormalize_targets – If True, denormalize the targets and crowd_targets
return_on_cpu – If True, the output will be returned on “CPU”, otherwise it will be returned on “device”
- Returns
list of the following tensors, for every image: :preds_matched: Tensor of shape (num_img_predictions, n_iou_thresholds)
True when prediction (i) is matched with a target with respect to the (j)th IoU threshold
- preds_to_ignore
Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold
- preds_scores
Tensor of shape (num_img_predictions), confidence score for every prediction
- preds_cls
Tensor of shape (num_img_predictions), predicted class for every prediction
- targets_cls
Tensor of shape (num_img_targets), ground truth class for every target
- super_gradients.training.utils.detection_utils.compute_img_detection_matching(preds: torch.Tensor, targets: torch.Tensor, crowd_targets: torch.Tensor, height: int, width: int, iou_thresholds: torch.Tensor, device: str, denormalize_targets: bool, top_k: int = 100, return_on_cpu: bool = True) → Tuple[source]
Match predictions (NMS output) and the targets (ground truth) with respect to IoU and confidence score for a given image. :param preds: Tensor of shape (num_img_predictions, 6)
format: (x1, y1, x2, y2, confidence, class_label) where x1,y1,x2,y2 are according to image size
- Parameters
targets – targets for this image of shape (num_img_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]
height – dimensions of the image
width – dimensions of the image
iou_thresholds – Threshold to compute the mAP
device –
crowd_targets – crowd targets for all images of shape (total_num_crowd_targets, 6) format: (index, x, y, w, h, label) where x,y,w,h are in range [0,1]
top_k – Number of predictions to keep per class, ordered by confidence score
device – Device
denormalize_targets – If True, denormalize the targets and crowd_targets
return_on_cpu – If True, the output will be returned on “CPU”, otherwise it will be returned on “device”
- Returns
- preds_matched
Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the (j)th IoU threshold
- preds_to_ignore
Tensor of shape (num_img_predictions, n_iou_thresholds) True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold
- preds_scores
Tensor of shape (num_img_predictions), confidence score for every prediction
- preds_cls
Tensor of shape (num_img_predictions), predicted class for every prediction
- targets_cls
Tensor of shape (num_img_targets), ground truth class for every target
- super_gradients.training.utils.detection_utils.get_top_k_idx_per_cls(preds_scores: torch.Tensor, preds_cls: torch.Tensor, top_k: int)[source]
Get the indexes of all the top k predictions for every class
- Parameters
preds_scores – The confidence scores, vector of shape (n_pred)
preds_cls – The predicted class, vector of shape (n_pred)
top_k – Number of predictions to keep per class, ordered by confidence score
- Return top_k_idx
Indexes of the top k predictions. length <= (k * n_unique_class)
- super_gradients.training.utils.detection_utils.compute_detection_metrics(preds_matched: torch.Tensor, preds_to_ignore: torch.Tensor, preds_scores: torch.Tensor, preds_cls: torch.Tensor, targets_cls: torch.Tensor, device: str, recall_thresholds: Optional[torch.Tensor] = None, score_threshold: Optional[float] = 0.1) → Tuple[source]
Compute the list of precision, recall, MaP and f1 for every recall IoU threshold and for every class.
- Parameters
preds_matched – Tensor of shape (num_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the (j)th IoU threshold
- :param preds_to_ignore Tensor of shape (num_predictions, n_iou_thresholds)
True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold
- Parameters
preds_scores – Tensor of shape (num_predictions), confidence score for every prediction
preds_cls – Tensor of shape (num_predictions), predicted class for every prediction
targets_cls – Tensor of shape (num_targets), ground truth class for every target box to be detected
recall_thresholds – Recall thresholds used to compute MaP.
score_threshold – Minimum confidence score to consider a prediction for the computation of precision, recall and f1 (not MaP)
device – Device
- Returns
- ap, precision, recall, f1
Tensors of shape (n_class, nb_iou_thrs)
- unique_classes
Vector with all unique target classes
- super_gradients.training.utils.detection_utils.compute_detection_metrics_per_cls(preds_matched: torch.Tensor, preds_to_ignore: torch.Tensor, preds_scores: torch.Tensor, n_targets: int, recall_thresholds: torch.Tensor, score_threshold: float, device: str)[source]
Compute the list of precision, recall and MaP of a given class for every recall IoU threshold.
- param preds_matched
Tensor of shape (num_predictions, n_iou_thresholds) True when prediction (i) is matched with a target with respect to the(j)th IoU threshold
- :param preds_to_ignore Tensor of shape (num_predictions, n_iou_thresholds)
True when prediction (i) is matched with a crowd target with respect to the (j)th IoU threshold
- param preds_scores
Tensor of shape (num_predictions), confidence score for every prediction
- param n_targets
Number of target boxes of this class
- param recall_thresholds
Tensor of shape (max_n_rec_thresh) list of recall thresholds used to compute MaP
- param score_threshold
Minimum confidence score to consider a prediction for the computation of precision and recall (not MaP)
- param device
Device
- return ap, precision, recall
Tensors of shape (nb_iou_thrs)
super_gradients.training.utils.distributed_training_utils module
- super_gradients.training.utils.distributed_training_utils.distributed_all_reduce_tensor_average(tensor, n)[source]
This method performs a reduce operation on multiple nodes running distributed training It first sums all of the results and then divides the summation :param tensor: The tensor to perform the reduce operation for :param n: Number of nodes :return: Averaged tensor from all of the nodes
- super_gradients.training.utils.distributed_training_utils.reduce_results_tuple_for_ddp(validation_results_tuple, device)[source]
Gather all validation tuples from the various devices and average them
- class super_gradients.training.utils.distributed_training_utils.MultiGPUModeAutocastWrapper(func)[source]
Bases:
object
- super_gradients.training.utils.distributed_training_utils.scaled_all_reduce(tensors: torch.Tensor, num_gpus: int)[source]
Performs the scaled all_reduce operation on the provided tensors. The input tensors are modified in-place. Currently supports only the sum reduction operator. The reduced values are scaled by the inverse size of the process group (equivalent to num_gpus).
- super_gradients.training.utils.distributed_training_utils.compute_precise_bn_stats(model: torch.nn.modules.module.Module, loader: torch.utils.data.dataloader.DataLoader, precise_bn_batch_size: int, num_gpus: int)[source]
- Parameters
model – The model being trained (ie: SgModel.net)
loader – Training dataloader (ie: SgModel.train_loader)
precise_bn_batch_size – The effective batch size we want to calculate the batchnorm on. For example, if we are training a model on 8 gpus, with a batch of 128 on each gpu, a good rule of thumb would be to give it 8192 (ie: effective_batch_size * num_gpus = batch_per_gpu * num_gpus * num_gpus). If precise_bn_batch_size is not provided in the training_params, the latter heuristic will be taken.
param num_gpus: The number of gpus we are training on
- super_gradients.training.utils.distributed_training_utils.get_local_rank()[source]
Returns the local rank if running in DDP, and 0 otherwise :return: local rank
super_gradients.training.utils.early_stopping module
- class super_gradients.training.utils.early_stopping.EarlyStop(phase: super_gradients.training.utils.callbacks.Phase, monitor: str, mode: str = 'min', min_delta: float = 0.0, patience: int = 3, check_finite: bool = True, threshold: Optional[float] = None, verbose: bool = False, strict: bool = True)[source]
Bases:
super_gradients.training.utils.callbacks.PhaseCallback
Callback to monitor a metric and stop training when it stops improving. Inspired by pytorch_lightning.callbacks.early_stopping and tf.keras.callbacks.EarlyStopping
- mode_dict = {'max': <built-in method gt of type object>, 'min': <built-in method lt of type object>}
- supported_phases = (<Phase.VALIDATION_EPOCH_END: 'VALIDATION_EPOCH_END'>, <Phase.TRAIN_EPOCH_END: 'TRAIN_EPOCH_END'>)
super_gradients.training.utils.ema module
- super_gradients.training.utils.ema.copy_attr(a: torch.nn.modules.module.Module, b: torch.nn.modules.module.Module, include: Union[list, tuple] = (), exclude: Union[list, tuple] = ())[source]
- class super_gradients.training.utils.ema.ModelEMA(model, decay: float = 0.9999, beta: float = 15, exp_activation: bool = True)[source]
Bases:
object
Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.
- update(model, training_percent: float)[source]
Update the state of the EMA model. :param model: current training model :param training_percent: the percentage of the training process [0,1]. i.e 0.4 means 40% of the training have passed
- update_attr(model)[source]
This function updates model attributes (not weight and biases) from original model to the ema model. attributes of the original model, such as anchors and grids (of detection models), may be crucial to the model operation and need to be updated. If include_attributes and exclude_attributes lists were not defined, all non-private (not starting with ‘_’) attributes will be updated (and only them). :param model: the source model
- class super_gradients.training.utils.ema.KDModelEMA(kd_model: super_gradients.training.models.kd_modules.kd_module.KDModule, decay: float = 0.9999, beta: float = 15, exp_activation: bool = True)[source]
Bases:
super_gradients.training.utils.ema.ModelEMA
Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.
super_gradients.training.utils.export_utils module
- super_gradients.training.utils.export_utils.fuse_conv_bn(model: torch.nn.modules.module.Module, replace_bn_with_identity: bool = False)[source]
Fuses consecutive nn.Conv2d and nn.BatchNorm2d layers recursively inplace in all of the model :param replace_bn_with_identity: if set to true, bn will be replaced with identity. otherwise, bn will be removed :param model: the target model :return: the number of fuses executed
super_gradients.training.utils.get_model_stats module
super_gradients.training.utils.module_utils module
- class super_gradients.training.utils.module_utils.MultiOutputModule(module: torch.nn.modules.module.Module, output_paths: list, prune: bool = True)[source]
Bases:
torch.nn.modules.module.Module
This module wraps around a container nn.Module (such as Module, Sequential and ModuleList) and allows to extract multiple output from its inner modules on each forward call() (as a list of output tensors) note: the default output of the wrapped module will not be added to the output list by default. To get the default output in the outputs list, explicitly include its path in the @output_paths parameter
i.e. for module:
- Sequential(
- (0): Sequential(
(0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU6(inplace=True)
) ===================================>> (1): InvertedResidual(
- (conv): Sequential(
(0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU6(inplace=True) ===================================>> (3): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False) (4): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
- and paths:
[0, [1, ‘conv’, 2]]
the output are marked with arrows
- forward(x) → list[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- super_gradients.training.utils.module_utils.replace_activations(module: torch.nn.modules.module.Module, new_activation: torch.nn.modules.module.Module, activations_to_replace: List[type])[source]
Recursively go through module and replaces each activation in activations_to_replace with a copy of new_activation :param module: a module that will be changed inplace :param new_activation: a sample of a new activation (will be copied) :param activations_to_replace: types of activations to replace, each must be a subclass of nn.Module
- super_gradients.training.utils.module_utils.fuse_repvgg_blocks_residual_branches(model: torch.nn.modules.module.Module)[source]
Call fuse_block_residual_branches for all repvgg blocks in the model :param model: torch.nn.Module with repvgg blocks. Doesn’t have to be entirely consists of repvgg. :type model: torch.nn.Module
- class super_gradients.training.utils.module_utils.ConvBNReLU(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', use_normalization: bool = True, eps: float = 1e-05, momentum: float = 0.1, affine: bool = True, track_running_stats: bool = True, device=None, dtype=None, use_activation: bool = True, inplace: bool = False)[source]
Bases:
torch.nn.modules.module.Module
- Class for Convolution2d-Batchnorm2d-Relu layer. Default behaviour is Conv-BN-Relu. To exclude Batchnorm module use
use_normalization=False, to exclude Relu activation use use_activation=False.
For convolution arguments documentation see nn.Conv2d. For batchnorm arguments documentation see nn.BatchNorm2d. For relu arguments documentation see nn.Relu.
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class super_gradients.training.utils.module_utils.NormalizationAdapter(mean_original, std_original, mean_required, std_required)[source]
Bases:
torch.nn.modules.module.Module
Denormalizes input by mean_original, std_original, then normalizes by mean_required, std_required.
Used in KD training where teacher expects data normalized by mean_required, std_required.
- mean_original, std_original, mean_required, std_required are all list-like objects of length that’s equal to the
number of input channels.
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
super_gradients.training.utils.optimizer_utils module
- super_gradients.training.utils.optimizer_utils.separate_zero_wd_params_groups_for_optimizer(module: torch.nn.modules.module.Module, net_named_params, weight_decay: float)[source]
- separate param groups for batchnorm and biases and others with weight decay. return list of param groups in format
required by torch Optimizer classes.
- bias + BN with weight decay=0 and the rest with the given weight decay
- param module
train net module.
- param net_named_params
list of params groups, output of SgModule.initialize_param_groups
- param weight_decay
value to set for the non BN and bias parameters
super_gradients.training.utils.regularization_utils module
- class super_gradients.training.utils.regularization_utils.DropPath(drop_prob=None)[source]
Bases:
torch.nn.modules.module.Module
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
Code taken from TIMM (https://github.com/rwightman/pytorch-image-models) Apache License 2.0
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
super_gradients.training.utils.segmentation_utils module
- super_gradients.training.utils.segmentation_utils.to_one_hot(target: torch.Tensor, num_classes: int, ignore_index: Optional[int] = None)[source]
Target label to one_hot tensor. labels and ignore_index must be consecutive numbers. :param target: Class labels long tensor, with shape [N, H, W] :param num_classes: num of classes in datasets excluding ignore label, this is the output channels of the one hot
result.
- Returns
one hot tensor with shape [N, num_classes, H, W]
- super_gradients.training.utils.segmentation_utils.reverse_imagenet_preprocessing(im_tensor: torch.Tensor) → numpy.ndarray[source]
- Parameters
im_tensor – images in a batch after preprocessing for inference, RGB, (B, C, H, W)
- Returns
images in a batch in cv2 format, BGR, (B, H, W, C)
- class super_gradients.training.utils.segmentation_utils.BinarySegmentationVisualization[source]
Bases:
object
- static visualize_batch(image_tensor: torch.Tensor, pred_mask: torch.Tensor, target_mask: torch.Tensor, batch_name: Union[int, str], checkpoint_dir: Optional[str] = None, undo_preprocessing_func: Callable[[torch.Tensor], numpy.ndarray] = <function reverse_imagenet_preprocessing>, image_scale: float = 1.0)[source]
A helper function to visualize detections predicted by a network: saves images into a given path with a name that is {batch_name}_{imade_idx_in_the_batch}.jpg, one batch per call. Colors are generated on the fly: uniformly sampled from color wheel to support all given classes.
- Parameters
image_tensor – rgb images, (B, H, W, 3)
pred_boxes – boxes after NMS for each image in a batch, each (Num_boxes, 6), values on dim 1 are: x1, y1, x2, y2, confidence, class
target_boxes – (Num_targets, 6), values on dim 1 are: image id in a batch, class, x y w h (coordinates scaled to [0, 1])
batch_name – id of the current batch to use for image naming
checkpoint_dir – a path where images with boxes will be saved. if None, the result images will be returns as a list of numpy image arrays
undo_preprocessing_func – a function to convert preprocessed images tensor into a batch of cv2-like images
image_scale – scale factor for output image
- super_gradients.training.utils.segmentation_utils.visualize_batches(dataloader, module, visualization_path, num_batches=1, undo_preprocessing_func=None)[source]
- super_gradients.training.utils.segmentation_utils.one_hot_to_binary_edge(x: torch.Tensor, kernel_size: int, flatten_channels: bool = True) → torch.Tensor[source]
Utils function to create edge feature maps. :param x: input tensor, must be one_hot tensor with shape [B, C, H, W] :param kernel_size: kernel size of dilation erosion convolutions. The result edge widths depends on this argument as
follows: edge_width = kernel - 1
- Parameters
flatten_channels – Whether to apply logical_or across channels dimension, if at least one pixel class is considered as edge pixel flatten value is 1. If set as False the output tensor shape is [B, C, H, W], else [B, 1, H, W]. Default is True.
- Returns
one_hot edge torch.Tensor.
- super_gradients.training.utils.segmentation_utils.target_to_binary_edge(target: torch.Tensor, num_classes: int, kernel_size: int, ignore_index: Optional[int] = None, flatten_channels: bool = True) → torch.Tensor[source]
Utils function to create edge feature maps from target. :param target: Class labels long tensor, with shape [N, H, W] :param num_classes: num of classes in datasets excluding ignore label, this is the output channels of the one hot
result.
- Parameters
kernel_size – kernel size of dilation erosion convolutions. The result edge widths depends on this argument as follows: edge_width = kernel - 1
flatten_channels – Whether to apply logical or across channels dimension, if at least one pixel class is considered as edge pixel flatten value is 1. If set as False the output tensor shape is [B, C, H, W], else [B, 1, H, W]. Default is True.
- Returns
one_hot edge torch.Tensor.
super_gradients.training.utils.sg_model_utils module
- class super_gradients.training.utils.sg_model_utils.MonitoredValue(name: str, greater_is_better: bool, current: Optional[float] = None, previous: Optional[float] = None, best: Optional[float] = None, change_from_previous: Optional[float] = None, change_from_best: Optional[float] = None)[source]
Bases:
object
Store a value and some indicators relative to its past iterations.
The value can be a metric/loss, and the iteration can be epochs/batch.
- name: str
- greater_is_better: bool
- current: float = None
- previous: float = None
- best: float = None
- change_from_previous: float = None
- change_from_best: float = None
- property is_better_than_previous
- property is_best_value
- super_gradients.training.utils.sg_model_utils.update_monitored_value(previous_monitored_value: super_gradients.training.utils.sg_model_utils.MonitoredValue, new_value: float) → super_gradients.training.utils.sg_model_utils.MonitoredValue[source]
Update the given ValueToMonitor object (could be a loss or a metric) with the new value
- Parameters
previous_monitored_value – The stats about the value that is monitored throughout epochs.
new_value – The value of the current epoch that will be used to update previous_monitored_value
- Returns
- super_gradients.training.utils.sg_model_utils.update_monitored_values_dict(monitored_values_dict: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue], new_values_dict: Dict[str, float]) → Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue][source]
Update the given ValueToMonitor object (could be a loss or a metric) with the new value
- Parameters
monitored_values_dict – Dict mapping value names to their stats throughout epochs.
new_values_dict – Dict mapping value names to their new (i.e. current epoch) value.
- Returns
Updated monitored_values_dict
- super_gradients.training.utils.sg_model_utils.display_epoch_summary(epoch: int, n_digits: int, train_monitored_values: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue], valid_monitored_values: Dict[str, super_gradients.training.utils.sg_model_utils.MonitoredValue]) → None[source]
Display a summary of loss/metric of interest, for a given epoch.
- Parameters
epoch – the number of epoch.
n_digits – number of digits to display on screen for float values
train_monitored_values – mapping of loss/metric with their stats that will be displayed
valid_monitored_values – mapping of loss/metric with their stats that will be displayed
- Returns
- super_gradients.training.utils.sg_model_utils.try_port(port)[source]
try_port - Helper method for tensorboard port binding :param port: :return:
- super_gradients.training.utils.sg_model_utils.launch_tensorboard_process(checkpoints_dir_path: str, sleep_postpone: bool = True, port: Optional[int] = None) → Tuple[multiprocessing.context.Process, int][source]
- launch_tensorboard_process - Default behavior is to scan all free ports from 6006-6016 and try using them
unless port is defined by the user
- param checkpoints_dir_path
- param sleep_postpone
- param port
- return
tuple of tb process, port
- super_gradients.training.utils.sg_model_utils.init_summary_writer(tb_dir, checkpoint_loaded, user_prompt=False)[source]
Remove previous tensorboard files from directory and launch a tensor board process
- super_gradients.training.utils.sg_model_utils.add_log_to_file(filename, results_titles_list, results_values_list, epoch, max_epochs)[source]
Add a message to the log file
- super_gradients.training.utils.sg_model_utils.write_training_results(writer, results_titles_list, results_values_list, epoch)[source]
Stores the training and validation loss and accuracy for current epoch in a tensorboard file
- super_gradients.training.utils.sg_model_utils.write_hpms(writer, hpmstructs=[], special_conf={})[source]
Stores the training and dataset hyper params in the tensorboard file
- super_gradients.training.utils.sg_model_utils.unpack_batch_items(batch_items: Union[tuple, torch.Tensor])[source]
Adds support for unpacking batch items in train/validation loop.
- @param batch_items: (Union[tuple, torch.Tensor]) returned by the data loader, which is expected to be in one of
- the following formats:
torch.Tensor or tuple, s.t inputs = batch_items[0], targets = batch_items[1] and len(batch_items) = 2
tuple: (inputs, targets, additional_batch_items)
where inputs are fed to the network, targets are their corresponding labels and additional_batch_items is a dictionary (format {additional_batch_item_i_name: additional_batch_item_i …}) which can be accessed through the phase context under the attribute additional_batch_item_i_name, using a phase callback.
@return: inputs, target, additional_batch_items
super_gradients.training.utils.ssd_utils module
- class super_gradients.training.utils.ssd_utils.DefaultBoxes(fig_size: int, feat_size: List[int], scales: List[int], aspect_ratios: List[List[int]], scale_xy=0.1, scale_wh=0.2)[source]
Bases:
object
Default Boxes, (aka: anchor boxes or priors boxes) used by SSD model
- property scale_xy
- property scale_wh
- class super_gradients.training.utils.ssd_utils.SSDPostPredictCallback(conf: float = 0.001, iou: float = 0.6, classes: Optional[list] = None, max_predictions: int = 300, nms_type: super_gradients.training.utils.detection_utils.NMS_Type = <NMS_Type.ITERATIVE: 'iterative'>, multi_label_per_box=True)[source]
Bases:
super_gradients.training.utils.detection_utils.DetectionPostPredictionCallback
post prediction callback module to convert and filter predictions coming from the SSD net to a format used by all other detection models
- forward(predictions, device=None)[source]
- Parameters
x – the output of your model
device – the device to move all output tensors into
- Returns
a list with length batch_size, each item in the list is a detections with shape: nx6 (x1, y1, x2, y2, confidence, class) where x and y are in range [0,1]
- training: bool
super_gradients.training.utils.utils module
- super_gradients.training.utils.utils.convert_to_tensor(array)[source]
Converts numpy arrays and lists to Torch tensors before calculation losses :param array: torch.tensor / Numpy array / List
- class super_gradients.training.utils.utils.WrappedModel(module)[source]
Bases:
torch.nn.modules.module.Module
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class super_gradients.training.utils.utils.Timer(device: str)[source]
Bases:
object
A class to measure time handling both GPU & CPU processes Returns time in milliseconds
- class super_gradients.training.utils.utils.AverageMeter[source]
Bases:
object
A class to calculate the average of a metric, for each batch during training/testing
- property average
- super_gradients.training.utils.utils.tensor_container_to_device(obj: Union[torch.Tensor, tuple, list, dict], device: str, non_blocking=True)[source]
- recursively send compounded objects to device (sending all tensors to device and maintaining structure)
:param obj the object to send to device (list / tuple / tensor / dict) :param device: device to send the tensors to :param non_blocking: used for DistributedDataParallel :returns an object with the same structure (tensors, lists, tuples) with the device pointers (like
the return value of Tensor.to(device)
- super_gradients.training.utils.utils.get_param(params, name, default_val=None)[source]
Retrieves a param from a parameter object/dict. If the parameter does not exist, will return default_val. In case the default_val is of type dictionary, and a value is found in the params - the function will return the default value dictionary with internal values overridden by the found value
i.e. default_opt_params = {‘lr’:0.1, ‘momentum’:0.99, ‘alpha’:0.001} training_params = {‘optimizer_params’: {‘lr’:0.0001}, ‘batch’: 32 …. } get_param(training_params, name=’optimizer_params’, default_val=default_opt_params) will return {‘lr’:0.0001, ‘momentum’:0.99, ‘alpha’:0.001}
- Parameters
params – an object (typically HpmStruct) or a dict holding the params
name – name of the searched parameter
default_val – assumed to be the same type as the value searched in the params
- Returns
the found value, or default if not found
- super_gradients.training.utils.utils.move_state_dict_to_device(model_sd, device)[source]
Moving model state dict tensors to target device (cuda or cpu) :param model_sd: model state dict :param device: either cuda or cpu
- super_gradients.training.utils.utils.random_seed(is_ddp, device, seed)[source]
Sets random seed of numpy, torch and random.
When using ddp a seed will be set for each process according to its local rank derived from the device number. :param is_ddp: bool, will set different random seed for each process when using ddp. :param device: ‘cuda’,’cpu’, ‘cuda:<device_number>’ :param seed: int, random seed to be set
- super_gradients.training.utils.utils.load_func(dotpath: str)[source]
load function in module. function is right-most segment.
Used for passing functions (without calling them) in yaml files.
@param dotpath: path to module. @return: a python function
- super_gradients.training.utils.utils.get_filename_suffix_by_framework(framework: str)[source]
Return the file extension of framework.
@param framework: (str) @return: (str) the suffix for the specific framework
- super_gradients.training.utils.utils.check_models_have_same_weights(model_1: torch.nn.modules.module.Module, model_2: torch.nn.modules.module.Module)[source]
Checks whether two networks have the same weights
@param model_1: Net to be checked @param model_2: Net to be checked @return: True iff the two networks have the same weights
- super_gradients.training.utils.utils.download_and_unzip_from_url(url, dir='.', unzip=True, delete=True)[source]
Downloads a zip file from url to dir, and unzips it.
- Parameters
url – Url to download the file from.
dir – Destination directory.
unzip – Whether to unzip the downloaded file.
delete – Whether to delete the zip file.
used to downlaod VOC.
Source: https://github.com/ultralytics/yolov5/blob/master/data/VOC.yaml
- super_gradients.training.utils.utils.download_and_untar_from_url(urls: List[str], dir: Union[str, pathlib.Path] = '.')[source]
Download a file from url and untar.
- Parameters
urls – Url to download the file from.
dir – Destination directory.
- super_gradients.training.utils.utils.make_divisible(x: int, divisor: int, ceil: bool = True) → int[source]
Returns x evenly divisible by divisor. If ceil=True it will return the closest larger number to the original x, and ceil=False the closest smaller number.
- super_gradients.training.utils.utils.check_img_size_divisibility(img_size: int, stride: int = 32) → Tuple[bool, Optional[Tuple[int, int]]][source]
- Parameters
img_size – Int, the size of the image (H or W).
stride – Int, the number to check if img_size is divisible by.
- Returns
(True, None) if img_size is divisble by stride, (False, Suggestions) if it’s not. Note: Suggestions are the two closest numbers to img_size that are divisible by stride. For example if img_size=321, stride=32, it will return (False,(352, 320)).
- super_gradients.training.utils.utils.get_orientation_key() → int[source]
Get the orientation key according to PIL, which is useful to get the image size for instance :return: Orientation key according to PIL
super_gradients.training.utils.weight_averaging_utils module
- class super_gradients.training.utils.weight_averaging_utils.ModelWeightAveraging(ckpt_dir, greater_is_better, source_ckpt_folder_name=None, metric_to_watch='acc', metric_idx=1, load_checkpoint=False, number_of_models_to_average=10, model_checkpoints_location='local')[source]
Bases:
object
Utils class for managing the averaging of the best several snapshots into a single model. A snapshot dictionary file and the average model will be saved / updated at every epoch and evaluated only when training is completed. The snapshot file will only be deleted upon completing the training. The snapshot dict will be managed on cpu.
- update_snapshots_dict(model, validation_results_tuple)[source]
Update the snapshot dict and returns the updated average model for saving :param model: the latest model :param validation_results_tuple: performance of the latest model
Module contents
- class super_gradients.training.utils.Timer(device: str)[source]
Bases:
object
A class to measure time handling both GPU & CPU processes Returns time in milliseconds
- class super_gradients.training.utils.WrappedModel(module)[source]
Bases:
torch.nn.modules.module.Module
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- super_gradients.training.utils.convert_to_tensor(array)[source]
Converts numpy arrays and lists to Torch tensors before calculation losses :param array: torch.tensor / Numpy array / List
- super_gradients.training.utils.get_param(params, name, default_val=None)[source]
Retrieves a param from a parameter object/dict. If the parameter does not exist, will return default_val. In case the default_val is of type dictionary, and a value is found in the params - the function will return the default value dictionary with internal values overridden by the found value
i.e. default_opt_params = {‘lr’:0.1, ‘momentum’:0.99, ‘alpha’:0.001} training_params = {‘optimizer_params’: {‘lr’:0.0001}, ‘batch’: 32 …. } get_param(training_params, name=’optimizer_params’, default_val=default_opt_params) will return {‘lr’:0.0001, ‘momentum’:0.99, ‘alpha’:0.001}
- Parameters
params – an object (typically HpmStruct) or a dict holding the params
name – name of the searched parameter
default_val – assumed to be the same type as the value searched in the params
- Returns
the found value, or default if not found
- super_gradients.training.utils.tensor_container_to_device(obj: Union[torch.Tensor, tuple, list, dict], device: str, non_blocking=True)[source]
- recursively send compounded objects to device (sending all tensors to device and maintaining structure)
:param obj the object to send to device (list / tuple / tensor / dict) :param device: device to send the tensors to :param non_blocking: used for DistributedDataParallel :returns an object with the same structure (tensors, lists, tuples) with the device pointers (like
the return value of Tensor.to(device)
- super_gradients.training.utils.adapt_state_dict_to_fit_model_layer_names(model_state_dict: dict, source_ckpt: dict, exclude: list = [], solver: Optional[callable] = None)[source]
Given a model state dict and source checkpoints, the method tries to correct the keys in the model_state_dict to fit the ckpt in order to properly load the weights into the model. If unsuccessful - returns None
- param model_state_dict
the model state_dict
- param source_ckpt
checkpoint dict
:param exclude optional list for excluded layers :param solver: callable with signature (ckpt_key, ckpt_val, model_key, model_val)
that returns a desired weight for ckpt_val.
- return
renamed checkpoint dict (if possible)
- super_gradients.training.utils.raise_informative_runtime_error(state_dict, checkpoint, exception_msg)[source]
Given a model state dict and source checkpoints, the method calls “adapt_state_dict_to_fit_model_layer_names” and enhances the exception_msg if loading the checkpoint_dict via the conversion method is possible
- super_gradients.training.utils.random_seed(is_ddp, device, seed)[source]
Sets random seed of numpy, torch and random.
When using ddp a seed will be set for each process according to its local rank derived from the device number. :param is_ddp: bool, will set different random seed for each process when using ddp. :param device: ‘cuda’,’cpu’, ‘cuda:<device_number>’ :param seed: int, random seed to be set