Active Learning Strategies¶
BADGE¶
-
class
distil.active_learning_strategies.badge.
BADGE
(X, Y, unlabeled_x, net, handler, nclasses, args)[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds (BADGE) 1 Strategy. This class extends : class:active_learning_strategies.strategy.Strategy.
This method is based on the paper Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds According to the paper, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between uncertainty and diversity without requiring any hand-tuned hyperparameters. Here at each round of selection, loss gradients are computed using the hypothesised labels. Then to select the points to be labeled are selected by applying k-means++ on these loss gradients.
- Parameters
X (Numpy array) – Features of the labled set of points
Y (Numpy array) – Lables of the labled set of points
unlabeled_x (Numpy array) – Features of the unlabled set of points
net (class object) – Model architecture used for training. Could be instance of models defined in distil.utils.models or something similar.
handler (class object) – It should be a subclasses of torch.utils.data.Dataset i.e, have __getitem__ and __len__ methods implemented, so that is could be passed to pytorch DataLoader.Could be instance of handlers defined in distil.utils.DataHandler or something similar.
nclasses (int) – No. of classes in tha dataset
args (dictionary) – This dictionary should have ‘batch_size’ as a key.
-
select
(budget)[source]¶ Select next set of points
- Parameters
budget (int) – Number of indexes to be returned for next set
- Returns
chosen – List of selected data point indexes with respect to unlabeled_x
- Return type
list
-
select_per_batch
(budget, batch_size)[source]¶ Select points to label by using per-batch BADGE strategy
- Parameters
budget (int) – Number of indices to be selected from unlabeled set
batch_size (TYPE) – Size of batches to form
- Returns
chosen – List of selected data point indices with respect to unlabeled_x
- Return type
list
Core-Set Approch¶
-
class
distil.active_learning_strategies.core_set.
CoreSet
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of CoreSet 2 Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include coreset sampling technique to select data points for active learning.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
-
furthest_first
(X, X_set, n)[source]¶ Selects points with maximum distance
- Parameters
X (numpy array) – Embeddings of unlabeled set
X_set (numpy array) – Embeddings of labeled set
n (int) – Number of points to return
- Returns
idxs – List of selected data point indexes with respect to unlabeled_x
- Return type
list
Entropy Sampling¶
-
class
distil.active_learning_strategies.entropy_sampling.
EntropySampling
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Entropy Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling technique to select data points for active learning.Least Confidence and Margin Sampling do not make use of all the label probabilities, whereas entropy sampling calculates entropy based on the hypothesised confidence scores for each label and queries for the true label of a data instance with the highest entropy.
Example¶ Data Instances
Entropy
p1
0.2
p2
0.5
p3
0.7
From the above table, Entropy sampling will query for the true label data instance p3 since it has the highest entropy.
Let \(p_i\) denote probability for ith label of data instance p, and let total possible labels be denoted by n, then Entropy for p is calculated as:
\[E = \sum{p_i*log(p_i)}\]where i=1,2,3….n Thus Entropy Selection can be mathematically shown as:
- ..math::
max{(E)}
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
Entropy Sampling with Dropout¶
-
class
distil.active_learning_strategies.entropy_sampling_dropout.
EntropySamplingDropout
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Entropy Sampling Dropout Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling with dropout technique to select data points for active learning.Least Confidence and Margin Sampling do not make use of all the label probabilities, whereas entropy sampling calculates entropy based on the hypothesised confidence scores for each label and queries for the true label of a data instance with the highest entropy.
Example¶ Data Instances
Entropy
p1
0.2
p2
0.5
p3
0.7
From the above table, Entropy sampling will query for the true label data instance p3 since it has the highest entropy.
Let \(p_i\) denote probability for ith label of data instance p, and let total possible labels be denoted by n, then Entropy for p is calculated as:
\[E = \sum{p_i*log(p_i)}\]where i=1,2,3….n Thus Entropy Selection can be mathematically shown as:
- ..math::
max{(E)}
The drop out version uses the predict probability dropout function from the base strategy class to find the hypothesised labels. User can pass n_drop argument which denotes the number of times the probabilities will be calculated. The final probability is calculated by averaging probabilities obtained in all iteraitons.
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
n_drop Dropout value to be used (int, optional)
FASS¶
-
class
distil.active_learning_strategies.fass.
FASS
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of FASS strategy:footcite:pmlr-v37-wei15 to select data points for active learning. This class extends
active_learning_strategies.strategy.Strategy
.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
submod: str Choice of submodular function - ‘facility_location’ | ‘graph_cut’ | ‘saturated_coverage’ | ‘sum_redundancy’ | ‘feature_based’
selection_type: str Choice of selection strategy - ‘PerClass’ | ‘Supervised’
GLISTER¶
-
class
distil.active_learning_strategies.glister.
GLISTER
(X, Y, unlabeled_x, net, handler, nclasses, args, valid, X_val=None, Y_val=None, loss_criterion=CrossEntropyLoss(), typeOf='none', lam=None, kernel_batch_size=200)[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of GLISTER-ACTIVE Strategy 3. This class extends
active_learning_strategies.strategy.Strategy
.- Parameters
X (Numpy array) – Features of the labled set of points
Y (Numpy array) – Lables of the labled set of points
unlabeled_x (Numpy array) – Features of the unlabled set of points
net (class object) – Model architecture used for training. Could be instance of models defined in distil.utils.models or something similar.
handler (class object) – It should be a subclass of torch.utils.data.Dataset i.e, have __getitem__ and __len__ methods implemented, so that is could be passed to pytorch DataLoader.Could be instance of handlers defined in distil.utils.DataHandler or something similar.
nclasses (int) – No. of classes in tha dataset
args (dictionary) – This dictionary should have keys ‘batch_size’ and ‘lr’. ‘lr’ should be the learning rate used for training. ‘batch_size’ ‘batch_size’ should be such that one can exploit the benefits of tensorization while honouring the resourse constraits.
valid (boolean) – Whether validation set is passed or not
X_val (Numpy array, optional) – Features of the points in the validation set. Mandatory if valid=True.
Y_val (Numpy array, optional) – Lables of the points in the validation set. Mandatory if valid=True.
loss_criterion (class object, optional) – The type of loss criterion. Default is torch.nn.CrossEntropyLoss()
typeOf (str, optional) – Determines the type of regulariser to be used. Default is ‘none’. For random regulariser use ‘Rand’. To use Facility Location set functiom as a regulariser use ‘FacLoc’. To use Diversity set functiom as a regulariser use ‘Diversity’.
lam (float, optional) – Determines the amount of regularisation to be applied. Mandatory if is not typeOf=’none’ and by default set to None. For random regulariser use values should be between 0 and 1 as it determines fraction of points replaced by random points. For both ‘Diversity’ and ‘FacLoc’, lam determines the weightage given to them while computing the gain.
kernel_batch_size (int, optional) – For ‘Diversity’ and ‘FacLoc’ regualrizer versions, similarity kernel is to be computed, which entails creating a 3d torch tensor of dimenssions kernel_batch_size*kernel_batch_size* feature dimenssion.Again kernel_batch_size should be such that one can exploit the benefits of tensorization while honouring the resourse constraits.
Least Confidence¶
-
class
distil.active_learning_strategies.least_confidence.
LeastConfidence
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Least Confidence Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include least confidence technique to select data points for active learning.In this active learning strategy, the algorithm selects the data points for which the model has the lowest confidence while predicting its hypothesised label.
Example¶ Data Instances
Label 1
Label 2
Label 3
p1
0.1
0.55
0.45
p2
0.2
0.3
0.5
p3
0.1
0.1
0.8
From the above table, the label for instance p1 is 2 with a confidence of 0.55, for instance p2, the hypothesised label predicted is 3 with confidence of 0.5 and for p3 label 3 is predicted with a confidence of 0.8. Thus, according to least confidence strategy, the point for which it will query for true label will be instance p2.
Let \(p_i\) represent probability for ith label and let there be n possible labels for data instance p then, mathematically it can be written as:
\[\min{(\max{(P)})}\]where P=:math:[p_1, p_2,… p_n]
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
Least Confidence with Dropout¶
-
class
distil.active_learning_strategies.least_confidence_dropout.
LeastConfidenceDropout
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Least Confidence Dropout Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include least confidence dropout technique to select data points for active learning.In this active learning strategy, the algorithm selects the data points for which the model has the lowest confidence while predicting its hypothesised label.
Example¶ Data Instances
Label 1
Label 2
Label 3
p1
0.1
0.55
0.45
p2
0.2
0.3
0.5
p3
0.1
0.1
0.8
From the above table, the label for instance p1 is 2 with a confidence of 0.55, for instance p2, the hypothesised label predicted is 3 with confidence of 0.5 and for p3 label 3 is predicted with a confidence of 0.8. Thus, according to least confidence strategy, the point for which it will query for true label will be instance p2.
Let \(p_i\) represent probability for ith label and let there be n possible labels for data instance p then, mathematically it can be written as:
\[\min{(\max{(P)})}\]where P=:math:[p_1, p_2,… p_n]
The drop out version uses the predict probability dropout function from the base strategy class to find the hypothesised labels. User can pass n_drop argument which denotes the number of times the probabilities will be calculated. The final probability is calculated by averaging probabilities obtained in all iteraitons.
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
n_drop Dropout value to be used (int, optional)
Margin Sampling¶
-
class
distil.active_learning_strategies.margin_sampling.
MarginSampling
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Margin Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include margin sampling technique to select data points for active learning.While least confidence only takes into consideration the maximum probability, margin sampling considers the difference between the confidence of first and the second most probable labels.
Example¶ Data Instances
Label 1
Label 2
Label 3
p1
0.1
0.55
0.45
p2
0.2
0.3
0.5
p3
0.1
0.1
0.8
From the above table, the difference between the probability first and the second labels for p1, p2, p3 are 0.1, 0.2, 0.7 respectively. The margin sampling will query the true label for the data instance p1 since it has the smallest difference among all the different data instances.
Let \(p_i\) represent probability for ith label and let there be n possible labels for data instance p. Let \(\max{(t)}\) represent the maximum value in t and \(max1{(t)}\) represent second maximum value in t then, mathematically it can be written as:
\[\min{(\max{(P)} - \max1{(P)})}\]where P=[ \(p_1, p_2,… p_n\)]
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
Margin sampling with Dropout¶
-
class
distil.active_learning_strategies.margin_sampling_dropout.
MarginSamplingDropout
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Margin Sampling Dropout Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include margin sampling dropout technique to select data points for active learning.While least confidence only takes into consideration the maximum probability, margin sampling considers the difference between the confidence of first and the second most probable labels.
Example¶ Data Instances
Label 1
Label 2
Label 3
p1
0.1
0.55
0.45
p2
0.2
0.3
0.5
p3
0.1
0.1
0.8
From the above table, the difference between the probability first and the second labels for p1, p2, p3 are 0.1, 0.2, 0.7 respectively. The margin sampling will query the true label for the data instance p1 since it has the smallest difference among all the different data instances.
Let \(p_i\) represent probability for ith label and let there be n possible labels for data instance p. Let \(\max{(t)}\) represent the maximum value in t and \(max1{(t)}\) represent second maximum value in t then, mathematically it can be written as:
\[\min{(\max{(P)} - \max1{(P)})}\]where P=[ \(p_1, p_2,… p_n\)]
The drop out version uses the predict probability dropout function from the base strategy class to find the hypothesised labels. User can pass n_drop argument which denotes the number of times the probabilities will be calculated. The final probability is calculated by averaging probabilities obtained in all iteraitons.
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
n_drop Dropout value to be used (int, optional)
Random Sampling¶
-
class
distil.active_learning_strategies.random_sampling.
RandomSampling
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Random Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include random sampling technique to select data points for active learning.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
Submodular Sampling¶
-
class
distil.active_learning_strategies.submod_sampling.
SubmodSampling
(X, Y, unlabeled_x, net, handler, nclasses, typeOf, selection_type, if_grad=False, args={}, kernel_batch_size=200)[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
- Parameters
X (Numpy array) – Features of the labled set of points
Y (Numpy array) – Lables of the labled set of points
unlabeled_x (Numpy array) – Features of the unlabled set of points
net (class object) – Model architecture used for training. Could be instance of models defined in distil.utils.models or something similar.
handler (class object) – It should be a subclass of torch.utils.data.Dataset i.e, have __getitem__ and __len__ methods implemented, so that is could be passed to pytorch DataLoader.Could be instance of handlers defined in distil.utils.DataHandler or something similar.
nclasses (int) – No. of classes in tha dataset
typeOf (str) – Choice of submodular function - ‘facility_location’ | ‘graph_cut’ | ‘saturated_coverage’ | ‘sum_redundancy’ | ‘feature_based’ | ‘Disparity-min’ | ‘Disparity-sum’ | ‘DPP’
selection_type (str) – selection strategy - ‘Full’ |’PerClass’ | ‘Supervised’
if_grad (boolean, optional) – Determines if gradients to be used for subset selection. Default is False.
args (dictionary) – This dictionary should have keys ‘batch_size’ and ‘lr’. ‘lr’ should be the learning rate used for training. ‘batch_size’ ‘batch_size’ should be such that one
kernel_batch_size (int, optional) – For ‘Diversity’ and ‘FacLoc’ regualrizer versions, similarity kernel is to be computed, which entails creating a 3d torch tensor of dimenssions kernel_batch_size*kernel_batch_size* feature dimenssion.Again kernel_batch_size should be such that one can exploit the benefits of tensorization while honouring the resourse constraits.
Adversarial BIM¶
Adversarial DeepFool¶
-
class
distil.active_learning_strategies.adversarial_deepfool.
AdversarialDeepFool
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Adversial Deep Fool Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling technique to select data points for active learning.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
max_iter Maximum Number of Iterations (int, optional)
Bayesian Active Learning Disagreement Dropout¶
-
class
distil.active_learning_strategies.bayesian_active_learning_disagreement_dropout.
BALDDropout
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of BALDDropout Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling technique to select data points for active learning.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
n_drop Dropout value to be used (int, optional)
KMeans Sampling¶
-
class
distil.active_learning_strategies.kmeans_sampling.
KMeansSampling
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of KMeans Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling technique to select data points for active learning.In KMeans Sampling selection strategy, the last layer embeddings are calculated for all the unlabeled data points. Then the KMeans clustering algorithm is run over these embeddings with the number of clusters equal to the budget. Then the distance is calculated for all the points from their respective centers. From each cluster, the point closest to the center is selected to be labeled for the next iteration. Since the number of centers are equal to the budget, selecting one point from each cluster satisfies the total number of data points to be selected in one iteration.
Let \(d_i\) represent distance of a point from the center of a cluster and D represent the set of distances for all the points that belong to a particular cluster, then Mathematically,
\[\min{(D)}\]where \(D=[d_1,d_2...d_n]\) where n is the number of points in a given cluster
- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
Baseline Sampling¶
-
class
distil.active_learning_strategies.baseline_sampling.
BaselineSampling
(X, Y, unlabeled_x, net, handler, nclasses, args={})[source]¶ Bases:
distil.active_learning_strategies.strategy.Strategy
Implementation of Baseline Sampling Strategy. This class extends
active_learning_strategies.strategy.Strategy
to include entropy sampling technique to select data points for active learning.- Parameters
X (numpy array) – Present training/labeled data
y (numpy array) – Labels of present training data
unlabeled_x (numpy array) – Data without labels
net (class) – Pytorch Model class
handler (class) – Data Handler, which can load data even without labels.
nclasses (int) – Number of unique target variables
args (dict) –
Specify optional parameters
batch_size Batch size to be used inside strategy class (int, optional)
REFERENCES¶
- 1
Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. CoRR, 2019. URL: http://arxiv.org/abs/1906.03671, arXiv:1906.03671.
- 2
Ozan Sener and Silvio Savarese. Active learning for convolutional neural networks: a core-set approach. 2018. arXiv:1708.00489.
- 3
Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, and Rishabh Iyer. Glister: generalization based data subset selection for efficient and robust learning. 2020. arXiv:2012.10630.