Openspeech Model¶
Openspeech Model¶
-
class
openspeech.models.openspeech_model.
OpenspeechModel
(configs: omegaconf.dictconfig.DictConfig, vocab: openspeech.vocabs.vocab.Vocabulary)[source]¶ Super class of openspeech models.
Note
Do not use this class directly, use one of the sub classes.
- Parameters
configs (DictConfig) – configuration set.
vocab (Vocabulary) – the class of vocabulary
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be a padded
FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions.
- Return type
y_hats (torch.FloatTensor)
-
configure_criterion
(criterion_name: str) → torch.nn.modules.module.Module[source]¶ Configure criterion for training.
- Parameters
criterion_name (str) – name of criterion
- Returns
criterion for training
- Return type
criterion (nn.Module)
-
configure_optimizers
()[source]¶ Choose what optimizers and learning-rate schedulers to use in your optimization.
- Returns
- Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers
(or multiple
lr_dict
).
-
forward
(inputs: torch.FloatTensor, input_lengths: torch.LongTensor) → Dict[str, torch.Tensor][source]¶ Forward propagate a inputs and targets pair for inference.
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be a padded
FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions.
- Return type
outputs (dict)
-
log_steps
(stage: str, wer: float, cer: float, loss: Optional[float] = None, cross_entropy_loss: Optional[float] = None, ctc_loss: Optional[float] = None) → None[source]¶ Provides log dictionary.
- Parameters
-
test_step
(batch: tuple, batch_idx: int)[source]¶ Forward propagate a inputs and targets pair for test.
- Inputs:
batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
training_step
(batch: tuple, batch_idx: int)[source]¶ Forward propagate a inputs and targets pair for training.
- Inputs:
batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
validation_epoch_end
(outputs: dict) → dict[source]¶ Called at the end of the validation epoch with the outputs of all validation steps.
# the pseudocode for these calls val_outs = [] for val_batch in val_data: out = validation_step(val_batch) val_outs.append(out) validation_epoch_end(val_outs)
- Parameters
outputs – List of outputs you defined in
validation_step()
, or if there are multiple dataloaders, a list containing a list of outputs for each dataloader.- Returns
None
Note
If you didn’t define a
validation_step()
, this won’t be called.Examples
With a single dataloader:
def validation_epoch_end(self, val_step_outputs): for out in val_step_outputs: # do something
With multiple dataloaders, outputs will be a list of lists. The outer list contains one entry per dataloader, while the inner list contains the individual outputs of each validation step for that dataloader.
def validation_epoch_end(self, outputs): for dataloader_output_result in outputs: dataloader_outs = dataloader_output_result.dataloader_i_outputs self.log('final_metric', final_value)
-
validation_step
(batch: tuple, batch_idx: int)[source]¶ Forward propagate a inputs and targets pair for validation.
- Inputs:
batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
Openspeech Encoder Decoder Model¶
-
class
openspeech.models.openspeech_encoder_decoder_model.
OpenspeechEncoderDecoderModel
(configs: omegaconf.dictconfig.DictConfig, vocab: openspeech.vocabs.vocab.Vocabulary)[source]¶ Base class for OpenSpeech’s encoder-decoder models.
- Parameters
configs (DictConfig) – configuration set.
vocab (Vocabulary) – the class of vocabulary
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be
a padded FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions.
- Return type
y_hats (torch.FloatTensor)
-
forward
(inputs: torch.Tensor, input_lengths: torch.Tensor) → Dict[str, torch.Tensor][source]¶ Forward propagate a inputs and targets pair for inference.
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be a padded
FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
- Result of model predictions that contains predictions, logits, encoder_outputs,
encoder_logits, encoder_output_lengths.
- Return type
dict (dict)
-
test_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for test.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
training_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for training.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
validation_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for validation.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
Openspeech CTC Model¶
-
class
openspeech.models.openspeech_ctc_model.
OpenspeechCTCModel
(configs: omegaconf.dictconfig.DictConfig, vocab: openspeech.vocabs.vocab.Vocabulary)[source]¶ Base class for OpenSpeech’s encoder-only models (ctc-model).
- Parameters
configs (DictConfig) – configuration set.
vocab (Vocabulary) – the class of vocabulary
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be a padded
FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions.
- Return type
y_hats (torch.FloatTensor)
-
forward
(inputs: torch.FloatTensor, input_lengths: torch.IntTensor) → Dict[str, torch.Tensor][source]¶ Forward propagate a inputs and targets pair for inference.
- Parameters
inputs (torch.FloatTensor) – A input sequence passed to encoders. Typically for inputs this will be a padded FloatTensor of size
(batch, seq_length, dimension)
.input_lengths (torch.IntTensor) – The length of input tensor.
(batch)
- Returns
Result of model predictions that contains y_hats, logits, output_lengths
- Return type
dict (dict)
-
test_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for test.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
training_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for training.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
validation_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for validation.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
Openspeech Transducer Model¶
-
class
openspeech.models.openspeech_transducer_model.
OpenspeechTransducerModel
(configs: omegaconf.dictconfig.DictConfig, vocab: openspeech.vocabs.vocab.Vocabulary)[source]¶ Base class for OpenSpeech’s transducer models.
- Parameters
configs (DictConfig) – configuration set.
vocab (Vocabulary) – the class of vocabulary
- Inputs:
- inputs (torch.FloatTensor): A input sequence passed to encoders. Typically for inputs this will be
a padded FloatTensor of size
(batch, seq_length, dimension)
.
input_lengths (torch.LongTensor): The length of input tensor.
(batch)
- Returns
Result of model predictions.
- Return type
y_hats (torch.FloatTensor)
-
decode
(encoder_output: torch.Tensor, max_length: int) → torch.Tensor[source]¶ Decode encoder_outputs.
- Parameters
encoder_output (torch.FloatTensor) – A output sequence of encoders. FloatTensor of size
(seq_length, dimension)
max_length (int) – max decoding time step
- Returns
Log probability of model predictions.
- Return type
logits (torch.FloatTensor)
-
forward
(inputs: torch.Tensor, input_lengths: torch.Tensor) → Dict[str, torch.Tensor][source]¶ Decode encoder_outputs.
- Parameters
inputs (torch.FloatTensor) – A input sequence passed to encoders. Typically for inputs this will be a padded FloatTensor of size
(batch, seq_length, dimension)
.input_lengths (torch.LongTensor) – The length of input tensor.
(batch)
- Returns
- Result of model predictions that contains predictions, logits,
encoder_outputs, encoder_output_lengths
- Return type
dict (dict)
-
joint
(encoder_outputs: torch.Tensor, decoder_outputs: torch.Tensor) → torch.Tensor[source]¶ Joint encoder_outputs and decoder_outputs.
- Parameters
encoder_outputs (torch.FloatTensor) – A output sequence of encoders. FloatTensor of size
(batch, seq_length, dimension)
decoder_outputs (torch.FloatTensor) – A output sequence of decoders. FloatTensor of size
(batch, seq_length, dimension)
- Returns
outputs of joint encoder_outputs and decoder_outputs..
- Return type
outputs (torch.FloatTensor)
-
test_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for test.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
training_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for training.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)
-
validation_step
(batch: tuple, batch_idx: int) → collections.OrderedDict[source]¶ Forward propagate a inputs and targets pair for validation.
- Inputs:
train_batch (tuple): A train batch contains inputs, targets, input_lengths, target_lengths batch_idx (int): The index of batch
- Returns
loss for training
- Return type
loss (torch.Tensor)