Search¶
Base¶
Beam Search CTC¶
-
class
openspeech.search.beam_search_ctc.
BeamSearchCTC
(labels: list, lm_path: str = None, alpha: int = 0, beta: int = 0, cutoff_top_n: int = 40, cutoff_prob: float = 1.0, beam_size: int = 3, num_processes: int = 4, blank_id: int = 0)[source]¶ Decodes probability output using ctcdecode package.
- Parameters
labels (list) – the tokens you used to train your model
lm_path (str) – the path to your external kenlm language model(LM).
alpha (int) – weighting associated with the LMs probabilities.
beta (int) – weight associated with the number of words within our beam
cutoff_top_n (int) – cutoff number in pruning. Only the top cutoff_top_n characters with the highest probability in the vocab will be used in beam search.
cutoff_prob (float) – cutoff probability in pruning. 1.0 means no pruning.
beam_size (int) – this controls how broad the beam search is.
num_processes (int) – parallelize the batch using num_processes workers.
blank_id (int) – this should be the index of the CTC blank token
- Inputs:
- predicted_probs: Tensor of character probabilities, where probs[c,t] is the probability of
character c at time t
sizes: Size of each sequence in the mini-batch
- Returns
sequences of the model’s best prediction
- Return type
outputs
-
forward
(logits, sizes=None)[source]¶ Decodes probability output using ctcdecode package.
- Inputs:
- logits: Tensor of character probabilities, where probs[c,t] is the probability of
character c at time t
sizes: Size of each sequence in the mini-batch
- Returns
sequences of the model’s best prediction
- Return type
outputs
Beam Search LSTM¶
-
class
openspeech.search.beam_search_lstm.
BeamSearchLSTM
(decoder: openspeech.decoders.lstm_decoder.LSTMDecoder, beam_size: int, batch_size: int)[source]¶ LSTM Beam Search Decoder
- Args: decoder, beam_size, batch_size
decoder (DecoderLSTM): base decoder of lstm model. beam_size (int): size of beam. batch_size (int): size of batch.
- Inputs: encoder_outputs, targets, encoder_output_lengths, teacher_forcing_ratio
- encoder_outputs (torch.FloatTensor): A output sequence of encoders. FloatTensor of size
(batch, seq_length, dimension)
- targets (torch.LongTensor): A target sequence passed to decoders. IntTensor of size
(batch, seq_length)
- encoder_output_lengths (torch.LongTensor): A encoder output lengths sequence. LongTensor of size
(batch)
teacher_forcing_ratio (float): Ratio of teacher forcing.
- Returns
Log probability of model predictions.
- Return type
logits (torch.FloatTensor)
-
forward
(encoder_outputs: torch.Tensor, encoder_output_lengths: torch.Tensor) → torch.Tensor[source]¶ Beam search decoding.
- Inputs: encoder_outputs
- encoder_outputs (torch.FloatTensor): A output sequence of encoders. FloatTensor of size
(batch, seq_length, dimension)
- Returns
Log probability of model predictions.
- Return type
logits (torch.FloatTensor)