nlp_architect.models.gnmt.scripts package¶
Submodules¶
nlp_architect.models.gnmt.scripts.bleu module¶
Python implementation of BLEU and smooth-BLEU.
This module provides a Python implementation of BLEU and smooth-BLEU. Smooth BLEU is computed following the method outlined in the paper: Chin-Yew Lin, Franz Josef Och. ORANGE: a method for evaluating automatic evaluation metrics for machine translation. COLING 2004.
-
nlp_architect.models.gnmt.scripts.bleu.
compute_bleu
(reference_corpus, translation_corpus, max_order=4, smooth=False)[source]¶ Computes BLEU score of translated segments against one or more references.
Parameters: - reference_corpus – list of lists of references for each translation. Each reference should be tokenized into a list of tokens.
- translation_corpus – list of translations to score. Each translation should be tokenized into a list of tokens.
- max_order – Maximum n-gram order to use when computing BLEU score.
- smooth – Whether or not to apply Lin et al. 2004 smoothing.
Returns: 3-Tuple with the BLEU score, n-gram precisions, geometric mean of n-gram precisions and brevity penalty.
nlp_architect.models.gnmt.scripts.rouge module¶
ROUGE metric implementation.
Copy from tf_seq2seq/seq2seq/metrics/rouge.py. This is a modified and slightly extended verison of https://github.com/miso-belica/sumy/blob/dev/sumy/evaluation/rouge.py.
-
nlp_architect.models.gnmt.scripts.rouge.
rouge
(hypotheses, references)[source]¶ Calculates average rouge scores for a list of hypotheses and references
-
nlp_architect.models.gnmt.scripts.rouge.
rouge_l_sentence_level
(evaluated_sentences, reference_sentences)[source]¶ Computes ROUGE-L (sentence level) of two text collections of sentences. http://research.microsoft.com/en-us/um/people/cyl/download/papers/ rouge-working-note-v1.3.1.pdf
Calculated according to: R_lcs = LCS(X,Y)/m P_lcs = LCS(X,Y)/n F_lcs = ((1 + beta^2)*R_lcs*P_lcs) / (R_lcs + (beta^2) * P_lcs)
where: X = reference summary Y = Candidate summary m = length of reference summary n = length of candidate summary
Parameters: - evaluated_sentences – The sentences that have been picked by the summarizer
- reference_sentences – The sentences from the referene set
Returns: F_lcs
Return type: A float
Raises: ValueError
– raises exception if a param has len <= 0
-
nlp_architect.models.gnmt.scripts.rouge.
rouge_l_summary_level
(evaluated_sentences, reference_sentences)[source]¶ Computes ROUGE-L (summary level) of two text collections of sentences. http://research.microsoft.com/en-us/um/people/cyl/download/papers/ rouge-working-note-v1.3.1.pdf
Calculated according to: R_lcs = SUM(1, u)[LCS<union>(r_i,C)]/m P_lcs = SUM(1, u)[LCS<union>(r_i,C)]/n F_lcs = ((1 + beta^2)*R_lcs*P_lcs) / (R_lcs + (beta^2) * P_lcs)
where: SUM(i,u) = SUM from i through u u = number of sentences in reference summary C = Candidate summary made up of v sentences m = number of words in reference summary n = number of words in candidate summary
Parameters: - evaluated_sentences – The sentences that have been picked by the summarizer
- reference_sentence – One of the sentences in the reference summaries
Returns: F_lcs
Return type: A float
Raises: ValueError
– raises exception if a param has len <= 0
-
nlp_architect.models.gnmt.scripts.rouge.
rouge_n
(evaluated_sentences, reference_sentences, n=2)[source]¶ Computes ROUGE-N of two text collections of sentences. Sourece: http://research.microsoft.com/en-us/um/people/cyl/download/ papers/rouge-working-note-v1.3.1.pdf
Parameters: - evaluated_sentences – The sentences that have been picked by the summarizer
- reference_sentences – The sentences from the referene set
- n – Size of ngram. Defaults to 2.
Returns: A tuple (f1, precision, recall) for ROUGE-N
Raises: ValueError
– raises exception if a param has len <= 0