Module lingua.writer
Functions
def check_input_file_path(input_file_path: pathlib.Path)
def check_output_directory_path(output_directory_path: pathlib.Path)
Classes
class LanguageModelFilesWriter
-
This class creates language model files and writes them to a directory.
Static methods
def create_and_write_language_model_files(input_file_path: pathlib.Path, output_directory_path: pathlib.Path, language: Language, char_class: str)
-
Create language model files for accuracy report generation and write them to a directory.
Args
input_file_path
- The path to a txt file used for language model creation. The assumed encoding of the txt file is UTF-8.
output_directory_path
- The path to an existing directory where the language model files are to be written.
language
- The language for which to create language models.
char_class
- A regex character class such as \p{L} to restrict the set of characters that the language models are built from.
Raises
Exception
- if the input file path is not absolute or does not point to an existing txt file; if the input file's encoding is not UTF-8; if the output directory path is not absolute or does not point to an existing directory; if the character class cannot be compiled to a valid regular expression
class TestDataFilesWriter
-
This class creates test data files for accuracy report generation and writes them to a directory.
Static methods
def create_and_write_test_data_files(input_file_path: pathlib.Path, output_directory_path: pathlib.Path, char_class: str, maximum_lines: int)
-
Create test data files for accuracy report generation and write them to a directory.
Args
input_file_path
- The path to a txt file used for test data creation. The assumed encoding of the txt file is UTF-8.
output_directory_path
- The path to an existing directory where the test data files are to be written.
char_class
- A regex character class such as \p{L} to restrict the set of characters that the test data are built from.
maximum_lines
- The maximum number of lines each test data file should have.
Raises
Exception
- if the input file path is not absolute or does not point to an existing txt file; if the input file's encoding is not UTF-8; if the output directory path is not absolute or does not point to an existing directory; if the character class cannot be compiled to a valid regular expression