Builds (n+1)-grams from n-grams.

add_1grams(ngram, u, seq_length)

Arguments

ngram

a single n-gram.

u

integer, numeric or character vector of all possible unigrams.

seq_length

length of an origin sequence.

Value

vector of n-grams (where n is equal to the n of the input plus one).

Details

n-grams are built by pasting every possible unigram in the every possible free position. The total length of n-gram (n plus total distance between elements of the n-gram) is limited by the length of an origin sequence, because the n-gram cannot be longer than an origin sequence.

See also

Reverse function: gap_ngrams.

Examples

add_1grams("1_2.3.4_3.0", 1L:4, 8)
#> [1] "1_2.1.3.4_0.2.0" "1_2.2.3.4_0.2.0" "1_2.3.3.4_0.2.0" "1_2.4.3.4_0.2.0" #> [5] "1_2.1.3.4_1.1.0" "1_2.2.3.4_1.1.0" "1_2.3.3.4_1.1.0" "1_2.4.3.4_1.1.0" #> [9] "1_2.1.3.4_2.0.0" "1_2.2.3.4_2.0.0" "1_2.3.3.4_2.0.0" "1_2.4.3.4_2.0.0" #> [13] "1_2.3.4.1_3.0.0" "1_2.3.4.2_3.0.0" "1_2.3.4.3_3.0.0" "1_2.3.4.4_3.0.0" #> [17] "1_2.3.4.1_3.0.1" "1_2.3.4.2_3.0.1" "1_2.3.4.3_3.0.1" "1_2.3.4.4_3.0.1"
add_1grams("a.a_1", c("a", "b", "c"), 4)
#> [1] "a.a.a_0.0" "a.b.a_0.0" "a.c.a_0.0" "a.a.a_1.0" "a.a.b_1.0" "a.a.c_1.0" #> [7] "a.a.a_0.1" "b.a.a_0.1" "c.a.a_0.1"