Tranforms a vector of positioned n-grams into a list of positions filled with n-grams that start on them.
position_ngrams(ngrams, df = FALSE, unigrams_output = TRUE)
ngrams | a vector of positioned n-grams (as created by |
---|---|
df | logical, if |
unigrams_output | logical, if |
if df
is FALSE
, returns a list of length equal to the number of unique
n-gram starts present in n-grams. Each element of the list contains n-grams that start on
this position. If df
is FALSE
, returns a data frame where first column contains
n-grams and the second column represent their start positions.
Transform n-gram name to human-friendly form: decode_ngrams
.
Validate n-gram structure: is_ngram
.
#> $`2` #> [1] 1_0 #> Levels: 1_0 2_0 #> #> $`3` #> [1] 1_0 1_0 2_0 #> Levels: 1_0 2_0 #> #> $`4` #> [1] 1_0 2_0 #> Levels: 1_0 2_0 #> #> $`5` #> [1] 2_0 2_0 2_0 #> Levels: 1_0 2_0 #># position data in the data frame format position_ngrams(c("2_1.1.2_0.1", "3_1.1.2_0.0", "3_2.2.2_0.0"), df = TRUE)#> ngram position #> 1 1_0 2 #> 2 1_0 3 #> 3 1_0 3 #> 4 2_0 3 #> 5 1_0 4 #> 6 2_0 4 #> 7 2_0 5 #> 8 2_0 5 #> 9 2_0 5