Tranforms a vector of positioned n-grams into a list of positions filled with n-grams that start on them.

position_ngrams(ngrams, df = FALSE, unigrams_output = TRUE)

Arguments

ngrams

a vector of positioned n-grams (as created by count_ngrams).

df

logical, if TRUE returns a data frame, if FALSE returns a list.

unigrams_output

logical, if TRUE extracts unigrams from the data and returns information about their position.

Value

if df is FALSE, returns a list of length equal to the number of unique n-gram starts present in n-grams. Each element of the list contains n-grams that start on this position. If df is FALSE, returns a data frame where first column contains n-grams and the second column represent their start positions.

See also

Transform n-gram name to human-friendly form: decode_ngrams.

Validate n-gram structure: is_ngram.

Examples

# position data in the list format position_ngrams(c("2_1.1.2_0.1", "3_1.1.2_0.0", "3_2.2.2_0.0"))
#> $`2` #> [1] 1_0 #> Levels: 1_0 2_0 #> #> $`3` #> [1] 1_0 1_0 2_0 #> Levels: 1_0 2_0 #> #> $`4` #> [1] 1_0 2_0 #> Levels: 1_0 2_0 #> #> $`5` #> [1] 2_0 2_0 2_0 #> Levels: 1_0 2_0 #>
# position data in the data frame format position_ngrams(c("2_1.1.2_0.1", "3_1.1.2_0.0", "3_2.2.2_0.0"), df = TRUE)
#> ngram position #> 1 1_0 2 #> 2 1_0 3 #> 3 1_0 3 #> 4 2_0 3 #> 5 1_0 4 #> 6 2_0 4 #> 7 2_0 5 #> 8 2_0 5 #> 9 2_0 5