Computes total number of n-grams that can be extracted from sequences.
count_total(seq, n, d)
seq | a vector or matrix describing sequence(s). |
---|---|
n |
|
d |
|
An integer
rperesenting the total number of n-grams.
The maximum number of possible n-grams is limited by their length and the distance between elements of the n-gram.
A format of d
vector is discussed in Details of
count_ngrams
. The maximum
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50) # make several sequences shorter by replacing them partially with NA seqs[8L:11, 46L:50] <- NA seqs[1L, 31L:50] <- NA count_total(seqs, 3, c(1, 0))#> [1] 524