Clusters sequences hierarchically with regular expressions. At each step we minimize number of degrees of freedom for all regular expressions needed to describe the data
cluster_reg_exp(ngrams)
ngrams | list of elements |
---|
List of four
"regExps"regular expression in best clustering
"seqClustering"clustering of sequences in best clustering
"allRegExps"all regular expressions.
"allIndices"all clusterings
Regular expression is a list of the length equal to the length of the input sequences. Each element of the list represents a position in the sequence and contains amino acid, that are likely to occure on this position.
data(human_cleave) #cluster_reg_exp is computationally expensive # \donttest{ results <- cluster_reg_exp(human_cleave[1L:10, 1L:4])#> | | | 0% | |== | 3% | |==== | 5% | |====== | 8% | |======= | 11% | |========= | 13% | |=========== | 16% | |============= | 18% | |=============== | 21% | |================= | 24% | |================== | 26% | |==================== | 29% | |====================== | 32% | |======================== | 34% | |========================== | 37% | |============================ | 39% | |============================= | 42% | |=============================== | 45% | |================================= | 47% | |=================================== | 50% | |===================================== | 53% | |======================================= | 55% | |========================================= | 58% | |========================================== | 61% | |============================================ | 63% | |============================================== | 66% | |================================================ | 68% | |================================================== | 71% | |==================================================== | 74% | |===================================================== | 76% | |======================================================= | 79% | |========================================================= | 82% | |=========================================================== | 84% | |============================================================= | 87% | |=============================================================== | 89% | |================================================================ | 92% | |================================================================== | 95% | |==================================================================== | 97% | |======================================================================| 100%# }