algorithms analysis answer api collection concepts corpus design documents features framework human index infer install introduction latent dirichlet allocation model objectives paragraphs python query questions random reference representation semantic similar space sparse structure SVD text thought topic training tutorials unsupervised vector words