Download

Get gensim version 0.8.2 from the Python Package Index or install directly with:

easy_install -U gensim

Table Of Contents

Questions? Suggestions?

Join the Google discussion group

Check the source code at Github. Report bugs at the issue tracker.

Gensim – Topic Modelling for Humans

algorithms analysis answer api collection concepts corpus design documents features framework human index infer install introduction latent dirichlet allocation model objectives paragraphs python query questions random reference representation semantic similar space sparse structure SVD text thought topic training tutorials unsupervised vector words

Quick Reference Example

>>> from gensim import corpora, models, similarities
>>>
>>> # Load corpus iterator from a Matrix Market file on disk.
>>> corpus = corpora.MmCorpus('/path/to/corpus.mm')
>>>
>>> # Initialize a transformation (Latent Semantic Indexing with 200 latent dimensions).
>>> lsi = models.LsiModel(corpus, num_topics=200)
>>>
>>> # Convert another corpus to the latent space and index it.
>>> index = similarities.MatrixSimilarity(lsi[another_corpus])
>>>
>>> # determine similarity of a query document against each document in the index
>>> sims = index[query]

What’s new?