Closed
Description
Hi there, I'm using gensim to do LDA on a collection of novels (using just 40 for testing, I have several hundreds). Building the corpus and dictionary seems to work fine, as does the modeling process itself. I can also inspect the resulting model (topics in documents and words in topics, for example). However, when attempting to use pyLDAvis, I run into a KeyError.
I'm on Linux (Ubuntu 14.04) and using Python 3.4 and the following versions of relevant modules:
pyLDAvis 1.2.0
numpy 1.9.2
gensim 0.11.1-1
This is my code (loading corpus, dictionary and model from previous step):
def gensim_output(modelfile, corpusfile, dictionaryfile):
"""Displaying gensim topic models"""
## Load files from "gensim_modeling"
corpus = corpora.MmCorpus(corpusfile)
dictionary = corpora.Dictionary.load(dictionaryfile) # for pyLDAvis
myldamodel = models.ldamodel.LdaModel.load(modelfile)
## Interactive visualisation
import pyLDAvis.gensim
vis = pyLDAvis.gensim.prepare(myldamodel, corpus, dictionary)
pyLDAvis.display(vis)
This is the output I get:
Traceback (most recent call last):
File "<ipython-input-79-940daa51d8a9>", line 1, in <module>
runfile('/home/[PATH]/an5/mygensim.py', wdir='/home/christof/Dropbox/0-Analysen/2015/rp_Sydney/an5')
File "/usr/lib/python3/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 586, in runfile
execfile(filename, namespace)
File "/usr/lib/python3/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 48, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "/home/[PATH]/an5/mygensim.py", line 84, in <module>
main("./5_lemmata/*.txt", "gensim_corpus.dict", "gensim_corpus.mm", "gensim_modelfile.gensim")
File "/home/[PATH]/an5/mygensim.py", line 82, in main
gensim_output(modelfile, corpusfile, dictionaryfile)
File "/home/[PATH]/an5/mygensim.py", line 75, in gensim_output
vis = pyLDAvis.gensim.prepare(myldamodel, corpus, dictionary)
File "/usr/local/lib/python3.4/dist-packages/pyLDAvis/gensim.py", line 61, in prepare
return vis_prepare(**_extract_data(topic_model, corpus, dictionary))
File "/usr/local/lib/python3.4/dist-packages/pyLDAvis/gensim.py", line 24, in _extract_data
term_freqs = [term_freqs_dict[id] for id in xrange(N)]
File "/usr/local/lib/python3.4/dist-packages/pyLDAvis/gensim.py", line 24, in <listcomp>
term_freqs = [term_freqs_dict[id] for id in xrange(N)]
KeyError: 6
Not sure whether this is a bug or bad usage of the module. Any help would be very much appreciated.