Topic in document with 0.99 prob but no one word intersects between documents and topic

Hello

I have 200k documents and I create 100 topics. I look at the terms and see that the topics are good.
But when I want to look at examples for each topic I do `probs, _ = topic_model.transform(count_matrix, details=True)`. Then I create new column for each for example `dataframe['topic=0']=pd.Series(probs[:, 0])`. Then I sort dataframe by prob value decrease and I see that about 1/3 of the document is relevant to the topic but others are irrelevant. Moreover no one word intersects between documents and topic. No indication of similarity between documents and topic. 

I noticed that last ~10 topics have few words (3-8) in `get_topics` method result, random words and prob values ~ 0.2-0.3 which is above average

Could you advise me how I can change the model, in particular, recalculation of probability estimates document-topic ? Ty

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topic in document with 0.99 prob but no one word intersects between documents and topic #33

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Topic in document with 0.99 prob but no one word intersects between documents and topic #33

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions