Skip to content

Conversation

@anubhabdaserrr
Copy link
Contributor

@anubhabdaserrr anubhabdaserrr commented Jun 27, 2025

Hello there,

Just discovered this library! It seems to work quite well with my survey dataset of size ~ 10k! Amazing how fast it is!

This feature will help monitor convergence for the topic model. The following piece of code plots the sample output shown below:

fig = model.plot_loss_arr()
fig.show()

Output:

loss_curve

model.loss_arr keeps track of the epoch-wise loss for all epochs.

@bobxwu
Copy link
Owner

bobxwu commented Jul 3, 2025

Hi Anubhab,
Thank you for delivering this great feature! I just wanted to check—have you tested it with model saving, loading, and retraining workflows?

@anubhabdaserrr
Copy link
Contributor Author

I've defined loss_arr as a class attribute that tracks the loss value, during the fitting stage. Just checked:

  1. model.plot_loss_arr() does work when a saved model is loaded.
  2. It doesn't work as intended after re-training. If the initial training is for 10 epochs, and then fit_transform is applied again for 2 more epochs, we want the loss function to be plotted for all 12 epochs, right?

@anubhabdaserrr
Copy link
Contributor Author

I've changed the logic (see commit 71c2f6f) such that in successive rounds of re-training, loss_arr list is extended, and not overwritten. This should fix the issue.

In the following code, I've also attached the outputs in the comments to demonstrate that it works for saving-loading & re-training workflows:

from fastopic import FASTopic

model = FASTopic(num_topics=50, verbose=True,device='cpu')
print(model.loss_arr) # Prints -> []

topic_top_words, doc_topic_dist = model.fit_transform(docs,epochs=10)
print(len(model.loss_arr)) # Prints -> 10

topic_top_words2, doc_topic_dist2 = model.fit_transform(docs,epochs=3)
print(len(model.loss_arr)) # Prints -> 13

model.save("fastopic.zip")

loaded_model = FASTopic.from_pretrained("fastopic.zip")
print(len(loaded_model.loss_arr)) # Prints -> 13

topic_top_words3, doc_topic_dist3 = loaded_model.fit_transform(docs,epochs=2)
print(len(loaded_model.loss_arr)) # Prints -> 15

@bobxwu bobxwu merged commit 72de8e6 into bobxwu:master Jul 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants