Skip to content

Can't get updated topic_embeddings_ when using pre-calculated document embeddings? #2316

Open
@nhansendev

Description

@nhansendev

Have you searched existing issues? 🔎

  • I have searched and found no existing issues

Desribe the bug

I can use the update_topics function by passing in the appropriate documents and my manually generated topic labels, and using the auto-generated embeddings. This works as intended and I get the updated topic_embeddings_.
topic_model.update_topics(docs, topics=topics)

However, if I calculate the document embeddings myself and pass them to BERTopic like so:
topic_model.fit_transform(docs, embeddings=embeddings)
then the topic_embeddings_ are not updated after using update_topics. This is because the call to _create_topic_vectors within update_topics is done without any arguments being passed, and would normally use self.embedding_model, which is None when I define my own embeddings.

How should I perform the update to get new topic_embeddings_ in this case?
Could a kwarg for passing pre-calculated embeddings to update_topics be added?

BERTopic Version

Version: 0.16.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions