Skip to content

Discussion: Update dataloader to skip rows that dont require training #2344

Open
@felipemello1

Description

@felipemello1

#2341

When a) train_on_input=False and b) message is too long that output is truncated, there may be a batch without trainable tokens, raising an error on the loss because of division by zero.

Beyond raising an inconvenient bug, this is a waste of compute, and fixing the loss seems to be fixing a symptom, instead of the root cause.

In the dataloader, should we skip rows that dont have trainable embeddings?

Metadata

Metadata

Assignees

Labels

best practiceThings we should be doing but aren'tdiscussionStart a discussiontriage reviewThis issue should be discussed in weekly review

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions