Discussion: Update dataloader to skip rows that dont require training

https://github.com/pytorch/torchtune/issues/2341

When a) `train_on_input=False` and b) message is too long that output is truncated, there may be a batch without trainable tokens, raising an error on the loss because of division by zero.

Beyond raising an inconvenient bug, this is a waste of compute, and fixing the loss seems to be fixing a symptom, instead of the root cause.

In the dataloader, should we skip rows that dont have trainable embeddings?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion: Update dataloader to skip rows that dont require training #2344

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: Update dataloader to skip rows that dont require training #2344

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions