Regarding compute_metrics() using with HuggingFace Trainer #4220
Unanswered
anmolagarwal999
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using DeepSpeed ZeroStage 3 and am passing a custom
compute_metrics()
to the trainer. I have 4 GPUs (devices). The compute_metrics function is being invoked by all the devices. Moreover, all the points (let’s say there N datapoints in the eval set) in the entire eval dataset seem to be sent to the compute_metrics of all the devices, which seems to be redundant and inefficient. Am I missing something here? (My expectation was that either (1) compute_metrics would be called only once OR (2) the evaluation dataset would be distributed across compute_metrics)Beta Was this translation helpful? Give feedback.
All reactions