Change set_learning_rate_tensor #3945
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
The change in D71010630 breaks
aps_models/examples/dlrm/tutorials:test_kernel_apf_dlrm_with_basic_training_demo
, which potentially breaksapf_dlrm
bento kernel.See Full error log
TBE has a method to set learning rate, i.e.,
set_learning_rate(lr)
wherelr
is the learning rate value to be set.D71010630 removes
optimizer_args.learning_rate
(float) and introducesself.learning_rate_tensor
(tensor). Hence setting learning rate value means changing the value of theleanring_rate_tensor
. We changed this by usingtensor._fill(lr)
.However, this seems to break bento kernel which is built from APF code, which causes issues when an in-place operation occurs i.e.,
tensor._fill(lr)
.The workaround is to create a new tensor to avoid the in-place operation. The change passes the test
https://www.internalfb.com/intern/testinfra/testconsole/testrun/3659174972188704/
Reviewed By: sryap, nautsimon
Differential Revision: D72617537