-
Notifications
You must be signed in to change notification settings - Fork 647
Add profiler to full finetune recipes #1288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1288
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 56f4adc with merge base 150a011 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
recipes/full_finetune_distributed.py
Outdated
self._profiler = self._setup_profiler(cfg.get(PROFILER_KEY, None)) | ||
|
||
def _setup_profiler( | ||
self, cfg_profiler: Optional[DictConfig] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
self, cfg_profiler: Optional[DictConfig] | |
self, cfg_profiler: Optional[DictConfig] = None |
recipes/full_finetune_distributed.py
Outdated
Parses the `profiler` section of top-level `cfg` and sets up profiler | ||
|
||
Args: | ||
cfg_profiler (DictConfig): `profiler` section of the top-level `cfg` (the main config passed to `recipe.main`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
cfg_profiler (DictConfig): `profiler` section of the top-level `cfg` (the main config passed to `recipe.main`) | |
cfg_profiler (Optional[DictConfig]): ``profiler`` section of the top-level ``cfg`` (the main config passed to ``recipe.main``). Default None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the case in all the recipes. Sorry about our annoying linter : )
Minor nit to appease our linter but LGTM. Thanks for getting this out!! |
cc @ebsmothers @joecummings to double check whether a quick example profile would be good to see |
Thanks @gau-nernst, this looks great! Re @SalmanMohammadi's comment: can you include a screenshot of one of the traces you got in the PR summary just to explicitly show that it works via an example? Also regarding the failing CI signals, can you check the linter one? (You can ignore the build signals, there is an unrelated breakage we are sorting out right now.) Let me know if you need any pointers on the linter and I'll be happy to provide them. After that I think this is good to go! |
Fix the linter issue. Sorry I didn't install pre-commit before making the first commit, so it slips through subsequent checks. Manually run pylintdoc to double check and no issue now. Also add the full finetune low memory trace to the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you for adding this!
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses. #1279
Changelog
What are the changes made in this PR?
Add profiler to
full_finetune_distributed.py
andfull_finetune_single_device.py
. Basically copied over from LoRA recipes.Test plan
Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.)
pre-commit install
)pytest tests
pytest tests -m integration_test
Just like the LoRA recipe, to profile, run:
Then we can open the trace under
profile-test/iteration_12
with something like https://ui.perfetto.devAnd it reveals that, in my setup, most of the time is spent transferring data back and forth between GPU and CPU for paged Adam.