-
Notifications
You must be signed in to change notification settings - Fork 647
Update E2E Tutorial w/ vLLM and HF Hub #2192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2192
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
docs/source/tutorials/e2e_flow.rst
Outdated
1|3|Loss: 1.943998098373413: 0%| | 3/1617 [00:21<3:04:47, 6.87s/it] | ||
|
||
Congrats on training your model! Let's take a look at the artifacts produced by torchtune. A simple way of doing this is by running `tree -a path/to/outputdir`, which should show something like the tree below. | ||
There are 4 types of folders: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 folders
docs/source/tutorials/e2e_flow.rst
Outdated
1) **recipe_state**: Holds recipe_state.pt with the information necessary to restart the last intermediate epoch. For more information, please check our deep-dive :ref:`Checkpointing in torchtune <understand_checkpointer>`.; | ||
2) **logs**: Defined in your config in metric_logger; | ||
3) **epoch_{}**: Contains your new trained model weights plus all original files of the model, except the checkpoints, making it easy for you to choose an specific epoch to run inference on or push to a model hub; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made some changes to it in checkpointer, based on evans comments. Probably work copying/pasting it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/source/tutorials/e2e_flow.rst
Outdated
model_type: LLAMA2 | ||
### OTHER PARAMETERS -- NOT RELATED TO THIS CHECKPOINT | ||
|
||
# Make sure to update the tokenizer path to the right | ||
# checkpoint directory as well | ||
tokenizer: | ||
_component_: torchtune.models.llama2.llama2_tokenizer | ||
path: <checkpoint_dir>/tokenizer.model | ||
# Environment | ||
device: cuda | ||
dtype: bf16 | ||
seed: 1234 # It is not recommended to change this seed, b/c it matches EleutherAI's default seed | ||
|
||
# EleutherAI specific eval args | ||
tasks: ["truthfulqa_mc2"] | ||
limit: null | ||
max_seq_length: 4096 | ||
batch_size: 8 | ||
enable_kv_cache: True | ||
|
||
Now, let's run the recipe. | ||
|
||
.. code-block:: bash | ||
|
||
tune run eleuther_eval --config ./custom_eval_config.yaml | ||
# Quantization specific args | ||
quantizer: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evan suggested to remove all of that, but i like giving the extra context. The issue is that it is not clear exactly we are changing. Maybe we should point it out: checkpointer.model_type, checkpoint_files, model.component, tokenizer. But idk, it may be too much
Profiling disabled. | ||
Profiler config after instantiation: {'enabled': False} | ||
1|3|Loss: 1.943998098373413: 0%| | 3/1617 [00:21<3:04:47, 6.87s/it] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should make it a new section:
"Inspecting your outputs" or "Understanding the model outputs" or something like that. Looking at the docs, the finetuning section is too long: https://docs-preview.pytorch.org/pytorch/torchtune/2192/tutorials/e2e_flow.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really great update to modernize this tutorial. I left some comments that you can pick and choose what you like.
We'll fine-tune using our | ||
`single device LoRA recipe <https://github.com/pytorch/torchtune/blob/main/recipes/lora_finetune_single_device.py>`_ | ||
and use the standard settings from the | ||
`default config <https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3_2/3B_lora_single_device.yaml>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to how you showed above how you use the cli to look at recipes, would this be a good place to show copying a config (can mention you can copy a recipe too) and show part of the config?
|
||
.. TODO (SalmanMohammadi) ref eval recipe docs | ||
|
||
torchtune integrates with | ||
`EleutherAI's evaluation harness <https://github.com/EleutherAI/lm-evaluation-harness>`_. | ||
An example of this is available through the | ||
``eleuther_eval`` recipe. In this tutorial, we're going to directly use this recipe by | ||
modifying its associated config ``eleuther_evaluation.yaml``. | ||
`eleuther_eval <https://github.com/pytorch/torchtune/blob/main/recipes/eleuther_eval.py>`_ recipe. In this tutorial, we're going to directly use this recipe by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the tutorial would be smoother if we added model specific configs like we do for the vision model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's inflight... see #2186
Do you want to land it?
docs/source/tutorials/e2e_flow.rst
Outdated
|
||
cd gpt-fast/ | ||
from transformers import AutoModelForCausalLM, AutoTokenizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note that for multimodal models, huggingface automodel doesn't work and you need to checkout the model page on huggingface
|
||
Speeding up Generation using Quantization | ||
----------------------------------------- | ||
Introduce some quantization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our quantize recipe only outputs 1 file that is .pt. I am tempted to say that we should remove this from here until we have better support for quantization, e.g. output safetensors and multiple checkpoints. Any thoughts?
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
Co-authored-by: Felipe Mello <[email protected]> Co-authored-by: salman <[email protected]>
Co-authored-by: Felipe Mello <[email protected]> Co-authored-by: salman <[email protected]>
Co-authored-by: Felipe Mello <[email protected]> Co-authored-by: salman <[email protected]>
Co-authored-by: Felipe Mello <[email protected]> Co-authored-by: salman <[email protected]>
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
Changelog
What are the changes made in this PR?
*
Test plan
Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.
pre-commit install
)pytest tests
pytest tests -m integration_test
UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example