Skip to content

Update E2E Tutorial w/ vLLM and HF Hub #2192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 20, 2024

Conversation

joecummings
Copy link
Contributor

@joecummings joecummings commented Dec 20, 2024

Context

What is the purpose of this PR? Is it to

  • add a new feature
  • fix a bug
  • update tests and/or documentation
  • other (please add here)

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?
*

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

  • run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
  • add unit tests for any new functionality
  • update docstrings for any new or updated methods or classes
  • run unit tests via pytest tests
  • run recipe tests via pytest tests -m integration_test
  • manually run any new or modified recipes with sufficient proof of correctness
  • include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

UX

If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example

  • I did not change any public API
  • I have added an example to docs or docstrings

Copy link

pytorch-bot bot commented Dec 20, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2192

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2024
1|3|Loss: 1.943998098373413: 0%| | 3/1617 [00:21<3:04:47, 6.87s/it]

Congrats on training your model! Let's take a look at the artifacts produced by torchtune. A simple way of doing this is by running `tree -a path/to/outputdir`, which should show something like the tree below.
There are 4 types of folders:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 folders

Comment on lines 132 to 134
1) **recipe_state**: Holds recipe_state.pt with the information necessary to restart the last intermediate epoch. For more information, please check our deep-dive :ref:`Checkpointing in torchtune <understand_checkpointer>`.;
2) **logs**: Defined in your config in metric_logger;
3) **epoch_{}**: Contains your new trained model weights plus all original files of the model, except the checkpoints, making it easy for you to choose an specific epoch to run inference on or push to a model hub;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i made some changes to it in checkpointer, based on evans comments. Probably work copying/pasting it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip one extra line after 3).
image

@joecummings joecummings changed the title updates Update E2E Tutorial w/ vLLM and HF Hub Dec 20, 2024
Comment on lines 213 to 245
model_type: LLAMA2
### OTHER PARAMETERS -- NOT RELATED TO THIS CHECKPOINT

# Make sure to update the tokenizer path to the right
# checkpoint directory as well
tokenizer:
_component_: torchtune.models.llama2.llama2_tokenizer
path: <checkpoint_dir>/tokenizer.model
# Environment
device: cuda
dtype: bf16
seed: 1234 # It is not recommended to change this seed, b/c it matches EleutherAI's default seed

# EleutherAI specific eval args
tasks: ["truthfulqa_mc2"]
limit: null
max_seq_length: 4096
batch_size: 8
enable_kv_cache: True

Now, let's run the recipe.

.. code-block:: bash

tune run eleuther_eval --config ./custom_eval_config.yaml
# Quantization specific args
quantizer: null
Copy link
Contributor

@felipemello1 felipemello1 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evan suggested to remove all of that, but i like giving the extra context. The issue is that it is not clear exactly we are changing. Maybe we should point it out: checkpointer.model_type, checkpoint_files, model.component, tokenizer. But idk, it may be too much

Profiling disabled.
Profiler config after instantiation: {'enabled': False}
1|3|Loss: 1.943998098373413: 0%| | 3/1617 [00:21<3:04:47, 6.87s/it]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make it a new section:

"Inspecting your outputs" or "Understanding the model outputs" or something like that. Looking at the docs, the finetuning section is too long: https://docs-preview.pytorch.org/pytorch/torchtune/2192/tutorials/e2e_flow.html

Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really great update to modernize this tutorial. I left some comments that you can pick and choose what you like.

We'll fine-tune using our
`single device LoRA recipe <https://github.com/pytorch/torchtune/blob/main/recipes/lora_finetune_single_device.py>`_
and use the standard settings from the
`default config <https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3_2/3B_lora_single_device.yaml>`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to how you showed above how you use the cli to look at recipes, would this be a good place to show copying a config (can mention you can copy a recipe too) and show part of the config?


.. TODO (SalmanMohammadi) ref eval recipe docs

torchtune integrates with
`EleutherAI's evaluation harness <https://github.com/EleutherAI/lm-evaluation-harness>`_.
An example of this is available through the
``eleuther_eval`` recipe. In this tutorial, we're going to directly use this recipe by
modifying its associated config ``eleuther_evaluation.yaml``.
`eleuther_eval <https://github.com/pytorch/torchtune/blob/main/recipes/eleuther_eval.py>`_ recipe. In this tutorial, we're going to directly use this recipe by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the tutorial would be smoother if we added model specific configs like we do for the vision model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inflight... see #2186

Do you want to land it?


cd gpt-fast/
from transformers import AutoModelForCausalLM, AutoTokenizer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note that for multimodal models, huggingface automodel doesn't work and you need to checkout the model page on huggingface


Speeding up Generation using Quantization
-----------------------------------------
Introduce some quantization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our quantize recipe only outputs 1 file that is .pt. I am tempted to say that we should remove this from here until we have better support for quantization, e.g. output safetensors and multiple checkpoints. Any thoughts?

@felipemello1 felipemello1 mentioned this pull request Dec 20, 2024
4 tasks
@felipemello1 felipemello1 merged commit 0cd8bc4 into pytorch:main Dec 20, 2024
3 checks passed
felipemello1 pushed a commit that referenced this pull request Dec 20, 2024
Co-authored-by: Felipe Mello <[email protected]>
Co-authored-by: salman <[email protected]>
mori360 pushed a commit to mori360/torchtune that referenced this pull request Dec 20, 2024
rahul-sarvam pushed a commit to sarvamai/torchtune that referenced this pull request Dec 23, 2024
rahul-sarvam pushed a commit to sarvamai/torchtune that referenced this pull request Dec 23, 2024
@RdoubleA RdoubleA mentioned this pull request Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants