Add example scripts to show how to run the model #108

ashwinb · 2024-08-14T16:35:06Z

Folks really want the llama-models repository to be self-contained. That is, they want to be able to simply run the models without needing other dependencies like llama-toolchain. See #82 for a discussion.

This PR adapts the example_*_completion.py scripts from the meta-llama/llama3/ repository so it works for llama3.1 models with the updated types.

Note that in order to run these scripts, you need to install additional dependencies not specified in requirements.txt: These are:

torch
fairscale
fire
blobfile (tiktoken's dependency which it does not specify :/)

raghotham · 2024-08-14T16:38:13Z

models/llama3_1/reference_impl/generation.py

+        ):
+            max_gen_len = self.model.params.max_seq_len - 1
+
+        prompt_tokens = self.tokenizer.encode(x, bos=True, eos=False)


Suggested change

prompt_tokens = self.tokenizer.encode(x, bos=True, eos=False)

prompt_tokens = self.tokenizer.encode(prompt, bos=True, eos=False)

ashwinb · 2024-08-14T17:50:55Z

models/llama3_1/README.md

@@ -34,7 +34,36 @@ Pre-requisites: Ensure you have `wget` installed. Then run the script: `./downlo

 Remember that the links expire after 24 hours and a certain amount of downloads. You can always re-request a link if you start seeing errors such as `403: Forbidden`.

-### Access to Hugging Face
+## Running the models


I added this blurb here to the README. cc @karpathy

zewpo · 2024-08-15T03:58:30Z

The example is on the right track. Great effort. Thanks.

Still it assumes some prior experience of how these llama models are used. I'd like the example to be a bit more explicit for someone with no experience with llama models. A hello world that will just work by following the instructions. After I get it working, then I can start to break it down and see what is going on inside. I think it only needs a couple small tweaks.

Some explanation, with working example, as to what to use for <CHECKPOINT_DIR> and <TOKENIZER_PATH>

As it is now, it doesn't actually provide explicit example of how to run the downloaded files! How do we actually reference the files that were just downloaded?

models/llama3_1/Meta-Llama-3.1-8B/consolidated.00.pth
models/llama3_1/Meta-Llama-3.1-8B/params.json
models/llama3_1/Meta-Llama-3.1-8B/tokenizer.model

I'm also curious that the "tokenizer.model" file is also in the api folder, as well as with the model. And was thinking that since the example is based on the files in the api folder, it seems like we might not need to reference a tokenizer model explicitly? (I'm assuming that the <TOKENIZER_PATH> needs to refer to this tokenizer.model file.) Is this correct, and/or needed for a simple hello world example?

whatdhack · 2024-08-19T22:08:45Z

Do not see this PR in main anymore .

ashwinb · 2024-08-21T11:43:45Z

@whatdhack The content just changed location? It is in models/scripts/ instead of in the llama3_1 sub-folder. The top-level README has also been updated. See https://github.com/meta-llama/llama-models?tab=readme-ov-file#running-the-models

Add example scripts to show how to run the model

c641000

ashwinb requested review from raghotham, hardikjshah and dltn August 14, 2024 16:35

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 14, 2024

raghotham reviewed Aug 14, 2024

View reviewed changes

ashwinb added 2 commits August 14, 2024 09:39

text_completion bugfix

f240dd3

Add a scripts/ folder, update README

13a16f0

ashwinb commented Aug 14, 2024

View reviewed changes

ashwinb linked an issue Aug 14, 2024 that may be closed by this pull request

How to run the model? #82

Closed

HamidShojanazeri mentioned this pull request Aug 14, 2024

How to run the model? #82

Closed

add model_parallel_size option

c70ddc7

raghotham approved these changes Aug 14, 2024

View reviewed changes

Update README with a pointer to llama-stack

5134866

ashwinb merged commit 82e417d into main Aug 14, 2024
3 checks passed

wukaixingxp mentioned this pull request Aug 14, 2024

Is it possible to run Llama 3.1 without HF libraries ? meta-llama/llama-cookbook#625

Closed

2 tasks

ashwinb deleted the run_scripts branch February 1, 2025 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add example scripts to show how to run the model #108

Add example scripts to show how to run the model #108

Uh oh!

ashwinb commented Aug 14, 2024 •

edited

Loading

Uh oh!

raghotham Aug 14, 2024

Uh oh!

ashwinb Aug 14, 2024

Uh oh!

Uh oh!

zewpo commented Aug 15, 2024

Uh oh!

whatdhack commented Aug 19, 2024

Uh oh!

ashwinb commented Aug 21, 2024

Uh oh!

Uh oh!

	prompt_tokens = self.tokenizer.encode(x, bos=True, eos=False)
	prompt_tokens = self.tokenizer.encode(prompt, bos=True, eos=False)

Add example scripts to show how to run the model #108

Add example scripts to show how to run the model #108

Uh oh!

Conversation

ashwinb commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raghotham Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

ashwinb Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zewpo commented Aug 15, 2024

Uh oh!

whatdhack commented Aug 19, 2024

Uh oh!

ashwinb commented Aug 21, 2024

Uh oh!

Uh oh!

ashwinb commented Aug 14, 2024 •

edited

Loading