Skip to content

Add proper instructions for using Alpaca models #382

@ggerganov

Description

@ggerganov
Member

So I am looking at https://github.com/antimatter15/alpaca.cpp and I see they are already running 30B Alpaca models, while we are struggling to run 7B due to the recent tokenizer updates.

I also see that the models are now even floating on Hugging Face - I guess license issues are no longer a problem?

We should add detailed instructions for obtaining the Alpaca models and a temporary explanation how to use the following script to make the models compatible with the latest master:

#324 (comment)

The bigger issue is that people keep producing the old version of the ggml models instead of migrating to the latest llama.cpp changes. And therefore, we now need this extra conversion step. It's best to figure out the steps for generating the Alpaca models and generate them in the correct format.

Edit: just don't post direct links to the models!

Activity

madmads11

madmads11 commented on Mar 22, 2023

@madmads11

Here is what I did to run Alpaca 30b on my system with llama.cpp. I would assume it would work with Alpaca 13b as well.

  1. Downloaded and built llama.cpp from scratch as the latest version is required to specify that the model is in 1 file with the new --n_parts 1 parameter
  2. Downloaded this 30b alpaca model https://huggingface.co/Pi3141/alpaca-30B-ggml/tree/main (If you check the model card, you can find links to other alpaca model sizes)
  3. Named the file ggml-alpaca-30b-q4.bin and placed it in /models/Alpaca/30b inside llama.cpp
  4. Downloaded the script mentioned here: Breaking change of models since PR #252 #324 (comment)
  5. Named it convert.py and placed it in the root folder of llama.cpp.
  6. Downloaded the tokenizer mentioned here: Breaking change of models since PR #252 #324 (comment)
  7. Placed the tokenizer.model file in /models
  8. Ran python convert.py models/Alpaca/30b models/tokenizer.model in the command prompt from the base folder of llama.cpp (personally I got the message that I needed the module sentencepiece, so I wrote pip install sentencepiece and then I re-ran python convert.py models/Alpaca/30b models/tokenizer.model and it worked. You may or may not encounter this error.)
  9. In the 30b folder, there is now a ggml-alpaca-30b-q4.bin and a ggml-alpaca-30b-q4.bin.tmp file, I renamed ggml-alpaca-30b-q4.bin to ggml-alpaca-30b-q4.bin.old to keep it as a backup, and ggml-alpaca-30b-q4.bin.tmp to ggml-alpaca-30b-q4.bin
  10. Now I can run llama.cpp with ./main -m ./models/alpaca/30b/ggml-alpaca-30b-q4.bin --color -f ./prompts/alpaca.txt -ins --n_parts 1.

Maybe this can be of temporary help to anybody else eager to set it up. Please correct me if I've made any mistakes, I wrote it retroactively from memory.

Puncia

Puncia commented on Mar 22, 2023

@Puncia

Can confirm the above works for the 13B model too.

lolxdmainkaisemaanlu

lolxdmainkaisemaanlu commented on Mar 22, 2023

@lolxdmainkaisemaanlu

The above instructions work for me too for the 13B model! Thank you!

Green-Sky

Green-Sky commented on Mar 22, 2023

@Green-Sky
Collaborator

check sum for the converted (ggmf v1) Pi3141 alpaca-30B-ggml

$ sha256sum ggml-model-q4_0.bin
969652d32ce186ca3c93217ece8311ebe81f15939aa66a6fe162a08dd893faf8  ggml-model-q4_0.bin
anzz1

anzz1 commented on Mar 23, 2023

@anzz1
Contributor

all of them (7B/13B/30B/65B*)
4b quantized q4_0 (RTN) and GPTQ
new tokenizer format
*no alpaca-65b tho as it would take very long time
does not include batteries

https://btcache.me/torrent/E5322AB4676E24632A907FD9846234BB40265C4F
https://torrage.info/torrent.php?h=e5322ab4676e24632a907fd9846234bb40265c4f

single command option:

aria2c --summary-interval=0 --bt-max-peers=0 http://taco.cab/ggml/ggml-q4.torrent

as usual, the alpaca and gptq models need the --n_parts 1 option

palpaca-7B

hope that helps 👍

Green-Sky

Green-Sky commented on Mar 23, 2023

@Green-Sky
Collaborator

@anzz1 you did not specify for which model your links are. also please provide checksums :)

Green-Sky

Green-Sky commented on Mar 23, 2023

@Green-Sky
Collaborator

me: i should try and debug all those crashes
me: > help me write a song about llama.cpp (c++ api for facebooks llm)
llama.cpp:

A llama is an animal that's so strange,
It can do things we only imagine.
LLamaCPP is the code that gives it its brawn,
Allowing us to use it like a clown.

The api has commands we can use,
To take advantage of this llama abuse.
It's an interface that let's us be boss,
If you know the right way to make your call.

(the 30B alpaca lora finetune by pi)

anzz1

anzz1 commented on Mar 23, 2023

@anzz1
Contributor

i linked the checksums here #374 (comment)

24 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @ggerganov@Green-Sky@Puncia@anzz1@j-f1

        Issue actions

          Add proper instructions for using Alpaca models · Issue #382 · ggml-org/llama.cpp