-
Notifications
You must be signed in to change notification settings - Fork 12.7k
Description
So I am looking at https://github.com/antimatter15/alpaca.cpp and I see they are already running 30B Alpaca models, while we are struggling to run 7B due to the recent tokenizer updates.
I also see that the models are now even floating on Hugging Face - I guess license issues are no longer a problem?
We should add detailed instructions for obtaining the Alpaca models and a temporary explanation how to use the following script to make the models compatible with the latest master
:
The bigger issue is that people keep producing the old version of the ggml
models instead of migrating to the latest llama.cpp
changes. And therefore, we now need this extra conversion step. It's best to figure out the steps for generating the Alpaca models and generate them in the correct format.
Edit: just don't post direct links to the models!
Activity
madmads11 commentedon Mar 22, 2023
Here is what I did to run Alpaca 30b on my system with llama.cpp. I would assume it would work with Alpaca 13b as well.
--n_parts 1
parameterggml-alpaca-30b-q4.bin
and placed it in /models/Alpaca/30b inside llama.cpppython convert.py models/Alpaca/30b models/tokenizer.model
in the command prompt from the base folder of llama.cpp (personally I got the message that I needed the modulesentencepiece
, so I wrotepip install sentencepiece
and then I re-ranpython convert.py models/Alpaca/30b models/tokenizer.model
and it worked. You may or may not encounter this error.)ggml-alpaca-30b-q4.bin
and aggml-alpaca-30b-q4.bin.tmp
file, I renamedggml-alpaca-30b-q4.bin
toggml-alpaca-30b-q4.bin.old
to keep it as a backup, andggml-alpaca-30b-q4.bin.tmp
toggml-alpaca-30b-q4.bin
./main -m ./models/alpaca/30b/ggml-alpaca-30b-q4.bin --color -f ./prompts/alpaca.txt -ins --n_parts 1
.Maybe this can be of temporary help to anybody else eager to set it up. Please correct me if I've made any mistakes, I wrote it retroactively from memory.
Puncia commentedon Mar 22, 2023
Can confirm the above works for the 13B model too.
lolxdmainkaisemaanlu commentedon Mar 22, 2023
The above instructions work for me too for the 13B model! Thank you!
Green-Sky commentedon Mar 22, 2023
check sum for the converted (ggmf v1) Pi3141 alpaca-30B-ggml
anzz1 commentedon Mar 23, 2023
all of them (7B/13B/30B/65B*)
4b quantized q4_0 (RTN) and GPTQ
new tokenizer format
*no alpaca-65b tho as it would take very long time
does not include batteries
https://btcache.me/torrent/E5322AB4676E24632A907FD9846234BB40265C4F
https://torrage.info/torrent.php?h=e5322ab4676e24632a907fd9846234bb40265c4f
single command option:
aria2c --summary-interval=0 --bt-max-peers=0 http://taco.cab/ggml/ggml-q4.torrent
as usual, the alpaca and gptq models need the
--n_parts 1
optionpalpaca-7B
hope that helps 👍
Green-Sky commentedon Mar 23, 2023
@anzz1 you did not specify for which model your links are. also please provide checksums :)
Green-Sky commentedon Mar 23, 2023
me: i should try and debug all those crashes
me:
> help me write a song about llama.cpp (c++ api for facebooks llm)
llama.cpp:
(the 30B alpaca lora finetune by pi)
anzz1 commentedon Mar 23, 2023
i linked the checksums here #374 (comment)
24 remaining items