You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-9Lines changed: 8 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Text generation web UI
2
2
3
-
A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, GPT-Neo, and Pygmalion.
3
+
A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.
4
4
5
5
Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) of text generation.
6
6
@@ -27,6 +27,7 @@ Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.
* Get responses via API, [with](https://github.com/oobabooga/text-generation-webui/blob/main/api-example-streaming.py) or [without](https://github.com/oobabooga/text-generation-webui/blob/main/api-example.py) streaming.
30
+
*[Supports the LLaMA model, including 4-bit mode](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model).
30
31
*[Supports the RWKV model](https://github.com/oobabooga/text-generation-webui/wiki/RWKV-model).
@@ -137,6 +138,8 @@ Optionally, you can use the following command-line flags:
137
138
|`--cai-chat`| Launch the web UI in chat mode with a style similar to Character.AI's. If the file `img_bot.png` or `img_bot.jpg` exists in the same folder as server.py, this image will be used as the bot's profile picture. Similarly, `img_me.png` or `img_me.jpg` will be used as your profile picture. |
138
139
|`--cpu`| Use the CPU to generate text.|
139
140
|`--load-in-8bit`| Load the model with 8-bit precision.|
141
+
|`--load-in-4bit`| Load the model with 4-bit precision. Currently only works with LLaMA.|
142
+
|`--gptq-bits GPTQ_BITS`| Load a pre-quantized model with specified precision. 2, 3, 4 and 8 (bit) are supported. Currently only works with LLaMA. |
140
143
|`--bf16`| Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU. |
141
144
|`--auto-devices`| Automatically split the model across the available GPU(s) and CPU.|
142
145
|`--disk`| If the model is too large for your GPU(s) and CPU combined, send the remaining layers to the disk. |
@@ -176,14 +179,10 @@ Check the [wiki](https://github.com/oobabooga/text-generation-webui/wiki/System-
176
179
177
180
Pull requests, suggestions, and issue reports are welcome.
178
181
179
-
Before reporting a bug, make sure that you have created a conda environment and installed the dependencies exactly as in the *Installation* section above.
182
+
Before reporting a bug, make sure that you have:
180
183
181
-
These issues are known:
182
-
183
-
* 8-bit doesn't work properly on Windows or older GPUs.
184
-
* DeepSpeed doesn't work properly on Windows.
185
-
186
-
For these two, please try commenting on an existing issue instead of creating a new one.
184
+
1. Created a conda environment and installed the dependencies exactly as in the *Installation* section above.
185
+
2.[Searched](https://github.com/oobabooga/text-generation-webui/issues) to see if an issue already exists for the issue you encountered.
0 commit comments