You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Get responses via API, [with](https://github.com/oobabooga/text-generation-webui/blob/main/api-example-streaming.py) or [without](https://github.com/oobabooga/text-generation-webui/blob/main/api-example.py) streaming.
30
-
*[Supports the LLaMA model](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model).
30
+
*[Supports the LLaMA model, including 4-bit mode](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model).
31
31
*[Supports the RWKV model](https://github.com/oobabooga/text-generation-webui/wiki/RWKV-model).
See also: [Installation instructions for human beings](https://github.com/oobabooga/text-generation-webui/wiki/Installation-instructions-for-human-beings).
Just download the zip above, extract it, and double click on "install". The web UI and all its dependencies will be installed in the same folder.
70
72
@@ -139,7 +141,7 @@ Optionally, you can use the following command-line flags:
139
141
|`--cpu`| Use the CPU to generate text.|
140
142
|`--load-in-8bit`| Load the model with 8-bit precision.|
141
143
|`--load-in-4bit`| Load the model with 4-bit precision. Currently only works with LLaMA.|
142
-
|`--gptq-bits`| Load a pre-quantized model with specified precision. 2, 3, 4 and 8bit are supported. Currently only works with LLaMA. |
144
+
|`--gptq-bits GPTQ_BITS`| Load a pre-quantized model with specified precision. 2, 3, 4 and 8 (bit) are supported. Currently only works with LLaMA. |
143
145
|`--bf16`| Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU. |
144
146
|`--auto-devices`| Automatically split the model across the available GPU(s) and CPU.|
145
147
|`--disk`| If the model is too large for your GPU(s) and CPU combined, send the remaining layers to the disk. |
@@ -155,12 +157,13 @@ Optionally, you can use the following command-line flags:
155
157
|`--local_rank LOCAL_RANK`| DeepSpeed: Optional argument for distributed setups. |
156
158
|`--rwkv-strategy RWKV_STRATEGY`| RWKV: The strategy to use while loading the model. Examples: "cpu fp32", "cuda fp16", "cuda fp16i8". |
157
159
|`--rwkv-cuda-on`| RWKV: Compile the CUDA kernel for better performance. |
158
-
|`--no-stream`| Don't stream the text output in real time. This improves the text generation performance.|
160
+
|`--no-stream`| Don't stream the text output in real time. |
159
161
|`--settings SETTINGS_FILE`| Load the default interface settings from this json file. See `settings-template.json` for an example. If you create a file called `settings.json`, this file will be loaded by default without the need to use the `--settings` flag.|
160
162
|`--extensions EXTENSIONS [EXTENSIONS ...]`| The list of extensions to load. If you want to load more than one extension, write the names separated by spaces. |
161
163
|`--listen`| Make the web UI reachable from your local network.|
162
164
|`--listen-port LISTEN_PORT`| The listening port that the server will use. |
163
165
|`--share`| Create a public URL. This is useful for running the web UI on Google Colab or similar. |
166
+
|`--auto-launch`| Open the web UI in the default browser upon launch. |
164
167
|`--verbose`| Print the prompts to the terminal. |
165
168
166
169
Out of memory errors? [Check this guide](https://github.com/oobabooga/text-generation-webui/wiki/Low-VRAM-guide).
@@ -179,14 +182,10 @@ Check the [wiki](https://github.com/oobabooga/text-generation-webui/wiki/System-
179
182
180
183
Pull requests, suggestions, and issue reports are welcome.
181
184
182
-
Before reporting a bug, make sure that you have created a conda environment and installed the dependencies exactly as in the *Installation* section above.
183
-
184
-
These issues are known:
185
-
186
-
* 8-bit doesn't work properly on Windows or older GPUs.
187
-
* DeepSpeed doesn't work properly on Windows.
185
+
Before reporting a bug, make sure that you have:
188
186
189
-
For these two, please try commenting on an existing issue instead of creating a new one.
187
+
1. Created a conda environment and installed the dependencies exactly as in the *Installation* section above.
188
+
2.[Searched](https://github.com/oobabooga/text-generation-webui/issues) to see if an issue already exists for the issue you encountered.
0 commit comments