[Bugfix] Fix GLM4 model #16618

intervitens · 2025-04-14T22:06:08Z

FIX #16617
FIX #16655
FIX #16687
FIX #16740
Currently the GLM4 model does not work and fails to load at all.
This PR enables the model to load and makes the outputs mostly identical to outputs from HF transformers.

github-actions · 2025-04-14T22:06:18Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

intervitens · 2025-04-14T22:26:41Z

The model works with --enforce-eager, however without it model loads, but produces garbage outputs.
Edit: also works fine without --enforce-eager and with VLLM_USE_V1=0

Signed-off-by: intervitens <[email protected]>

DarkLight1337 · 2025-04-15T03:09:50Z

cc @zRzRzRzRzRzRzR can you check?

zRzRzRzRzRzRzR · 2025-04-15T04:55:47Z

yes, I will do this

zRzRzRzRzRzRzR · 2025-04-15T05:50:17Z

hidden_states = residual + hidden_states

This section should be retained. See here

zRzRzRzRzRzRzR · 2025-04-15T05:51:08Z

There seem to be some issues, I need to take a closer look. I found that the model cannot run normally now but it could before(as I pr). I need to spend some time checking it out.

zRzRzRzRzRzRzR · 2025-04-15T07:26:41Z

This PR caused the model output to be garbled, @intervitens, have you encountered this problem? I am using GLM-4-9B-0414.

kalomaze · 2025-04-15T07:41:04Z

This PR caused the model output to be garbled, @intervitens, have you encountered this problem? I am using GLM-4-9B-0414.

The 9b and 32b are not identical architecturally. 9b seems to have attention biases, unlike 32b.
Also, the KV head count for the reasoner 32b + DeepResearch-esque 32b (the Z1 and Z1-Rumination 32b models) seems to be larger, strangely enough, if you check the configuration.

Chandler-Bing · 2025-04-15T07:42:33Z

This PR caused the model output to be garbled, @intervitens, have you encountered this problem? I am using GLM-4-9B-0414.

add --enforce-eager would output normally.

zRzRzRzRzRzRzR · 2025-04-15T07:46:16Z

This PR caused the model output to be garbled, @intervitens, have you encountered this problem? I am using GLM-4-9B-0414.

The 9b and 32b are not identical architecturally. 9b seems to have attention biases, unlike 32b. Also, the KV head count for the reasoner 32b + DeepResearch-esque 32b (the Z1 and Z1-Rumination 32b models) seems to be larger, strangely enough, if you check the configuration.

I don’t think that’s the issue. The 9B and 32B models released by GLM do have differences in bias—9B has bias while 32B doesn’t. However, in the attention_bias, they have already been configured.

zRzRzRzRzRzRzR · 2025-04-15T07:51:28Z

vllm serve THUDM/GLM-4-9B-0414 –-enforce-eager

I wasn’t successful. The error you mentioned does indeed exist.
The modification to

hidden_states, residual = self.post_attention_layernorm(hidden_states, residual)

|is correct, but strangely, I got completely different outputs under the same model compared to the PR I submitted back then.
This really puzzles me.
Your change to the dim is also correct.

zRzRzRzRzRzRzR · 2025-04-15T08:03:44Z

I tried reinstalling vLLM from source, and the issue was resolved. Under the current circumstances, your PR works correctly.

is_neox_style=False,

is not necessary

and can you change

THUDM/GLM-4-32B-Chat-0414 docs/source/models/supported_models.md as THUDM/GLM-4-32B-0414

As we rename the model. There is no more -Chat AnyMore

zRzRzRzRzRzRzR · 2025-04-15T08:04:40Z

cc @DarkLight1337 @intervitens Thank you so much for your support Again

Also:

vllm serve THUDM/GLM-4-9B-0414

with out –-enforce-eager is working

Signed-off-by: intervitens <[email protected]>

intervitens · 2025-04-15T12:36:56Z

Removing

is_neox_style=False,

causes the model output to become significantly degraded and repetitive

VLLM_USE_V1=1 vllm serve THUDM/GLM-4-9B-0414

still doesn't work for me, @zRzRzRzRzRzRzR did you figure out any changes to the PR that fixed it for you?

zRzRzRzRzRzRzR · 2025-04-15T13:24:12Z

In my scenario, both is_neox_style=False and is_neox_style=True can run normally, and I think it's better to add it for safety as False

zRzRzRzRzRzRzR · 2025-04-15T13:25:04Z

This might be related to the CUDA version. I tested it on H100 with CUDA 12.4, and I'm not sure if it's related to this.

ad1192214879 · 2025-04-16T02:58:41Z

Has this issue been resolved?

Signed-off-by: intervitens <[email protected]>

intervitens · 2025-04-16T22:07:10Z

I fixed the error that made the model output garbage without eager mode or VLLM_USE_V1=0
Should be ready to merge now.

DarkLight1337 · 2025-04-17T02:53:03Z

Can you verify again @zRzRzRzRzRzRzR ?

icelinks · 2025-04-17T03:17:08Z

GLM-Z1-9B-0414 is ok，but GLM-Z1-32B-0414 repeat with !!!!!
run with CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 vllm serve GLM-Z1-32B-0414 --dtype half --max-model-len 65536 --tensor-parallel-size 8 --port 10033 --enforce-eager --rope-scaling '{"rope_ty pe": "yarn","factor": 4.0,"original_max_position_embeddings": 32768}' in Tesla T4

zRzRzRzRzRzRzR · 2025-04-17T04:19:34Z

It should be a very normal behavior that T4 does not support,

FP16 cannot reason this model normally, it is necessary to use BF16
This exclamation mark error also seems to occur in some specific situations, but there is still no stable code for reproduction, and it seems to be unrelated to this PR (this problem still occurs when using other frameworks).

You can submit the prompt words corresponding to the "infinite output!" issue to the THUDM/GLM-4 repository, and the staff will record and try to reproduce and find the cause of the problem.

zRzRzRzRzRzRzR · 2025-04-17T04:54:28Z

I fixed the error that made the model output garbage without eager mode or VLLM_USE_V1=0 Should be ready to merge now.

It is working for me

DarkLight1337

Thanks for fixing!

solrex · 2025-04-17T10:32:08Z

I fixed the error that made the model output garbage without eager mode or VLLM_USE_V1=0 Should be ready to merge now.

Tested locally, works as expected. +1

Signed-off-by: intervitens <[email protected]>

icelinks · 2025-04-18T01:58:05Z

It should be a very normal behavior that T4 does not support,

FP16 cannot reason this model normally, it is necessary to use BF16

This exclamation mark error also seems to occur in some specific situations, but there is still no stable code for reproduction, and it seems to be unrelated to this PR (this problem still occurs when using other frameworks).

You can submit the prompt words corresponding to the "infinite output!" issue to the THUDM/GLM-4 repository, and the staff will record and try to reproduce and find the cause of the problem.

OK, thanks. I'll try, at least now GLM-Z1-9B-0414 is correct.

warlockedward · 2025-04-18T02:51:34Z

Also in v100, if dtype float16 is configured , all information output by the big model is ！！！！！！！！！

Curious-chen · 2025-04-18T02:59:04Z

Also in v100, if dtype float16 is configured , all information output by the big model is ！！！！！！！！！

Yes, I also tried setting dtype float16 on A6000 and only output !!!!!!!!!

icelinks · 2025-04-18T08:21:55Z

Also in v100, if dtype float16 is configured , all information output by the big model is ！！！！！！！！！

Yes, I also tried setting dtype float16 on A6000 and only output !!!!!!!!!

It's wired, my colleague said it's also appeared with bf16. Only GLM-Z1-32B-0414.

Signed-off-by: intervitens <[email protected]> Signed-off-by: Yang Wang <[email protected]>

rangehow · 2025-04-22T07:00:57Z

It seems there is still a problem. I am using multiple large models for standalone generation, including Qwen, Mistral-Large, Llama 4, Command-A, and Gemma 3-27B. All of the above models are running normally, except for GLM4-32B.

14:56:29.289 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] WorkerProc hit an exception.
14:56:29.289 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.290 �[1;36m(VllmWorker rank=2 pid=6072)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] WorkerProc hit an exception.
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.347 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.348 �[1;36m(VllmWorker rank=3 pid=6073)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] WorkerProc hit an exception.
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.396 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] 
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Traceback (most recent call last):
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = func(*args, **kwargs)
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 242, in execute_model
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     output = self.model_runner.execute_model(scheduler_output)
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     return func(*args, **kwargs)
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]            ^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]   File "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-aipnlp/INS/ruanjunhao04/miniforge3/envs/sglang/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1148, in execute_model
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]     valid_sampled_token_ids = sampled_token_ids.tolist()
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] RuntimeError: CUDA error: device-side assert triggered
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
14:56:29.397 �[1;36m(VllmWorker rank=0 pid=6069)�[0;0m ERROR 04-22 14:56:29 [multiproc_executor.py:470] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

One suspicious point is that I set the hyperparameter to enable the model to handle a longer context:

import os
os.environ['VLLM_ALLOW_LONG_MAX_MODEL_LEN'] = '1'

llm = LLM(model=model_path,tensor_parallel_size=device_count(),enable_prefix_caching=True,task='generate',max_model_len=50000,dtype='bfloat16')

darkness8i8 · 2025-04-23T20:22:08Z

@yangw-dev @DarkLight1337 the issue with !!!! output is happening to all models I am training with dtype = torch.float16, . I usually train with Llama 3.1 8B. Can you please look at this problem holistically I don't believe it is model specific.

DarkLight1337 · 2025-04-24T02:27:42Z

If the model is originally trained on bfloat16, then there may be numerical stability issues when using float16 for inference due to the narrower range float16 supports.

darkness8i8 · 2025-04-24T03:53:36Z

@DarkLight1337 that could be true, thanks. Please consider adding better error outputs than !!! This would be super helpful

Signed-off-by: intervitens <[email protected]>

Signed-off-by: intervitens <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>

Signed-off-by: intervitens <[email protected]> Signed-off-by: Mu Huai <[email protected]>

intervitens added 2 commits April 15, 2025 01:57

Fix GLM4

7ecf64c

Signed-off-by: intervitens <[email protected]>

Fix formatting

b6ae554

Signed-off-by: intervitens <[email protected]>

intervitens force-pushed the main branch from 4f9d949 to b6ae554 Compare April 14, 2025 22:57

jeejeelee changed the title ~~Fix GLM4 model~~ [Bugfix] Fix GLM4 model Apr 15, 2025

ZSharp7 mentioned this pull request Apr 15, 2025

vllm无法运行 GLM-4-32B-0414 THUDM/GLM-4#742

Closed

jeejeelee mentioned this pull request Apr 15, 2025

Fix glm4.py residual bug #16650

Closed

DarkLight1337 mentioned this pull request Apr 15, 2025

[Usage]: RuntimeError: ('Worker failed with error %s, please check the stack trace above for the root cause' #16655

Closed

1 task

Fix model names

6c56af0

Signed-off-by: intervitens <[email protected]>

intervitens requested review from DarkLight1337 and ywang96 as code owners April 15, 2025 12:36

Merge branch 'vllm-project:main' into main

9058417

mergify bot added the documentation Improvements or additions to documentation label Apr 15, 2025

zRzRzRzRzRzRzR mentioned this pull request Apr 15, 2025

is:issue state:open 四张A10显卡vllm跑GLM-4-32B-0414，一直报错Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue. THUDM/GLM-4#743

Closed

2 tasks

Fix V1

1318b61

Signed-off-by: intervitens <[email protected]>

DarkLight1337 approved these changes Apr 17, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) April 17, 2025 04:56

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 17, 2025

zRzRzRzRzRzRzR mentioned this pull request Apr 17, 2025

Can run THUDM/GLM-Z1-32B-0414 with --model-impl but not with --tensor-parallel-size 8 being added THUDM/GLM-4#751

Open

2 tasks

vllm-bot merged commit 5b1aca2 into vllm-project:main Apr 17, 2025
60 of 63 checks passed

lionelvillard pushed a commit to lionelvillard/vllm that referenced this pull request Apr 17, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

25012d5

Signed-off-by: intervitens <[email protected]>

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

8cfafcd

Signed-off-by: intervitens <[email protected]> Signed-off-by: Yang Wang <[email protected]>

DarkLight1337 mentioned this pull request Apr 24, 2025

[Feature]: Automatically detect numerical issues #17123

Open

1 task

gitlawr mentioned this pull request Apr 27, 2025

After upgrading to version 0.6.0, when starting the glm-z1-32b-0414 model, an error occurs during dialogue: "Service unavailable. Please retry your requests after a brief wait." gpustack/gpustack#1908

Closed

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

cc1e0f7

Signed-off-by: intervitens <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

d53ca91

Signed-off-by: intervitens <[email protected]>

adobrzyn pushed a commit to HabanaAI/vllm-fork that referenced this pull request Apr 30, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

4a42e20

Signed-off-by: intervitens <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Bugfix] Fix GLM4 model (vllm-project#16618)

7afd75d

Signed-off-by: intervitens <[email protected]> Signed-off-by: Mu Huai <[email protected]>

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

Uh oh!

[Bugfix] Fix GLM4 model #16618

[Bugfix] Fix GLM4 model #16618

Uh oh!

Conversation

intervitens commented Apr 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

intervitens commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

kalomaze commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Chandler-Bing commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

intervitens commented Apr 15, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zRzRzRzRzRzRzR commented Apr 15, 2025

Uh oh!

ad1192214879 commented Apr 16, 2025

Uh oh!

intervitens commented Apr 16, 2025

Uh oh!

DarkLight1337 commented Apr 17, 2025

Uh oh!

icelinks commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zRzRzRzRzRzRzR commented Apr 17, 2025

Uh oh!

zRzRzRzRzRzRzR commented Apr 17, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

solrex commented Apr 17, 2025

Uh oh!

Uh oh!

icelinks commented Apr 18, 2025

Uh oh!

warlockedward commented Apr 18, 2025

Uh oh!

Curious-chen commented Apr 18, 2025

Uh oh!

icelinks commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rangehow commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

darkness8i8 commented Apr 23, 2025

Uh oh!

DarkLight1337 commented Apr 24, 2025

intervitens commented Apr 14, 2025 •

edited by github-actions bot

Loading

intervitens commented Apr 14, 2025 •

edited

Loading

kalomaze commented Apr 15, 2025 •

edited

Loading

zRzRzRzRzRzRzR commented Apr 15, 2025 •

edited

Loading

zRzRzRzRzRzRzR commented Apr 15, 2025 •

edited

Loading

zRzRzRzRzRzRzR commented Apr 15, 2025 •

edited

Loading

zRzRzRzRzRzRzR commented Apr 15, 2025 •

edited

Loading

icelinks commented Apr 17, 2025 •

edited

Loading

icelinks commented Apr 18, 2025 •

edited

Loading

rangehow commented Apr 22, 2025 •

edited

Loading