Open
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
I tried to run on SGLang Llama-3_3-Nemotron-Super-49B-v1 recently announced by Nvidia.
It seems not to be yet supported by SGLang since DeciLMForCausalLM
is not yet accepted by SGLang. See below.
Can you add corresponding support?
Scheduler hit an exception: Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/sglang/srt/managers/scheduler.py", line 1748, in run_scheduler_process
scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/managers/scheduler.py", line 218, in __init__
self.tp_worker = TpWorkerClass(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/managers/tp_worker_overlap_thread.py", line 63, in __init__
self.worker = TpModelWorker(server_args, gpu_id, tp_rank, dp_rank, nccl_port)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/managers/tp_worker.py", line 74, in __init__
self.model_runner = ModelRunner(
^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_executor/model_runner.py", line 166, in __init__
self.initialize(min_per_gpu_memory)
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_executor/model_runner.py", line 176, in initialize
self.load_model()
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_executor/model_runner.py", line 361, in load_model
self.model = get_model(
^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_loader/__init__.py", line 22, in get_model
return loader.load_model(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_loader/loader.py", line 358, in load_model
model = _initialize_model(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_loader/loader.py", line 137, in _initialize_model
model_class, _ = get_model_architecture(model_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/model_loader/utils.py", line 37, in get_model_architecture
return ModelRegistry.resolve_model_cls(architectures)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/models/registry.py", line 65, in resolve_model_cls
return self._raise_for_unsupported(architectures)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/sglang/srt/models/registry.py", line 32, in _raise_for_unsupported
raise ValueError(
ValueError: Model architectures ['DeciLMForCausalLM'] are not supported for now. Supported architectures: dict_keys(['BaichuanForCausalLM', 'ChatGLMModel', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'MultiModalityCausalLM', 'DeepseekV3ForCausalLMNextN', 'DeepseekV2ForCausalLM', 'DeepseekV3ForCausalLM', 'ExaoneForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'Gemma2ForSequenceClassification', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GraniteForCausalLM', 'Grok1ForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'InternLM2ForRewardModel', 'LlamaForCausalLM', 'Phi3ForCausalLM', 'InternLM3ForCausalLM', 'LlamaForClassification', 'LlamaForCausalLMEagle', 'LlamaEmbeddingModel', 'MistralModel', 'LlamaForSequenceClassification', 'LlamaForSequenceClassificationWithNormal_Weights', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'MiniCPMV', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MllamaForConditionalGeneration', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'Phi3SmallForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2_5_VLForConditionalGeneration', 'Qwen2ForCausalLMEagle', 'Qwen2MoeForCausalLM', 'Qwen2ForRewardModel', 'Qwen2VLForConditionalGeneration', 'StableLmForCausalLM', 'TorchNativeLlamaForCausalLM', 'TorchNativePhi3ForCausalLM', 'XverseForCausalLM', 'XverseMoeForCausalLM', 'YiVLForCausalLM'])
Reproduction
Start SGLang and with nvidia/Llama-3_3-Nemotron-Super-49B-v1
coming from HuggingFace
The message above will appear right after this command
Environment
Amazon Linux 2023
SGLang 0.0.4.post1 = last officially published version as of this writing