Fix VLLM ValueError: Model architectures [‘Qwen2ForCausalLM’] failed to be inspected

Post author By yodi
Post date March 17, 2025
No Comments on Fix VLLM ValueError: Model architectures [‘Qwen2ForCausalLM’] failed to be inspected

When running VLLM, I got error “alueError: Model architectures [‘Qwen2ForCausalLM’] failed to be inspected”

vllm serve unsloth/DeepSeek-R1-Distill-Qwen-32B-bnb-4bit --enable-reasoning --reasoning-parser deepseek_r1 --quantization bitsa
ndbytes --load-format bitsandbytes --enable-chunked-prefill --max_model_len 6704

The solution is put VLLM_USE_MODELSCOPE=True

For example

VLLM_USE_MODELSCOPE=True vllm serve unsloth/DeepSeek-R1-Distill-Qwen-32B-bnb-4bit --enable-reasoning --reasoning-parser deepseek_r1 --quantization bitsa
ndbytes --load-format bitsandbytes --enable-chunked-prefill --max_model_len 6704

Tags Model architectures ['Qwen2ForCausalLM'] failed to be inspected

Leave a Reply Cancel reply