C
1Cat-vLLM
Author: donitb934
Optimize Tesla V100 GPUs for AWQ 4-bit inference with improved speed, stability, and support for modern large models like Qwen3.5 and MoE.
Source: donitb934/1Cat-vLLM
C · Review first
Author unclaimed
Clear source
Execution · High
Audit focus · unexpected code execution
Universal0