turboquant
Repository: turboquant
Author: 0xSero · Source status: Clear source
TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration
Score basis:Clear source · Risk needs review · Universal