3996 shaares
65 private links
65 private links
Smaller models with smarter architectures and low-bit quantized models are two venues for more efficient use. I'm really curious how far they'll go. This article focuses on low-bit quantized models and the prospects are interesting.