692: Lossless LLM Weight Compression: Run Huge Models on a Single GPU — with Jon Krohn

Показать описание

Join @JonKrohnLearns as he navigates listeners through the innovative SpQR approach—a cutting-edge, lossless LLM weight compression technique that harnesses the power of quantization. Tune in as Jon delves into the four steps behind this groundbreaking method in this week's episode.

Super Data Science: ML & AI Podcast with Jon Krohn
SuperDataScience
Podcast
Super Data Science Podcast
Data Science
Jon Krohn