How to make LLMs fast: KV Caching, Speculative Decoding, and Multi-Query Attention | Cursor Team

preview_player
Показать описание
See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc.

*GUEST BIO:*
Aman Sanger, Arvid Lunnemark, Michael Truell, and Sualeh Asif are creators of Cursor, a popular code editor that specializes in AI-assisted programming.

*CONTACT LEX:*

*EPISODE LINKS:*

*SPONSORS:*
To support this podcast, check out our sponsors & get discounts:
*Encord:* AI tooling for annotation & data management.
*MasterClass:* Online classes from world-class experts.
*Shopify:* Sell stuff online.
*NetSuite:* Business management software.
*AG1:* All-in-one daily nutrition drinks.

*PODCAST LINKS:*

*SOCIAL LINKS:*
Рекомендации по теме
Комментарии
Автор

See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc.

*GUEST BIO:*
Aman Sanger, Arvid Lunnemark, Michael Truell, and Sualeh Asif are creators of Cursor, a popular code editor that specializes in AI-assisted programming.

*CONTACT LEX:*

*EPISODE LINKS:*

*SPONSORS:*
To support this podcast, check out our sponsors & get discounts:
*Encord:* AI tooling for annotation & data management.
*MasterClass:* Online classes from world-class experts.
*Shopify:* Sell stuff online.
*NetSuite:* Business management software.
*AG1:* All-in-one daily nutrition drinks.

*PODCAST LINKS:*

*SOCIAL LINKS:*

LexClips
Автор

Imagine human behaviour was relative to the time zones they lived in and you could guarantee 5 Ti gpus running in each household every morning to heat water for their showers like Santa flying around the world doing computations while heating on demand 😂

ValidatingUsername
Автор

I really like the channel but I had to unsubscribe because my feed is absolutely flooded with short clips from this channel making it annoying as fuck to find anything else

saintsplenetic