Launching the fastest AI inference solution with Cerebras Systems CEO Andrew Feldman

preview_player
Показать описание
In this episode of Gradient Dissent, Andrew Feldman, CEO of Cerebras Systems, joins host Lukas Biewald to discuss the latest advancements in AI inference technology.

They explore Cerebras Systems' groundbreaking new AI inference product, examining how their wafer-scale chips are setting new benchmarks in speed, accuracy, and cost efficiency. Andrew shares insights on the architectural innovations that make this possible and discusses the broader implications for AI workloads in production. This episode provides a comprehensive look at the cutting-edge of AI hardware and its impact on the future of machine learning.

⏳Timestamps:
00:00 - Introduction
04:28 - Cerebras Systems' Latest Product Announcement
12:59 - The Challenges of AI Inference
18:34 - Architectural Innovations in Wafer-Scale Chips
22:17 - Real-World Applications of AI Inference
27:03 - Speed vs. Accuracy: Striking the Balance
32:46 - Overcoming Latency Issues
38:21 - The Future of AI in Production Environments
42:15 - Competing with Industry Giants
47:39 - Open Source vs. Closed Source in AI Development
52:58 - The Impact of AI on Chip Manufacturing
57:23 - Final Thoughts and Takeaways

🎙 Get our podcasts on these platforms:

Connect with Andrew Feldman:

Follow Weights & Biases:

Join the Weights & Biases Discord Server:

Paper Andrew referenced Paul David- Economic historian
Рекомендации по теме
Комментарии
Автор

They are lighting fast and the voice assistant they made is amazing and free for now at least.

andersonsystem
Автор

... very impressive inference speed, insightful talk with Andrew. cheers! Groq, Samba, Cerebras (most impressive) .. all going for the speed

ashred
Автор

Cerebras inference is indeed impressive. I was getting 1800 t/s yesterday which is incredible. It is is also incredibly difficult to manage. Utilising all that output is like trying to drink from a fire hose at the moment!

Could either of you recommend an agentic set up that I can use in conjunction with Cerebras as a base to build on, for the Metaculus forecasting tournaments?

christopherd.winnan
Автор

Between these guys and Groq, it's hard to get excited about them when I can't use them in a production environment. Groq's API is useless with their inference use limits. Oh well, I suppose we'll get there eventually.

User.Joshua