AI Inference is ABOUT to CHANGE!!!

Показать описание

🔗 Links 🔗
Apple M4 Chip

MLX Community on Hugging Face

❤️ If you want to support the channel ❤️
Support here:

🧭 Follow me on 🧭

Рекомендации по теме

Комментарии

Good luck to folks who are trapped in the Apple closed garden model. But they are not for me. I will stick with PC's I can rely on them not deliberately obsoleting them. Apples vision of AI is closed and censored. Thats up to them. Its not for me though.

rodvik

I really doubt the future of on-device LLM. I mean it nice to chat and stuff but if you think from quality and usage. The larger models are always going to be better.
Unless the gap in performance between large and small models get reduced, which is highly unlikely.
Also technologies like GROQs LPU are just too fast right now and most likely may get better. For sure there are going to be use cases for on-device LLMs but it's just you can always go on internet and search with almost minimal latency.

prashantsolanki

Certainly will dominate the Edge and could be a challenge for inferencing. This is a bigger challenge for AMD as opposed to Nvidia

mvasa

Interesting take but fairly speculative since we don't know yet if the new NPU is helpful in any way.
mlx and llama.cpp are running llms on the GPU or CPU as fallback, and in both cases the performance is mostly dependent on the memory bandwidth available.

Typically, despite sharing the same RAM, the GPU memory bandwidth is a lot higher than the CPU cores combined.
What about the NPU memory bandwidth on the M4? It wasn't mentioned in the keynote, but it's what matters most for llms inference.

BTW they're comparing to Intel Core Ultra's NPU, since Qualcomm's platform is not released yet.

supercurioTube

You hit the nail about Intel and AMD not capturing the market potential. Unfortunate but true!

mvasa

Whether on the device or in the cloud, these models are capable of compressing data so intensely that they are the perfect spy tool for many applications. Just think about it - if the average size of an LLM model is a few gigabytes, and it contains most of the knowledge humanity has created in the known past, what is it for the model to compress one person's data. These will be bits in size that can be indistinctly attached to any file, or encoded in a file so that they are invisible. And sent over an electrical network that has long been connected to the Internet. And all this still without the IoT running 24/7. No place to hide😉

TheGalacticIndian

So the Base M4 chip has 38 TOPs (Neural engine only). Nice

psguy

Just not interested in being trapped with Apple 😂.

Custodian

Apple has had AI chips before all this AI thing came out, but, apple did not say anything about it, they just minded their business. now is coming out and vocal cause they know they might be left behind the curtain and they need to earn money

adriangpuiu

Could similar things be done with the latest Ryzen 8700G and sufficient system ram?

jmirodg

wait until apple said they invent LLM

i once read that somewhere in YouTube comment

RickySupriyadi

Which llms can be run on the most powerful M4 tablet?

simplemanideas

1:41 is this just marketing added or is it really a physical difference in the architecture?

GetzAI

Hey @1littlecoder could you please make a video on Idefics2 model from Hugging Face, also its fine tuning for custom dataset, I would be really glad😊

fouziaanjums

still snapdrgon x elite with 45 tops on base with higher effeciency of less than 4 wattt

vickeythegamer

Am I about to live in a world where I actually have to buy Apple products to be on bleeding edge of graphics?
I'm not convinced yet. Companies like groq etc can make accelerated cards but the reason nvidia is on top of every ML engineers mind is due to CUDA unless I am missing something?

Spreadshotstudios

i mean they are apple. they will do everything they can to make mlx incompatible with M1 hardware within the next month. Apple is preditory in the worst ways. I mean it kind of puzzles me because apples whole entire thing is "lets make something expensive, and white." Are they going to try to make things even more expensive then Nvidia?.. no whiter and smoother I guess.. or no more of an "exclusive club"

Listen Nvidia is NOT great
but apple is way worse.

EternalKernel

M4 is nothing compared to the upcoming TSMC'S A16 N2 chip which is based on GAA tech-what intel calls ribbon 20A and 18A tech. the A16 TSMC chip will head strait to apples devices and shall be capable of running gpt-4 old version on device. the next apple imac and desktop will be insane.

paulmuriithi

I will never buy an apple product, but it could be good for the industry indeed. As for the market I think, on a product level, they are competing or rather preventing the rise of AMD who has invested a lot in its APUs, because M chips are only that, CPUs with integrated graphics. There is no chip dedicated to AI on consumer hardware yet. If AMD manages to make a chip that reach the same performance as apple's it could be a better option as the user could be able to buy and upgrade the ram as he pleases.

Raphy_Afk

Apple is shit, i can buy 512gb server with 4x 3090 for same price

akierum

AI Inference is ABOUT to CHANGE!!!

What is AI Inference?

AI Inference: The Secret to AI's Superpowers

Deep Learning Concepts: Training vs Inference

AI ML Training versus Inference

Nvidia's Dilemma: AI Learning vs Inference | All-In Podcast

AI Inference is ABOUT to CHANGE!!!

Inference: Why Tesla will win AI 🤖

What is an AI inference engine? [2023]

AI inference

The race is on: Getting ahead with AI inference

NVIDIA Breakthroughs in AI Inference

Generative AI Inference Powered by NVIDIA NIM: Performance and TCO Advantage

AI Hardware: Training, Inference, Devices and Model Optimization

What is “inference”? AI Explained #ai #artificialintelligence #shorts #aiexplained #demystifyai

Groq Builds the World's Fastest AI Inference Technology

Understanding LLM Inference | NVIDIA Experts Deconstruct How AI Works

🤖🧑‍🏫 Diving into AI Training vs Inference #ai #aitraining #inference #datacenter #datacloud #tech...

NVIDIA Announces New AI Inference Platform

Boost AI Performance: Why AI Inference Matters & How Baseten Helps

NuPIC: A New Era of AI Inference | CES 2024

Testing the FASTEST AI Inference Technology (2024)

Groq-LPU™ Inference Engine Better Than OpenAI Chatgpt And Nvidia

Accelerating AI inference workloads

Understanding Tokens in AI | Impact on Model Training & Inference