Geekbench AI Shows Why AI Benchmarking is HARD!

preview_player
Показать описание
AI and Machine Learning are everywhere, from Large Language Models to neural networks that blur out the background on our phones. It is more important that ever to be able to test the speed and performance of our AI enabled devices. To that end Primate Labs has released Geekbench AI v1.0. However using it we quickly discover that testing AI performance is HARD!
---

#garyexplains
Рекомендации по теме
Комментарии
Автор

My Pixel 8 Pro scored 412 Single; 4325 Half; and 5740 Quantized using NNAPI.

Way different than the Qualcomm results!

silversentinal
Автор

12:43 YES 1000% Agree with you, TOPS is USELESS! I've been thinking this since the AI PC from all vendors starts using TOPS as performance measurement.

LuckynbOC
Автор

9:18 NN hardware like that is specialized hardware, often not fast, but supposedly really efficient.

autohmae
Автор

Are you sure the QNN results for the 8 Gen 3 are correct? I can't even get close to those half and quantized results.
It also seems pretty weird for the 8g3 to beat the A17 Pro and even my PC GPU by such a large margin. Even the Elite X only gets 2k, 10k and 20k respectively.

Ignacio.Romero
Автор

8:40 hold on CPUs don't support half-float. Converting weights in-flight is bound to add a substantial overhead (task dependent, not universally accountable for), and it can be a lot higher than quantised weights for which you can just create L1 cached value-outcome tables or which are a good fit for even classic MMX and other SIMD instructions. Unfortunately you can't assume that you have double the RAM to expand all the weights to single depending on the type of system you're running on, how much physical RAM it has, whether it's equipped with NVMe or eMMC, etc.

SianaGearz
Автор

I think ULmark can come up with cross platform AI benchmark because the benchmarks can be compared against RTX, AMD, Intel, SoCs

vasudevmenon
Автор

Thanks for explaining but I was expecting something more for us dummies. Especially what those three tests really means for us. What kind of task each of them influences. If there is a difference. I have no idea if single precision affects text generation and last test image manipulation or whatever or it has no connection or it’s more complicated. Thanks.

Aladinzc
Автор

I look forward to some dedicated inference only hardware coming out. Stuff with enough ram to hold real models. Like the ones from Groq.(Not to be confused with Grok)

marcusk
Автор

Any thoughts on AI performance per watt?

silversentinal
Автор

I’m confused about the lack of mentions of memory bandwidth

JaviSoto
Автор

AI looks so good, we should let AI run our military operations What could possibly go wrong :) .

mikldude
Автор

some suprising results in half precision and quantized table. never knew. maybe we could use AI to test AI with results in two buckets..smart vs dumb. enjoyed your analysis.

test
Автор

Too bad you didn't have a mac to look at. The use of the neural engine simply blows away the competition using the neural engine compared to the others. The best pc is 33, 000, but the average Mac with a neural engine is 52, 000.

zedpassway
Автор

The average consumer will never need to worry about this. LLMs and image classification is not a normal person workload.

Iswimandrun
Автор

In all seriousness, i`m now even more sceptical of AI than i already was .

mikldude