The Problem with Benchmarks

Показать описание

Benchmarks are great, but they have limitations... and choosing to buy a computer because it has great test scores, is not always the best plan.

In this video, I show how common benchmarks, like Geekbench 6, arrive at a total score, and explain some of the limitations and edge cases.

Join this channel to get access to perks:

PLEASE SUPPORT THE CHANNEL:
As an Amazon Influencer I earn from qualifying purchases
Apple Store on Amazon

Gaming Laptops on Amazon

#benchmark #apple #pc

Рекомендации по теме

Комментарии

RE-UPLOADED DUE TO AUDIO GLITCHES - apologies to anyone who left a comment on the broken version, and sorry for not checking my work properly!

ConstantGeekery

Great video, good to see someone breaking down the different elements.

deavo

Just wanted to drop that you are very informed, unbiased and smart in your videos. A joy to listen to them.

sueside

Great rundown, also, the set is looking clean and cozy!

AlanW

Great video! Reminds me of the way people would rely on micro-benchmarks to determine which programming language was ‘faster’. Which completely ignores the context of the programming — for example modern Java JREs have ‘just in time’ optimization, which means that as your program executes in real time the JRE starts optimizing the code path based on the realworld performance. Compared to static analysis based optimization typically found at compile time in C programs. Or if you were comparing python — for example if the python only executes once you have to factor the compile time to a .pyc but if that same .pyc is compiled once and just executed over and over millions of times then the compile time is irrelevant.

Similarly, you can have a fast and efficient compiled program but it has to call an old database over the internet pretty much making your choice of language a nonfactor in application performance. Etc.

doctorscoot

Nice video. I agree with the points you make. However, the average user is unlikely to be able to fully understand what a given benchmark represents. That point is self evident based on some of the comments posted to this video already.
So, given this lack of understanding, general benchmarks are still safer bet than the results of any one specific application.

steveseidel

Fair comment. It feels like the benchmarks need to assign individual weighting to each test based on how it would generally affect the average user. Perhaps a little less weight on Clang or Navigation when a final score is calculated.

POVwithRC

It’s hard to justify yearly upgrades. An Apple Silicon machine is such a good value that, unless you really need a new feature for your workflow, a new purchase won’t be necessary. That said, the AV1 and ray tracing features are a nice addition.

ufopsi

I appreciate that your explanations are always so clear, thanks.

bryans

The only question I'm interested in is whether the LLM fits into the VRAM. 😁 - But yes, we’d best use case related benchmarks. Are there any?

MeinDeutschkurs

People were always fully focused on numbers.. back in the days i remember hearing... oh, the CPU only has XXX GHz (well, I remember even MHz), or the scanner only has XXXX DPI... Maybe because I was in the HW selling business, I was always thinking bit differently about this... In 2012 I bought my first MacBook Pro, the very first Retina model. I certainly didn't need the performance, I simply liked the computer. Got me one of the higher end CPUs, more RAM, bigger storage. For my needs it was bloody expensive and simply too much of a power... But I found myself using it sill after 10 years, just had to replace the battery, it didn't get the latest OS updates... But still was fast enough for 'normal' work. Bit slower and louder for intensive work... So when I decided I wanted a new computer, went for the future-proof concept again... So got me M1Max MBP and will see the experience after 10 years... And BTW my original MBP Retina went further, a friend of mine got it from me for his web browsing, movies watching and easy office work... And he is very happy with it 🙂

masterphoenixpraha

"Always do your research". Whole-heartedly agreed! I've been waiting for the Threadripper Pro 79xxWX to finally come, and I'm aiming at 7975 for now, paired with dual A2000 NVidia cards for AI and LLM dev. Another niche, i know. And man, that stuff is expensive!

My last system (dual processor Xeon, 12 cores each, 48GB mem) lasted over 10 years. After some minor upgrades (faster SSD, mem and graphics) it's time for something new. Been researching for weeks now, but hey, you only buy these beasts once a decade. My prev system is still faster than my high end 2023 laptop lol. Not in benchmarks, but in actual every day workload.

Misteribel

Yeah, Lots of them are just flat out irrelevant for real world now TBH. Looking at 3d, They’ll use Blender or C4D CPU renderer which no one uses any more, cos we all use a GPU renderer outside of very niche cases

Or they’ll use a benchmark that maxes all the cores, but your software just doesn’t work that way.

Also Adobe software is grossly inefficient in many places and just doesn’t take advantage of what it’s offered for a whole raft of reasons.

Looking at so called “blender and C4D” benchemarks would make you think you need a threadripper, whereas actually an RTX 4090 and a CPU with a high boost speed and fewer cores would be better for you

I've been watching reviews on Intel Arc video cars, and found out that they are killer for video and photo editing. They have 4 or 5 various decoders built in, kind of like M1/2/3, plus, if you have a newer intel 12th, 13th gen cpu, it combines the apu acceleraors with the Arc gpu and multiplies the performance, theoretically by 2x to 4x.

Technocrat.

Interesting stuff! Ended up going for a single CCD Ryzen CPU after having used a multi CCD CPU that gave me issues in some tasks.

Same as that putting a GeForce GT710 in a system with a Haswell i5 improved general computing speeds (browsing the web and such). I think, and I cannot confirm this as I do not have the knowledge nor tools to do this, that having dedicated VRAM offloads RAM bandwidth use next to being slightly faster in general. The card was put in because the HDMI port on the board was bad though I did not expect this side effect.

basbas

[REPOSTED] Well done. Yes, benchmarks can be highly misleading by drawing our attention to irrelevant/away from relevant considerations. It's still crazy that we're obsessed with ray-tracing benchmarks when it's an edge-case using dedicated silicon blocks (RT cores & SIMD) which almost no other task uses - the very definition of misleading.

BTW. Geekbench 6 differs from GB5 in that it no longer runs multiples instances of the tests concurrently on the available CPU threads instead it adopts a similar multithreading architecture to real applications ditching the previous artificial scalability (much to the disdain of some users). Cinebench has also been updated as R23 tested the SIMD with an Intel-controlled library (Embree) which disabled Apple Silicon 2nd NEON SIMD.

Also, Metal currently limits GPU access to 75% of total RAM but it's way more efficient than physically capped VRAM.

daveh

Hello. Great video. For developer who considers it best to buy M1 Max 14" 32 Gb or M2 Pro 14" 16gb. Thank you for your professional opinion

robertomarianic

Men lie women lie but numbers don't

KevinWhiting-ji

The Problem with Benchmarks

The Problem with Benchmarks

The problem with benchmarks — Shocking! Brutal! Fail!

iPhone 14 Plus & The Problem with Benchmarks!

The Problem With GPU & CPU Gaming Benchmarks - What They Don't Tell You!

The Problem with GNN Benchmarks.

The Big RTX 5070 Problem - My Review and Benchmarks

Claude 3 Release and The Problem with Benchmarks

have benchmarks been the problem the whole time

FPS Benchmarks Are Flawed: Introducing Animation Error | Engineering Discussion

The Problem with Benchmarks with Renee Cohen

FPS Benchmarks Are Flawed - Part 1, Ft. Scott Wasson

Finally! Faster 4:2:2 Video Editing (RTX 50 Series)

We Found Problems: AMD Ryzen 5 8600G & R7 8700G APU Benchmarks & Review

7 Popular LLM Benchmarks Explained [OpenLLM Leaderboard & Chatbot Arena]

Fake Benchmarks For Graphics Cards and CPUs Are EVERYWHERE On YouTube And It's A BIG Problem!

Not All Benchmarks Are Created Equal

have benchmarks been the problem the whole time?!?

Grand Finale at OpenAI - New o3 Model Breaks All Benchmarks in Coding & Problem Solving #aispeak...

Do Not Buy: NVIDIA RTX 5070 Ti GPU Absurdity (Benchmarks & Review)

LLM Benchmarks: What You MUST Know Before Creating AI Agents! | GetGenerative.ai

Investigating NVIDIA’s Defective GPUs: RTX 5080 Missing ROPs Benchmarks

Terrible Optimization: Cities Skylines 2 GPU Benchmarks & Graphics Optimization Guide

Jon’s Problem V3/6a+ || Moonboard Benchmarks #bouldering #climbing

Drinking Problems, V9/7C - 2016 MoonBoard Benchmarks with Sierra Blair #moonboardbenchmarks