Linux v Windows: Which is FASTER? - Software Drag Racing!

preview_player
Показать описание
Retired Microsoft operating system developer Davepl races Linux against Windows on the latest Threadripper CPUs to determine once and for all which system is actually fastest under maximum load.

Davepl develops a multicore prime sieve capable of solving the primes to a BILLION in under a second - or dozens of times per second on the "big" CPUs!

All profits from the sale of Dave's Garage mugs go to Autism Reseach:

Рекомендации по теме
Комментарии
Автор

Next drag race suggestion: WSL vs Linux running natively.

hellterminator
Автор

I almost feel like I'm seeing the beginnings of Dave's Garage Benchmark...

noodlefu_
Автор

just want to drop a comment about the quality of the camera, its focus on the face and the clean audio. not a lot of people will appreciate it, but I notice that tight depth of field

tehsraphm
Автор

@Dave's Garage I wrote a multithreaded prime sieve a few years back. For finding primes between 1 and 10, 000, 000 on my Threadripper 3960X it takes ~ 0.01 seconds regardless of the number of threads used. You are correct, multi-threading won't bring any gains when the sieve is heavily dependent on prior work.

Now for finding primes between 1 and 1, 000, 000, 000 then we start to see some differences with +/- 0.10 seconds:

1 thread = 5.032 seconds
2 threads = 4.816 seconds
4 threads = 4.72 seconds
8 threads = 4.450 seconds
16 threads = 3.989 seconds
32 threads = 3.702 seconds
48 threads = 3.548 seconds

A better multithreaded test is Buddhabrot rendering.

MichaelPohoreski
Автор

Dave, did you do your testing for over 16 cores on Enterprise builds of Windows? I know back in the day they used to artificially cripple the consumer builds and limit performance both at the HAL and Kernel levels to some degree so you couldn’t take advantage of many cores and huge amounts of memory. Not sure if that is the case with later generations like Windows 8+ but wouldn’t put it past them 😉 I’m genuinely curious myself as to why Linux was so much faster with with C over 16 cores 🤔 I wonder if the Windows Intel C compiler is also doing something funny here. I’m really rusty since I haven’t coded in a good long time but this got my cogs turning a bit so I thought I would comment. I shot you an email, keep on killing it with these videos man 👍🏻

Barnacules
Автор

I appreciate your balanced view of the drag races. It is amazing how close everything is. Thank you for the great video series.

brucewilliams
Автор

Hi! About your remarks for how to thread the prime sieve:
You should be perfectly fine to hand out marking of multiples for the entire sieve range to individual threads; I don't think the case you described would cause any issues.
The worst thing that would happen is that a thread picks up a factor that should have been eliminated, and marks all multiples of it as non-prime. As long as another thread marks that factor itself later as non-prime, all that has happened was simply that redundant work was done.
Example: 4 threads start marking off multiples; they all start at the same time, marking off the multiples of 3, 5, 7 and 9 (since those are the first 4 factors in your sieve). Of course, all multiples of 9 would be eliminated by the thread marking off multiples of 3; but no ill effects result of this redundant work, you just lose a bit efficiency at the beginning. The further the sieve progresses, the less likely it becomes that a thread is doing redundant work.
It's of course important for this that you DON'T use vector<bool> or any container that stores the bit fields by ORing values together, as you run into contention issues. But using e.g. a vector<char> should work just fine.

thebigMuh
Автор

the amount of effort and thought you put into your videos is startling! thumbs up!

sebve
Автор

I would like to see Linux baremetal and Windows on HyperV.
That was not a fair comparison IMHO.

mercuriete
Автор

I tried your program on my 9980XE cpu. Funny thing, I get 6000 passes inside a VM (qemu/kvm) and 5200 on bare metal. My cpu has 18 hyperthread cores, clocked at 3.06 GHz. Used gcc (10.3.0). Only have linux, both as host and as guest. Gentoo to be exact. I also added mtune/march to gcc arguments according to my cpu. That alone is giving about 200-300 extra passes :)

axlslak
Автор

Unless you need some high-end, proprietary software like FCP, or are into PC gaming, there's no reason not to switch to an entry-level distro of Linux like Linux Mint or Ubuntu, even if you're a "noob".

redpillsatori
Автор

TL;DR
I scripted a primality test. It's pretty boneheaded but was fun.

After watching your previous vids on this, I wrote a very rudimentary C++ script that checks whether a given number is prime. It does this by brute force checking the modulus of the number and all potential factors (2 and all odds) up to the square root of the number. It took ~17 seconds to verify that the primality of 18, 446, 744, 073, 709, 551, 557 (largest prime less than 2^64) on my Ryzen 5 2400G. I then decided to try to add threading. My implementation is pretty basic and probably full of bad practices because I have no experience, but it works. Basically, it creates a vector of futures and each future checks a range of the possible factors. Each future then returns either 0 if it finds a factor or 1 if it doesn't. I then sum up each element in the vector and if the sum is the same as the number of threads, then the number must be prime. Otherwise, the sum is less and at least one of the futures found a factor, so the number must be composite. Using 7 threads on the same processor to check the same large prime, it takes ~4.4 seconds. All in all, it was a fun little project to work on, thanks for inspiring me to try working on something.

wompastompa
Автор

For this kind of test I can imagine that the overhead of WSL compared to native Linux is very small, due to virtualization support in modern hardware. I have seen that on other virtual machines as well.

One issue worth investigating is heat generation, and if that might cause the CPU to throttle down, skewing the results.

dagbruck
Автор

Education and entertainment at the same time. I enjoy these videos Dave, keep going please

gumboe
Автор

The "How to lie with statistics" amazon link in the description is instilling fears in me.

kquote
Автор

I feel like this is a kind of the software version of the tests on Gamers Nexus (and other HW focused channels), and we definitely needs more of these.

scifibob
Автор

As someone who only recently got into computation and coding (for any purpose other than physics calculations, I am an engineer after all) I love your videos. Your video on gigabit switches already helped me. I just don't have time to learn this stuff on my own, and you really break things into awesome, digestible blocks.

TGCIIII
Автор

Any time I see a graph with the origin not at zero, I suspect I am being manipulated.

sammyfromsydney
Автор

Interesting. I've seen other multithreaded benchmarks (using rendering software, I believe) that corroborate Windows' loss of efficiency beyond 16 cores. My guess is that its scheduling and/or thread pooling algorithms haven't been updated since long before such core counts became common. Nice win for Linux, although the "hump" in its curve indicates the potential for even better performance. Great video as always!

bitcortex
Автор

Would be cool to see a test between different Windows versions (XP, ME, 2000 etc.) to see the progress throughout the years with handling multi-treading and number-crunching.

HugRunner