AVX 512 Properly Explained! – Performance and Syntax Analysis

preview_player
Показать описание

The Advanced Vector Extension, A.K.A. AVX, is an extension to the x86 instruction set architecture, designed to make SIMD possible within the CPU core itself!

Building a Budget PC can be tough. Not only are GPUs and CPUs so incredibly expensive, but they can be hard to find on a budget... But, there are tips and tricks to finding you your dream Budget GPU, and pairing it with a CPU that will give you the performance you want!

Also, if you're reading this far - I've got an RTX 2060 review coming!

Have a Great Day!
- Proceu

Timestamps:
0:00 Intro
0:46 Preface
1:22 What is AVX?
3:05 Parallel vs AVX code paths
3:33 Why SIMD ISAs?
4:07 Intel MMX
4:26 MMX vs SSE & AVX
5:17 Uses for Double-Precision
5:34 Why use the CPU?
6:31 immintrin.h
6:50 The Code
11:36 The Benchmarks
14:22 Conclusions
15:49 Guinea Pig Cam

#AVX #AVX2 #AVX512
Рекомендации по теме
Комментарии
Автор

The fact that the current high-end gaming Intel CPU (14900K) does not support AVX512 is insane.

KvapuJanjalia
Автор

This was so well explained! Thank you!! The CoD zombies clips in the background actually helped a ton with following along on the video haha. Great stuff!

MrHavk
Автор

Recently, I implemented the Serpent symmetric block cipher (AES candidate) in portable C using 32bit unsigned integers. Then ported it to 128bit SSE2 for 4× the performance, and later to 256bit AVX2 for additional 2× speedup on top of SSE2. That thing scales like magic. I don't have AVX-512 integer capable computer. Reading Intel's intrinsics docs isn't really that difficult like I was afraid of initially.

MarekKnapek
Автор

You have to understand that, in it's most basic form, shifting from an 8 to a 16bit CPU carries an automatic 'SIMD' upgrade to all increased registers, for rather obvious reasons. With today's 64bit CPU's, adding separate large registers and applicable opcodes (opcodes which have become more complex/powerful) can - and does - have the effect of stalling a general move to greater than 64bit CPU's.
Today, were merely extending hybrid architectures, and today's 'large register' extensions are our means to do that.

Rob......
Автор

6:10 This is a tad misleading since GPUs are often, like, 30 times less efficient at processing 64bit floats than 32bit floats. Something something consumer grade silicon handicap

Dorumin
Автор

I... wh... okay THIS is how my friends feel when I start talking about various cpu's gpu's and other hardware... this whole video felt like I was listening to an entirely different language. am I just stupid or something for not being able to pick up on context clues?

DoorObserver
Автор

Somebud mentions AVX 512 😁
My 10940x/256gig system be like: 👀

CTJFVB-CAZ
Автор

The more Hardware Acceleration the better, I think; as such, Audio, Networking, Internet and Compression should have their own Instructions in their respective Processing Units 😊👍

saull
Автор

I'm no programmer but, I wonder what happens if you run a AVX2 program on a processor that doesn't supports it. Like an Intel i5 3470.

yumenokoyume
Автор

how is avx512 used for Inference?
from what i understood in this video avx512 enables you to execute a Multiply (or Accumulate) instruction for eight double precision floats (8*64=512, thus the name)
so could quantized models to int8 then execute 64 int8 with one single instruction instead of decoding the same instruction 64 times?
the company neuralmagic even goes the route of saying cpu inferencing is the way forward
bbut even with "64simd", GPUs are still much more parallel i thought

nate
Автор

I remember when i jumped from 10700k to 11700k and i got massively fps gain in any games :) The problem why avx512 is not integrated in many modern cpu is its die size and cpu heating problems. Cpu's went itself better without avx512. I remember prime95 tests with my i7 11700k, when i tested with avx512 disbled and cpu temps went down arround -10-15c also cpu power draw becames like -20-30w less. So question if it worth to use in modern cpus instead of just increasing manufacturer stock clock speeds that can give same effect and problaly less heating and less required voltages. Companies are smart on this enough to what they are trying to do.

AP
Автор

In think that AVX612 is overrated for floating point ops The code overhead that is required to shuffle data around in the ZMM registers absorbes as lot of the efficiency gain for the actual _mm512 intrinsia.
Against an optimized C++ program ai squeeze out 15% speed gain at most writing in assembly, using intrinsics this gain is slightly less

arnoldn
Автор

Most emulation requires avx-512 to run stable, the intel cpus to me have been trash in performance, the 5600x I have is way better.

MZRFaith
Автор

You know, it's very annoying watching you play a game while you're explaining how things work.

Matlockization
welcome to shbcf.ru