Learn how to write fast Java code with the Vector API - JEP Café #18

preview_player
Показать описание
The Vector API can tremendously speed up computations by using the SIMD capabilities of your CPU. Learn how parallel computing works on a SIMD machine, how the Java Vector API gives you access to these capabilities, and how you can structure your code to leverage these capabilities for amazing performances.

⎯⎯⎯⎯⎯⎯ Chapters ⎯⎯⎯⎯⎯⎯
0:00 Intro
1:37 Introducing the SIMD features of your CPU
2:41 Scalar parallel computing based on concurrency
5:53 Vector parallel computing based on SIMD machine
8:51 Shape and species of a Vector
10:33 Creating vectors from arrays to sum them
12:25 Loading any array in a vector using masks
15:56 Avoiding masking when it is not supported
17:16 Parallel cross-lanes and lane-wise operations
18:58 Computing the norm of a vector in parallel
21:19 Computing the average of vector components in parallel
22:12 Filtering and compressing a vector in parallel
24:20 Reducing a vector in parallel
24:44 Wrapping up parallel computations using vectors
25:29 Examples, patterns and performances
27:22 Outro

⎯⎯⎯⎯⎯⎯ Resources ⎯⎯⎯⎯⎯⎯

Tags: #Java #Java17 #Java20 #OpenJDK #JDK #JDK17 #VectorAPI #JEPCafe #insidejava #SIMD
Рекомендации по теме
Комментарии
Автор

I like how slowly but wisely Java is coming after C/C++/etc. programmers that were proud of using features like SIMD for a long time.

michimarz
Автор

Does anyone know where I can buy that coffee cup. I want one.

BryanSiegeldotcom
Автор

could be a super boost for machine learning applications

ClementLevallois
Автор

Very nice video! Thank you! I'm just curious, 16:13, could the Vector API or the sepcies object tells whether the underlying CPU supports mask features or not in runtime? so that justifying whether to use mixed pattern or mask feature could be done in runtime as well.

garypinot
Автор

I'm surprised how low level thet api is for Java standards.
Wouldn't it be easier if you had a variable size vector container that does the splitting, masking, loop limiting etc for you?
I might add this to my list for my wip general purpose jdk extension library.

redcrafterlppa
Автор

…C2 compiler will use it without the vector API being available. If they had shown performance comparisons based on the code examples, we would have seen how Vector API compares to C2 autovectorization.

ralf
Автор

Typo @20:35? The "+=" operator in "V3 += ..." should be plain "=", IMO.

bbobcik
Автор

Excuse me, maybe I’m missing something.
Are you sure that vectors can be subjected to the operator +=?
At the 20th minute there is such an example: V3 += V3.add(V2);

ЕленаБаршай-ух
Автор

this is great but feels "complicated" to write, is there a reason not to replace "ArrayList" behind the scenes to use vector? or add a "Vector<T>" object that works this way?
in addition this entire "extra" work should have been encapsulated (its nice to keep it simple at the base but again create a new Vector object that handle most of the things u mentioned like breaking into batches based on the cpu and things like that)

maorhamami
Автор

I've already played with the Vector API a couple of years ago and found that the scalar versions of the algorithms were faster. My tests were done on a Thread Ripper 1950x. Has the Vector API been made more efficient since then?

AdamFJH
Автор

This API uses specific features of CPUs that may or may not be there? Is it just me wondering whatever happened to "write once, run anywhere?"

BirdBrain
Автор

Which method can solve a cross product of two vectors?

dpiano
Автор

The coffee thing is incredibly cringe. Made sense when java was backed by a cool company like sun. You should just make your content as corporate as possible, will be better to watch

zqbthvk
visit shbcf.ru