Nvidia's MAJOR Ray Tracing Performance BREAKTHROUGH - Analysis | Intel Arc Laptop Roadmap SPOTTED

preview_player
Показать описание
Nvidia's MAJOR Ray Tracing Performance BREAKTHROUGH - Analysis | Intel Arc Laptop Roadmap SPOTTED

Nvidia has revealed a new patent for ray tracing, which showcases a MAJOR breakthrough for Nvidia ray tracing called GPU Subwarp Interleaving. If this report is accurate, we could see a HUGE increase in ray tracing performance for future Nvidia GPUs such as RTX 40 and beyond. With ray tracing becoming more utilised and of course being very demanding of GPU performance, streamlining graphics card architecture for RTX 40 Lovelace (or potentially RTX 50) is hugely important. With RDNA 3 from AMD rumoured to offer very competitive performance in both gaming and ray tracing, this MAJOR performance breakthrough from Nvidia could be a game changer.

We also have a potential release date / release window or at least an idea of when it will be available, for Intel Arc Alchemist for laptop. Intel Alchemist is hugely anticipated, and there have been a lot of rumours surrounding both the Intel Arc desktop discrete GPUs and mobile GPUs release date. There is also furher leaks on Intel Arc DG2's memory config specs.

LINKS

AMAZON AFFILIATE LINKS

GreenManGaming Affiliate link -

SOCIAL MEDIA

0:00 Start
0:11 Nvidia ray tracing
7:35 Intel Arc memory config
9:03 Intel Arc release date
11:58 Vulkan

Рекомендации по теме
Комментарии
Автор

More RT, more silicon. Let price go completely orbital...

_A.d.G_
Автор

The abstract of the whitepaper claims only a 6.3% average performance increase.

lb
Автор

I want next Level NPC. Gaming graphics always progresses but NPC did not.

snowgentleman
Автор

And it doesn't matter because we'll never get the GPU capable of using this tech until 2034

Moody_Incorporated
Автор

before watching the video, I sure hope its a BVH build / rebuild accelerator. depending on scene complexity and ray count, rebuilding the BVH can take just as long if not longer than the actual BVH traversal and ray intersection testing. its also the culprit for why enabling just 1 raytracing effect is the bulk of the performance loss compared to enabling multiple raytracing effects, basically the cost is the added burden of BVH maintenance + tracing rays, adding more effects just means more rays since you are already on the hook for BVH maintenance.

gameguy
Автор

What I want to see: Spacewarp frame interpolation like that used in VR headsets
Instead of DLSS/FSR which renders at a lower resolution then upscales it, this renders fewer frames and interpolates the differences between them. With clever trickery (I personally use the version built into the Oculus app Virtual Desktop, rather than the native Oculus or SteamVR version) and find it to be near perfect in maintaining 60fps interpolated to 120fps for smooth gameplay and minimal artefacts. Unless I'm looking really closely for it I can't even tell it's running, but it saves soooo much headroom for graphical processing without the need to alter resolution at all.

Even if you stacked it on top of DLSS, this is the kind of tech that could actually allow people to run games like Cyberpunk with ray tracing enabled at 120fps.. it should not be limited to VR only

MotoCat
Автор

If I remember rightly
A Nvidia warp takes 4 cycles to process and is 32 threads in size....
In the Vega GCN days, a wave takes 4 cycles to process and is 64 threads in size.
Everyone was expecting vega 64 to be so much faster than it was...but the CU's were bored.
Everyone was expecting ampere to be faster than it is too...

RNDA has solved all this because wave is now 32 threads in size AND takes only one cycle to complete. Hence why RDNA has less FP32 (meaningless) but still competes with all nvidia cards.


Now that nvidia is 4 times the size (in SM's) of what it was, yet still taking 4 cycles to process a warp (simplifying) I'd say they were experiencing the same issues. SM's are being stalled by thread prediction misses or simply not enough to do, too much to schedule.

Forgive me if i've gotten anything wrong... Or feel free to correct me....

Subdivision of a warp....

Taken from page 3 of the whitepaper....
if all 32 threads in a warp execute a branch, but only four of those threads take the branch, the warp will splinter into two subwarps. One subwarp contains the four threads that take the branch, and the other subwarp contains the remaining 28 threads.
Nice diagram on page 2 of the stall of threads. All very similar to GCN stalling.

pete
Автор

its gigabyte board ...blue board that only gigabyte use that layout...you can see VRM trace outs too...from memory chips...it means the cards are on the way to the store shelves on march 5....lot reviewers already have intel A380 and A780 boards for testing and review while we are heavy NDA watch from intel themselves...only 4 words i gonna say what i think....ITS F....AMAZING CARDS

user
Автор

Paul, you were firing on all cylinders in this one. The well placed and eloquently delivered "that's what she said" was hilarious 😂.

TheCgOrion
Автор

People need to realize that RT performance is still limited by the raster performance. In general, RDNA2 RT perf would be worse than even Turing let alone Ampere but RDNA2 performance overall over Turing is so large that it performs better even under RT. What that means is that RT performance scales with the baseline raster performance. If you have 150fps in raster and you drop to 75fps under RT, GPU that starts at 100fps will be slower in RT even if its RT performance is better. its all about headroom and baseline raster performance. None of this matters except titles like Minecraft RT or Quake RTX which are purely RT.

MaggotCZ
Автор

That spaceman reminded me of "Space Quest."

briankleinschmidt
Автор

Warp is 32 threads, more a set of DATA than of instructions. All 32 threads have to run the same instructions in-synch because there is only a single control flow logic per warp. When the logic branches/diverges(has an ‘if’ statement) between threads within the same warp, it has to take all 32 threads through one branch, and then back again through the other. The threads that shouldn’t have gone through the branch are essentially halted while the other threads that do are executing.
This patent seems to be about being able to have the halted threads instead execute on some other warp so they don’t wait. In a sense interleave part of the warp (sub-warp) with some other one that is idle waiting on data. Thereby hiding more latency.

AinurEru
Автор

its like shader3.0 from shader2.0 back those days due to microsoft library update...DX13 is comming soon and Crysis 4 gonna feature all the latest tech ...

user
Автор

The pop when he says "warp" is so satisfying

uzairshaikh
Автор

This just tells me Nvidia may push for gen 5 pcie express. The rumors on the 3090 ti suggested it could be gen 5 pcie. But also turing vs ampere Ampere supports motion blur in the structure. At least that is what is implied in the power point presentation of the of the 3000 launch. gddr7 isn't to far off either. As much as people really want ppu sequel or physics cards. I don't think that will happen. If that happens. I'd rather have it as a nvme option. That would be more interesting and maybe more lucrative. Until I see a presentation of what the 4000 series brings. I am skeptical of the leaks though. The 10 series vs the 20 series is pretty clear of what is going on with millisecond ray tracing response time differences. That is a obvious example.

furynotes
Автор

Noooo, you're kidding me, memory latency is important for strided memory access 😳😳
Exactly the oposite of what's available since year 2K 😩
Further it translates to actual physical time..

goldnoob
Автор

Wake me up when there is NO rasterisation and the scene is FULLY Path Traced (NOT Ray traced). NVidia currently only does 1 pass per pixel. To get a photo realistic scene you need around 50. NVidia cheats by smoothing the image with the tensor cores. Without smoothing the image looks SHIT!

KangoV
Автор

Trying to put a bit more awe in awsome for real time ray tracing, back in the 80s the Amiga was often used to raytrace videos. Back then it wasn't FPS for ray tracing, but how many hours... PER FRAME of video. When real time ray tracing was 1st annonced, my chin hit the floor! :o

ElAnciano
Автор

Imagine the next Nintendo console have raytracing

RitaMaSTeR
Автор

More marketing for the new card. Nvidia just looking for ways to jack up prices with all this crap, let’s make it sound more sophisticated and expensive.

sinhnguyen