GPU Lecture 40: Standard Shader Rewritten as a Surface Shader (GPU Programming for Video Games)

preview_player
Показать описание


0:00 -- Introduction
1:55 -- Height maps (parallax)
2:57 -- MyStandardFromSurface
10:23 -- Simpler shader
13:10 -- Simplest shader
14:23 -- Compiled HLSL
17:21 -- Assembly code
18:30 -- Height map clarification

#unity #unity3d #hlsl #cg
Рекомендации по теме
Комментарии
Автор

Awesome video as always. I have not found another breakdown of the unity shaders so far, so understanding them usually involves blundering through tens of obscure include files.

I don't have a tonne of shader knowledge, so seeing you sometimes go "yeah idk why that's there" is super helpful in realizing not all shader code is this complex (and unity has to account for a bunch of different issues and their code has evolved over time).

I understand this is being done as part of a course, so you're probably not taking video ideas :p But will you ever be attempting an optimization of the standard shaders? I'm going through this thing for mobile.

harshmudhar
Автор

It's insane that the generated shader files end up being so large, though that's probably common across all of the big engines.
For your converted shaders, shouldn't you have also copied over and renamed the BRDF instead of using the default unity one? It looks like you only went down one level instead of fully traversing the tree of function calls. Or perhaps I missed the function when you were scrolling through.

For the last part about the assembly, that looks like a lower-level generated code and not any form of GPU assembly. From the unity documentation, they describe it as " low-level generated code [which] is useful for pasting into GPU shader performance analysis tools". I take that as being a step above IR. Although, that looks like metal to me, which would make sense given that you're compiling on a mac. The interesting thing is that the generated low-level code seems to be optimizing for registers (by restricting the number of generated variables), but that would give a false impression of the performance, since the final compiler will break apart the dataflow graph and try to extract ILP (at least I would hope).

hjups
visit shbcf.ru