Commit Graph

17 Commits (dec5680934b814370326ebd826fa7970d2411bc7)

Author SHA1 Message Date
ReinUsesLisp a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
ReinUsesLisp 3dcaa84ba4 shader/transform_feedback: Add host API friendly TFB builder
ReinUsesLisp e8efd5a901 video_core: Rename "const buffer locker" to "registry"
ReinUsesLisp bd8b9bbcee gl_shader_cache: Rework shader cache and remove post-specializations
Instead of pre-specializing shaders and then post-specializing them,
drop the later and only "specialize" the shader while decoding it.
James Rowe b429095b61 Fix git version in scm_rev.cpp
Fernando Sahmkow 1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.
Fernando Sahmkow 47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes.
Fernando Sahmkow 8be6e1c522 shader_ir: Corrections to outward movements and misc stuffs
ReinUsesLisp 4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
Fernando Sahmkow 8af6e6a052 shader_ir: Implement a new shader scanner
ReinUsesLisp 06c4ce8645 shader: Decode SUST and implement backing image functionality
ReinUsesLisp dec1cbaf7f cmake: Add missing shader hash file entries
fearlessTobi b67be7154d GenerateSCMRev: fix Travis compilation on repo forks
ReinUsesLisp 48e6f77c03 shader/decode: Split memory and texture instructions decoding
ReinUsesLisp dfd14618f7 cmake: Fix title bar issue
Michael 4ffb487251 cmake: Fixup application string
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
ReinUsesLisp be4641c43f gl_shader_disk_cache: Invalidate shader cache changes with CMake hash