Commit Graph

782 Commits (159ae5e47ceff0a44c1021379cbca601b36f4fb0)

Author SHA1 Message Date
ameerj 06894b0711 emit_spirv_image: Fix depth image implicit lod sample in compute
Ensures all drivers behave the same way in this case.
4 years ago
Ameer J 3791c7ca82
Merge pull request #7077 from FernandoS27/face-down
A series of fixes to queries and indexed samplers.
4 years ago
Fernando Sahmkow 3f4444b552 Shader Compiler: avoid overflowed indices on indixed samplers. 4 years ago
Morph e29f3b87f1 style: Remove extra space preceding the :: operator 4 years ago
ameerj 73666fb262 general: Update style to clang-format-12 4 years ago
Fernando Sahmkow 8984abfc76 Spir-V: Rescale the frag depth to 0,1 mode when -1,1 mode is used in Vulkan. 4 years ago
Morph 9248442bb2
Merge pull request #6948 from ameerj/amd-warp-fix
shaders: Fix warp instructions on 64-thread warp devices
4 years ago
bunnei 7e9163779d
Merge pull request #6962 from vonchenplus/spirv_support_legacy_attribute
renderer_vulkan: Spirv support glsl  legacy attribute
4 years ago
Feng Chen b1e655f898 Detail adjustment 4 years ago
Feng Chen bbc1800c1b Detail adjustment 4 years ago
Feng Chen e5ca733722 Re-implement get unused location 4 years ago
Feng Chen 9cdf2383e9 Move attribute related definitions to spirv anonymous namespace 4 years ago
Feng Chen 1de9e4e121 Dynamic get unused location 4 years ago
Feng Chen d994466a08 Implement intput and output fixed fnc textures 4 years ago
Feng Chen a7bbaa4897 Rename parameters 4 years ago
Feng Chen cf26f375ff Fix create GraphicsPipelines crash 4 years ago
Feng Chen 1e2a89d306 Add input/output location 4 years ago
bunnei b2572a56d3
Merge pull request #6900 from ameerj/attr-reorder
structured_control_flow: Add DemoteCombinationPass
4 years ago
ameerj d956fb3c7c emit_glsl_warp: Fix shuffle ops for 64-thread warp sizes 4 years ago
ameerj 5b45dfe971 emit_glsl_warp: Fix ballot related ops for 64-thread warp sizes 4 years ago
ameerj a5d9dcf3d9 emit_spirv_warp: Fix shuffle ops for 64-thread warp sizes 4 years ago
ameerj 95213270ef emit_spirv_warp: Fix ballot related ops for 64-thread warp sizes 4 years ago
Feng Chen 73b11f390e Add colorfront and txtcoord support 4 years ago
ameerj 907dfbea71 structured_control_flow: Skip reordering nested demote branches.
Nested demote branches add complexity with combining the condition if it has not been initialized yet. Skip them for the time being.
4 years ago
ameerj 4fda7f1c82 structured_control_flow: Conditionally invoke demote reorder pass
This is only needed on select drivers when a fragment shader discards/demotes.
4 years ago
ameerj 862dc2b2b3 structured_control_flow: Add DemoteCombinationPass
Some drivers misread data when demotes are interleaved in the program. This moves demote branches to be checked at the end of the program.
Fixes "wireframe" issue in Pokemon SwSh on some drivers
4 years ago
ameerj 6e407c02d8 emit_spirv_context_get_set: Fix Get FrontFace return value
The IR expects GetAttribute to return an F32 value. This case was returning a U32 instead.
4 years ago
Valeri beb7305b73
SPIR-V: Merge two ifs in EmitGetAttribute 4 years ago
Morph db07ca6c7f
Merge pull request #6767 from ReinUsesLisp/fold-float-pack
shader: Fold UnpackFloat2x16 and PackFloat2x16
4 years ago
bunnei a98f14e9b0
Merge pull request #6722 from ReinUsesLisp/xmad-opts
shader: Fold integer FMA from Nvidia's pattern
4 years ago
ReinUsesLisp 8c9febe8f7 shader: Fold UnpackFloat2x16 and PackFloat2x16
Simplifies the code a bit when possible. These instructions should be
no-ops codegen wise.
4 years ago
ReinUsesLisp 1bb46b7d64 shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructions
Fixes instances where fp16 types are not declared on SPIR-V but they are
used. This shouldn't happen on master, as it's been uncovered by an
additional optimization pass.
4 years ago
Lioncash c27ddb44de exception: Make constructors explicit
Ensures that exception construction is always explicit.
4 years ago
Lioncash e490ddf327 exception: Make what() member function nodiscard 4 years ago
Lioncash 90f3678ada exception: Narrow down specific header
We can use the <exception> header instead of pulling in all of the
exception-style classes.
4 years ago
Rodrigo Locatti c0f99558fb
Merge pull request #6724 from lioncash/nodisc-shader
shader_recompiler: Remove unnecessary [[nodiscard]] instances
4 years ago
Rodrigo Locatti de0b89792c
Merge pull request #6726 from lioncash/hguard
emit_spirv_instructions: Add missing header guard
4 years ago
Rodrigo Locatti 3d97f1e6cf
Merge pull request #6727 from lioncash/topology
emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
4 years ago
Rodrigo Locatti b2b3fcdccd
Merge pull request #6723 from lioncash/shader
object_pool: Add missing return in Chunk move assignment operator
4 years ago
Lioncash 3e7813e49d emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
This should be LINES_ADJACENCY
4 years ago
Lioncash c2915d9f2f emit_spirv_instructions: Add missing header guard 4 years ago
Lioncash 06ca911621 shader_recompiler: Remove unnecessary [[nodiscard]] instances
[[nodiscard]] doesn't do anything on functions with a void return type
and causes superfluous warnings.
4 years ago
Lioncash 0b67df1f7c control_flow: Fix duplicate switch case in OpcodeToken
This previously duplicated the case of the PBK case above it.
4 years ago
Lioncash 89ad9df0e9 object_pool: Add missing return in Chunk move assignment operator
Prevents undefined behavior from occurring.
4 years ago
ReinUsesLisp 66a0cedba3 shader: Fold integer FMA from Nvidia's pattern
Fold shaders doing "a * b + c" on integers from the pattern generated by
Nvidia's GL compiler.

On a somewhat complex compute shader it reduces the code size by 16
instructions from 2 matches on Turing GPUs.

On Intel as extracted from KHR_pipeline_executable_properties:
Before the optimization:
```
Instruction Count: 2057
Basic Block Count: 45
Scratch Memory Size: 14752
Spill Count: 232
Fill Count: 261
SEND Count: 610
Cycle Count: 11325
```

After the optimization:
```
Instruction Count: 2046
Basic Block Count: 44
Scratch Memory Size: 13728
Spill Count: 219
Fill Count: 268
SEND Count: 604
Cycle Count: 11367
```
4 years ago
ReinUsesLisp 09fb41dc63 shader: Use TryInstRecursive on XMAD multiply folding
Simplify a bit the logic.
4 years ago
ReinUsesLisp f6f0383b49 shader: Add TryInstRecursive utility to values 4 years ago
ReinUsesLisp 7f13104c17 shader: Support out of bound local memory reads and immediate writes
Support ignoring immediate out of bound writes. Writing dynamically out
of bounds is not yet supported (e.g. R0+0x4).

Reading out of bounds yields zero. This is supported checking for the
size from the IR; if the input is immediate, the optimization passes
will drop it.
4 years ago
ameerj 56478bc9ac shader: Fix disabled attribute default values 4 years ago
ameerj 56c30dd9e0 glsl: Simplify FCMP emission 4 years ago
ameerj 79d2684261 glsl: Update TessellationControl gl_in
Adheres to GL_ARB_separate_shader_objects requirements
4 years ago
ameerj fc7bed21b5 shader: Implement ISETP.X 4 years ago
ReinUsesLisp bf2956d77a shader: Avoid usage of C++20 ranges to build in clang 4 years ago
ameerj 94af0a00f6 glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZE 4 years ago
lat9nq 49946cf780 shader_recompiler, video_core: Resolve clang errors
Silences the following warnings-turned-errors:
-Wsign-conversion
-Wunused-private-field
-Wbraced-scalar-init
-Wunused-variable

And some other errors
4 years ago
ReinUsesLisp 2235a51b5d shader: Manually convert from array<u32> to bitset instead of using bit_cast 4 years ago
ameerj 41c6cb70f9 glsl: Fix tracking of info.uses_shadow_lod 4 years ago
ameerj 11f04f1022 shader: Ignore global memory ops on devices lacking int64 support 4 years ago
ameerj 57f222c56e dual_vertex_pass: Clang format 4 years ago
ReinUsesLisp 8722668b3c emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 Nvidia
Fix regression on Fire Emblem: Three Houses when using native fp16.
4 years ago
lat9nq 2e5af95541 shader: GCC fmt 8.0.0 fixes 4 years ago
ameerj b9069c7891 shader: Account for 33-bit IADD3 scenario 4 years ago
ReinUsesLisp b21bf79bd2 shader: Only apply shift on register mode for IADD3 4 years ago
ReinUsesLisp 5643a909bc shader: Fix disabled and unwritten attributes and varyings 4 years ago
ameerj 65daec8b75 glsl: Fix shared and local memory declarations
account for the fact that program.*memory_size is in units of bytes.
4 years ago
ameerj 8289eb108f opengl: Implement LOP.CC
Used by MH:Rise
4 years ago
ReinUsesLisp 5b2b0634a1 spirv: Fix code emission when descriptor aliasing is unsupported
Fixes OpenGL.
4 years ago
ameerj 00fa09dc45 glsl: Declare local memory in main 4 years ago
ameerj f7352411f0 glsl: Add passthrough geometry shader support 4 years ago
ReinUsesLisp 8612b5fec5 shader: Use std::bit_cast instead of Common::BitCast for passthrough 4 years ago
ReinUsesLisp 8a3427a4c8 glasm: Add passthrough geometry shader support 4 years ago
ReinUsesLisp 7dafa96ab5 shader: Rework varyings and implement passthrough geometry shaders
Put all varyings into a single std::bitset with helpers to access it.

Implement passthrough geometry shaders using host's.
4 years ago
ReinUsesLisp ecd6b4356b shader: Only verify shader when graphics debugging is enabled 4 years ago
ReinUsesLisp 395bed3a0a shader: Unify shader stage types 4 years ago
lat9nq 257d2aab74 lower_int64_to_int32: Add missing include 4 years ago
ReinUsesLisp fb166b5ff4 shader: Emulate 64-bit integers when not supported
Useful for mobile and Intel Xe devices.
4 years ago
ReinUsesLisp d8d5501459 shader: Add int64 to int32 lowering pass 4 years ago
ReinUsesLisp 04ef2160f9 shader: Teach global memory base tracker to follow vectors 4 years ago
ReinUsesLisp 97e80dda55 shader: Add constant propagation to integer vectors 4 years ago
ameerj 27ca8a0e13 glsl: Better IAdd Overflow CC fix
This ensures the original operand values are not overwritten when being used in the overflow detection.
4 years ago
ReinUsesLisp 4397053d5c shader: Remove IAbs64 4 years ago
ameerj bc6e399ae3 glsl: Fix IADD CC 4 years ago
ameerj a7536825df shader_recompiler: Fix IADD3 input partitioning 4 years ago
ReinUsesLisp 808ef97a08 shader: Move loop safety tests to code emission 4 years ago
ameerj cbce9ddd4a glsl: Remove frag color initialization 4 years ago
ameerj 3a2dd1b483 glasm: Implement SetAttribute ViewportMask 4 years ago
ameerj 1c648f176c emit_glsl_special: Skip initialization of frag_color0
Fixes rendering in Devil May Cry without regressing Ori and the Blind Forest.
4 years ago
ReinUsesLisp 1d182fc0f5 shader: Calibrate loop safety threshold 4 years ago
Morph cfbc85839d glsl: Add missing ; in EmitSetSampleMask
Fixes shader compilation in Okami HD
4 years ago
ameerj 9e066dcb15 glsl: Fix output varying initialization when transform feedback is used 4 years ago
ameerj a0365217f5 texture_pass: Fix is_read image qualification
Atomic operations are considered to have both read and write access. This was not  being accounted for.
4 years ago
ReinUsesLisp 0cd08b3e72 shader: Align constant buffer sizes to 16 bytes
WAR for AMD reading zeroes on uniform buffers of size 2.
4 years ago
ReinUsesLisp 59fead3a47 spirv: Properly handle devices without int8 and int16 4 years ago
ReinUsesLisp b5e78607ad spirv: Handle small storage buffer loads on devices with no support 4 years ago
ameerj ccbd24fe00 glsl: Fix cbuf component indexing bug falback 4 years ago
ReinUsesLisp 1091995f8e shader: Simplify MergeDualVertexPrograms 4 years ago
ReinUsesLisp 374eeda1a3 shader: Properly manage attributes not written from previous stages 4 years ago
ReinUsesLisp 892b8aa2ad glsl: Only declare fragment outputs on fragment shaders 4 years ago
ReinUsesLisp 0ffea97e2e shader: Split profile and runtime info headers 4 years ago
ReinUsesLisp cbbca26d18 shader: Add support for native 16-bit floats 4 years ago
ReinUsesLisp 376aa94819 shader: Rename maxwell/program.h to translate_program.h 4 years ago
ameerj 12ef06ba8b glsl: Obey need_declared_frag_colors to declare and initialize all frag_color
Fixes Ori and the blind forest title screen
4 years ago
ameerj d36f667bc0 glsl: Address rest of feedback 4 years ago
ameerj c5dfa0b630 glsl: Move gl_Position/generic attribute initialization to EmitProlgue 4 years ago
ameerj 3b339fbbf6 glsl: Conditionally use fine/coarse derivatives based on device support 4 years ago
ameerj 6eea88d614 glsl: Cleanup/Address feedback 4 years ago
ameerj ae4e452759 glsl: Add Shader_GLSL logging 4 years ago
ameerj 6c6a451d6a glsl: Add LoopSafety instructions 4 years ago
ameerj a0d0704aff glsl: Conditionally add EXT_texture_shadow_lod 4 years ago
ameerj 5e7b2b9661 glsl: Add stubs for sparse queries and variable aoffi when not supported 4 years ago
ameerj 6aa1bf7b6f glsl: Implement legacy varyings 4 years ago
ameerj 39c29664f9 glsl: Minor cleanup 4 years ago
ameerj 427a2596a1 glsl: Fix Cbuf getters for F32 type 4 years ago
ameerj 7c82f20b52 glsl: Add immediate index oob checking for Cbuf getters 4 years ago
ameerj 84c86e03cd glsl: Refactor GetCbuf functions to reduce code duplication 4 years ago
ameerj e81c73a874 glsl: Address more feedback. Implement indexed texture reads 4 years ago
ameerj 7d89a82a48 glsl: Remove Signed Integer variables 4 years ago
ameerj 4759db28d0 glsl: Address Rodrigo's feedback 4 years ago
ameerj 85399e119d glsl: Reorganize backend code, remove unneeded [[maybe_unused]] 4 years ago
ameerj e7c8f8911f glsl: Implement SampleId and SetSampleMask
plus some minor refactoring of implementations
4 years ago
ameerj d1a68f7997 glsl: Add gl_PerVertex in for GS 4 years ago
ameerj a926695234 glsl: Use existing tracking for enabling EXT_shader_image_load_formatted 4 years ago
ameerj 14bd73db36 glsl: Enable early fragment tests 4 years ago
ameerj 3f31a547e0 glsl: Implement more attribute getters and setters 4 years ago
ameerj 8bb8bbf4ae glsl: Implement fswzadd
and wip nv thread shuffle impl
4 years ago
ameerj c542204113 glsl: Implement indexed attribute loads 4 years ago
ameerj 2a504b4765 glsl: Conditionally add GL_ARB_sparse_texture2 4 years ago
ameerj fc0db612ab glsl: Conditionally use GL_EXT_shader_image_load_formatted
Fix for SULD.D
4 years ago
ameerj fb839061fb glsl: Remove output generic indexing for geometry stage 4 years ago
ameerj 258106038e glsl: Allow dynamic tracking of variable allocation 4 years ago
ameerj 465903468e glsl: Implement barriers 4 years ago
ameerj 421847cf1e glsl: Implement image atomics and set layer
along with some more cleanup/oversight fixes
4 years ago
ameerj d41aef03c7 glsl: Fix image gather logic 4 years ago
ameerj 35e78d558d glsl: Add cbuf access workaround for devices with component indexing bug 4 years ago
ameerj 747b8556a4 glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupported 4 years ago
ameerj d12f2b8ccf emit_glsl_image: Use immediate offsets when possible 4 years ago
ameerj 0a0b0a73d8 glsl: Fix <32-bit SSBO writes
and more cleanup
4 years ago
ameerj 34fdb6471d glsl: Cleanup and address feedback 4 years ago
ameerj 5355568a2d glsl: Refactor Global memory functions 4 years ago
ameerj a68fabf6d5 glsl: Increase NUM_VARS that can be allocated
needed for HW:AoC.
4 years ago
ameerj 8d8ce24f20 glsl: Implement Load/WriteGlobal
along with some other misc changes and fixes
4 years ago
ameerj af9696059c glsl: Implement Images 4 years ago
ameerj 6577a63d36 glsl: skip gl_ViewportIndex write if device does not support it 4 years ago
ameerj f4799e8fa1 glsl: Implement transform feedback 4 years ago
ameerj 31147ffe69 glsl: Yet another gl_ViewportIndex fix attempt 4 years ago
ameerj 9f3970f837 glsl: Add gl_ViewportIndex out attribute 4 years ago
lat9nq fc29de7d5b emit_glsl_context_get_set: Remove unused function 4 years ago
ameerj 59576b82a8 glsl: Fix precise variable declaration
and add some more separation in the shader for better debugability when dumped
4 years ago
ameerj 8c684b3e23 glsl: Implement tessellation shaders 4 years ago
ameerj c7d085b505 glsl: Implement ImageGradient and other texture function variants 4 years ago