Revert4030/Ryujinx

Author	SHA1	Message	Date
riperiperi	de3134adbe	Vulkan: Explicitly enable precise occlusion queries (#4292 ) The only guarantee of the occlusion query type in Vulkan is that it will be zero when no samples pass, and non-zero when any samples pass. Of course, most GPUs implement this by just placing the # of samples in the result and calling it a day. However, this lax restriction means that GPUs could just report a boolean (1/0) or report a value after one is recorded, but before all samples have been counted. MoltenVK falls in the first category - by default it only reports 1/0 for occlusion queries. Thankfully, there is a feature and flag that you can use to force compatible drivers to provide a "precise" query result, that being the real # of samples passed. Should fix ink collision in Splatoon 2/3 on MoltenVK.	2023-01-19 00:30:42 +00:00
gdkchan	065c4e520d	Specify image view usage flags on Vulkan (#4283 ) * Specify image view usage flags on Vulkan * PR feedback	2023-01-15 23:12:52 +01:00
Ac_K	85faa9d8fa	Revert "Relax Vulkan requirements (#4228 )" (#4279 ) This reverts commit `dca5b14493`.	2023-01-13 06:04:59 +00:00
gdkchan	dca5b14493	Relax Vulkan requirements (#4228 )	2023-01-13 06:09:48 +01:00
riperiperi	8fa248ceb4	Vulkan: Add workarounds for MoltenVK (#4202 ) * Add MVK basics. * Use appropriate output attribute types * 4kb vertex alignment, bunch of fixes * Add reduced shader precision mode for mvk. * Disable ASTC on MVK for now * Only request robustnes2 when it is available. * It's just the one feature actually * Add triangle fan conversion * Allow NullDescriptor on MVK for some reason. * Force safe blit on MoltenVK * Use ASTC only when formats are all available. * Disable multilevel 3d texture views * Filter duplicate render targets (on backend) * Add Automatic MoltenVK Configuration * Do not create color attachment views with formats that are not RT compatible * Make sure that the host format matches the vertex shader input types for invalid/unknown guest formats * FIx rebase for Vertex Attrib State * Fix 4b alignment for vertex * Use asynchronous queue submits for MVK * Ensure color clear shader has correct output type * Update MoltenVK config * Always use MoltenVK workarounds on MacOS * Make MVK supersede all vendors * Fix rebase * Various fixes on rebase * Get portability flags from extension * Fix some minor rebasing issues * Style change * Use LibraryImport for MVKConfiguration * Rename MoltenVK vendor to Apple Intel and AMD GPUs on moltenvk report with the those vendors - only apple silicon reports with vendor 0x106B. * Fix features2 rebase conflict * Rename fragment output type * Add missing check for fragment output types Might have caused the crash in MK8 * Only do fragment output specialization on MoltenVK * Avoid copy when passing capabilities * Self feedback * Address feedback Co-authored-by: gdk <gab.dark.100@gmail.com> Co-authored-by: nastys <nastys@users.noreply.github.com>	2023-01-13 01:31:21 +01:00
riperiperi	e20abbf9cc	Vulkan: Don't flush commands when creating most sync (#4087 ) * Vulkan: Don't flush commands when creating most sync When the WaitForIdle method is called, we create sync as some internal GPU method may read back written buffer data. Some games randomly intersperse compute dispatch into their render passes, which result in this happening an unbounded number of times depending on how many times they run compute. Creating sync in Vulkan is expensive, as we need to flush the current command buffer so that it can be waited on. We have a limited number of active command buffers due to how we track resource usage, so submitting too many command buffers will force us to wait for them to return to the pool. This PR allows less "important" sync (things which are less likely to be waited on) to wait on a command buffer's result without submitting it, instead relying on AutoFlush or another, more important sync to flush it later on. Because of the possibility of us waiting for a command buffer that hasn't submitted yet, any thread needs to be able to force the active command buffer to submit. The ability to do this has been added to the backend multithreading via an "Interrupt", though it is not supported without multithreading. OpenGL drivers should already be doing something similar so they don't blow up when creating lots of sync, which is why this hasn't been a problem for these games over there. Improves Vulkan performance on Xenoblade DE, Pokemon Scarlet/Violet, and Zelda BOTW (still another large issue here) * Add strict argument This is technically a separate concern from whether the sync is a host syncpoint. * Remove _interrupted variable * Actually wait for the invoke This is required by AMD GPUs, and also may have caused some issues on other GPUs. * Remove unused using. * I don't know why it added these ones. * Address Feedback * Fix typo	2022-12-29 15:39:04 +01:00
riperiperi	470be03c2f	GPU: Add fallback when 16-bit formats are not supported (#4108 ) * Add conversion for 16 bit RGBA formats (not supported in Rosetta) * Rebase fix Rebase fix * Forgot to remove this * Fix RGBA16 format conversion * Add RGBA4 -> RGBA8 conversion * Handle host stride alignment * Address Feedback Part 1 * Can't count * Don't zero out rgb when alpha is 0 * Separate RGBA4 and 5-bit component formats Not sure of a better way to name them... * Add A1B5G5R5 conversion * Put this in the right place. * Make format naming consistent for capabilities * Change method names	2022-12-26 15:50:27 -03:00
Hunter	c963b3c804	Added Generic Math to BitUtils (#3929 ) * Generic Math Update Updated Several functions in Ryujinx.Common/Utilities/BitUtils to use generic math * Updated BitUtil calls * Removed Whitespace * Switched decrement * Fixed changed method calls. The method calls were originally changed on accident due to me relying too much on intellisense doing stuff for me * Update Ryujinx.Common/Utilities/BitUtils.cs Co-authored-by: gdkchan <gab.dark.100@gmail.com> Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2022-12-26 14:11:05 +00:00
gdkchan	f906eb06c2	Implement a software ETC2 texture decoder (#4121 ) * Implement a software ETC2 texture decoder * Fix output size calculation for non-2D textures * Address PR feedback	2022-12-21 20:39:58 -03:00
Georg Lehmann	0f50de72be	Vulkan: enable VK_EXT_custom_border_color features (#4116 ) * Vulkan: enable VK_EXT_custom_border_color features radv only create the border color bo if this feature is enabled, so it crashed when creating samplers with custom border colors Fixes #4072 Fixes #3993 * Address gdkchan's comment Co-authored-by: Mary <mary@mary.zone>	2022-12-14 20:53:33 -03:00
Andrey Sukharev	535fbec675	Use NuGet Central Package Management to manage package versions solution-wise (#4095 )	2022-12-12 16:03:10 +01:00
Isaac Marovitz	851d81d24a	Fix Redundant Qualifer Warnings (#4091 ) * Fix Redundant Qualifer Warnings * Remove unnecessary using	2022-12-10 21:21:13 +01:00
riperiperi	e211c3f00a	UI: Add Metal surface creation for MoltenVK (#3980 ) * Initial implementation of metal surface across UIs * Fix SDL2 on windows * Update Ryujinx/Ryujinx.csproj Co-authored-by: Mary-nyan <thog@protonmail.com> * Address Feedback Co-authored-by: Mary-nyan <thog@protonmail.com>	2022-12-06 19:00:25 -03:00
Andrey Sukharev	4da44e09cb	Make structs readonly when applicable (#4002 ) * Make all structs readonly when applicable. It should reduce amount of needless defensive copies * Make structs with trivial boilerplate equality code record structs * Remove unnecessary readonly modifiers from TextureCreateInfo * Make BitMap structs readonly too	2022-12-05 14:47:39 +01:00
Mary-nyan	ae13f0ab4d	misc: Fix obsolete warnings in Ryujinx.Graphics.Vulkan (#4020 ) Was caused by some merges after the Silk.NET update	2022-12-05 12:57:11 +00:00
gdkchan	17a1cab5d2	Allow SNorm buffer texture formats on Vulkan (#3957 ) * Allow SNorm buffer texture formats on Vulkan * Shader cache version bump	2022-12-04 15:36:03 -03:00
gdkchan	73aed239c3	Implement non-MS to MS copies with draws (#3958 ) * Implement non-MS to MS copies with draws, simplify MS to non-MS copies and supports any host sample count * Remove unused program	2022-12-04 15:07:11 -03:00
Andrey Sukharev	3868a00206	Use source generated regular expressions (#4005 )	2022-12-04 00:43:23 +00:00
Mary-nyan	ce92e8cd04	chore: Update Silk.NET to 2.16.0 (#3953 )	2022-12-01 19:11:56 +01:00
riperiperi	458452279c	GPU: Track buffer migrations and flush source on incomplete copy (#3952 ) * Track buffer migrations and flush source on incomplete copy Makes sure that the modified range list is always from the latest iteration of the buffer, and flushes earlier iterations of a buffer if the data has not been migrated yet. * Cleanup 1 * Reduce cost for redundant signal checks on Vulkan * Only inherit the range list if there are pending ranges. * Fix OpenGL * Address Feedback * Whoops	2022-12-01 16:30:13 +01:00
gdkchan	4905101df1	Remove shader dependency on SPV_KHR_shader_ballot and SPV_KHR_subgroup_vote extensions (#3943 ) * Remove shader dependency on SPV_KHR_shader_ballot and SPV_KHR_subgroup_vote extensions * Shader cache version bump	2022-11-30 18:24:15 -03:00
Mary-nyan	d41c95dcff	chore: Update OpenTK to 4.7.5 (#3944 )	2022-11-29 13:32:40 +00:00
riperiperi	1fc0f569de	GPU: Always draw polygon topology as triangle fan (#3932 ) Polygon topology wasn't really supported and would only work on OpenGL on drivers that haven't removed it. As an alternative, this PR makes all cases of polygon topology use triangle fan. The topology type and transform feedback type have not been changed, as I don't think geo shader/tfb should be used with polygons. The OpenGL spec states: Only convex polygons are guaranteed to be drawn correctly by the GL. For convex polygons, triangle fan is equivalent to polygon. I imagine this is probably how it works on device, as this get-out-of-jail-free card is too enticing to pass up. This fixes the stat display in Pokemon S/V.	2022-11-28 19:18:22 -03:00
Ac_K	a1ddaa2736	ui: Fixes disposing on GTK/Avalonia and Firmware Messages on Avalonia (#3885 ) * ui: Only wait on _exitEvent when MainLoop is active under GTK This fixes a dispose issue under Horizon/GTK, we don't check if the ApplicationClient is null so it throw NCE. We don't check if the main loop is active and waiting an event which is set in the main loop... So that could lead to a freeze. Everything works fine in GTK now. Related issue: https://github.com/Ryujinx/Ryujinx/issues/3873 As a side note, same kind of issue appear in Avalonia UI too. Firmware's popup doesn't show anything and the emulator just freeze. * TSRBerry's change Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com> * Fix Avalonia crashing/freezing * Add Avalonia OpenGL fixes * Fix firmware popup on windows * Fixes everything * Add _initialized bool to VulkanRenderer and OpenGL Window Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>	2022-11-24 15:08:27 +01:00
riperiperi	ece36b274d	GAL: Send all buffer assignments at once rather than individually (#3881 ) * GAL: Send all buffer assignments at once rather than individually The `(int first, BufferRange[] ranges)` method call has very significant performance implications when the bindings are spread out, which they generally always are in Vulkan. This change makes it so that these methods are only called a maximum of one time per draw. Significantly improves GPU thread performance in Pokemon Scarlet/Violet. * Address Feedback Removed SetUniformBuffers(int first, ReadOnlySpan<BufferRange> buffers)	2022-11-24 07:50:59 +00:00
gdkchan	2e43d01d36	Move gl_Layer from vertex to geometry if GPU does not support it on vertex (#3866 ) * Move gl_Layer from vertex to geometry if GPU does not support it on vertex * Shader cache version bump * PR feedback	2022-11-18 23:27:54 -03:00
riperiperi	7373ec5792	Vulkan: Clear dummy texture to (0,0,0,0) on creation (#3867 ) This might fix an issue with AMD gpus on linux where the data could contain random garbage data. On the switch, it always samples as 0.	2022-11-18 23:11:34 -03:00
riperiperi	131baebe2a	Vulkan: Don't create preload command buffer outside a render pass (#3864 ) * Vulkan: Don't create preload buffer outside a render pass The preload command buffer is used to avoid render pass splits and barriers when updating buffer data. However, when a render pass is not active (for example, at the start of a pass, or during compute invocations) buffer uploads can be performed at any time, so the optimization isn't as useful. This PR makes it so that the preload command buffer is only used for buffer updates outside of a render pass. It's still used for textures as I don't want to shake things up right now regarding how the preload buffer is obtained before some other changes, and texture updates are a lot rarer anyways. Improves performance slightly in Pokemon Scarlet/Violet (43 -> 48), as it was switching to compute, writing a bunch of buffers inline, then dispatching, then flushing commands... It uses 1 command buffer instead of 2 every time it does this now. Maybe it would be nice to find a faster way to sync without creating so many command buffers in a short period of time. * Address feedback	2022-11-18 14:58:56 +00:00
Wunk	d536cc8ae6	Update units of memory from decimal to binary prefixes (#3716 ) `MB` and `GB` can either be interpreted as having base-10 units, or base-2. `MiB` and `GiB` removes this discrepancy so that units of memory are always interpreted using base-2 units.	2022-11-16 23:27:42 +01:00
gdkchan	f1d1670b0b	Implement HLE macro for DrawElementsIndirect (#3748 ) * Implement HLE macro for DrawElementsIndirect * Shader cache version bump * Use GL_ARB_shader_draw_parameters extension on OpenGL * Fix DrawIndexedIndirectCount on Vulkan when extension is not supported * Implement DrawIndex * Alignment * Fix some validation errors * Rename BaseIds to DrawParameters * Fix incorrect index buffer and vertex buffer size in some cases * Add HLE macros for DrawArraysInstanced and DrawElementsInstanced * Perform a regular draw when indirect data is not modified * Use non-indirect draw methods if indirect buffer was not GPU modified * Only check if draw parameters match if the shader actually uses them * Expose Macro HLE setting on GUI * Reset FirstVertex and FirstInstance after draw * Update shader cache version again since some people already tested this * PR feedback Co-authored-by: riperiperi <rhy3756547@hotmail.com>	2022-11-16 14:53:04 -03:00
gdkchan	a6a67a2b7a	Minor improvement to Vulkan pipeline state and bindings management (#3829 ) * Minor improvement to Vulkan pipeline state and bindings management * Clean up buffer textures too * Use glBindTextureUnit	2022-11-10 13:38:38 -03:00
Mary-nyan	c6d05301aa	infra: Migrate to .NET 7 (#3795 ) * Update readme to mention .NET 7 * infra: Migrate to .NET 7 .NET 7 is still in preview but this prepare for the release coming up next month. * Use Random.Shared in CreateRandom * Move UInt128Utils.cs to Ryujinx.Common project * Fix inverted parameters in System.UInt128 constructor * Fix Visual Studio complains on Ryujinx.Graphics.Vic * time: Fix missing alignment enforcement in SystemClockContext Fixes at least Smash * time: Fix missing alignment enforcement in SteadyClockContext Fix games (like recent version of Smash) using time shared memory * Switch to .NET 7.0.100 release * Enable Tiered PGO * Ensure CreateId validity requirements are meet when doing random generation Also enforce correct packing layout for other Mii structures. This fix a Mario Kart 8 crashes related to the default Miis.	2022-11-09 20:22:43 +01:00
gdkchan	f82309fa2d	Vulkan: Implement multisample <-> non-multisample copies and depth-stencil resolve (#3723 ) * Vulkan: Implement multisample <-> non-multisample copies and depth-stencil resolve * FramebufferParams is no longer required there * Implement Specialization Constants and merge CopyMS Shaders (#15) * Vulkan: Initial Specialization Constants * Replace with specialized helper shader * Reimplement everything Fix nonexistant interaction with Ryu pipeline caching Decouple specialization info from data and relocate them Generalize mapping and add type enum to better match spv types Use local fixed scopes instead of global unmanaged allocs * Fix misses in initial implementation Use correct info variable in Create2DLayerView Add ShaderStorageImageMultisample to required feature set * Use texture for source image * No point in using ReadOnlyMemory * Apply formatting feedback Co-authored-by: gdkchan <gab.dark.100@gmail.com> * Apply formatting suggestions on shader source Co-authored-by: gdkchan <gab.dark.100@gmail.com> Co-authored-by: gdkchan <gab.dark.100@gmail.com> * Support conversion with samples count that does not match the requested count, other minor changes Co-authored-by: mageven <62494521+mageven@users.noreply.github.com>	2022-11-02 18:17:19 -03:00
Wunk	3fe3598d41	Vulkan: Replace `VK_EXT_debug_report` usage with `VK_EXT_debug_utils` (#3802 ) * Vulkan: Replace `VK_EXT_debug_report` usage with `VK_EXT_debug_utils` [VK_EXT_debug_report](https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_debug_report.html) has been depreciated for quite some time now in favor of the much more featureful [VK_EXT_debug_utils](https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_debug_utils.html) extension. This PR converts our debug-report-callback into the newer debug-messenger pattern. `VK_EXT_debug_utils` adds some additional diagnostic tooling for marking debug-label scopes for queue-operations, command-buffers, and assigning name-labels to vulkan objects to aid in debugging(for a later PR). * Vulkan: Fix `DebugMessenger` severity-flag classification Extension bits between the two flags, for reference: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkDebugUtilsMessageSeverityFlagBitsEXT.html https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkDebugReportFlagBitsEXT.html	2022-10-29 14:09:25 -03:00
gdkchan	28ba55598d	Vulkan: Fix indirect buffer barrier (#3798 )	2022-10-26 14:53:11 -03:00
riperiperi	9719b6a112	Vulkan: Use dynamic state for blend constants (#3793 )	2022-10-25 23:49:23 +00:00
riperiperi	6e92b7a378	Dispose Vulkan TextureStorage when views hit 0 instead of immediately (#3738 ) Due to the `using` statement being scoped to the `CreateTextureView` method, `TextureStorage` would be disposed as soon as the view was returned. This was largely fine as the TextureStorage resources were being kept alive by the views holding their own references to them, but it also meant that dispose is only called as soon as the texture is created. Aliased Storages are TextureStorages created with the same allocation as another TextureStorage, if they have to be aliased as another format. We keep track of a TextureStorage's `_aliasedStorages` as they are created, and dispose them when the TextureStorage is disposed... ...except it is disposed immediately, before any aliased storages are even created. The aliased storages added after this will never be disposed. This PR attempts to fix this by disposing TextureStorage when its view count reaches 0. The other use of texture storage - the D32S8 blit - still manually disposes the storage, but regular uses created via the GAL are now disposed by the view count. I think this makes the most sense, as otherwise in the future this behaviour might be forgotton and more things could be added to the Dispose() method that don't work due to it not actually calling at the right time. This should improve memory leaks in Super Mario Odyssey, most noticeable when resolution scaling. The memory usage of the game is still wildly unpredictable due to how it interacts with the texture cache, but now it shouldn't get considerably longer as you play... I hope. I've seen it typically recover back to the same level occasionally, though it can spike significantly. Please test a bunch of games on multiple GPUs to make sure this doesn't break anything.	2022-10-18 23:52:08 +00:00
gdkchan	a6cd044f0f	Vulkan: Fix blit levels/layers parameters being inverted (#3768 )	2022-10-18 10:13:44 +02:00
riperiperi	0dbe45ae37	Fix various issues caused by Vertex/Index buffer conversions (#3762 ) * Fix various issues caused by #3679 - The arguments for the 0th dummy vertex buffer were incorrect - it was given an offset of 16 rather than a size of 16. - The wrong size was used when doing `autoBuffer.Get` on a converted vertex buffer. - The possibility of a vertex buffer being disposed and then rebound can rebindings to find a different buffer where the current range is out of bounds. Avoid binding when out of range to prevent validation errors. - The above also affects generation of converted buffers, which was a bit more fatal. Conversion functions now attempt to bound input offset/size. * Fix offset for converted buffer	2022-10-16 19:38:58 -03:00
riperiperi	2b50e52e48	Fix primitive count calculation for topology conversion (#3763 ) Luigi's Mansion 3 performs a non-index quads draw with 6 vertices. It's meant to ignore the last two, but the index pattern's primitive count calculation was rounding up. No idea why the game does this but this should fix random triangles in the map.	2022-10-16 19:25:40 -03:00
gdkchan	5af1327068	Vulkan: Fix sampler custom border color (#3751 )	2022-10-10 08:35:44 +02:00
riperiperi	bf77d1cab9	GPU: Pass SpanOrArray for Texture SetData to avoid copy (#3745 ) * GPU: Pass SpanOrArray for Texture SetData to avoid copy Texture data is often converted before upload, meaning that an array was allocated to perform the conversion into. However, the backend SetData methods were being passed a Span of that data, and the Multithreaded layer does `ToArray()` on it so that it can be stored for later! This method can't extract the original array, so it creates a copy. This PR changes the type passed for textures to a new ref struct called SpanOrArray, which is backed by either a ReadOnlySpan or an array. The benefit here is that we can have a ToArray method that doesn't copy if it is originally backed by an array. This will also avoid a copy when running the ASTC decoder. On NieR this was taking 38% of texture upload time, which it does a _lot_ of when you move between areas, so there should be a 1.6x performance boost when strictly uploading textures. No doubt this will also improve texture streaming performance in UE4 games, and maybe a small reduction with video playback. From the numbers, it's probably possible to improve the upload rate by a further 1.6x by performing layout conversion on GPU. I'm not sure if we could improve it further than that - multithreading conversion on CPU would probably result in memory bottleneck. This doesn't extend to buffers, since we don't convert their data on the GPU emulator side. * Remove implicit cast to array.	2022-10-08 12:04:47 -03:00
riperiperi	1ca0517c99	Vulkan: Fix some issues with CacheByRange (#3743 ) * Fix some issues with CacheByRange - Cache now clears under more circumstances, the most important being the fast path write. - Cache supports partial clear which should help when more buffers join. - Fixed an issue with I8->I16 conversion where it wouldn't register the buffer for use on dispose. Should hopefully fix issues with https://github.com/Ryujinx/Ryujinx-Games-List/issues/4010 and maybe others. * Fix collection modified exception * Fix accidental use of parameterless constructor * Replay DynamicState when restoring from helper shader	2022-10-08 11:28:27 -03:00
gdkchan	a4fc9f8050	Support use of buffer ranges with size 0 (#3736 )	2022-10-03 20:08:38 -03:00
gdkchan	5437d6cb13	Vulkan: Fix buffer texture storage not being updated on buffer handle reuse (#3731 )	2022-10-03 19:45:33 -03:00
mageven	96bf7f8522	Avoid allocating unmanaged string per shader (#3730 ) * Avoid reallocating same unmanaged string per shader * Address PR feedback * Rename to _disposed	2022-10-02 10:59:34 +02:00
riperiperi	f502cfaf62	Vulkan: Zero blend state when disabled or write mask is 0 (#3719 ) * Zero blend state when disabled or write mask is 0 Any difference in the blend state when blend is disabled is meaningless, but Ryujinx would compare different disabled blends and compile them as separate pipelines. This change ensures that all pipelines where blend state is meaningless record it as such, which avoids compiling a bunch of pipelines that are essentially identical. The NVIDIA driver is pretty forgiving when it comes to silly pipeline misses like this, but other drivers don't offer the same level of kindness. This should reduce stuttering on those drivers, and might improve overall performance very slightly due to less pipeline variants being in the hash table. * Fix blend possibly being wrong when an attachment is unmasked	2022-09-29 12:32:49 -03:00
riperiperi	4c0eb91d7e	Convert Quads to Triangles in Vulkan (#3715 ) * Add Index Buffer conversion for quads to Vulkan Also adds a reusable repeating pattern index buffer to use for non-indexed draws, and generalizes the conversion cache for buffers. * Fix some issues * End render pass before conversion * Resume transform feedback after we ensure we're in a pass. * Always generate UInt32 type indices for topology conversion * No it's not. * Remove unused code * Rely on TopologyRemap to convert quads to tris. * Remove double newline * Ensure render pass ends before stride or I8 conversion	2022-09-20 18:38:48 -03:00
Emmanuel Hansen	6f0395538b	Avalonia - Use embedded window for avalonia (#3674 ) * wip * use embedded window * fix race condition on opengl Windows * fix glx issues on prime nvidia * fix mouse support win32 * clean up * addressed review * addressed review * fix warnings * fix sotware keyboard dialog * Update Ryujinx.Ava/Ui/Applet/SwkbdAppletDialog.axaml.cs Co-authored-by: gdkchan <gab.dark.100@gmail.com> * remove double semi Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2022-09-19 15:05:26 -03:00
riperiperi	c3c41fa4bb	Periodically Flush Commands for Vulkan (#3689 ) * Periodically Flush Commands for Vulkan NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now. Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner. This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws. This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency. By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built. The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread. Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck. Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely. Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO) * Remove unused variable * Fix possible issue with early query flush	2022-09-14 13:48:31 -03:00

1 2

63 commits