Archived
1
0
Fork 0
forked from Mirror/Ryujinx
Commit graph

10 commits

Author SHA1 Message Date
riperiperi
458452279c
GPU: Track buffer migrations and flush source on incomplete copy (#3952)
* Track buffer migrations and flush source on incomplete copy

Makes sure that the modified range list is always from the latest iteration of the buffer, and flushes earlier iterations of a buffer if the data has not been migrated yet.

* Cleanup 1

* Reduce cost for redundant signal checks on Vulkan

* Only inherit the range list if there are pending ranges.

* Fix OpenGL

* Address Feedback

* Whoops
2022-12-01 16:30:13 +01:00
riperiperi
5a39d3c4a1
GPU: Relax locking on Buffer Cache (#3883)
I did this on ncbuffer2 when we were using it for LDN 3, but I noticed that it can apply to the current buffer manager too, and it's an easy performance win.

The only buffer access that can come from another thread is the overlap search for buffers that have been unmapped. Everything else, including modifications, come from the main GPU thread. That means we only need to lock the range list when it's being modified, as that's the only time where we'll cause a race with the unmapped handler.

This has a significant performance improvements in situations where FIFO is high, like the other two PRs. Joined together they give a nice boost (73.6 master -> 79 -> 83 fps in SMO).
2022-11-24 01:41:16 +00:00
riperiperi
187372cbde
Prune ForceDirty and CheckModified caches on unmap (#3862)
* Prune ForceDirty and CheckModified caches on unmap

Since we're now using this for modified checks on the HLE indirect draw method, I'm worried that leaving these to forever gather cache entries isn't the best idea for performance in the long term, and it could keep old buffer objects alive for longer than they should be.

This PR adds the ability to prune invalid entries before checking these caches, and queues it whenever gpu memory is unmapped. It also aligns modified checks to the page size, as I figured it would be possible for a huge number of overlapping over a game's runtime.

This prevents Super Mario Odyssey from having 10s of thousands of entries in the modified cache in Metro Kingdom, and them duplicating when entering and leaving a building (should be cleared, as they were unmapped).

* Address Feedback
2022-11-18 14:58:24 +00:00
gdkchan
f1d1670b0b
Implement HLE macro for DrawElementsIndirect (#3748)
* Implement HLE macro for DrawElementsIndirect

* Shader cache version bump

* Use GL_ARB_shader_draw_parameters extension on OpenGL

* Fix DrawIndexedIndirectCount on Vulkan when extension is not supported

* Implement DrawIndex

* Alignment

* Fix some validation errors

* Rename BaseIds to DrawParameters

* Fix incorrect index buffer and vertex buffer size in some cases

* Add HLE macros for DrawArraysInstanced and DrawElementsInstanced

* Perform a regular draw when indirect data is not modified

* Use non-indirect draw methods if indirect buffer was not GPU modified

* Only check if draw parameters match if the shader actually uses them

* Expose Macro HLE setting on GUI

* Reset FirstVertex and FirstInstance after draw

* Update shader cache version again since some people already tested this

* PR feedback

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2022-11-16 14:53:04 -03:00
gdkchan
79becc4b78
Dynamically increase buffer size when resizing (#2861)
* Grow buffers by 1.5x of its size when resizing

* Further restrict the cases where the dynamic expansion is done
2022-03-15 03:33:53 +01:00
mpnico
8e1adb95cf
Add support for HLE macros and accelerate MultiDrawElementsIndirectCount #2 (#2557)
* Add support for HLE macros and accelerate MultiDrawElementsIndirectCount

* Add missing barrier

* Fix index buffer count

* Add support check for each macro hle before use

* Add missing xml doc

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-08-26 23:50:28 +02:00
gdkchan
40b21cc3c4
Separate GPU engines (part 2/2) (#2440)
* 3D engine now uses DeviceState too, plus new state modification tracking

* Remove old methods code

* Remove GpuState and friends

* Optimize DeviceState, force inline some functions

* This change was not supposed to go in

* Proper channel initialization

* Optimize state read/write methods even more

* Fix debug build

* Do not dirty state if the write is redundant

* The YControl register should dirty either the viewport or front face state too, to update the host origin

* Avoid redundant vertex buffer updates

* Move state and get rid of the Ryujinx.Graphics.Gpu.State namespace

* Comments and nits

* Fix rebase

* PR feedback

* Move changed = false to improve codegen

* PR feedback

* Carry RyuJIT a bit more
2021-07-11 17:20:40 -03:00
gdkchan
8b44eb1c98
Separate GPU engines and make state follow official docs (part 1/2) (#2422)
* Use DeviceState for compute and i2m

* Migrate 2D class, more comments

* Migrate DMA copy engine

* Remove now unused code

* Replace GpuState by GpuAccessorState on GpuAcessor, since compute no longer has a GpuState

* More comments

* Add logging (disabled)

* Add back i2m on 3D engine
2021-07-07 20:56:06 -03:00
gdkchan
fbb4019ed5
Initial support for separate GPU address spaces (#2394)
* Make GPU memory manager a member of GPU channel

* Move physical memory instance to the memory manager, and the caches to the physical memory

* PR feedback
2021-06-29 19:32:02 +02:00
gdkchan
a10b2c5ff2
Initial support for GPU channels (#2372)
* Ground work for separate GPU channels

* Rename TextureManager to TextureCache

* Decouple texture bindings management from the texture cache

* Rename BufferManager to BufferCache

* Decouple buffer bindings management from the buffer cache

* More comments and proper disposal

* PR feedback

* Force host state update on channel switch

* Typo

* PR feedback

* Missing using
2021-06-24 01:51:41 +02:00