Archived
1
0
Fork 0
forked from Mirror/Ryujinx
This repository has been archived on 2024-10-11. You can view files and clone it, but cannot push or open issues or pull requests.
jinx/ARMeilleure/CodeGen/X86
FICTURE7 9d7627af64
Add multi-level function table (#2228)
* Add AddressTable<T>

* Use AddressTable<T> for dispatch

* Remove JumpTable & co.

* Add fallback for out of range addresses

* Add PPTC support

* Add documentation to `AddressTable<T>`

* Make AddressTable<T> configurable

* Fix table walk

* Fix IsMapped check

* Remove CountTableCapacity

* Add PPTC support for fast path

* Rename IsMapped to IsValid

* Remove stale comment

* Change format of address in exception message

* Add TranslatorStubs

* Split DispatchStub

Avoids recompilation of stubs during tests.

* Add hint for 64bit or 32bit

* Add documentation to `Symbol`

* Add documentation to `TranslatorStubs`

Make `TranslatorStubs` disposable as well.

* Add documentation to `SymbolType`

* Add `AddressTableEventSource` to monitor function table size

Add an EventSource which measures the amount of unmanaged bytes
allocated by AddressTable<T> instances.

 dotnet-counters monitor -n Ryujinx --counters ARMeilleure

* Add `AllowLcqInFunctionTable` optimization toggle

This is to reduce the impact this change has on the test duration.
Before everytime a test was ran, the FunctionTable would be initialized
and populated so that the newly compiled test would get registered to
it.

* Implement unmanaged dispatcher

Uses the DispatchStub to dispatch into the next translation, which
allows execution to stay in unmanaged for longer and skips a
ConcurrentDictionary look up when the target translation has been
registered to the FunctionTable.

* Remove redundant null check

* Tune levels of FunctionTable

Uses 5 levels instead of 4 and change unit of AddressTableEventSource
from KB to MB.

* Use 64-bit function table

Improves codegen for direct branches:

    mov qword [rax+0x408],0x10603560
 -  mov rcx,sub_10603560_OFFSET
 -  mov ecx,[rcx]
 -  mov ecx,ecx
 -  mov rdx,JIT_CACHE_BASE
 -  add rdx,rcx
 +  mov rcx,sub_10603560
 +  mov rdx,[rcx]
    mov rcx,rax

Improves codegen for dispatch stub:

    and rax,byte +0x1f
 -  mov eax,[rcx+rax*4]
 -  mov eax,eax
 -  mov rcx,JIT_CACHE_BASE
 -  lea rax,[rcx+rax]
 +  mov rax,[rcx+rax*8]
    mov rcx,rbx

* Remove `JitCacheSymbol` & `JitCache.Offset`

* Turn `Translator.Translate` into an instance method

We do not have to add more parameter to this method and related ones as
new structures are added & needed for translation.

* Add symbol only when PTC is enabled

Address LDj3SNuD's feedback

* Change `NativeContext.Running` to a 32-bit integer

* Fix PageTable symbol for host mapped
2021-05-29 18:06:28 -03:00
..
Assembler.cs Add multi-level function table (#2228) 2021-05-29 18:06:28 -03:00
CallConvName.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00
CallingConvention.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00
CodeGenCommon.cs Optimize x64 loads and stores using complex addressing modes (#972) 2020-03-10 09:29:34 +11:00
CodeGenContext.cs PPTC & Pool Enhancements. (#1968) 2021-02-22 03:23:48 +01:00
CodeGenerator.cs PPTC & Pool Enhancements. (#1968) 2021-02-22 03:23:48 +01:00
HardwareCapabilities.cs CPU (A64): Add FP16/FP32 fast paths (F16C Intrinsics) for Fcvt_S, Fcvtl_V & Fcvtn_V Instructions. Now HardwareCapabilities uses CpuId. (#1650) 2020-11-18 19:35:54 +01:00
IntrinsicInfo.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00
IntrinsicTable.cs CPU (A64): Add Fmaxnmp & Fminnmp Scalar Inst.s, Fast & Slow Paths; with Tests. (#1894) 2021-01-20 09:12:33 +11:00
IntrinsicType.cs Add support for guest Fz (Fpcr) mode through host Ftz and Daz (Mxcsr) modes (fast paths). (#1630) 2020-12-07 10:37:07 +01:00
PreAllocator.cs (CPU) Fix CRC32 instruction when constant values are used as input (#2183) 2021-04-07 23:43:08 +02:00
X86Condition.cs Improve branch operations (#1442) 2020-08-05 08:52:33 +10:00
X86Instruction.cs Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s). Fix Vfma_V slow path not using StandardFPSCRValue(). (#1775) 2020-12-17 20:43:41 +01:00
X86Optimizer.cs Fold constant offsets and group constant addresses (#2285) 2021-05-13 21:26:57 +02:00
X86Register.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00