Archived
1
0
Fork 0
forked from Mirror/Ryujinx
This repository has been archived on 2024-10-11. You can view files and clone it, but cannot push or open issues or pull requests.
jinx/ARMeilleure/CodeGen
jduncanator 68e15c1a74
Implement Fast Paths for most A32 SIMD instructions (#952)
* Begin work on A32 SIMD Intrinsics

* More instructions, some cleanup.

* Intrinsics for Move instructions (zip etc)

These pass the existing tests.

* Intrinsics for some of Cvt

While doing this I noticed that the conversion for int/fp was incorrect
in the slow path. I'll fix this in the original repo.

* Intrinsics for more Arithmetic instructions.

* Intrinsics for Vext

* Fix VEXT Intrinsic for double words.

* Use InsertPs to move scalar values.

* Cleanup, fix VPADD.f32 and VMIN signed integer.

* Cleanup, add SSE2 support for scalar insert.

Works similarly to the IR scalar insert, but obviously this one works
directly on V128.

* Minor cleanup.

* Enable intrinsic for FP64 to integer conversion.

* Address feedback apart from splitting out intrinsic float abs

Also: bad VREV encodings as undefined rather than throwing in translation.

* Move float abs to helper, fix bug with cvt

* Rename opc2 & 3 to match A32 docs, use ArgumentOutOfRangeException appropriately.

* Get name of variable at compilation rather than string literal.

* Use correct double sign mask.
2020-03-05 11:41:33 +11:00
..
Optimizations Replace LinkedList by IntrusiveList to avoid allocations on JIT (#931) 2020-02-17 22:30:54 +01:00
RegisterAllocators Replace LinkedList by IntrusiveList to avoid allocations on JIT (#931) 2020-02-17 22:30:54 +01:00
Unwinding Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00
X86 Implement Fast Paths for most A32 SIMD instructions (#952) 2020-03-05 11:41:33 +11:00
CompiledFunction.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00