This is a `core` patch to make wakers 1 word (the task pointer) instead of 2 (task pointer + vtable). It allows having the "waker optimization" we had a while back on `WakerRegistration/AtomicWaker`, but EVERYWHERE, without patching all crates.
Advantages:
- Less memory usage.
- Faster.
- `AtomicWaker` can actually use atomics to load/store the waker, No critical section needed.
- No `dyn` call, which means `cargo-call-stack` can now see through wakes.
Disadvantages:
- You have to patch `core`...
- Breaks all executors and other things that create wakers, unless they opt in to using the new `from_ptr` API.
How to use:
- Run this shell script to patch `core`. https://gist.github.com/Dirbaio/c67da7cf318515181539122c9d32b395
- Enable `build-std`
- Enable `build-std-features = core/turbowakers`
- Enable feature `turbowakers` in `embassy-executor`, `embassy-sync`.
- Make sure you have no other crate creating wakers other than `embassy-executor`. These will panic at runtime.
Note that the patched `core` is equivalent to the unpached one when the `turbowakers` feature is not enabled, so it should be fine to leave it there.
This makes behavior consistent with `WakerRegistration`. It allows canceling `wait`
in one task and then calling `wait` in another. If two tasks are `wait`ing
concurrently the signal will be received by only one of them, randomly.