we can only have one active waiter for any given irq at any given time.
allowing waits for irqs on state machines bypasses this limitation and
causes lost events for all but the latest waiter for a given irq.
splitting this out also allows us to signal from state machines to other
parts of the application without monopolizing state machine access for
the irq wait, as would be necessary to make irq waiting sound.
move all methods into PioStateMachine instead. the huge trait wasn't
object-safe and thus didn't have any benefits whatsoever except for
making it *slightly* easier to write bounds for passing around state
machines. that would be much better solved with generics-less instances.
once all sharing owners of pio pins have been dropped we should reset
the pin for use by other hal objects. unfortunately this needs an atomic
state per pio block because PioCommon and all of the state machines
really do share ownership of any wrapped pins. only PioCommon can create
them, but all state machines can keep them alive. since state machines
can be moved to core1 we can't do reference counting in relaxed mode,
but we *can* do relaxed pin accounting (since only common and the final
drop can modify this).
we can't prove that some instruction memory is not used as long as state
machines are alive, and we can pass instance memory handles between
instances as well. mark free_instr unsafe, with documentation for this caveat.
1425: rp pio, round 2 r=Dirbaio a=pennae
another round of bugfixes for pio, and some refactoring. in the end we'd like to make pio look like all the other modules and not expose traits that provide all the methods of a type, but put them onto the type itself. traits only make much sense, even if we added an AnyPio and merged the types for the member state machines (at the cost of at least a u8 per member of Pio).
Co-authored-by: pennae <github@quasiparticle.net>
not requiring a PioInstance for splitting lets us split from a
PeripheralRef or borrowed PIO as well, mirroring every other peripheral
in embassy_rp. pio pins still have to be constructed from owned pin
instances for now.
merge into PioInstance instead. PioPeripheral was mostly a wrapper
around PioInstance anyway, and the way the wrapping was done required
PioInstanceBase<N> types where PIO{N} could've been used instead.
1423: rp: fix gpio InputFuture and inefficiencies r=pennae a=pennae
InputFuture could not wait for edges without breaking due to a broken From impl, but even if the impl had been correct it would not have worked correctly because raw edge interrupts are sticky and must be cleared from software. also replace critical sections with atomic accesses, and do nvic setup only once.
Co-authored-by: pennae <github@quasiparticle.net>
doing this setup work repeatedly, on every wait, is unnecessary. with
nothing ever disabling the interrupt it is sufficient to enable it once
during device init and never touch it again.
pio control registers are notionally shared between state machines as
well. state machine operations that change these registers must use
atomic accesses (or critical sections, which would be overkill).
notably PioPin::set_input_sync_bypass was even wrong, enabling the
bypass on a pin requires the corresponding bit to be set (not cleared).
the PioCommon function got it right.
fixing the dma word size to 32 makes it impossible to implement any
peripheral that takes its data in smaller chunks, eg uart, spi, i2c,
ws2812, the list goes on.
compiler barriers were also not set correctly; we need a SeqCst barrier
before starting a transfer as well to avoid reordering of accesses into
a buffer after dma has started.
InputFuture did not use and check edge interrupts correctly.
InterruptTrigger should've checked for not 1,2,3,4 but 1,2,4,8 since the
inte fields are bitmasks, and not clearing INTR would have repeatedly
triggered edge interrupts early.
1414: rp: report errors from buffered and dma uart receives r=Dirbaio a=pennae
neither of these reported errors so far, which is not ideal. add error reporting to both of them that matches the blocking error reporting as closely as is feasible, even allowing partial receives from buffered uarts before errors are reported where they would have been by the blocking code. dma transfers don't do this, if an errors applies to any byte in a transfer the entire transfer is nuked (though we probably could report how many bytes have been transferred).
Co-authored-by: pennae <github@quasiparticle.net>
this reports errors at the same location the blocking uart would, which
works out to being mostly exact (except in the case of overruns, where
one extra character is dropped). this is actually easier than going
nuclear in the case of errors and nuking both the buffer contents and
the rx fifo, both of which are things we'd have to do in addition to
what's added here, and neither are needed for correctness.
sending break conditions is necessary to implement some protocols, and
the hardware supports this natively. we do have to make sure that we
don't assert a break condition while the uart is busy though, otherwise
the break may be inserted before the last character in the tx fifo.
instruction memory is a shared resource. writing it only from PioCommon
clarifies this, and perhaps makes it more obvious that multiple state
machines can share the same instructions.
this also allows *freeing* of instruction memory to reprogram the
system, although this interface is not entirely safe yet. it's safe in
the sense rusts understands things, but state machines may misbehave if
their instruction memory is freed and rewritten while they are running.
fixing this is out of scope for now since it requires some larger
changes to how state machines are handled. the interface provided
currently is already unsafe in that it lets people execute instruction
memory that has never been written, so this isn't much of a drawback for now.
pin and irq operations affect the entire pio block. with pins this is
not very problematic since pins themselves are resources, but irqs are
not treated like that and can thus interfere across state machines. the
ability to wait for an irq on a state machine is kept to make
synchronization with user code easier, and since we can't inspect loaded
programs at build time we wouldn't gain much from disallowing waits from
state machines anyway.
this mainly removes the need for explicit indexing to get the pac
object. runtime effect is zero, but arguably things are a bit easier to
read with less indexing.
this is already done during platform init. it wasn't even sound in the
original implementation because futures would meddle with the nvic in
critical sections, while another (interrupt) executor could meddle with
the nvic without critical sections here. it is only accidentally sound
now and only if irq1 of both pios isn't used by user code. luckily the
worst we can expect to happen is interrupt priorities being set wrong,
but wrong is wrong is wrong.
since we never actually *disable* these interrupts for any length of
time we can simply enable them globally. we also initialize all pio
interrupt flags to not cause system interrupts since state machine
irqa are not necessarily meant to cause a system interrupt when set. the
fifo interrupts are sticky and can likewise only be cleared inside the
handler by disabling them.
dma does this too, also with 12 bits to check. this decreases code size
significantly (increasing speed when the cache is cold), frees up an
interrupt handler, and avoids read-modify-write cycles (which makes each
processed flag cheaper). due to more iterations per handler invocation
the actual runtime of the handler body remains roughly the
same (slightly faster at O2, slightly slower at Oz).
notably wakers are now kept in one large array indexed by the irq
register bit number instead of three different arrays, this allows for
machine code-level optimizations of waker lookups.
1403: Bump versions preparing for -macros and -executor release r=lulf a=lulf
I'd like to propose a new release of embassy-macros and embassy-executor, as there is a challenge with some of the features changing since 0.1.1 when using libraries that depend on 0.1.1 with applications that patch to use git versions.
Co-authored-by: Ulf Lilleengen <lulf@redhat.com>
1406: rp: DMA behaviour during flash operations r=Dirbaio a=kalkyl
This PR changes the old behaviour during flash operations where all DMA transfers were paused during the flash operation.
The new approach is to wait for any DMA operating in flash region to finish and let RAM transfers continue.
Co-authored-by: kalkyl <henrik.alser@me.com>
storing a full function pointer initialized to a resolver trampoline
lets us avoid the runtime cost of checking whether we need to do the
initialization.
rp-hal has done this very well already, so we'll just copy their entire
impl again. only div.rs needed some massaging because our sio access
works a little differently, everything else worked as is.
1378: Add ability to invert UART pins, take 2 r=Dirbaio a=jakewins
Same PR as before, except this now works :)
There was a minor hiccup in the UartRx code where the rx pin got passed as the tx argument, so the invert settings didn't get applied. With this fix, my local setup at least is happily reading inverted uart data.
Co-authored-by: Jacob Davis-Hansson <jake@davis-hansson.com>
1372: rp: add division intrinsics r=Dirbaio a=pennae
rp2040-hal adds division intrinsics using the hardware divider unit in the SIO, as does the pico-sdk itself. using the hardware is faster than the compiler_rt implementations, and more compact too.
since embassy does not expose the hardware divider in any way (yet?) we could go even further an remove the state-saving code rp2040-hal needs, but that doesn't seem to be worth it.
Co-authored-by: pennae <github@quasiparticle.net>
rp2040-hal adds division intrinsics using the hardware divider unit in
the SIO, as does the pico-sdk itself. using the hardware is faster than
the compiler_rt implementations, and more compact too.
This is useful in some cases where the surrounding circuit
for some reason inverts the UART signal, for instance if you're talking
to a device via an optocoupler.
A while ago `OutputOpenDrain` was made to implement `InputPin`,
something that allowed drivers for various one-wire protocols to be
written, but it's been lacking a `Wait` implementation — something
that's needed to write async versions of these drivers.
This commit also adds `get_level()` to `OutputOpenDrain`, since
`is_high()` and `is_low()` were already implemented, but `get_level()`
itself was missing.
1318: rp: Allow zero len reads for buffered uart r=Dirbaio a=timokroeger
Prevents the read methods from getting stuck forever.
cc `@MathiasKoch` can you test if this fixes the problem you described in the chat?
Co-authored-by: Timo Kröger <timokroeger93@gmail.com>
When using gpio pin changes for things like peripheral interrupts these
debug! calls flood defmt, making it difficult to find what you're
actually looking for.
1142: More rp2040 BufferedUart fixes r=Dirbaio a=timokroeger
* Refactor init code
* Make it possible to drop RX without breaking TX (or vice versa)
* Correctly handle RX buffer full scenario
Co-authored-by: Timo Kröger <timokroeger93@gmail.com>
* Only clear interrupt flags that have fired (so that we do not lose any error flags)
* Enable RX interrupt when a read is requested, disable it when the RX buffer is full
* Rework TX interrupt handling: its "edge" triggered by a FIFO threshold
When data is in the RX fifo the RX timeout interrupt goes high again even after clearing it.
The result is a deadlock because execution is stuck in the interrupt handler. No other code
can run to clear the receive buffer.
Enable and disable RX interrupts based on the buffer fill level.
Use the same approach for the TX code path.
1128: Add missing SPI pins r=Dirbaio a=pferreir
The SPI definitions lack the pins which are not accessible on the pico (but are so e.g. on the stamp).
Co-authored-by: Pedro Ferreira <pedro@dete.st>
959: Generic, executor-agnostic queue implementation r=ivmarkov a=ivmarkov
Hopefully relatively well documented.
Implementation relies on a fixed-size `SortedLinkedList` from `heapless`. (By default, for up to 128 timer schedules, but we can lower this number to - say - 64.)
As discussed earlier, on queue overflow, the `WakerRegistration` approach is utilized, whereas the waker that is ordered first in the queue is awoken to make room for the incoming one (which might be the waker that would be awoken after all!). Wakers are compared with `Waker::will_wake`, so the queue should actually not fill up that easily, if at all.
I've left provisions for the user to manually instantiate the queue using a dedicated macro - `generic_queue!` so that users willing to adjust the queue size, or users (like me) who have to use the queue in a complex "on-top-of-RTOS-but-the-timer-driver-calling-back-from-ISR" scenario can customize the mutex that protects the queue.
The one thing I'm not completely happy with is the need to call `{ embassy_time::queue::initialize() }` early on before any futures using embassy-time are polled, which is currently on the shoulders of the user. I'm open to any ideas where we can get rid of this and do it on the first call to `_embassy_time_schedule_wake`, without introducing very complex combinations of critical sections, atomics and whatnot.
Co-authored-by: ivmarkov <ivan.markov@gmail.com>
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
This ensures that the current response has finished being sent
before the subsequent set_address() happens. Otherwise connecting
a device is intermittent, can fail depending on timing.
984: rp pico async i2c implementation r=Dirbaio a=jsgf
This implements an interrupt-driven async i2c master. It is based on https://github.com/embassy-rs/embassy/pull/914, a bit of https://github.com/embassy-rs/embassy/pull/978 and `@ithinuel's` https://github.com/ithinuel/rp2040-async-i2c.git
This is still work-in-progress, and is currently untested.
1006: Removes some of the code duplication for UarteWithIdle r=Dirbaio a=huntc
This PR removes some of the code duplications for `UarteWithIdle` at the slight expense of requiring a split when using idle processing. As the nRF example illustrates though given the LoC removed, this expense seems worth the benefit in terms of maintenance, and the avoidance of copying over methods. My main motivation for this PR was actually due to the `event_endtx` method not having been copied across to the idle-related code.
Tested the uart_idle example on my nRF52840-dk, and from within my app. Both appear to work fine.
Co-authored-by: Jeremy Fitzhardinge <jeremy@goop.org>
Co-authored-by: huntc <huntchr@gmail.com>
Assert "tx_empty" interrupt a little early so there's time to wake up
and start refilling the fifo before it drains. This avoids stalling the
i2c bus if the tx fifo completely drains.
This is an interrupt-driven async i2c master implementation. It makes as
best use of the RP2040's i2c block's fifos as possible to minimize
interrupts.
It implements embedded_hal_async::i2c for easy interop.
WIP async impl
972: Restructure USB crates r=Dirbaio a=Dirbaio
- Split driver from `embassy-usb` to a separate crate. This allows making breaking changes to `embassy-usb` without having to bump all the crates with driver impls, such as HALs.
- Merge classes into `embassy-usb`. Now that breaking changes to `embassy-usb` aren't that bad, having everything in a single crate is much easier.
Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
949: (embassy-rp): Implement RealTimeClock r=lulf a=MathiasKoch
Basically a 1:1 port of the great implementation effort made by `rp-hal`
Co-authored-by: Mathias <mk@blackbird.online>
Old code used `cfg_if!` because rustc still parses code inside disabled cfg's, and Rust stable at that time couldn't parse the new GAT where-clause location. This is not the case anymore.