-
Notifications
You must be signed in to change notification settings - Fork 165
Delay time influenced by memory configuration #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The documentation clearly states that delay will block for at least the specified amount of cycles. It certainly is a really poor delay implementation and not meant to be used for accurate timing. PRs to improve upon that would certainly be appreciated. NB: Since this function is inlined it will behave differently depending on the caller so measuring as part of the initialisation is not possible. Not inlining it would make it even less dependable, at least for short delays. |
I believe the concerns in the OP here have been addressed (to the extent they can be, as discussed in other issues) now that #587 has been merged. Though, that is a curious note about the NRF52840 and 16 byte alignment. I only tested #587 on a SAMD51 (Cortex M4) on which the branch timing seems to only differentiate between 2, 4, or 8 byte aligned targets. |
I don’t have the bandwidth right now to confirm the fix or sort out why I came to the cycle vs. alignment conclusions that I did, but this looks like a good step to me. Thanks for working on it. Feel free to close the issue. I do still think a doc clarification would be worthwhile. The “at least” statement to me implies an additive error (starting overhead, interrupts) rather than the multiplicative error caused by the cycle timing effects. |
I agree about the "at least", will aim to PR a change to that wording and fix a minor error I made in the comment that explains the alignment. In another thread, I suggested that we put the argument in terms of instructions/NOPs instead of cycles - maybe that would better communicate the situation. With a slightly wider view, I'm wondering if the right place for this method is actually in the HAL layer instead. Just thinking about the ATSAM one that I am most familiar with, we could quite easily make a delay that's more-or-less cycle correct since we know exactly what chip the code will run on. |
A lot of HALs can implement delays more usefully, including using timer peripherals and such, and can implement the embedded-hal Delay traits etc too, but I think it's useful to have some very basic, very compatible "delay" in cortex-m, even if it varies a bit chip-to-chip (and at least not alignment-to-alignment any more after your PR). |
How many cycles
asm::delay()
can be affected by how the chip memory is configured and even where the function ends up in Flash. The code assumes the loop takes 4 cycles and explains that interrupts may cause the function to delay longer than requested.However, items such as Flash wait states, instruction cache, and function alignment can also cause the loop to take longer, even when interrupts are disabled. As an example,
asm::delay()
running on the NRF52840 can take 4, 5, or 6 cycles per loop depending on whether the function ends up 16, 8, or 2 byte aligned. This also means that the delay depends on how exactly the binary is compiled and linked.This is very surprising behavior and I think a note to the above effect would be useful in the function's documentation. Perhaps the loop can be measured as part of initialization, but this would probably be unnecessary overhead.
The text was updated successfully, but these errors were encountered: