Skip to content

STM32H7: Flash error flags are set spontaneously #60449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mjaun opened this issue Jul 17, 2023 · 6 comments · Fixed by #60765
Closed

STM32H7: Flash error flags are set spontaneously #60449

mjaun opened this issue Jul 17, 2023 · 6 comments · Fixed by #60765
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug

Comments

@mjaun
Copy link
Contributor

mjaun commented Jul 17, 2023

Describe the bug
Summary: The RDSERR (secure error) and/or RDPERR (read protection error) flags of the flash status register are set from time to time.

We noticed this issue in our project because the bootloader API mcuboot_swap_type() would return something unexpected. Under the hood this function uses flash_area_read() which is provided by the STM32 flash driver. This one in turn checks the above mentioned error flags after execution which leads to a failure.

To Reproduce
From our project we extracted a sample to reproduce the issue. After quite some playing around we now have a minimal sample which does actually nothing except very short sleeps in the main thread. Before running the loop with sleeps the sample registers an interrupt to get notified if the mentioned flash error flags are set.

It can be found here: https://github.com/mjaun/zephyr/tree/flash_error/samples/flash_error

It reproduces the issue at least on the nucleo_h743zi board and also on the nucleo_h743zi_minimal board which is provided on the same branch and disable almost all drivers.

Expected behavior

  1. Most importantly the flash errors should not occur
  2. Additionally it might make sense to clear the flags before read/write operations

Impact
We have some workarounds in place, because the bootloader API might not work properly.

Logs and console output
Output from the mentioned sample:

*** Booting Zephyr OS build abae9b562b84 ***
ISR triggered: 5581 ms
FLASH_SR1: 0x01000000
FLASH_SR2: 0x00000000
ISR triggered: 50320 ms
FLASH_SR1: 0x01000000
FLASH_SR2: 0x00000000
ISR triggered: 95060 ms
FLASH_SR1: 0x01000000
FLASH_SR2: 0x00000000
ISR triggered: 139799 ms
FLASH_SR1: 0x01000000
FLASH_SR2: 0x00000000
ISR triggered: 184538 ms
FLASH_SR1: 0x01000000
FLASH_SR2: 0x00000000

The timings are always the same also across multiple devices. Note that the time between each occurrence is always almost the same. I observed the timings changing by modifying SYSCLK or also changing the application code. Some changes make the issue disappear (e.g. enabling CONFIG_LOG or CONFIG_DEBUG).

Currently I don't have a clue why these errors occur at all. Breaking in the ISR shows that the application is in idle thread.

Environment (please complete the following information):

  • Zephyr v3.3.0 but also on main (8e40304)
  • STM32H743ZI
@mjaun mjaun added the bug The issue is a bug, or the PR is fixing a bug label Jul 17, 2023
@nordicjm nordicjm added the platform: STM32 ST Micro STM32 label Jul 17, 2023
@mjaun
Copy link
Contributor Author

mjaun commented Jul 17, 2023

Just noted the comment in this discussion, that such behavior can be triggered by speculative access to the system memory.

Indeed adding the following MPU region seems to resolve the issue at least in our sample:

MPU_REGION_ENTRY("SYSTEM", 0x1FF00000, { (STRONGLY_ORDERED_SHAREABLE | REGION_1M | NO_ACCESS_Msk) })

Beforehand I already tried setting up a background region as suggested by ST in this tutorial. For unknown reasons they don't protect the system memory in this example. Anyway the solution to the issue here could be to define a reasonable background MPU region.

@erwango
Copy link
Member

erwango commented Jul 18, 2023

Thanks for raising this.
Can you check #57467 to see if it can fix your issue ?

@mjaun
Copy link
Contributor Author

mjaun commented Jul 18, 2023

Thanks for raising this. Can you check #57467 to see if it can fix your issue ?

The commits from #57467 are contained in the provided sample. They don't seem to fix the issue. As mentioned it seems to be the system memory region which causes the issue not external memory regions.

@erwango erwango assigned FRASTM and unassigned erwango Jul 18, 2023
@erwango erwango added the priority: low Low impact/importance bug label Jul 18, 2023
@FRASTM
Copy link
Collaborator

FRASTM commented Jul 19, 2023

also CONFIG_DCACHE=n makes the pb disappear

The interrupt occurs even when the SECURITY option byte and the secure protection are OFF.

The https://community.st.com/t5/stm32-mcu-products/spurious-rdperr-and-rdserr-when-all-protection-and-security/td-p/279852 gives some clarifications :
"the Cortex-M7's prefetch unit is performing speculative instruction fetches to a (partially undocumented) region of system memory (0x1FF0 0000 - 0x1FF7 FFFF) decoded by the flash controller, which sets RDSERR in protest. Mapping this region as Execute-Never with the MPU inhibits the speculative fetch, preventing the RDSERR."

@heinwessels
Copy link
Contributor

heinwessels commented Aug 8, 2023

We've also been observing this behaviour on our STM32H743 when communicating with the bootloader API. Would be great to have this fixed.

@FRASTM
Copy link
Collaborator

FRASTM commented Aug 8, 2023

We've also been observing this behaviour on our STM32H7343 when communicating with the bootloader API. Would be great to have this fixed.

The PR #60765 fixes for all the stm32H7, isn't it.
Now Expecting reviewers/approvers

labbott added a commit to oxidecomputer/hubris that referenced this issue Oct 18, 2024
The STM32H7 is haunted/contains an errata where the CPU may
speculate into a system area (`0x1ff0_0000`) and trigger a flash
error normally only seen when security features are enabled.
This is not an official errata but seems to be well(?) reported

https://community.st.com/t5/stm32-mcus-products/spurious-rdperr-and-rdserr-when-all-protection-and-security/td-p/279852
zephyrproject-rtos/zephyr#60449

One suggested workaround is to add an extra MPU region which
marks The Danger Zone as NX. For the way hubris is designed this
isn't plausible. Instead clear any errors before we attempt to
write to flash.
labbott added a commit to oxidecomputer/hubris that referenced this issue Oct 23, 2024
The STM32H7 is haunted/contains an errata where the CPU may
speculate into a system area (`0x1ff0_0000`) and trigger a flash
error normally only seen when security features are enabled.
This is not an official errata but seems to be well(?) reported

https://community.st.com/t5/stm32-mcus-products/spurious-rdperr-and-rdserr-when-all-protection-and-security/td-p/279852
zephyrproject-rtos/zephyr#60449

One suggested workaround is to add an extra MPU region which
marks The Danger Zone as NX. For the way hubris is designed this
isn't plausible. Instead clear any errors before we attempt to
write to flash.
labbott added a commit to oxidecomputer/hubris that referenced this issue Oct 24, 2024
The STM32H7 is haunted/contains an errata where the CPU may
speculate into a system area (`0x1ff0_0000`) and trigger a flash
error normally only seen when security features are enabled.
This is not an official errata but seems to be well(?) reported

https://community.st.com/t5/stm32-mcus-products/spurious-rdperr-and-rdserr-when-all-protection-and-security/td-p/279852
zephyrproject-rtos/zephyr#60449

One suggested workaround is to add an extra MPU region which
marks The Danger Zone as NX. For the way hubris is designed this
isn't plausible. Instead clear any errors before we attempt to
write to flash.
labbott added a commit to oxidecomputer/hubris that referenced this issue Oct 24, 2024
The STM32H7 is haunted/contains an errata where the CPU may
speculate into a system area (`0x1ff0_0000`) and trigger a flash
error normally only seen when security features are enabled.
This is not an official errata but seems to be well(?) reported

https://community.st.com/t5/stm32-mcus-products/spurious-rdperr-and-rdserr-when-all-protection-and-security/td-p/279852
zephyrproject-rtos/zephyr#60449

One suggested workaround is to add an extra MPU region which
marks The Danger Zone as NX. For the way hubris is designed this
isn't plausible. Instead clear any errors before we attempt to
write to flash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants