Skip to content

cache: stm32: add cortex-m33 peripheral driver #88585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lindblandro
Copy link

@lindblandro lindblandro commented Apr 14, 2025

STM32 Cortex-M33, such as the U5xxx series, have a cache peripheral for instruction and data caches, which are not present in the C-M33 architecture spec.

Closes #71268

@lindblandro
Copy link
Author

Unsure where the cache driver should be enabled and which Kconfig options should be activated by default. This also now uses the STM32 HAL symbols (DCACHE1 DCACHE2) directly in conditional compilation paths instead of checking for device support. Checking for chip type explicitly might be more readable.

I initially figured that the SOC Kconfig files would select HAS_ICACHE and HAS_DCACHE but that triggered a code path in the arch startup code that tried to use CMSIS to activate the Arm caches, which this core doesn't have.

@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch 2 times, most recently from 7f16427 to 0ebf7d1 Compare April 14, 2025 11:36
Copy link
Collaborator

@JarmouniA JarmouniA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following should be added to Kconfig.defconfig of M33 SoC series that have the cache.

config CACHE_STM32_CACHE
	default y if EXTERNAL_CACHE

config CACHE_MANAGEMENT
	default y

choice CACHE_TYPE
	default EXTERNAL_CACHE
endchoice

Also, all LL_ICACHE_x calls should be replaced by their corresponding cache_instr_x throughout the code.

@lindblandro
Copy link
Author

The following should be added to Kconfig.defconfig of M33 SoC series that have the cache.

config CACHE_STM32_CACHE
	default y if EXTERNAL_CACHE

config CACHE_MANAGEMENT
	default y

choice CACHE_TYPE
	default EXTERNAL_CACHE
endchoice

Also, all LL_ICACHE_x calls should be replaced by their corresponding cache_instr_x throughout the code.

Okay so I was initially in the right path as I had more or less what you suggest but I completely missed the CACHE_TYPE option, which explains why everything broke down. I'll add those in another commit. Thanks.

@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch from 0ebf7d1 to 915ca07 Compare April 15, 2025 08:24
Copy link
Collaborator

@mathieuchopstm mathieuchopstm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please modify the driver so that it works on SoCs without DCACHE? (e.g., STM32L5/STM32H503/STM32WBA)

(Assuming CONFIG_DCACHE is not defined on these series, I guess you could gate all cache_data_* functions behind it?)

@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch from 915ca07 to ad42647 Compare April 15, 2025 08:44
@lindblandro
Copy link
Author

Could you please modify the driver so that it works on SoCs without DCACHE? (e.g., STM32L5/STM32H503/STM32WBA)

(Assuming CONFIG_DCACHE is not defined on these series, I guess you could gate all cache_data_* functions behind it?)

Sure. I was already working on STM32WBA but wasn't aware of L5 / H503. The existing L5 soc init code doesn't set the 1-way cache like the others. Should that be changed in this PR or should the cache associativity be added as a configuration option?

The cache_* functions are already gated inside the sys_cache_* interface so they shouldn't end up in the final image. I guess they can be removed from the build completely.

@mathieuchopstm
Copy link
Collaborator

Sure. I was already working on STM32WBA but wasn't aware of L5 / H503. The existing L5 soc init code doesn't set the 1-way cache like the others. Should that be changed in this PR or should the cache associativity be added as a configuration option?

I guess it could be added as Kconfig option.

The cache_* functions are already gated inside the sys_cache_* interface so they shouldn't end up in the final image. I guess they can be removed from the build completely.

The issue isn't about their presence in the binary, but rather that on such series, the LL_DCache_xxx functions most likely don't exist. Even if it's dead code, trying to use them from the driver should break the build if so.
(But if you're working on WBA, I guess you will see very quickly whether it builds without guards or not 🙂)

@lindblandro
Copy link
Author

lindblandro commented Apr 15, 2025

Sure. I was already working on STM32WBA but wasn't aware of L5 / H503. The existing L5 soc init code doesn't set the 1-way cache like the others. Should that be changed in this PR or should the cache associativity be added as a configuration option?

I guess it could be added as Kconfig option.

I wonder if there's any solid reasoning behind the current default for L5. The original commit didn't mention any and all the other chips use direct mapping as their default. I'll add the Kconfig option in any case.

The cache_* functions are already gated inside the sys_cache_* interface so they shouldn't end up in the final image. I guess they can be removed from the build completely.

The issue isn't about their presence in the binary, but rather that on such series, the LL_DCache_xxx functions most likely don't exist. Even if it's dead code, trying to use them from the driver should break the build if so. (But if you're working on WBA, I guess you will see very quickly whether it builds without guards or not 🙂)

Indeed I did 😅.

@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch from ad42647 to 2b0ebfd Compare April 16, 2025 12:33
@github-actions github-actions bot added area: Flash area: Sensors Sensors area: HWINFO Hardware Information Driver labels Apr 16, 2025
@github-actions github-actions bot requested review from asm5878 and benothmn-st April 16, 2025 12:34
@github-actions github-actions bot requested review from HoZHel and MaureenHelm April 16, 2025 12:34
Copy link
Collaborator

@JarmouniA JarmouniA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -103,14 +103,14 @@ static int stm32_vref_channel_get(const struct device *dev, enum sensor_channel
* STM32H5X: accesses to flash RO region must be done with caching disabled.
*/
#if defined(CONFIG_SOC_SERIES_STM32H5X)
LL_ICACHE_Disable();
sys_cache_instr_disable();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be out of scope for this PR, but is this still applicable to H5 only? or should we check for CONFIG_EXTERNAL_CACHE && CONFIG_ICACHE instead?
The same qst for the check in stm32_temp.c & hwinfo_stm32.c

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think EXTERNAL_CACHE and ICACHE could be used. This explanation by @mathieuchopstm #57178 (comment) points out the issue with cache refills. The issue is with cache line width, which the application note AN5212 summarizes as 128bits for all mentioned devices. The reference manuals for U5 is a bit coy on the matter, but it does mention that OTP is through register reads only and the cache line width is naturally documented without mention to these parts of the flash.

However, the cache implementation might be better and the OTP area could in theory be treated as non-cacheable. I'm mentioning this as I've been using both vref and temp sensors in U585 and don't recall any bus errors even though ICACHE was enabled by soc init code. I'll run a quick check tomorrow and compare H5 and U5 RMs a bit more thoroughly.

The docs could definitely be a lot better on this matter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested simply with

 #if defined(CONFIG_SOC_SERIES_STM32H5X)
 	sys_cache_instr_disable();
 #endif /* CONFIG_SOC_SERIES_STM32H5X */
-
+	LOG_INF("ICACHE is: %s", LL_ICACHE_IsEnabled() ? "enabled" : "disabled");
 	/* Calculate VREF+ using VREFINT bandgap voltage and calibration data */
 	vref = (cfg->cal_mv * ((*cfg->cal_addr) >> cfg->cal_shift)) / data->raw;

and I get this in the logs:

[00:01:34.086,000] <inf> stm32_vref: ICACHE is: enabled

while reading vref (and core temperature) in a loop. This is with board b_u585i_iot02a/stm32u585xx/ns. So it seems it's working better at least in U585.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference manuals for U5 is a bit coy on the matter, but it does mention that OTP is through register reads only and the cache line width is naturally documented without mention to these parts of the flash.

Can you point out which part of the RM? I can't seem to find such a notice 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference manuals for U5 is a bit coy on the matter, but it does mention that OTP is through register reads only and the cache line width is naturally documented without mention to these parts of the flash.

Can you point out which part of the RM? I can't seem to find such a notice 🤔

Hmm, it was getting late and I might've mixed OTP and option bytes while reading section § 7.3.1 where it says

- 512 bytes OTP (one-time programmable) bytes for user data (32 quad-words).
The OTP data cannot be erased and can be written only once.
- option bytes for user configuration. Unlike user flash memory and system memory,
it is not mapped to any memory address and can be accessed only through the
flash register interface.

I think I just blasted through those two paragraphs, ignoring the second bullet where it starts to talk about option bytes. Sorry for the confusion.

The VREF calibration register is located after the OTP block at 0x0BFA07A5 (OTP ends at 0x0BFA0200). That and the internal temperature sensor registers seem to work with ICACHE in any case.

@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch from 2b0ebfd to 35120b3 Compare April 17, 2025 09:41
Henrik Lindblom added 2 commits April 17, 2025 13:12
STM32 Cortex-M33, such as the L5/H5/U5 series, have a cache peripheral for
instruction and data caches, which are not present in the C-M33
architecture spec.

The driver defaults to direct mapped cache as it uses less power than the
alternative set associative mapping [1]. This has also been the default in
stm32 soc initialization code for chips that have the ICACHE peripheral,
which makes it the safest choice for backward compatibility. The exception
to the rule is STM32L5, which has the n-way cache mode selected in SOC
code.

[1]: https://en.wikipedia.org/wiki/Cache_placement_policies

Signed-off-by: Henrik Lindblom <[email protected]>
Use the Zephyr cache API in soc initialization code instead of calling the
HAL directly. The change does not modify the pre-existing cache settings,
just changes the path they are enabled.

Signed-off-by: Henrik Lindblom <[email protected]>
@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch 2 times, most recently from 2d83f15 to 81bde83 Compare April 17, 2025 10:45
Use cache API for disabling and enabling ICACHE. The driver handles waiting
for ongoing cache invalidation.

Signed-off-by: Henrik Lindblom <[email protected]>
@lindblandro lindblandro force-pushed the stm32-cache-peripheral-driver branch from 81bde83 to fa373e9 Compare April 17, 2025 11:15
Comment on lines +13 to +16
# "default n" for L5 is legacy - could be removed?
config CACHE_STM32_ICACHE_DIRECT_MAPPING
bool "Use 1-way associative mapping for ICACHE"
default n if SOC_SERIES_STM32L5X
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be ok to move L5 to 1-way associated mapping, but we should document the change in migration guide and mention how to revert.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's probably fine, but I'd suggest doing that in a separate PR.

@kartben kartben merged commit 1a3fb4a into zephyrproject-rtos:main Apr 25, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stm32: Implement cache driver (C-M33 based series)
7 participants