Cortex-R Floating Point Support #44753

bbolen · 2022-04-11T14:48:45Z

This PR adds support for the Cortex-R VFP unit. It can also be used to enable floating point support for Cortex-A, though that is untested. A couple of the basic ideas behind this PR:

Kconfig options are used to select the VFP options for the Cortex-R implemented in the SoC. FPU/FPU_SHARING are still used to determine whether or not to pass the hard or soft float flags to the compiler.
The space for the floating point registers is always allocated on the process stack during context switches, but the registers are not saved unless necessary.

I pulled (and heavily modified) a couple of patches @stephanosio pointed me to a long time ago. Let me know if you want me to change the author so that git blame won't implicate you if something goes wrong.

Closes #19979

stephanosio

Looks ok in general.

Just a few comments.

tests/kernel/fpu_sharing/generic/src/float_regs_arm_gcc.h

tests/kernel/fpu_sharing/float_disable/src/k_float_disable.c

stephanosio · 2022-04-12T14:20:17Z

tests/kernel/fpu_sharing/generic/testcase.yaml

@@ -21,21 +21,21 @@ tests:
    tags: fpu kernel
    timeout: 600
  kernel.fpu_sharing.generic.riscv32:
-    extra_args: PI_NUM_ITERATIONS=500
+    extra_args: PI_NUM_ITERATIONS=500 DISABLE_INT_TEST=1


What is the rationale behind disabling the timer testcase for these platforms?

Those particular platforms fail the test. Unfortunately, I don't enough about these platforms to fix them myself. I updated the commit message to note this.

Can you make 48b8439a0d4123d8abf83d08e170dbcc798e68cd a separate PR? This needs a more thorough review from the maintainers of all these platforms.

p.s. ~~it also fails on ARC when testing with nSIM.~~ succeeds with PI_NUM_ITERATIONS=500.
p.s. 2. riscv32 and riscv64 succeed also with PI_NUM_ITERATIONS=500 ... anyways, let's review this in a separate PR so as to not block this.

arch/arm/core/aarch32/thread.c

arch/arm/Kconfig

povergoing · 2022-04-14T01:31:47Z

Just wonder, these are only for v7r vfp, right?

stephanosio · 2022-05-02T11:43:22Z

@julien-massot @povergoing FYI, this needs to be extended to AARCH32_ARMV8_R as well.

microbuilder · 2022-05-02T11:49:56Z

I was planning to do that while I was working on the Cortex-R stuff, but never got to do it. We should definitely do it at some point.

Perhaps during the 3.2 cycle?

tests/kernel/fpu_sharing/generic/CMakeLists.txt

tests/kernel/fpu_sharing/generic/src/float_regs_arm_gcc.h

stephanosio · 2022-05-02T13:57:49Z

tests/kernel/fpu_sharing/generic/testcase.yaml

@@ -21,21 +21,21 @@ tests:
    tags: fpu kernel
    timeout: 600
  kernel.fpu_sharing.generic.riscv32:
-    extra_args: PI_NUM_ITERATIONS=500
+    extra_args: PI_NUM_ITERATIONS=500 DISABLE_INT_TEST=1


Can you make 48b8439a0d4123d8abf83d08e170dbcc798e68cd a separate PR? This needs a more thorough review from the maintainers of all these platforms.

p.s. ~~it also fails on ARC when testing with nSIM.~~ succeeds with PI_NUM_ITERATIONS=500.
p.s. 2. riscv32 and riscv64 succeed also with PI_NUM_ITERATIONS=500 ... anyways, let's review this in a separate PR so as to not block this.

Reuse the Cortex-M paths for testing the floating point unit. Signed-off-by: Bradley Bolen <[email protected]>

For testing, assume that the Cortex-A/R platforms are using a GIC interrupt controller. Use the last GIC SGI to trigger an interrupt for the test. Signed-off-by: Bradley Bolen <[email protected]>

This commit adds the unified floating-point configuration symbols for the ARM architectures. These configuration symbols allow specification of the floating-point coprocessors, such as VFP (also known as FP for Cortex-M) and NEON, for the ARM architectures. Signed-off-by: Stephanos Ioannidis <[email protected]>

This commit updates the Zephyr build system to support specifying advanced floating-point compilation options derived from the newly introduced unified floating-point configurations. The following changes are introduced by this commit: 1. Specify architecture floating-point option to the `-mcpu` flag. 2. Specify floating-point unit (FPU) type using the `-mfpu` flag. Note that the `-march` flag is not specified separately because the `-mcpu` flag provides more detailed architecture options and this makes the `-march` flag redundant. Signed-off-by: Stephanos Ioannidis <[email protected]>

When Dual-redundant Core Lock-step (DCLS) topology is used, the VFP registers across the two redundant cores must be manually initialised and synchronised, and this requires the `-mfloat-abi=hard` option to be specified. This commit forces the use of FP "hard" ABI on the VFP-equipped cores that are configured in DCLS topology. Signed-off-by: Stephanos Ioannidis <[email protected]>

Cortex-A/R use a descending stack frame and the hardware does not help with the stacking. This led to some less than desirable workarounds in the exception code where the basic stack frame was saved twice. Rearranging the order of the exception stack frame removes that problem and provides a clearer path to saving CPU context in a fully descending manner. Signed-off-by: Bradley Bolen <[email protected]>

Grouping the FPU registers together will make adding FPU support for Cortex-A/R easier later. It provides the ability to get the sizeof and offsetof FPU registers easier. Signed-off-by: Bradley Bolen <[email protected]>

This will enable the VFP unit on boot to handle the case where FPU_SHARING is not enabled. Signed-off-by: Bradley Bolen <[email protected]>

This commit updates the Cortex-R reset routine to initialise (synchronise) the VFP D16-D31 registers when Dual-redundant Core Lock-step (DCLS) is enabled. Signed-off-by: Stephanos Ioannidis <[email protected]>

This adds lazy floating point context switching. On svc/irq entrance, the VFP is disabled and a pointer to the exception stack frame is saved away. If the esf pointer is still valid on exception exit, then no other context used the VFP so the context is still valid and nothing needs to be restored. If the esf pointer is NULL on exception exit, then some other context used the VFP and the floating point context is restored from the esf. The undefined instruction handler is responsible for saving away the floating point context if needed. If the handler is in the first irq/svc context and the current thread uses the VFP, then the float context needs to be saved. Also, if the handler is in a nested context and the previous context was using the FVP, save the float context. Signed-off-by: Bradley Bolen <[email protected]>

This SoC supports vfpv3-d16 with single and double precision and 16 64-bit registers. Signed-off-by: Bradley Bolen <[email protected]>

bbolen · 2022-05-02T17:39:17Z

@stephanosio I created #45286 to review the new interleaved FPU test separately and fixed up the additional issues you pointed out. Thanks.

microbuilder · 2022-05-03T10:36:44Z

Approved too soon, there are some compliance check issues, but should be easy to address.

stephanosio · 2022-05-03T10:55:40Z

there are some compliance check issues

This warning can be ignored since typedef usage here is justified.

-:1022: WARNING:NEW_TYPEDEFS: do not add new typedefs
#1022: FILE: arch/arm/include/kernel_arch_data.h:48:
+typedef struct __fpu_sf _fpu_sf_t;

povergoing · 2022-05-05T01:50:31Z

@julien-massot @povergoing FYI, this needs to be extended to AARCH32_ARMV8_R as well.

Yes, agreed. @SgrrZhf May take a look

bbolen requested review from dcpleung, andyross, nashif, MaureenHelm, galak, ioannisg, carlescufi, tejlmand and stephanosio as code owners April 11, 2022 14:48

github-actions bot added area: API Changes to public APIs area: ARM ARM (32-bit) Architecture area: Build System area: Kernel area: Tests Issues related to a particular existing or missing test labels Apr 11, 2022

stephanosio requested review from microbuilder and povergoing April 11, 2022 17:30

bbolen force-pushed the cortex_r_fpu2 branch from a58ff43 to 194fcf4 Compare April 11, 2022 18:55

zephyrbot requested review from carlocaione, ceolin and peter-mitsis April 12, 2022 12:11

zephyrbot assigned microbuilder Apr 12, 2022

peter-mitsis previously approved these changes Apr 12, 2022

View reviewed changes

stephanosio requested a review from ibirnbaum April 12, 2022 14:09

microbuilder assigned povergoing Apr 12, 2022

stephanosio requested changes Apr 12, 2022

View reviewed changes

bbolen dismissed peter-mitsis’s stale review via ff42d22 April 13, 2022 01:13

bbolen force-pushed the cortex_r_fpu2 branch 2 times, most recently from ff42d22 to 20c0bbf Compare April 13, 2022 01:21

povergoing reviewed Apr 14, 2022

View reviewed changes

arch/arm/Kconfig Outdated Show resolved Hide resolved

stephanosio requested changes May 2, 2022

View reviewed changes

stephanosio added the Release Notes To be mentioned in the release notes label May 2, 2022

Bradley Bolen and others added 11 commits May 2, 2022 10:35

tests: fpu_sharing: Enable support for Cortex-R

2bf7680

Reuse the Cortex-M paths for testing the floating point unit. Signed-off-by: Bradley Bolen <[email protected]>

tests: fpu_sharing: Support FPU disable test for Cortex-A/R

2ab7c39

For testing, assume that the Cortex-A/R platforms are using a GIC interrupt controller. Use the last GIC SGI to trigger an interrupt for the test. Signed-off-by: Bradley Bolen <[email protected]>

arch: arm: aarch32: Create a fpu stack frame

4271eb5

Grouping the FPU registers together will make adding FPU support for Cortex-A/R easier later. It provides the ability to get the sizeof and offsetof FPU registers easier. Signed-off-by: Bradley Bolen <[email protected]>

arch: arm: aarch32: Create z_arm_floating_point_init() for Cortex-R

eefb847

This will enable the VFP unit on boot to handle the case where FPU_SHARING is not enabled. Signed-off-by: Bradley Bolen <[email protected]>

arch: arm: cortex_r: Initialise VFP D32 registers for DCLS

42c6972

This commit updates the Cortex-R reset routine to initialise (synchronise) the VFP D16-D31 registers when Dual-redundant Core Lock-step (DCLS) is enabled. Signed-off-by: Stephanos Ioannidis <[email protected]>

soc: arm: xilinx_zynqmp: Enable the VFP

54c81c2

This SoC supports vfpv3-d16 with single and double precision and 16 64-bit registers. Signed-off-by: Bradley Bolen <[email protected]>

bbolen dismissed stale reviews from ibirnbaum and povergoing via 54c81c2 May 2, 2022 14:39

bbolen force-pushed the cortex_r_fpu2 branch from 6422080 to 54c81c2 Compare May 2, 2022 14:39

bbolen mentioned this pull request May 2, 2022

tests: fpu_sharing: Add interleaved isr/thread test #45286

Closed

stephanosio approved these changes May 2, 2022

View reviewed changes

stephanosio requested review from povergoing and ibirnbaum May 2, 2022 17:40

microbuilder approved these changes May 3, 2022

View reviewed changes

povergoing approved these changes May 5, 2022

View reviewed changes

stephanosio merged commit dfc4c3f into zephyrproject-rtos:main May 5, 2022

bbolen deleted the cortex_r_fpu2 branch May 6, 2022 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cortex-R Floating Point Support #44753

Cortex-R Floating Point Support #44753

bbolen commented Apr 11, 2022 •

edited by stephanosio

Loading

stephanosio left a comment

stephanosio Apr 12, 2022

bbolen Apr 13, 2022

stephanosio May 2, 2022 •

edited

Loading

povergoing commented Apr 14, 2022

stephanosio commented May 2, 2022 •

edited

Loading

microbuilder commented May 2, 2022

stephanosio May 2, 2022 •

edited

Loading

bbolen commented May 2, 2022

microbuilder commented May 3, 2022

stephanosio commented May 3, 2022

povergoing commented May 5, 2022

Cortex-R Floating Point Support #44753

Cortex-R Floating Point Support #44753

Conversation

bbolen commented Apr 11, 2022 • edited by stephanosio Loading

stephanosio left a comment

Choose a reason for hiding this comment

stephanosio Apr 12, 2022

Choose a reason for hiding this comment

bbolen Apr 13, 2022

Choose a reason for hiding this comment

stephanosio May 2, 2022 • edited Loading

Choose a reason for hiding this comment

povergoing commented Apr 14, 2022

stephanosio commented May 2, 2022 • edited Loading

microbuilder commented May 2, 2022

stephanosio May 2, 2022 • edited Loading

Choose a reason for hiding this comment

bbolen commented May 2, 2022

microbuilder commented May 3, 2022

stephanosio commented May 3, 2022

povergoing commented May 5, 2022

bbolen commented Apr 11, 2022 •

edited by stephanosio

Loading

stephanosio May 2, 2022 •

edited

Loading

stephanosio commented May 2, 2022 •

edited

Loading

stephanosio May 2, 2022 •

edited

Loading