Skip to content

On 32-bit x86, half return ABI is incorrect when SSE is enabled but SSE2 is disabled #112890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
beetrees opened this issue Oct 18, 2024 · 6 comments
Labels
ABI Application Binary Interface backend:X86

Comments

@beetrees
Copy link
Contributor

Consider the following IR (compiler explorer):

target triple = "i586-unknown-linux-gnu"

define half @only_sse(half) #0 {
    ret half %0
}

attributes #0 = { "target-features"="+sse,-sse2" }

define half @sse_and_sse2(half) #1 {
    ret half %0
}

attributes #1 = { "target-features"="+sse,+sse2" }

The 32-bit x86 ABI for returning half is specified as using the xmm0 register. As both only_sse and sse_and_sse2 have SSE registers available, they should both be able to use the specified ABI. However, LLVM currently only compiles sse_and_sse2 correctly, with only_sse incorrectly returning the half in eax instead.

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/issue-subscribers-backend-x86

Author: None (beetrees)

Consider the following IR ([compiler explorer](https://godbolt.org/z/nMrcsed1a)):
target triple = "i586-unknown-linux-gnu"

define half @<!-- -->only_sse(half) #<!-- -->0 {
    ret half %0
}

attributes #<!-- -->0 = { "target-features"="+sse,-sse2" }

define half @<!-- -->sse_and_sse2(half) #<!-- -->1 {
    ret half %0
}

attributes #<!-- -->1 = { "target-features"="+sse,+sse2" }

The 32-bit x86 ABI for returning half is specified as using the xmm0 register. As both only_sse and sse_and_sse2 have SSE registers available, they should both be able to use the specified ABI. However, LLVM currently only compiles sse_and_sse2 correctly, with only_sse incorrectly returning the half in eax instead.

@phoebewang
Copy link
Contributor

We don't have instructions to load/store half to xmm registers. Likely, we use GPR too when x87 is not usable https://godbolt.org/z/TE8c99aTK. The only problem is we should diagnose for them, I have a proposal to verify ABI, see #111690.

@beetrees
Copy link
Contributor Author

We don't have instructions to load/store half to xmm registers.

AFAIK, the movss instruction is available when only sse is enabled and can be used to load and store values to and from xmm0 via the stack.

@phoebewang
Copy link
Contributor

We don't have instructions to load/store half to xmm registers.

AFAIK, the movss instruction is available when only sse is enabled and can be used to load and store values to and from xmm0 via the stack.

If you mean by transiting through stack for each load/store, yes, we can do it in this way, but it's not worth the complexity. SSE was designed for float type only, even double is not supported without SSE2, not to mention half. Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK. Other front end should follow in the same way.

@RalfJung
Copy link
Contributor

Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK

This doesn't show any error?

@phoebewang
Copy link
Contributor

Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK

This doesn't show any error?

Sorry, here is the right link https://godbolt.org/z/GYETdPaEe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ABI Application Binary Interface backend:X86
Projects
None yet
Development

No branches or pull requests

5 participants