-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Support calling functions with SIMD vectors that couldn't be used in the caller #132865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I found this issue via the LKML discussion, and I wanted to add a word or two of caution. I think you've covered two bases:
But the actual environments that software runs in are tricky and nasty, and register state management is especially tricky. Some machines (fortunately not x86, or at least not any x86 that Intel or AMD will admit to) are heterogenous, and threads can context switch to and from CPUs with different capabilities. And the register state themselves is generally managed by a horrible mechanism that needs to be babysat. In the context of Linux, this means that one needs permission to use various register sets, including x87 and SSE. On older systems, this may have meant twiddling the CR0.TS bit (although this could be sort of automatic in some contexts, and I admit I don't fully recall exactly how Linux did this, but I was the one who did much of the work to delete that code...). On current systems, in general, the contents of XMM registers may belong to userspace because XSAVE and friends are expensive and are done only as needed, and writing to XMM registers at the wrong time is egregiously unsound and, worse, won't even reliably affect the kernel -- it will just leak information to user code and may crash user code! So what the system actually wants is (psuedocode, names don't match Linux, etc):
and this is awful because it's very error-prone. I don't have a great solution for Rust. Maybe the act of crossing a target feature boundary (depending on direction?) should be unsafe and also be implementable in a way that has some kind of explicit barrier so that instructions and register allocation don't leak across the boundary? I can imagine calling a (safe) function that does the transition and calls a closure:
and somehow has the right type to make it sound. Or I can imagine Rust helping out less and merely enabling code to work in approximately the same unpleasant way that C works. But please be very cautious with any sort of analysis along the lines of "I can prove that this code is going to touch XMM regs, and therefore it's okay to touch XMM regs now." FWIW, this isn't necessarily restricted to kernel code. User code that does lightweight threading can switch GPRs right away and defer switching FPU regs until it feels like it. (And, in fact, I think x86's CR0.TS mechanism dates back to the 80286/287 era and was used when switching "tasks".) |
Thanks for raising these points! You will be happy to hear that both of these cases have been raised in discussions before, but I did not include an essay of everything that was ever said about target features in the issue description. :) However, I think they are unrelated to this issue. This issue is very specifically about what it says in the OP, whereas both points you raised are much wider discussions affecting not just the specific ability to invoke
I think the general consensus is that these are cursed environments, and hardly any system actually supports that. Rust currently only supports platforms where all CPUs present the same capabilities. Maybe smaller cores can use slower emulations for large vector types or so, but a target feature is only considered available if it is available on all cores that the process may be scheduled on. If you are playing funny games with heterogeneous systems, it is your responsibility to ensure this remains true.
The issues of enabling and disabling target features, e.g. inside the Linux kernel, has been discussed elsewhere and is orthogonal to this one. In particular, by stabilizing #134090 in its current form, we have decided that Rust will by default assume that a target feature, once available, will always remain available. Using target features the way the kernel does requires extra care -- basically, |
We now lint and will eventually error on this program:
The lint is necessary because the way we codegen this function would be unsound (and indeed, if you run this on the playground you can see that the argument value gets corrupted). See #116558 for more context.
However, there's no fundamental reason that we couldn't compile this code! We "just" need to generate the call to
with_target_feature
using its proper ABI, i.e., using the AVX registers. This is sound because the function anyway requires that target feature, so the caller must have already ensured that this target feature is available.The problem is that LLVM currently simply has no way to express such a call. So we have three options:
call
operations llvm/llvm-project#70563) -- I am told this is quite hardavx
feature gate, and calls the actual callee -- not a pretty solution since the extra function call is bad for performance, and performance is the reason people manually write SIMD code to begin withLucky enough, this only affects non-Rust ABIs, so users should only rarely run into this.
Cc @rust-lang/wg-llvm @rust-lang/opsem @chorman0773 @veluca93
The text was updated successfully, but these errors were encountered: