-
Notifications
You must be signed in to change notification settings - Fork 288
Fix has_cpuid implementation #492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
7f12738
a374f60
f87c949
5760ad3
b12aef6
277a49a
3605ad3
c70093b
6ee5955
9f9427e
1831237
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,34 +78,53 @@ pub unsafe fn __cpuid(leaf: u32) -> CpuidResult { | |
} | ||
|
||
/// Does the host support the `cpuid` instruction? | ||
#[inline] | ||
#[inline(never)] | ||
pub fn has_cpuid() -> bool { | ||
#[cfg(target_arch = "x86_64")] | ||
{ | ||
true | ||
} | ||
#[cfg(target_arch = "x86")] | ||
{ | ||
use coresimd::x86::{__readeflags, __writeeflags}; | ||
|
||
// On `x86` the `cpuid` instruction is not always available. | ||
// This follows the approach indicated in: | ||
// http://wiki.osdev.org/CPUID#Checking_CPUID_availability | ||
unsafe { | ||
// Read EFLAGS: | ||
let eflags: u32 = __readeflags(); | ||
|
||
// Invert the ID bit in EFLAGS: | ||
let eflags_mod: u32 = eflags | 0x0020_0000; | ||
|
||
// Store the modified EFLAGS (ID bit may or may not be inverted) | ||
__writeeflags(eflags_mod); | ||
|
||
// Read EFLAGS again: | ||
let eflags_after: u32 = __readeflags(); | ||
|
||
// Check if the ID bit changed: | ||
eflags_after != eflags | ||
// On `x86` the `cpuid` instruction is not always available. | ||
// This follows the approach indicated in: | ||
// http://wiki.osdev.org/CPUID#Checking_CPUID_availability | ||
// https://software.intel.com/en-us/articles/using-cpuid-to-detect-the-presence-of-sse-41-and-sse-42-instruction-sets/ | ||
// which detects whether `cpuid` is available by checking whether the 21st bit of the EFLAGS register is modifiable or not. | ||
// If it is, then `cpuid` is available. | ||
let result: u32; | ||
let _temp: u32; | ||
unsafe { | ||
asm!(r#" | ||
# Read eflags into $0 and copy into $1: | ||
pushfd | ||
pop $0 | ||
mov $1, $0 | ||
# Flip 21st bit: | ||
xor $0, 0x200000 | ||
# Set eflags: | ||
push $0 | ||
popfd | ||
# Read eflags again, if cpuid is available | ||
# the 21st bit will be flipped, otherwise it | ||
# it will have the same value as the original in $1: | ||
pushfd | ||
pop $0 | ||
# Xor'ing with the original eflags should have the | ||
# 21st bit set to true if cpuid is available and zero | ||
# otherwise. All other bits have not been modified and | ||
# are zero: | ||
xor $0, $1 | ||
# Store in $0 the value of the 21st bit | ||
shr $0, 21 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know this was in @stevecheckoway's version too, but I wonder why keep shifting if you're going to check for non-zero anyway? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly, the real reason that I left the shift is because I didn't want to read this table in sufficient detail to determine if there were any important interactions between the There's actually a good reason to do the shift (or use Here's an example program demonstrating this: #![feature(asm)]
fn main() {
let eflags1: u32;
let eflags2: u32;
unsafe {
asm!(r#"
pushfd
mov $0, dword ptr [esp]
popfd
pushfd
pop $1
"#
: "=r"(eflags1), "=r"(eflags2)
:
: "cc", "memory"
: "intel");
}
println!("{:x}", eflags1);
println!("{:x}", eflags2);
} Running it outside the debugger gives
so you can see that the parity flag is changing between runs for some reason, but the two attempts reading the flags are consistent. Now let's do it with a debugger (edited to remove the unhelpful output)
So as you can see, the other flags really can change. Now I don't have access to an old-enough Intel processor that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great point. If the other bits could change, it would surely be safer to do the bitwise and? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I'm really tempted to be minimally smart here. It seems best to just stick to the exact sequence of instructions Intel suggests in the document linked in the comment, they gotta know best, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don’t think there’s any danger of Intel adding new flags to eflags and all of bits 22 through 31 (63 in rflags) are reserved and have value zero. That said, not doing a shift but instead masking result and testing for nonzero in Rust code and not in assembly should compile to a Using Intel’s code is fine too. It’s just doing some unnecessary work There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the 32-bit toolchains supported by rustup, only one potentially doesn't have the
I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LLVM supports more targets than the ones rustc supports and these can be targeted using I think it is a good idea to add optimizations of this feature for targets that are known to have the cpuid function. The easiest one might be to return EDIT: I've filled #497 , PRs welcome ;) |
||
"# | ||
: "=r"(result), "=r"(_temp) | ||
: | ||
: "cc", "memory" | ||
: "intel"); | ||
} | ||
result != 0 | ||
} | ||
} | ||
} | ||
|
@@ -138,17 +157,8 @@ mod tests { | |
assert!(cpuid::has_cpuid()); | ||
} | ||
|
||
#[cfg(target_arch = "x86")] | ||
#[test] | ||
fn test_has_cpuid() { | ||
unsafe { | ||
let before = __readeflags(); | ||
|
||
if cpuid::has_cpuid() { | ||
assert!(before != __readeflags()); | ||
} else { | ||
assert!(before == __readeflags()); | ||
} | ||
} | ||
fn test_has_cpuid_idempotent() { | ||
assert_eq!(cpuid::has_cpuid(), cpuid::has_cpuid()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?