-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Bounds checking should be eliminated in obvious case #30112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@bluss provided a reduced test case: http://is.gd/0vOZs0 |
I'm pretty sure this is an LLVM/backend issue. LLVM's bounds-check removal analysis is really brittle. It's possible there's better flags we could be passing it to try to eliminate bounds checks. I seem to recall @huonw mentioning one once? |
This kind of thing (playground) elides the bounds check for pub fn split_at(input: &[u8]) -> &[u8] {
for i in 0..input.len() {
if input[i] == b' ' {
return &input[..i]
}
}
&[]
} |
LLVM removes the first bounds check, because that is the same as the loop condition, i.e. |
@dotdash That's unfortunate. Do you know if that is a reason for the original example too? |
What could be the reason for llvm to keep bound checks when using nom but not when using chomp? |
@Geal could you provide the IR and the asm code for that function? I know the asm is in those pdfs, but I have a hard time reading that. You can get them with |
@dotdash here: https://gist.github.com/Geal/e7ebe638862828420863 It might be slightly different, since I built with a more recent rustc (1.6.0-dev (4867df4 2015-11-29) ), but the performance is the same. |
IRCE: #22987 http://llvm.org/docs/Frontend/PerformanceTips.html also says:
|
@Geal the remaining checks are the ones that check that start is less than or equal to end for the range. Due to the the ranges including the end, that makes the actual bounds check go away. What I wonder is why there is a check that compares against 2 here: https://gist.github.com/Geal/e7ebe638862828420863#file-nom_http-ll-L216 I can't seem to figure out to which part of the code that belongs. Any idea? |
Triage: rustc 1.13.0-nightly (4f9812a 2016-09-21)
#![crate_type="lib"]
pub fn split_to(input: &[u8]) -> &[u8] {
match input.iter().position(|x| *x == b' ') {
Some(i) => &input[..i],
None => &[],
}
}
pub fn split_from(input: &[u8]) -> &[u8] {
match input.iter().position(|x| *x == b' ') {
Some(i) => &input[i..],
None => &[],
}
}
pub fn find(input: &[u8]) -> Option<&u8> {
match input.iter().position(|x| *x == b' ') {
Some(i) => Some(&input[i]),
None => None,
}
}
|
The optimizer likes the indices best at this point. No bounds checks here (beta & nightly) pub fn split_to(input: &[u8]) -> &[u8] {
match (0..input.len()).find(|&i| input[i] == b' ') {
Some(i) => &input[..i],
None => &[],
}
} Optimization is simple: The range ensures |
Triage: I'm not great with asm, but https://godbolt.org/z/wj6kYl It appears that |
https://godbolt.org/z/Wb46fYsPK If I understand it correctly - after I altered the default order of LLVM passes and inserted IRCE there (I also added additional LICM and unswitch if I remember correctly - I just followed Pass Ordering tips.) it managed to "clone" the loop and have a fast loop where it doesn't poll for bounds checks 😏 |
Looks to me like all the bounds checks in the original examples are gone now: https://godbolt.org/z/fcexGna7s |
Thanks for checking! I think this can be closed then? |
Hi,
Following the benchmarks comparing the nom and chomp parser combinator libraries, I investigated the performance difference and found something interesting: in code that is nearly equivalent, rustc generates a lot more stuff with nom than chomp, and keeps bounds check where the parsers make sure no error should happen.
For the comparison, here are some code examples, in nom:
in chomp:
(we could change
take_while1!(not_line_ending)
to atake_till!(is_end_of_line)
and it would achieve the same code).Once the macros are processed, it gives the following code in nom: https://gist.github.com/Geal/fa3740cf45530d123023
chomp uses the same approach, with iterators, in its version of
take_while1
and take_till`: https://github.com/m4rw3r/chomp/blob/master/src/parsers.rs#L208-L253Now, the interesting thing is the assembly generated by rustc ( 1.5.0-dev (ea2dabf 2015-10-21), but the version from yesterday has the same issues). Here is the nom version: http://dev.unhandledexpression.com/nom_http.pdf
And the chomp version: http://dev.unhandledexpression.com/chomp_http.pdf
We can see that nom's code is a lot more complex:
Err
destructor (it is expected, but I'd like to improve that as well)I would like to know if there is a way to improve code generation. If the issue is in rustc, I can provide as many test cases as you need. If it is in nom, I'm open to any ideas ;)
The text was updated successfully, but these errors were encountered: