Description
Bugzilla Link | 478 |
Resolution | FIXED |
Resolved on | Feb 22, 2010 12:51 |
Version | 1.0 |
OS | All |
Extended Description
Consider this simple function:
int f(int a, int b){
return a * a + 2 * a * b + b * b;
}
We currently generate this X86 code for it:
f:
subl $4, %esp
movl %esi, (%esp)
movl 8(%esp), %ecx
movl 12(%esp), %eax
movl %ecx, %edx
imull %edx, %edx
movl %eax, %esi
imull %ecx, %esi
addl %esi, %esi
imull %eax, %eax
addl %edx, %eax
addl %esi, %eax
movl (%esp), %esi
addl $4, %esp
ret
... uh, yuck.
GCC generates this:
f:
movl 4(%esp), %eax
movl 8(%esp), %edx
movl %eax, %ecx
imull %eax, %ecx
imull %edx, %eax
imull %edx, %edx
leal (%ecx,%eax,2), %eax
addl %edx, %eax
ret
Which is much nicer.
It would be even better if we reassociated it into something like:
int f(int a, int b) {
return a * (a + 2 * b) + b * b;
}
... which would save a multiply (this is an instcombine thing).
The PPC backend is producing nice code:
_f:
mullw r2, r3, r3
mullw r3, r4, r3
rlwinm r3, r3, 1, 0, 30
mullw r4, r4, r4
add r2, r4, r2
add r3, r2, r3
blr
The X86 backend should do so too! :)
-Chris