Skip to content

Switch lookup gets turned into a branch to a wrong address #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gergoerdi opened this issue May 5, 2017 · 7 comments
Closed

Switch lookup gets turned into a branch to a wrong address #46

gergoerdi opened this issue May 5, 2017 · 7 comments

Comments

@gergoerdi
Copy link
Collaborator

I have the following Rust code:

#[inline(never)]
pub fn decode(n: u8) -> bool {
    match n {
        0x0 => Some(()),
        0x1 => Some(()),
        0x2 => Some(()),
        0x3 => Some(()),
        0x6 => Some(()),
        _ => None
    }.is_some()
}

It is turned into the following LLVM IR:

; ModuleID = 'external.cgu-0.rs'
source_filename = "external.cgu-0.rs"
target datalayout = "e-p:16:16:16-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-n8"
target triple = "avr-atmel-none"

; Function Attrs: noinline nounwind readnone uwtable
define zeroext i1 @_ZN8external6decode17hac8ff9e06698feb0E(i8) unnamed_addr #0 {
start:
  %1 = icmp ult i8 %0, 7
  br i1 %1, label %switch.lookup, label %bb7

switch.lookup:                                    ; preds = %start
  %2 = sext i8 %0 to i16
  %3 = and i16 %2, -2
  %.cmp = icmp ne i16 %3, 4
  br label %bb7

bb7:                                              ; preds = %start, %switch.lookup
  %.sink = phi i1 [ %.cmp, %switch.lookup ], [ false, %start ]
  ret i1 %.sink
}

attributes #0 = { noinline nounwind readnone uwtable }

Compiling that LLVM IR, the resulting object file is, I believe, wrong.

$ avr-objdump -dr external-4c5561d448dd0019.o 

external-4c5561d448dd0019.o:     file format elf32-avr


Disassembly of section .text:

00000000 <_ZN8external6decode17hac8ff9e06698feb0E>:
   0:	87 30       	cpi	r24, 0x07	; 7
   2:	90 e0       	ldi	r25, 0x00	; 0
   4:	00 f4       	brcc	.+0      	; 0x6 <_ZN8external6decode17hac8ff9e06698feb0E+0x6>
			4: R_AVR_7_PCREL	.text+0x1e
   6:	28 2f       	mov	r18, r24
   8:	38 2f       	mov	r19, r24
   a:	33 0f       	add	r19, r19
   c:	33 0b       	sbc	r19, r19
   e:	2e 7f       	andi	r18, 0xFE	; 254
  10:	44 e0       	ldi	r20, 0x04	; 4
  12:	50 e0       	ldi	r21, 0x00	; 0
  14:	91 e0       	ldi	r25, 0x01	; 1
  16:	24 17       	cp	r18, r20
  18:	35 07       	cpc	r19, r21
  1a:	01 f4       	brne	.+0      	; 0x1c <_ZN8external6decode17hac8ff9e06698feb0E+0x1c>
			1a: R_AVR_7_PCREL	.text+0x26
  1c:	90 e0       	ldi	r25, 0x00	; 0

0000001e <LBB0_4>:
  1e:	91 70       	andi	r25, 0x01	; 1
  20:	89 2f       	mov	r24, r25
  22:	99 27       	eor	r25, r25
  24:	08 95       	ret

Note that the relocation for 0x1a targets 0x26, which is after the end of the function. I believe the correct target would be 0x1e.

@gergoerdi
Copy link
Collaborator Author

If I compile the same LLVM IR with -O0, I don't see any out-of-bound jumps (but the code gets too compilicated for me to understand if it's correct or not).

@gergoerdi
Copy link
Collaborator Author

I've done a bit more digging. Here's the result of instruction selection on my external::decode function:

*** MachineFunction at end of ISel ***
# Machine code for function _ZN8external6decode17hac8ff9e06698feb0E: IsSSA, TracksLiveness
Function Live Ins: %R24 in %vreg2

BB#0: derived from LLVM BB %start
    Live Ins: %R24
        %vreg2<def> = COPY %R24; GPR8:%vreg2
        %vreg4<def> = LDIRdK 0; LD8:%vreg4
        %vreg3<def> = COPY %vreg4; GPR8:%vreg3 LD8:%vreg4
        CPIRdK %vreg2, 7, %SREG<imp-def>, %SREG<imp-use>; GPR8:%vreg2
        BRSHk <BB#2>, %SREG<imp-use>
        RJMPk <BB#1>
    Successors according to CFG: BB#1(0x40000000 / 0x80000000 = 50.00%) BB#2(0x40000000 / 0x80000000 = 50.00%)

BB#1: derived from LLVM BB %switch.lookup
    Predecessors according to CFG: BB#0
        %vreg5<def> = SEXT %vreg2, %SREG<imp-def,dead>; DLDREGS:%vreg5 GPR8:%vreg2
        %vreg6<def,tied1> = ANDIWRdK %vreg5<tied0>, -2, %SREG<imp-def,dead>; DLDREGS:%vreg6,%vreg5
        %vreg7<def> = LDIWRdK 4; DLDREGS:%vreg7
        %vreg8<def> = LDIRdK 0; LD8:%vreg8
        %vreg9<def> = LDIRdK 1; LD8:%vreg9
        CPWRdRr %vreg6<kill>, %vreg7<kill>, %SREG<imp-def>; DLDREGS:%vreg6,%vreg7
        %vreg0<def> = Select8 %vreg9<kill>, %vreg8<kill>, 1, %SREG<imp-use>; GPR8:%vreg0 LD8:%vreg9,%vreg8
    Successors according to CFG: BB#2(?%)

BB#2: derived from LLVM BB %bb7
    Predecessors according to CFG: BB#0 BB#1
        %vreg1<def> = PHI %vreg3, <BB#0>, %vreg0, <BB#1>; LD8:%vreg1 GPR8:%vreg3,%vreg0
        %vreg10<def,tied1> = ANDIRdK %vreg1<tied0>, 1, %SREG<imp-def,dead>; LD8:%vreg10,%vreg1
        %vreg11<def> = ZEXT %vreg10<kill>, %SREG<imp-def,dead>; DREGS:%vreg11 LD8:%vreg10
        %R25R24<def> = COPY %vreg11; DREGS:%vreg11
        RET %R25R24<imp-use>

# End machine code for function _ZN8external6decode17hac8ff9e06698feb0E.

Shouldn't BB#1 jump to BB#2 at the end, after the Select8? It claims its successor is BB#2, but there's nothing in there to actually jump there.

Accordingly, the final machine code for BB#1 becomes empty:

********** MACHINEINSTRS **********
# Machine code for function _ZN8external6decode17hac8ff9e06698feb0E: NoPHIs, TracksLiveness
Function Live Ins: %R24 in %vreg2

0B      BB#0: derived from LLVM BB %start
            Live Ins: %R24
16B             %vreg2<def> = COPY %R24; GPR8:%vreg2
64B             CPIRdK %vreg2, 7, %SREG<imp-def>, %SREG<imp-use,kill>; GPR8:%vreg2
80B             %vreg13<def> = LDIRdK 0; LD8:%vreg13
96B             BRSHk <BB#4>, %SREG<imp-use,kill>
112B            RJMPk <BB#3>
            Successors according to CFG: BB#3(0x40000000 / 0x80000000 = 50.00%) BB#4(0x40000000 / 0x80000000 = 50.00%)

128B    BB#1: derived from LLVM BB %switch.lookup
            Predecessors according to CFG: BB#3 BB#2
            Successors according to CFG: BB#4(0x80000000 / 0x80000000 = 100.00%)

176B    BB#2: derived from LLVM BB %switch.lookup
            Predecessors according to CFG: BB#3
192B            %vreg13<def> = LDIRdK 0; LD8:%vreg13
208B            RJMPk <BB#1>
            Successors according to CFG: BB#1(?%)

224B    BB#3: derived from LLVM BB %switch.lookup
            Predecessors according to CFG: BB#0
240B            %vreg6<def> = SEXT %vreg2, %SREG<imp-def,dead>; DLDREGS:%vreg6 GPR8:%vreg2
272B            %vreg6<def,tied1> = ANDIWRdK %vreg6<tied0>, -2, %SREG<imp-def,dead>; DLDREGS:%vreg6
288B            %vreg7<def> = LDIWRdK 4; DLDREGS:%vreg7
304B            %vreg13<def> = LDIRdK 1; LD8:%vreg13
320B            CPWRdRr %vreg6, %vreg7, %SREG<imp-def>; DLDREGS:%vreg6,%vreg7
352B            BRNEk <BB#1>, %SREG<imp-use,kill>
368B            RJMPk <BB#2>
            Successors according to CFG: BB#2(?%) BB#1(?%)

384B    BB#4: derived from LLVM BB %bb7
            Predecessors according to CFG: BB#0 BB#1
432B            %vreg13<def,tied1> = ANDIRdK %vreg13<tied0>, 1, %SREG<imp-def,dead>; LD8:%vreg13
448B            %vreg11<def> = ZEXT %vreg13, %SREG<imp-def,dead>; DREGS:%vreg11 LD8:%vreg13
464B            %R25R24<def> = COPY %vreg11; DREGS:%vreg11
480B            RET %R25R24<imp-use>

# End machine code for function _ZN8external6decode17hac8ff9e06698feb0E.

@gergoerdi
Copy link
Collaborator Author

So let's focus on BB#1. It starts out life looking benign enough:

switch.lookup:                                    ; preds = %start
  %2 = sext i8 %0 to i16
  %3 = and i16 %2, -2
  %.cmp = icmp ne i16 %3, 4
  br label %bb7

The initial SelectionDAG already doesn't seem to have any continuation jump!

Initial selection DAG: BB#1 '_ZN8external6decode17hac8ff9e06698feb0E:switch.lookup'
SelectionDAG has 13 nodes:
  t0: ch = EntryToken
  t3: ch = ValueType:i8
            t2: i8,ch = CopyFromReg t0, Register:i8 %vreg2
          t4: i16 = sign_extend t2
        t6: i16 = and t4, Constant:i16<-2>
      t9: i1 = setcc t6, Constant:i16<4>, setne:ch
    t10: i8 = any_extend t9
  t12: ch = CopyToReg t0, Register:i8 %vreg0, t10

If I pass -O0 to llc, I can see that the basic block that corresponds to the same switch.lookup has a final continuation jump:

# !!!!!!! with -O0 !!!!!!!!
Initial selection DAG: BB#1 '_ZN8external6decode17hac8ff9e06698feb0E:switch.lookup'
SelectionDAG has 14 nodes:
  t0: ch = EntryToken
              t2: i8,ch = CopyFromReg t0, Register:i8 %vreg3
            t3: i16 = sign_extend t2
          t5: i16 = and t3, Constant:i16<-2>
        t8: i1 = setcc t5, Constant:i16<4>, setne:ch
      t9: i8 = any_extend t8
    t11: ch = CopyToReg t0, Register:i8 %vreg0, t9
  t13: ch = br t11, BasicBlock:ch<bb7 0x434b1a8>

@gergoerdi
Copy link
Collaborator Author

gergoerdi commented May 6, 2017

Nah, that's not it -- playing around with feeding the same IR to the X86 backend clearly shows that there's some kind of implicit fallthrough to the successor BB, so not having a final jump should be OK.

But then that means that maybe the bug is in how we handle these implicit continuations?

@gergoerdi
Copy link
Collaborator Author

I think I have found a workaround by changing AVRTargetLowering::EmitInstrWithCustomInserter to add a final jump to trueMBB:

from

  // Transfer remaining instructions and all successors of the current
  // block to the block which will contain the Phi node for the
  // select.
  trueMBB->splice(trueMBB->begin(), MBB,
                  std::next(MachineBasicBlock::iterator(MI)), MBB->end());
  trueMBB->transferSuccessorsAndUpdatePHIs(MBB);

to

  // Transfer remaining instructions and all successors of the current
  // block to the block which will contain the Phi node for the
  // select.
  trueMBB->splice(trueMBB->begin(), MBB,
                  std::next(MachineBasicBlock::iterator(MI)), MBB->end());
  BuildMI(trueMBB, dl, TII.get(AVR::RJMPk)).addMBB(*MBB->succ_begin());
  trueMBB->transferSuccessorsAndUpdatePHIs(MBB);

Since I have no idea what I'm doing, I've posted a question about this to StackOverflow to find out if there's a more principled way to handle these successors-without-jumps: http://stackoverflow.com/q/43816706/477476

@gergoerdi
Copy link
Collaborator Author

With the change in #46 (comment), my stripped-down libcore no longer compiles; I assume because some MBB has no successors... hopefully it's just a matter of writing it a bit more defensively.

@gergoerdi
Copy link
Collaborator Author

This is entirely subsumed by #49 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant