Skip to content

Improve inline asm support #1206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 22, 2021
Merged

Improve inline asm support #1206

merged 11 commits into from
Nov 22, 2021

Conversation

nbdd0121
Copy link
Contributor

Fix part of #1204

  • Support register classes. This will require writing a mini register allocator to solve all register constraints.
  • Optimize
    • Overlap input and output stack slots.
    • Skip saving caller saved registers.
  • Support more architectures
    • 32bit x86
    • riscv32/64
    • ...

@bjorn3
Copy link
Member

bjorn3 commented Nov 22, 2021

Thanks a lot! I will review later today. CI failure is due to some issues with the recent std::simd addition I haven't fixed yet.

}

impl<'tcx> InlineAssemblyGenerator<'_, 'tcx> {
fn allocate_registers(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function can fail to find a valid allocation even if one exists in the presence of overlapping registers. I don't think it is a high priority to fix this though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, e.g. when reg_abcd is used for inputs and reg is used for inouts, then this will fail. But for most cases this naive allocator should suffice.

}

impl<'tcx> InlineAssemblyGenerator<'_, 'tcx> {
fn allocate_registers(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't skipping clobbers for registers unavailable given the target features of the current function, right? They need to be skipped I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is respecting sess.target_features but not function-local #[target_feature] ones.

@nbdd0121 nbdd0121 force-pushed the master branch 2 times, most recently from d8bf21f to 450640f Compare November 22, 2021 13:40
@nbdd0121
Copy link
Contributor Author

nbdd0121 commented Nov 22, 2021

Hmm GitHub seems to be displaying the commits in the wrong order.

EDIT: It seems to be a glitch in my clock or git that causes wrong timestamp to be used.

@bjorn3
Copy link
Member

bjorn3 commented Nov 22, 2021

Enabling assembly for compiler-builtins gives the following linker error:

          rust-lld: error: undefined symbol: eax
          >>> referenced by compiler_builtins.53ea2f17-cgu.6.o
          >>>               compiler_builtins-e4d6a99e201768d9.compiler_builtins.53ea2f17-cgu.6.rcgu.o:(_ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0) in archive ~/blog_os/target/x86_64-blog_os/debug/deps/libcompiler_builtins-e4d6a99e201768d9.rlib
          >>> referenced by compiler_builtins.53ea2f17-cgu.6.o
          >>>               compiler_builtins-e4d6a99e201768d9.compiler_builtins.53ea2f17-cgu.6.rcgu.o:(_ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0) in archive ~/blog_os/target/x86_64-blog_os/debug/deps/libcompiler_builtins-e4d6a99e201768d9.rlib
          
          rust-lld: error: undefined symbol: r15d
          >>> referenced by compiler_builtins.53ea2f17-cgu.6.o
          >>>               compiler_builtins-e4d6a99e201768d9.compiler_builtins.53ea2f17-cgu.6.rcgu.o:(_ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0) in archive ~/blog_os/target/x86_64-blog_os/debug/deps/libcompiler_builtins-e4d6a99e201768d9.rlib
          
          rust-lld: error: undefined symbol: r15
          >>> referenced by compiler_builtins.53ea2f17-cgu.7.o
          >>>               compiler_builtins-e4d6a99e201768d9.compiler_builtins.53ea2f17-cgu.7.rcgu.o:(_ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0) in archive ~/blog_os/target/x86_64-blog_os/debug/deps/libcompiler_builtins-e4d6a99e201768d9.rlib

Generated assembly:

.intel_syntax noprefix

            .att_syntax
            .pushsection .text.__rust_probestack
            .globl __rust_probestack
            .type  __rust_probestack, @function
            .hidden __rust_probestack
        __rust_probestack:
            
    .cfi_startproc
    pushq  %rbp
    .cfi_adjust_cfa_offset 8
    .cfi_offset %rbp, -16
    movq   %rsp, %rbp
    .cfi_def_cfa_register %rbp

    mov    %rax,%r11        

    cmp    $0x1000,%r11
    jna    3f
2:
    sub    $0x1000,%rsp
    test   %rsp,8(%rsp)
    sub    $0x1000,%r11
    cmp    $0x1000,%r11
    ja     2b

3:
    
    
    sub    %r11,%rsp
    test   %rsp,8(%rsp)

    
    
    
    add    %rax,%rsp

    leave
    .cfi_def_cfa_register %rsp
    .cfi_adjust_cfa_offset -8
    ret
    .cfi_endproc
    
            .size __rust_probestack, . - __rust_probestack
            .popsection
            
.att_syntax

[src/inline_asm.rs:253] &class = X86(
    reg,
)
[src/inline_asm.rs:253] &class = X86(
    reg,
)
[src/inline_asm.rs:253] &class = X86(
    reg,
)
.globl _ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0
.type _ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0,@function
.section .text._ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0,"ax",@progbits
_ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0:
.intel_syntax noprefix
    push rbp
    mov rbp,rdi
    mov rax, [rbp+0x0]
    mov rcx, [rbp+0x8]
    mov rdi, [rbp+0x10]
    mov rsi, [rbp+0x18]
.att_syntax
repe movsq (%rsi), (%rdi)
mov eax, %ecx
repe movsb (%rsi), (%rdi)
.intel_syntax noprefix
    pop rbp
    ret
.att_syntax
.size _ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0, .-_ZN17compiler_builtins3mem5impls12copy_forward17h56ed4e9e55fe506bE__inline_asm_0
.text


.globl _ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0
.type _ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0,@function
.section .text._ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0,"ax",@progbits
_ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0:
.intel_syntax noprefix
    push rbp
    mov rbp,rdi
    mov rax, [rbp+0x0]
    mov rcx, [rbp+0x8]
    mov rdi, [rbp+0x10]
    mov rsi, [rbp+0x18]
.att_syntax
std
repe movsq (%rsi), (%rdi)
movl eax, %ecx
addq $7, %rdi
addq $7, %rsi
repe movsb (%rsi), (%rdi)
cld
.intel_syntax noprefix
    pop rbp
    ret
.att_syntax
.size _ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0, .-_ZN17compiler_builtins3mem5impls13copy_backward17hcb6cfd93e8521d77E__inline_asm_0
.text


.globl _ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0
.type _ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0,@function
.section .text._ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0,"ax",@progbits
_ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0:
.intel_syntax noprefix
    push rbp
    mov rbp,rdi
    mov [rbp+0x0], r15
    mov r15, [rbp+0x8]
    mov rcx, [rbp+0x10]
    mov rdi, [rbp+0x18]
    mov rax, [rbp+0x20]
.att_syntax
repe stosq %rax, (%rdi)
mov r15d, %ecx
repe stosb %al, (%rdi)
.intel_syntax noprefix
    mov r15, [rbp+0x0]
    pop rbp
    ret
.att_syntax
.size _ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0, .-_ZN17compiler_builtins3mem5impls9set_bytes17heceff2f71540c61aE__inline_asm_0
.text


[src/inline_asm.rs:253] &class = X86(
    reg,
)
.globl _ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0
.type _ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0,@function
.section .text._ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0,"ax",@progbits
_ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0:
.intel_syntax noprefix
    push rbp
    mov rbp,rdi
    mov [rbp+0x0], r15
    mov r15, [rbp+0x18]
    mov rax, [rbp+0x8]
    mov rdx, [rbp+0x10]
.att_syntax
div r15
.intel_syntax noprefix
    mov [rbp+0x8], rax
    mov [rbp+0x10], rdx
    mov r15, [rbp+0x0]
    pop rbp
    ret
.att_syntax
.size _ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0, .-_ZN17compiler_builtins3int19specialized_div_rem19u128_by_u64_div_rem17hbd5c83cbc8ac25c9E__inline_asm_0
.text


{standard input}: Assembler messages:
{standard input}:13: Warning: no instruction mnemonic suffix given and no register operands; using default for `div'

@nbdd0121
Copy link
Contributor Author

Ah, I think this is just missing % when generating register names in AT&T syntax.

@bjorn3
Copy link
Member

bjorn3 commented Nov 22, 2021

Trying to get blog os work with this branch. Almost there it seems. Only got two inline asm from different codegen units with a conflicting name:

  = note: rust-lld: error: duplicate symbol: _ZN6x86_6412instructions3tlb5flush17h23a57f961ec6dc00E__inline_asm_0
          >>> defined at bootloader.421d0cd4-cgu.0.o
          >>>            /home/bjorn/Documenten/blog_os/target/bootimage/bootloader/x86_64-bootloader/release/deps/bootloader-7ec32d838108b3d0.bootloader.421d0cd4-cgu.0.rcgu.o:(_ZN6x86_6412instructions3tlb5flush17h23a57f961ec6dc00E__inline_asm_0)
          >>> defined at bootloader.421d0cd4-cgu.1.o
          >>>            /home/bjorn/Documenten/blog_os/target/bootimage/bootloader/x86_64-bootloader/release/deps/bootloader-7ec32d838108b3d0.bootloader.421d0cd4-cgu.1.rcgu.o:(.text._ZN6x86_6412instructions3tlb5flush17h23a57f961ec6dc00E__inline_asm_0+0x0)

I think it will need to use the codegen unit name instead of function name as base name.

Copy link
Member

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge this now. The issues with blog os are unrelated to the changes in this PR.

@bjorn3 bjorn3 merged commit a49c6b8 into rust-lang:master Nov 22, 2021
@bjorn3 bjorn3 mentioned this pull request Nov 24, 2021
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants