Skip to content

Only add begin/end region primitives for [Ctrywith] when stack allocation is enabled #812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

azewierzejew
Copy link
Contributor

@azewierzejew azewierzejew commented Sep 7, 2022

For each try with block the stack for local allocations is stored at the beginning of the block and in the beginning of the handler the stack pointer is restored.
That is because the exception can be raised at a point where there are some additional allocation compared to the stored pointer.

The current "selectgen.ml" always inserts those Ibeginregion and Iendregion but they aren't necessary if the stack allocation is disabled.

With this change the operations will only be inserted if the stack allocation is enabled.

For amd64, flambda middle end and upstream register allocator.
The code

let foo x = try bar x with _ -> x

produces following assembly on the main branch with or without stack allocation and on fixed branch with stack allocation only:

0000000000000010 <camlTest__foo_14>:
  10:	48 83 ec 18          	sub    rsp,0x18
  14:	48 89 44 24 08       	mov    QWORD PTR [rsp+0x8],rax
  19:	49 8b 9e 88 00 00 00 	mov    rbx,QWORD PTR [r14+0x88]
  20:	48 89 1c 24          	mov    QWORD PTR [rsp],rbx
  24:	4c 8d 1d 1d 00 00 00 	lea    r11,[rip+0x1d]        # 48 <camlTest__foo_14+0x38>
  2b:	41 53                	push   r11
  2d:	41 ff 76 10          	push   QWORD PTR [r14+0x10]
  31:	49 89 66 10          	mov    QWORD PTR [r14+0x10],rsp
  35:	e8 00 00 00 00       	call   3a <camlTest__foo_14+0x2a>
			36: R_X86_64_PLT32	camlTest__bar_8-0x4
  3a:	41 8f 46 10          	pop    QWORD PTR [r14+0x10]
  3e:	48 83 c4 08          	add    rsp,0x8
  42:	48 83 c4 18          	add    rsp,0x18
  46:	c3                   	ret    
  47:	90                   	nop
  48:	48 8b 04 24          	mov    rax,QWORD PTR [rsp]
  4c:	49 89 86 88 00 00 00 	mov    QWORD PTR [r14+0x88],rax
  53:	48 8b 44 24 08       	mov    rax,QWORD PTR [rsp+0x8]
  58:	48 83 c4 18          	add    rsp,0x18
  5c:	c3                   	ret    
  5d:	0f 1f 00             	nop    DWORD PTR [rax]

The fix branch with stack allocation disabled produces following assembly:

0000000000000010 <camlTest__foo_14>:
  10:	48 83 ec 08          	sub    rsp,0x8
  14:	48 89 04 24          	mov    QWORD PTR [rsp],rax
  18:	4c 8d 1d 1d 00 00 00 	lea    r11,[rip+0x1d]        # 3c <camlTest__foo_14+0x2c>
  1f:	41 53                	push   r11
  21:	41 ff 76 10          	push   QWORD PTR [r14+0x10]
  25:	49 89 66 10          	mov    QWORD PTR [r14+0x10],rsp
  29:	e8 00 00 00 00       	call   2e <camlTest__foo_14+0x1e>
			2a: R_X86_64_PLT32	camlTest__bar_8-0x4
  2e:	41 8f 46 10          	pop    QWORD PTR [r14+0x10]
  32:	48 83 c4 08          	add    rsp,0x8
  36:	48 83 c4 08          	add    rsp,0x8
  3a:	c3                   	ret    
  3b:	90                   	nop
  3c:	48 8b 04 24          	mov    rax,QWORD PTR [rsp]
  40:	48 83 c4 08          	add    rsp,0x8
  44:	c3                   	ret    
  45:	90                   	nop
  46:	66 2e 0f 1f 84 00 00 	nop    WORD PTR cs:[rax+rax*1+0x0]
  4d:	00 00 00 

The new code doesn't have moves at offsets 19 and 20 which fetch the local allocation stack pointer.
It also doesn't have moves at offsets 48 and 4cwhich restores the pointer in the handler.

Apart from that the code remains the same.

@mshinwell mshinwell changed the title Backend fix regions for [Ctrywith]. Backend fix regions for [Ctrywith] Sep 7, 2022
@mshinwell mshinwell changed the title Backend fix regions for [Ctrywith] Only add begin/end region primitives for [Ctrywith] when stack allocation is enabled Sep 7, 2022
Copy link
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that part of the backend
very well, but the change looks correct.

@gretay-js
Copy link
Contributor

@azewierzejew can you make the same change in ocaml/asmcomp directory please?

@mshinwell
Copy link
Collaborator

#809 contains the analogous logic (and more) for Flambda 2.

@gretay-js gretay-js merged commit 6b1aebc into ocaml-flambda:main Sep 7, 2022
@azewierzejew azewierzejew deleted the backend_fix_trywith_regions branch September 9, 2022 09:54
mshinwell added a commit to mshinwell/flambda-backend that referenced this pull request Oct 24, 2022
25188da flambda-backend: Missed comment from PR802 (ocaml-flambda#887)
9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml-flambda#802)
d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml-flambda#874)
4bbde7a flambda-backend: Simpler symbols (ocaml-flambda#753)
ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml-flambda#862)
a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml-flambda#869)
045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml-flambda#868)
3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml-flambda#866)
c5b12bf flambda-backend: Remove unnecessary install lines (ocaml-flambda#860)
ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml-flambda#861)
c84976c flambda-backend: Static check for noalloc: attributes (ocaml-flambda#825)
ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml-flambda#857)
39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml-flambda#854)
c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml-flambda#850)
6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml-flambda#852)
f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml-flambda#843)
9b78eb2 flambda-backend: Add test for ocaml-flambda#820 (include functor soundness bug) (ocaml-flambda#841)
8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml-flambda#833)
65c2f22 flambda-backend: Add test for ocaml-flambda#829 (ocaml-flambda#831)
7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml-flambda#830)
ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml-flambda#787)
3ee650c flambda-backend: Fix soundness bug in include functor (ocaml-flambda#820)
2f57378 flambda-backend: Static check noalloc (ocaml-flambda#778)
aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml-flambda#812)
17c7173 flambda-backend: Fix .cmt for included signatures (ocaml-flambda#803)
e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml-flambda#800)
ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml-flambda#784)
14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml-flambda#790)
487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml-flambda#795)
a50a818 flambda-backend: Reduce closure allocation in List (ocaml-flambda#792)
96c9c60 flambda-backend: Merge ocaml-jst
a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml-flambda#767)
f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml-flambda#757)
c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml-flambda#756)
b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml-flambda#750)
8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml-flambda#749)

git-subtree-dir: ocaml
git-subtree-split: 25188da
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants