Skip to content

Improve the semantics of asynchronous exceptions (new simpler version) #802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 12, 2022

Conversation

mshinwell
Copy link
Collaborator

Supercedes #765. This works in bytecode as well as native now.

New rules of the game:

  • Only Sys.Break or Stack_overflow may be raised from signal handlers, finalisers or memprof callbacks. (Stack_overflow is not supposed to be raised by user code.) This helps prevent exfiltration of data from these callbacks by means of asynchronous exceptions; and avoids decisions in the runtime about how to "join" more descriptive exceptions in the case where a second asynchronous exception arises during processing of another.
  • Raising any other exception from any of these callbacks will cause the toplevel uncaught exception handler to be invoked.

Exceptions will no longer be raised at allocation points; this is reflected in Mach.operation_can_raise. This now matches the semantics expected by Flambda 2.

Note that this is a breaking change: as noted in sys.mli, you will likely want to use Sys.with_async_exns after using Sys.catch_break, or else Ctrl+C interrupts will cause exceptions to be sent to the toplevel uncaught exception handler.

I still need to add some more test cases, although the one I've added plus the existing testsuite seem to cover the majority of cases.

For some reason it seemed that the existing bytecode interpreter use of sigsetjmp (for implementing exceptions) was incorrect; these calls weren't being told to save the signal mask, but the siglongjmp calls were instructing restoration of such mask. This had to be fixed for this PR to work correctly; the signal masks should be restored across these jumps.

@stedolan is on the hook to review this in due course. We plan to initiate some discussion upstream too.

@mshinwell
Copy link
Collaborator Author

One hitch: there isn't a 32-bit x86 implementation provided for native code, and I don't really want to write one. I'm going to see again if we can stop building the 32-bit compiler.

@Gbury
Copy link
Contributor

Gbury commented Sep 7, 2022

I'm afraid that this change would break at least a few packages, for instance:

  • those that use gc alarms (e.g. to enforce a limit on memory usage)
  • those that use expressions from signal handlers to implement timeouts

@mshinwell mshinwell force-pushed the async-exns-only-break branch from 98bb22d to 893a9c4 Compare September 7, 2022 12:58
@mshinwell
Copy link
Collaborator Author

mshinwell commented Sep 7, 2022

This will be a breaking change, but I think any fix to this problem is going to suffer from the same. We decided on a simple model where users are expected to use Break (and only Break) to exfiltrate data from signal handlers, finalisers etc. in order to avoid complications where new asynchronous exceptions are raised during processing of an existing asynchronous exception in the runtime. With this model, there is a clear and reliable way of doing this exfiltration, which does not run the risk of losing data. The user should establish a data structure which is populated by a signal handler, finaliser or whatever and then query that structure upon receiving Break from a Sys.with_async_exns point. This is a bit like the existing technique used e.g. in Async for handling signals: the only thing that runs in the signal handler itself is pushing a value onto a queue to say that there has been a signal. Those can then be dealt with in a more normal context.

This PR has now been updated with a 32-bit x86 implementation in i386.S.

@mshinwell
Copy link
Collaborator Author

@stedolan could you please review this, thanks.

@stedolan
Copy link
Contributor

An earlier version of this patch (IIRC) made caml_start_program always catch async exceptions, while the current version splits it into caml_start_program and caml_start_program_async_exn. Why? Do you ever want the former? It seems dangerous to allow async exceptions to propagate past caml_callback_exn by default.

@mshinwell
Copy link
Collaborator Author

An earlier version of this patch (IIRC) made caml_start_program always catch async exceptions, while the current version splits it into caml_start_program and caml_start_program_async_exn. Why? Do you ever want the former? It seems dangerous to allow async exceptions to propagate past caml_callback_exn by default.

This was to make sure that async exns only appear at Sys.with_async_exns points, but hmm... maybe this could be problematic for a caller that really requires their frame calling caml_callback_exn to be returned to.

@mshinwell
Copy link
Collaborator Author

@stedolan and I discussed this. I'm going to change this PR so that caml_callback*_exn do return async exns; but caml_callback* will raise async exns as async exns.

@mshinwell mshinwell force-pushed the async-exns-only-break branch from def3b17 to a80743b Compare October 3, 2022 10:35
@stedolan
Copy link
Contributor

I am basically happy with this patch, but I'm concerned with the size of the diff to the runtime. (Especially as I'm about to try to merge 4.14 into this branch!).

I've made a patch that simplifies some of the mechanisms here with a view to making a smaller change, which should also aid in upstreaming this. The patch is at stedolan/flambda-backend/tree/async-exns-only-break, let me know what you think of it. Commit message copied below:

Simplify the implementation of new asynchronous exception semantics

  - Use a flag in Caml_state to indicate whether an async exception is
    being raised, removing the need for two versions of caml_start_program

  - Use the same flag mechanism to distinguish async exceptions on both
    bytecode and native code

  - Check that only Break may be raised from async callbacks at the point
    of *raising*, rather than at the point of running the callback

  - No longer propagate Stack_overflow from async callbacks to the main
    program

  - Simplify raising logic by removing prepare_for_raise, as there is no
    longer a need to carefully avoid caml_raise recursion (Now, caml_raise
    may trigger callbacks, which may trigger caml_raise_async, which can
    trigger no further work)

  - Revert a change to setsigmask behaviour in interp.c

  - caml_raise_out_of_memory_fatal renamed to caml_fatal_out_of_memory,
    and is always a fatal error on both bytecode and native

)

* Simplify the implementation of new asynchronous exception semantics

  - Use a flag in Caml_state to indicate whether an async exception is
    being raised, removing the need for two versions of caml_start_program

  - Use the same flag mechanism to distinguish async exceptions on both
    bytecode and native code

  - Check that only Break may be raised from async callbacks at the point
    of *raising*, rather than at the point of running the callback

  - No longer propagate Stack_overflow from async callbacks to the main
    program

  - Simplify raising logic by removing prepare_for_raise, as there is no
    longer a need to carefully avoid caml_raise recursion (Now, caml_raise
    may trigger callbacks, which may trigger caml_raise_async, which can
    trigger no further work)

  - Revert a change to setsigmask behaviour in interp.c

  - caml_raise_out_of_memory_fatal renamed to caml_fatal_out_of_memory,
    and is always a fatal error on both bytecode and native

* review

* comments
Copy link
Contributor

@gretay-js gretay-js left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For files owned by me

@mshinwell mshinwell merged commit a77f0dd into ocaml-flambda:main Oct 12, 2022
mshinwell added a commit to mshinwell/flambda-backend that referenced this pull request Oct 24, 2022
25188da flambda-backend: Missed comment from PR802 (ocaml-flambda#887)
9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml-flambda#802)
d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml-flambda#874)
4bbde7a flambda-backend: Simpler symbols (ocaml-flambda#753)
ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml-flambda#862)
a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml-flambda#869)
045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml-flambda#868)
3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml-flambda#866)
c5b12bf flambda-backend: Remove unnecessary install lines (ocaml-flambda#860)
ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml-flambda#861)
c84976c flambda-backend: Static check for noalloc: attributes (ocaml-flambda#825)
ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml-flambda#857)
39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml-flambda#854)
c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml-flambda#850)
6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml-flambda#852)
f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml-flambda#843)
9b78eb2 flambda-backend: Add test for ocaml-flambda#820 (include functor soundness bug) (ocaml-flambda#841)
8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml-flambda#833)
65c2f22 flambda-backend: Add test for ocaml-flambda#829 (ocaml-flambda#831)
7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml-flambda#830)
ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml-flambda#787)
3ee650c flambda-backend: Fix soundness bug in include functor (ocaml-flambda#820)
2f57378 flambda-backend: Static check noalloc (ocaml-flambda#778)
aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml-flambda#812)
17c7173 flambda-backend: Fix .cmt for included signatures (ocaml-flambda#803)
e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml-flambda#800)
ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml-flambda#784)
14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml-flambda#790)
487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml-flambda#795)
a50a818 flambda-backend: Reduce closure allocation in List (ocaml-flambda#792)
96c9c60 flambda-backend: Merge ocaml-jst
a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml-flambda#767)
f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml-flambda#757)
c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml-flambda#756)
b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml-flambda#750)
8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml-flambda#749)

git-subtree-dir: ocaml
git-subtree-split: 25188da
@mshinwell mshinwell added the flambda2 upstreaming Prerequisites for upstreaming flambda2 label Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flambda2 upstreaming Prerequisites for upstreaming flambda2 runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants