[crashing] Improve crash chaining code by alexischr · Pull Request #19973 · mono/mono

alexischr · 2020-06-15T18:25:24Z

With this change, mono_post_native_crash_handler chains the crashing signal directly, instead of that having to happen after mono_handle_native_crash returns (which created a fragile repetitive code pattern). Finally, this way mono_post_native_crash_handler can also re-raise the signal right after chaining it. Since we remove our crashing signal handlers in mono_handle_native_crash, this should allow the OS to handle the signal as the final action.

TODO:

Investigate the possibly broken existing pattern where mono_chain_signal is called before mono_handle_native_crash , and if the crash is chained, mono_handle_native_crash doesn't run (example: mini-runtime.c:mono_sigfpe_signal_handler)
Ensure that all of the crashing signal handlers are removed at the beginning of mono_handle_native_crash

Fixes #19860

alexischr · 2020-06-17T15:01:58Z

@lambdageek @vargaz What do you think this code means to do (from master)?

mono/mono/mini/mini-runtime.c

Lines 3424 to 3434 in 74f6d2c

    
           	if (!mono_do_crash_chaining && mono_chain_signal (MONO_SIG_HANDLER_PARAMS)) 
        
           		goto exit; 
        
           	mono_sigctx_to_monoctx (ctx, &mctx); 
        
           	if (mono_dump_start ()) 
        
           		mono_handle_native_crash (mono_get_signame (SIGFPE), &mctx, info); 
        
           	if (mono_do_crash_chaining) { 
        
           		mono_chain_signal (MONO_SIG_HANDLER_PARAMS); 
        
           		goto exit; 
        
           	} 
        
           }

To me it means that if crash chaining is enabled (and actually has a handler registered), we will never mono_handle_native_crash - or reach the second call to mono_chain_signal - which does nothing if there is no handler registered.

I want to remove this part:

 	if (!mono_do_crash_chaining && mono_chain_signal (MONO_SIG_HANDLER_PARAMS)) 
 		goto exit;

and always handle the chaining sequence the same way.

rolfbjarne · 2020-06-17T16:39:17Z

mono/mini/mini-posix.c

Is it possible to:

Restore the default signal action

Restore the original context (i.e. restore the registers, etc. to the values when the signal was raised), so that the process resumes execution with the default signal action (which will presumably terminate the process and cause the OS to produce a crash report).

The point is to leave the process in the same state it was when the signal was raised, except without Mono's signal handler. The reason for this is that when there's a crash in native code, it's quite often very difficult to figure out what happened, and having the original state can be helpful. In particular with crashes due to bugs in Apple's software, it would be much better to be able to provide Apple engineers with a crash report that accurately reflects the process state at the time of the crash instead of a process where Mono stomped all over the place.

@Rolf (1) is done (now more completely) in mono_handle_native_crash

(2) I will have to think about how that would be possible - restoring the context would naturally divert execution to the new (old) IP.

@Rolf On further thought, just restoring the context might work - without advancing the IP, the same instruction will execute as before, presumably inducing the same signal which then would be handled by the OS instead of Mono.

lambdageek · 2020-06-22T19:16:08Z

Windows builds are failing:

mini-windows.c:284:1: error: conflicting types for 'mono_post_native_crash_handler'
  284 | mono_post_native_crash_handler (const char *signal, MonoContext *mctx, MONO_SIG_HANDLER_INFO_TYPE *info, gboolean crash_chaining)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from mini-windows.c:49:
mini-runtime.h:564:1: note: previous declaration of 'mono_post_native_crash_handler' was here
  564 | mono_post_native_crash_handler (const char *signal, MonoContext *mctx, MONO_SIG_HANDLER_INFO_TYPE *info, gboolean crash_chaining, void *context);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With this change, `mono_post_native_crash_handler` chains the crashing signal directly, instead of that having to happen after `mono_handle_native_crash` returns (which created a fragile repetitive code pattern). Finally, this way `mono_post_native_crash_handler` can also re-raise the signal right after chaining it. Since we remove our crashing signal handlers in `mono_handle_native_crash`, this should allow the OS to handle the signal as the final action. Should fix mono#19860

alexischr · 2020-06-23T23:37:40Z

@monojenkins backport to 2020-02

monojenkins · 2020-06-23T23:38:56Z

@alexischr backporting to 2020-02 failed, the patch results in conflicts:

Applying: [crashing] Improve crash chaining
Using index info to reconstruct a base tree...
M	mono/mini/exceptions-amd64.c
M	mono/mini/exceptions-x86.c
M	mono/mini/mini-exceptions.c
M	mono/mini/mini-posix.c
M	mono/mini/mini-runtime.c
M	mono/mini/mini-runtime.h
M	mono/mini/mini-windows.c
M	mono/mini/mini.h
Falling back to patching base and 3-way merge...
Auto-merging mono/mini/mini.h
Auto-merging mono/mini/mini-windows.c
Auto-merging mono/mini/mini-runtime.h
Auto-merging mono/mini/mini-runtime.c
CONFLICT (content): Merge conflict in mono/mini/mini-runtime.c
Auto-merging mono/mini/mini-posix.c
Auto-merging mono/mini/mini-exceptions.c
Auto-merging mono/mini/exceptions-x86.c
Auto-merging mono/mini/exceptions-amd64.c
error: Failed to merge in the changes.
Patch failed at 0001 [crashing] Improve crash chaining

Please backport manually!

…ash() for s390x * Fix exception-s390x so mono_handle_native_crash() is called according to the changes made in mono/mono#19973 * Make roslyn the default compiler for s390x

…ash() for s390x (#38428) * Fix exception-s390x so mono_handle_native_crash() is called according to the changes made in mono/mono#19973 * Make roslyn the default compiler for s390x Co-authored-by: nealef <nealef@users.noreply.github.com>

This reverts commit 7a0425e. This change was causing this android issue: mono#20275

* Changed to i8 * Revert "[crashing] Improve crash chaining (#19973)" This reverts commit 7a0425e. This change was causing this android issue: #20275

This reverts commit 7a0425e. This change was causing this android issue: mono#20275

This reverts commit 7a0425e. This change was causing this android issue: #20275

monojenkins mentioned this pull request Jun 15, 2020

[crashing] Improve crash chaining code dotnet/runtime#37914

Merged

2 tasks

alexischr force-pushed the crash-chaining-refactor branch from 30560c7 to 703269e Compare June 17, 2020 14:46

rolfbjarne reviewed Jun 17, 2020

View reviewed changes

alexischr changed the title ~~[crashing] Refactor crash chaining code~~ [crashing] Improve crash chaining code Jun 22, 2020

alexischr force-pushed the crash-chaining-refactor branch from 703269e to 59c0d62 Compare June 22, 2020 17:29

alexischr marked this pull request as ready for review June 22, 2020 17:38

alexischr requested review from BrzVlad, SamMonoRT, lambdageek and vargaz as code owners June 22, 2020 17:38

lambdageek approved these changes Jun 22, 2020

View reviewed changes

alexischr force-pushed the crash-chaining-refactor branch from 59c0d62 to 870f2b7 Compare June 23, 2020 20:23

alexischr merged commit 7a0425e into mono:master Jun 24, 2020

alexischr mentioned this pull request Jun 25, 2020

[crashing] Restore the crashing thread's context #20019

Closed

nealef mentioned this pull request Jun 26, 2020

Fix #20022 - changing the signature of mono_handle_native_crash() for s390x #20028

Merged

monojenkins mentioned this pull request Jun 26, 2020

Fix #20022 - changing the signature of mono_handle_native_crash() for s390x dotnet/runtime#38428

Merged

naricc pushed a commit to naricc/mono that referenced this pull request Sep 1, 2020

Revert "[crashing] Improve crash chaining (mono#19973)"

7cb5c04

This reverts commit 7a0425e. This change was causing this android issue: mono#20275

monojenkins pushed a commit to monojenkins/mono that referenced this pull request Sep 8, 2020

Revert "[crashing] Improve crash chaining (mono#19973)"

7cc64cc

This reverts commit 7a0425e. This change was causing this android issue: mono#20275

marek-safar pushed a commit that referenced this pull request Sep 16, 2020

Revert "[crashing] Improve crash chaining (#19973)"

be2226b

This reverts commit 7a0425e. This change was causing this android issue: #20275

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[crashing] Improve crash chaining code#19973

[crashing] Improve crash chaining code#19973
alexischr merged 1 commit intomono:masterfrom
alexischr:crash-chaining-refactor

alexischr commented Jun 15, 2020 •

edited

Loading

Uh oh!

alexischr commented Jun 17, 2020

Uh oh!

rolfbjarne Jun 17, 2020

Uh oh!

alexischr Jun 22, 2020

Uh oh!

alexischr Jun 23, 2020

Uh oh!

lambdageek commented Jun 22, 2020

Uh oh!

alexischr commented Jun 23, 2020

Uh oh!

monojenkins commented Jun 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

alexischr commented Jun 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexischr commented Jun 17, 2020

Uh oh!

rolfbjarne Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

alexischr Jun 22, 2020

Choose a reason for hiding this comment

Uh oh!

alexischr Jun 23, 2020

Choose a reason for hiding this comment

Uh oh!

lambdageek commented Jun 22, 2020

Uh oh!

alexischr commented Jun 23, 2020

Uh oh!

monojenkins commented Jun 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alexischr commented Jun 15, 2020 •

edited

Loading