runtime: panicmem should expose the address it's panicking about. #37023

seebs · 2020-02-04T19:26:30Z

What version of Go are you using (`go version`)?

1.13

Does this issue reproduce with the latest release?

probably

What operating system and processor architecture are you using (`go env`)?

AMD64

What did you do?

I don't know, but it resulted in a segfault. :) But then I started experimenting with panic and recover, and discovered (somewhat to my surprise) that I can catch a SEGV's panic.

What did you expect to see?

A runtime error/exception/something that has comparable information to what I'd get if I didn't recover().

What did you see instead?

The string "invalid memory address or nil pointer dereference"

So, contrast:

unexpected fault address 0x7f84e68d6112
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x7f84e68d6112 pc=0x5f17cb]

goroutine 992352 [running]:
runtime.throw(0xfab924, 0x5)
        /usr/local/Cellar/go/1.13/libexec/src/runtime/panic.go:774 +0x72 fp=0xd5b852e7f8 sp=0xd5b852e7c8 pc=0x42f3b2
runtime.sigpanic()
        /usr/local/Cellar/go/1.13/libexec/src/runtime/signal_unix.go:401 +0x3de fp=0xd5b852e828 sp=0xd5b852e7f8 pc=0x444a3e
[...]

If I look at [runtime/]debug.StackTrace(), I can get some of this information, but I can't necessarily get the actual address of the unexpected fault. It looks to me like runtime has this information at the time when the panic is generated (or would be, if I turned on SetPanicOnFault):

        case _SIGSEGV:
                if (g.sigcode0 == 0 || g.sigcode0 == _SEGV_MAPERR || g.sigcode0 == _SEGV_ACCERR) && g.sigcode1 < 0x1000 {
                        panicmem()
                }
                // Support runtime/debug.SetPanicOnFault.
                if g.paniconfault {
                        panicmem()
                }
                print("unexpected fault address ", hex(g.sigcode1), "\n")
                throw("fault")

But it looks like panicmem doesn't expose these values. Possibly it should. I can't think of a reasonable approach immediately -- obviously I'd just like the siginfo_t exposed, except that this is unportable even by memory fault handling standards and won't be compatible between implementations. But it seems like exposing the affected address (g.sigcode1) and possibly the kind-of-error (SEGV_MAPERR or SEGV_ACCERR?)

But as a simple baseline that I think is probably supportable on current systems, if panicmem took a uintptr, and the resulting panic were an error type that could be queried for the uintptr, or displayed it by default, that might make life easier.

Because on the one hand, I want to catch the panics and prevent them from breaking things, but on the other hand, diagnosing them sort of benefits from having the address be available.

(On the third hand, it might be that changing the exact panic result in these cases would break the Compatibility Promise, without some hypothetical new debug.SetFancyPanicOnFault() that would not have been used in the existing code.)

The text was updated successfully, but these errors were encountered:

randall77 · 2020-02-04T20:56:10Z

We could do something about this. The types of runtime panics are explicitly unspecified: https://golang.org/ref/spec#Run_time_panics
We can add a Addr() method to the value we panic with, so you can do

type addrable interface {
    Addr() uintptr
}
e := recover()
if a, ok := e.(addrable), ok {
   _ = a.Addr()
}

Should we include kind-of also? Not sure. We can rapidly reach implementation-specific areas, which I don't think we want to do.

rsc · 2020-02-05T18:00:41Z

You can't always get the address either. For example on amd64 if you fault because the address is malformed (top 17 bits not all 0s or all 1s) then you don't get the address in the fault at all. In that case it definitely can't be passed anywhere else. I am not convinced we should do this.

Why do you want it? There's not usually much use to it.

seebs · 2020-02-05T20:36:22Z

Yeah, I noticed the lack of address with high bits set recently. I remember all the mc68k programs that said "oh hey the address bus is only 24 bits, we can use that top word and just mask it out", and I am so looking forward to having that again in amd64 in a little bit here.

The reason I want it is that it turns out to let me distinguish between "i have no idea what garbage was here" and "wow what an amazing coincidence that address is in a space that got munmapped recently".

So if I get a throw(), I can only dump every mmap operation as it happens just in case there's a crash later, and then after a week of logging, I get a crash in which the relevant thing is in the last five lines of the logs. If I use PanicOnFault, I can't tell which of the things are relevant, but I can look at recent activity. If I used PanicOnFault, and got the address (when possible/applicable), I could look at the recent activity and report exactly what happened, and life would be a lot simpler. (And it really does seem to be days-to-weeks between events.)

I agree that there's not usually much use to it, but in the specific case where mmap is involved (which is pretty much the only time I'd be likely to want PanicOnFault anyway), the distinction between "an invalid address" and "an invalid address which I can trace to a specific memory-mapped region" is very significant.

rsc · 2020-02-12T18:54:40Z

Adding just the address does help with PanicOnFault, so I could see maybe doing this.
Thoughts, @aclements, @ianlancetaylor, @randall77?

ianlancetaylor · 2020-02-12T19:57:24Z

I don't see a problem as long as we document that the information is system-dependent, and is not always available or reliable.

Would this be a language spec change or just a runtime package change? The spec says that runtime errors have "perhaps other methods" so I guess we can just do it in the runtime.

There are three signals that produce a runtime panic: SIGBUS, SIGSEGV, SIGFPE. They all set the si_addr field. SIGBUS and SIGSEGV set it to the memory address that caused the fault. SIGFPE sets it to the memory address of the instruction that faulted. Should we provide an address in all cases? I think not, as on a soft floating point implementation we will get a floating point exception without a signal and without a specific associated instruction.

seebs · 2020-02-12T20:47:39Z

Intuitively, I would be really confused if a runtime panic from SIGFPE gave me a different kind of address than the other two do, and it's redundant with the already-available stack backtrace. So I think it'd probably be more confusing than not to provide that one. (If SIGFPE gave the address of the floating point object, that might almost be useful, but not very because if you know what code is executing, you're already able to narrow it down pretty well.)

I don't think it makes sense to add a spec change, because this is so specific to a particular runtime implementation; if you don't have runtime.PanicOnFault, it'll never come up.

rsc · 2020-02-26T18:21:12Z

SIGFPE is already turning into a different panic.
If the target here is PanicOnFault then that would only apply to the panic generated in that case, which is SIGBUS, SIGSEGV.
This need not be a language spec change (it doesn't mention PanicOnFault either), and we can be suitably non-committal about how good the address is.

Based on the discussion, it seems like this is a likely accept.

rsc · 2020-03-04T18:03:24Z

No change in consensus, so accepting.

gopherbot · 2020-08-20T21:28:04Z

Change https://golang.org/cl/249677 mentions this issue: runtime/debug: provide Address method for errors from SetPanicOnFault

bcmills changed the title ~~propsoal: panicmem should expose the address it's panicking about.~~ proposal: panicmem should expose the address it's panicking about. Feb 4, 2020

gopherbot added this to the Proposal milestone Feb 4, 2020

gopherbot added the Proposal label Feb 4, 2020

bcmills changed the title ~~proposal: panicmem should expose the address it's panicking about.~~ proposal: runtime: panicmem should expose the address it's panicking about. Feb 4, 2020

rsc mentioned this issue Feb 12, 2020

proposal: review meeting minutes #33502

Open

rsc added the Proposal-FinalCommentPeriod label Feb 26, 2020

rsc modified the milestones: Proposal, Backlog Mar 4, 2020

rsc changed the title ~~proposal: runtime: panicmem should expose the address it's panicking about.~~ runtime: panicmem should expose the address it's panicking about. Mar 4, 2020

rsc added Proposal-Accepted and removed Proposal-FinalCommentPeriod labels Mar 4, 2020

gopherbot closed this as completed in d91d076 Sep 18, 2020

golang locked and limited conversation to collaborators Sep 18, 2021

gopherbot added the FrozenDueToAge label Sep 18, 2021

rsc moved this to Accepted in Proposals Aug 10, 2022

rsc added this to Proposals Aug 10, 2022

rsc removed this from Proposals Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: panicmem should expose the address it's panicking about. #37023

runtime: panicmem should expose the address it's panicking about. #37023

seebs commented Feb 4, 2020

randall77 commented Feb 4, 2020

rsc commented Feb 5, 2020

seebs commented Feb 5, 2020

rsc commented Feb 12, 2020

ianlancetaylor commented Feb 12, 2020

seebs commented Feb 12, 2020

rsc commented Feb 26, 2020

rsc commented Mar 4, 2020

gopherbot commented Aug 20, 2020

runtime: panicmem should expose the address it's panicking about. #37023

runtime: panicmem should expose the address it's panicking about. #37023

Comments

seebs commented Feb 4, 2020

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

randall77 commented Feb 4, 2020

rsc commented Feb 5, 2020

seebs commented Feb 5, 2020

rsc commented Feb 12, 2020

ianlancetaylor commented Feb 12, 2020

seebs commented Feb 12, 2020

rsc commented Feb 26, 2020

rsc commented Mar 4, 2020

gopherbot commented Aug 20, 2020

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?