Skip to content

gccgo with ld.gold: panic when running code that works with go 1.6.1 #15549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tamird opened this issue May 4, 2016 · 19 comments
Closed

gccgo with ld.gold: panic when running code that works with go 1.6.1 #15549

tamird opened this issue May 4, 2016 · 19 comments
Milestone

Comments

@tamird
Copy link
Contributor

tamird commented May 4, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
$ gccgo go version
go version go1.6.1 gccgo (GCC) 6.1.0 linux/amd64
  1. What operating system and processor architecture are you using (go env)?
$ gccgo go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/tamird/src/go"
GORACE=""
GOROOT="/opt/gccgo"
GOTOOLDIR="/opt/gccgo/libexec/gcc/x86_64-pc-linux-gnu/6.1.0"
GO15VENDOREXPERIMENT="1"
CC="/opt/gccgo/bin/gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="/opt/gccgo/bin/g++"
CGO_ENABLED="1"
  1. What did you do?
$ gccgo go test github.com/cockroachdb/cockroach/sql -bench=BenchmarkBank2_Cockroach -benchmem -run=NONE -gccgoflags='-Xlinker --allow-multiple-definition'

Note I had to first delete github.com/cockroachdb/cockroach/sql/{parallel,logic}_test.go due to #15544.
Note I had to use -gccgoflags='-Xlinker --allow-multiple-definition due to #15547.

  1. What did you expect to see?
BenchmarkBank2_Cockroach     5357452 ns/op          434579 B/op       6732 allocs/op
PASS
ok      github.com/cockroachdb/cockroach/sql    2.082s
  1. What did you see instead?
I160504 23:34:59.238068 rand.go:76  Random seed: 2750415068627718676
PASS
BenchmarkBank2_Cockroach        W160504 23:34:59.254597 gossip/gossip.go:892  not connected to cluster; use --join to specify a connected node
I160504 23:34:59.262987 storage/engine/rocksdb.go:137  opening in memory rocksdb instance
fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0x0]

goroutine 26 [syscall]:
runtime_dopanic
        ../../../gcc-6.1.0/libgo/runtime/panic.c:131
runtime_throw
        ../../../gcc-6.1.0/libgo/runtime/panic.c:193
sig_panic_leadin
        ../../../gcc-6.1.0/libgo/runtime/go-signal.c:249
sig_panic_info_handler
        ../../../gcc-6.1.0/libgo/runtime/go-signal.c:283

        :0
_ZN6google8protobuf16RepeatedPtrFieldINS0_7MessageEED4Ev
        internal/src/google/protobuf/repeated_field.h:1793
_ZN6google8protobuf8internal21arena_destruct_objectINS0_16RepeatedPtrFieldINS0_7MessageEEEEEvPv
        internal/src/google/protobuf/arena.h:74
created by testing.run.pN9_testing.B
        ../../../gcc-6.1.0/libgo/go/testing/benchmark.go:190

goroutine 16 [chan receive]:
runtime_mcall
        ../../../gcc-6.1.0/libgo/runtime/proc.c:295
chanrecv
        ../../../gcc-6.1.0/libgo/runtime/chan.goc:269
testing.run.pN9_testing.B
        ../../../gcc-6.1.0/libgo/go/testing/benchmark.go:191
testing.RunBenchmarks
        ../../../gcc-6.1.0/libgo/go/testing/benchmark.go:343
testing.Run.pN9_testing.M
        ../../../gcc-6.1.0/libgo/go/testing/testing.go:523
github_com_cockroachdb_cockroach_sql_test.TestMain
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/sql/main_test.go:238
main.main
        /tmp/go-build987433693/github.com/cockroachdb/cockroach/sql/_test/_testmain.go:504
runtime_main
        ../../../gcc-6.1.0/libgo/runtime/proc.c:626
created by main
        ../../../gcc-6.1.0/libgo/runtime/go-main.c:54

goroutine 17 [syscall]:
        goroutine in C code; stack unavailable
created by runtime_main
        ../../../gcc-6.1.0/libgo/runtime/proc.c:598

goroutine 18 [finalizer wait]:
runtime_mcall
        ../../../gcc-6.1.0/libgo/runtime/proc.c:295
runfinq
        ../../../gcc-6.1.0/libgo/runtime/mgc0.c:2512
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:235
created by
        ../../../gcc-6.1.0/libgcc/config/i386/morestack.S:544

goroutine 19 [syscall]:
        goroutine in C code; stack unavailable
created by
        ../../../gcc-6.1.0/libgcc/config/i386/morestack.S:544

goroutine 24 [chan receive]:
runtime_mcall
        ../../../gcc-6.1.0/libgo/runtime/proc.c:295
chanrecv
        ../../../gcc-6.1.0/libgo/runtime/chan.goc:299
runtime.chanrecv2
        ../../../gcc-6.1.0/libgo/runtime/chan.goc:384
github_com_cockroachdb_cockroach_util_log.flushDaemon.pN50_github_com_cockroachdb_cockroach_util_log.loggingT
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/log/clog.go:1011
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:235
created by
        ../../../gcc-6.1.0/libgcc/config/i386/morestack.S:544

goroutine 25 [GC sweep wait]:
runtime_mcall
        ../../../gcc-6.1.0/libgo/runtime/proc.c:295
bgsweep
        ../../../gcc-6.1.0/libgo/runtime/mgc0.c:1844
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:235
created by mgc
        ../../../gcc-6.1.0/libgo/runtime/mgc0.c:2215

goroutine 27 [runnable]:
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:232
created by github_com_cockroachdb_cockroach_util_stop.RunWorker.pN50_github_com_cockroachdb_cockroach_util_stop.Stopper
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:137

goroutine 28 [runnable]:
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:232
created by github_com_cockroachdb_cockroach_util_stop.RunWorker.pN50_github_com_cockroachdb_cockroach_util_stop.Stopper
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:137

goroutine 29 [runnable]:
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:232
created by github_com_cockroachdb_cockroach_util_stop.RunWorker.pN50_github_com_cockroachdb_cockroach_util_stop.Stopper
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:137

goroutine 30 [runnable]:
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:232
created by github_com_cockroachdb_cockroach_util_stop.RunWorker.pN50_github_com_cockroachdb_cockroach_util_stop.Stopper
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:137

goroutine 31 [runnable]:
kickoff
        ../../../gcc-6.1.0/libgo/runtime/proc.c:232
created by github_com_cockroachdb_cockroach_util_stop.RunWorker.pN50_github_com_cockroachdb_cockroach_util_stop.Stopper
        /home/tamird/src/go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:137
exit status 2
FAIL    github.com/cockroachdb/cockroach/sql    0.250s
@ianlancetaylor ianlancetaylor added this to the Gccgo milestone May 5, 2016
@ianlancetaylor ianlancetaylor self-assigned this May 5, 2016
@thanm
Copy link
Contributor

thanm commented Jul 14, 2016

@tamird -- I tried to reproduce this bug today, but could not build the package in question:

% go get -d github.com/cockroachdb/cockroach/sql
% go build github.com/cockroachdb/cockroach/sql

github.com/cockroachdb/cockroach/roachpb

github.com/cockroachdb/cockroach/roachpb/api.pb.go:1486: undefined: grpc.SupportPackageIsVersion2

It looks as though the protobuf go sources need to be regenerated?

@tamird
Copy link
Contributor Author

tamird commented Jul 14, 2016

Please use the Makefile via make build to ensure that dependencies are properly versioned.

@thanm
Copy link
Contributor

thanm commented Jul 14, 2016

Thanks. I can now sql to build and the test runs.
However it does not compile with trunk gccgo -- it looks as though there are now portions of the code that are dependent on 1.7 runtime features:

go test -compiler gccgo github.com/cockroachdb/cockroach/sql -bench=BenchmarkBank2_Cockroach -benchmem -run=NONE -gccgoflags='-Xlinker --allow-multiple-definition'
# golang.org/x/net/context
../../../golang.org/x/net/context/go17.go:10:9: error: import file ‘context’ not found
  "context" // standard library's context, as of Go 1.7

gccgo runtime is currently back at 1.4.

@tamird
Copy link
Contributor Author

tamird commented Jul 14, 2016

Something's funky here - we are definitely not building on 1.7.

Take a look at the file from which the error is originating: https://github.com/golang/net/blob/master/context/go17.go#L5

Why is it being compiled? Looks like you found another bug in gccgo.

@thanm
Copy link
Contributor

thanm commented Jul 14, 2016

Hmm-- I think i may have shot myslef in the foot there (I modified the makefile to get around the version check). Stay tuned, I will try again.

@mwhudson
Copy link
Contributor

I suspect the problem you are having is that the Go 1.7 version of the go tool sets the go1.7 build tag irrespective of the version of gccgo (#8851). You will generally have a better time if you use the version of the go tool that comes with the version of gccgo you are using -- on Debian/Ubuntu that would be installed as /usr/bin/go-6.

@thanm
Copy link
Contributor

thanm commented Jul 18, 2016

I am able to build now with gccgo after recreating my GOPATH from scratch.

So far I have not been able to reproduce the panic, however-- test passes OK. This is with top-of-trunk gcc/gccgo; I will make another attempt using the official GCC 6.1.0 bits.

@tamird, is there anything you can recommend that will speed up the cockroachdb builds? The build process includes a lot of C++ compilations, all of which are being done serially, so I am spending much time waiting for compiles to complete. I tried passing -j20 to make and GOFLAGS="-p 20", but from what I can tell GOFLAGS isn't being passed through to 'glock's, so there is zero parallelism.

@thanm
Copy link
Contributor

thanm commented Jul 18, 2016

OK, I am stumped. I repeated the process with a fresh copy of gcc 6.1.0 built with

../gcc-6.1.0/configure --prefix=/tmp/gcc-6.1.0/cross --enable-languages=c,c++,go --enable-libgo --disable-bootstrap

and I get the same result (test passes, no panic).

@thanm
Copy link
Contributor

thanm commented Jul 21, 2016

@tamird -- closing this bug out for now, since I can't reproduce it. Please re-open if there is a better reproducer.

@thanm thanm closed this as completed Jul 21, 2016
@tamird
Copy link
Contributor Author

tamird commented Jul 25, 2016

@thanm sorry for the delay, I was out last week. I notice that you compiled gcc using different flags than I did. It's also possible that today's version of Cockroach doesn't tickle this bug.

I'm re-testing now.

@tamird
Copy link
Contributor Author

tamird commented Jul 25, 2016

OK, bad things still happen with a current version of CockroachDB (this time it's a segfault):

tamird@instance-1:~/src/go/src/github.com/cockroachdb/cockroach$ git rev-parse HEAD
29a968ba0253e7ddd924ecbc48b69a9993f3acd4
tamird@instance-1:~/src/go/src/github.com/cockroachdb/cockroach$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    deleted:    sql/logic_test.go
    deleted:    sql/parallel_test.go

no changes added to commit (use "git add" and/or "git commit -a")
tamird@instance-1:~/src/go/src/github.com/cockroachdb/cockroach$ gccgo go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/tamird/src/go"
GORACE=""
GOROOT="/opt/gccgo"
GOTOOLDIR="/opt/gccgo/libexec/gcc/x86_64-pc-linux-gnu/6.1.0"
GO15VENDOREXPERIMENT="1"
CC="/opt/gccgo/bin/gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="/opt/gccgo/bin/g++"
CGO_ENABLED="1"

Note that this gccgo was compiled using:

../gcc-6.1.0/configure --prefix=/opt/gccgo --enable-languages=c,c++,go --disable-multilib --with-ld=$(which ld.gold)

Now testing without the gold linker.

@tamird
Copy link
Contributor Author

tamird commented Jul 25, 2016

Indeed, this doesn't repro without the gold linker (but continues to repro with it):

tamird@instance-1:~/src/gcc-build$ gccgo go test github.com/cockroachdb/cockroach/sql -bench=BenchmarkBank2_Cockroach -benchmem -run=NONE -v -count 10 -gccgoflags='-Xlinker --allow-multiple-definition'
signal: Segmentation fault
FAIL    github.com/cockroachdb/cockroach/sql    0.068s

For reference, I'm using:

$ ld.gold --version
GNU gold (GNU Binutils for Debian 2.25) 1.11
Copyright (C) 2014 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

@thanm I can't reopen this myself, could you?

@tamird tamird changed the title gccgo: panic when running code that works with go 1.6.1 gccgo with ld.gold: panic when running code that works with go 1.6.1 Jul 25, 2016
@thanm
Copy link
Contributor

thanm commented Jul 25, 2016

Thanks, that's helpful. I'll take another shot with gold enabled.

@thanm thanm reopened this Jul 25, 2016
@thanm
Copy link
Contributor

thanm commented Jul 26, 2016

The crash is taking place early, during static initialization for the main load module. Trace below. Things look kosher as far as I can tell until we hit je_malloc_mutex_init, where it looks as though there is a linker issue. The more-stack sequence in the prolog of the function is OK, but once we get into the main body things are messed up. Here is the code:

if (pthread_mutexattr_init(&attr) != 0)
    return (true);

and here is the object code from the .o file:

/s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/mutex.c:97
  1b:   48 89 e7                mov    %rsp,%rdi
  1e:   e8 00 00 00 00          callq  23 <je_malloc_mutex_init+0x23>
            1f: R_X86_64_PLT32  pthread_mutexattr_init-0x4
  23:   85 c0                   test   %eax,%eax
  25:   74 11                   je     38 <je_malloc_mutex_init+0x38>

however here is the object code from the a.out:

/s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/mutex.c:97
 16df07b:   48 89 e7                mov    %rsp,%rdi
 16df07e:   e8 7d 0f 92 fe          callq  0
 16df083:   85 c0                   test   %eax,%eax
 16df085:   74 11                   je     16df098 <je_malloc_mutex_init+0x38>

The other calls in the function are similarly mangled (target of zero). Not sure how this happened, but it would definitely point to a linker error. I'll see about building a debuggable ld.gold to find out what is going haywire.

Full trace:

#0  0x0000000000000000 in ?? ()
#1  0x00000000016df083 in je_malloc_mutex_init (mutex=mutex@entry=0x2a8ac60 <base_mtx>) at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/mutex.c:97
#2  0x00000000016bb79b in je_base_boot () at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/base.c:147
#3  0x00000000016d162b in malloc_init_hard_a0_locked () at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/jemalloc.c:1235
#4  0x00000000016d17ba in malloc_init_hard () at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/jemalloc.c:1373
#5  0x00000000016d5c38 in malloc_init () at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/jemalloc.c:302
#6  imalloc_body (slow_path=true, usize=<synthetic pointer>, tsd=<synthetic pointer>, size=<optimized out>) at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/jemalloc.c:1451
#7  malloc (size=<optimized out>) at /s/go-issue15549/src/github.com/cockroachdb/c-jemalloc/jemalloc.c:1507
#8  0x0000000001cc6c15 in __morestack () at ../../../gcc-trunk/libgcc/config/i386/morestack.S:544
#9  0x00007ffff7ed4b00 in pool (this=0x7ffff7fcb420 <(anonymous namespace)::emergency_pool>) at ../../../../gcc-trunk/libstdc++-v3/libsupc++/eh_alloc.cc:123
#10 __static_initialization_and_destruction_0 (__priority=65535, __initialize_p=1) at ../../../../gcc-trunk/libstdc++-v3/libsupc++/eh_alloc.cc:250
#11 _GLOBAL__sub_I_eh_alloc.cc(void) () at ../../../../gcc-trunk/libstdc++-v3/libsupc++/eh_alloc.cc:326
#12 0x00007ffff7dea10a in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdaf8, env=env@entry=0x7fffffffdb08) at dl-init.c:78
#13 0x00007ffff7dea1f3 in call_init (env=<optimized out>, argv=<optimized out>, argc=<optimized out>, l=<optimized out>) at dl-init.c:36
#14 _dl_init (main_map=0x7ffff7ffe1c8, argc=1, argv=0x7fffffffdaf8, env=0x7fffffffdb08) at dl-init.c:126
#15 0x00007ffff7ddb30a in _dl_start_user () at rtld.c:871
#16 0x0000000000000001 in ?? ()
#17 0x00007fffffffdef1 in ?? ()
#18 0x0000000000000000 in ?? ()

@thanm
Copy link
Contributor

thanm commented Jul 27, 2016

OK, it appears that the mysterious "callq 0" is due to the linker options, which include " -Wl,-unresolved-symbols=ignore-all". When I remove that, the link fails with

/tmp/go-build793160650/github.com/cockroachdb/cockroach/sql/_test/_testmain.go:2: error: undefined reference to 'internal_golang_org_x_net_http2_hpack..import'
internal/include/jemalloc/internal/tsd.h:609: error: undefined reference to 'pthread_setspecific'

Adding -lpthread eliminates the pstread issues, but doesn't eliminate the undefined reference to internal_golang_org_x_net_http2_hpack..import. Still working on tracking that down.

I would suggest removing " -Wl,-unresolved-symbols=ignore-all" from the linker options -- that seems like asking for trouble from my perspective.

@thanm thanm assigned thanm and unassigned ianlancetaylor Jul 27, 2016
@tamird
Copy link
Contributor Author

tamird commented Jul 27, 2016

Couldn't agree more! However: #14985

On Jul 27, 2016 15:08, "Than McIntosh" [email protected] wrote:

OK, it appears that the mysterious "callq 0" is due to the linker options,
which include " -Wl,-unresolved-symbols=ignore-all". When I remove that,
the link fails with

/tmp/go-build793160650/
github.com/cockroachdb/cockroach/sql/_test/_testmain.go:2: error:
undefined reference to 'internal_golang_org_x_net_http2_hpack..import'
internal/include/jemalloc/internal/tsd.h:609: error: undefined reference
to 'pthread_setspecific'

Adding -lpthread eliminates the pstread issues, but doesn't eliminate the
undefined reference to internal_golang_org_x_net_http2_hpack..import. Still
working on tracking that down.

I would suggest removing " -Wl,-unresolved-symbols=ignore-all" from the
linker options -- that seems like asking for trouble from my perspective.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15549 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABdsPJEZvK7Ta_-2Q0X0X7kd7mmr05Pyks5qZ6zHgaJpZM4IXqJ9
.

@thanm
Copy link
Contributor

thanm commented Jul 29, 2016

Update: the internal_golang_org_x_net_http2_hpack..import undef was a red herring (pilot error).

I can get the test to pass by adding "-lpthread" to

cli/start_jemalloc.go
server/status/runtime_jemalloc.go
storage/engine/rocksdb.go
storage/engine/rocksdb/rocksdb.go

immediately following the "-Wl,-unresolved-symbols=ignore-all". E.g.

-// #cgo linux LDFLAGS: -Wl,-unresolved-symbols=ignore-all
+// #cgo linux LDFLAGS: -Wl,-unresolved-symbols=ignore-all -lpthread

Will close out the bug again assuming you can reproduce this.

@tamird
Copy link
Contributor Author

tamird commented Jul 29, 2016

How come lpthread is required in gccgo but not gc?

tamird added a commit to cockroachdb/c-jemalloc that referenced this issue Jul 29, 2016
tamird added a commit to cockroachdb/c-jemalloc that referenced this issue Jul 29, 2016
This is needed when using gccgo with ld.gold.

See golang/go#15549.
@tamird
Copy link
Contributor Author

tamird commented Jul 29, 2016

Ah, this is of course a different between ld and gold. Live and learn.

Confirmed that this is working. Thanks for digging!

@golang golang locked and limited conversation to collaborators Jul 29, 2017
@rsc rsc unassigned thanm Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants