Skip to content

os/user: Lookup not returning UnknownUserError for unknown user when CGO_ENABLED=1 #67912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
benhoyt opened this issue Jun 10, 2024 · 7 comments

Comments

@benhoyt
Copy link
Contributor

benhoyt commented Jun 10, 2024

Go version

go version go1.22.4 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/ben/.cache/go-build'
GOENV='/home/ben/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ben/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ben/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/ben/sdk/go1.22.4'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/ben/sdk/go1.22.4/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ben/w/pebble/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2146132670=/tmp/go-build -gno-record-gcc-switches'

What did you do?

When I call user.Lookup on a non-existent user (eg: "baduser"), I expect it to return user.UnknownUserError, but it instead returns a generic non-nil error. I can use the following test program (Go playground link) to easily reproduce this on my system:

package main

import (
	"fmt"
	"os"
	"os/user"
)

func main() {
	usr, err := user.Lookup(os.Args[1])
	if _, ok := err.(user.UnknownUserError); ok {
		fmt.Println("unknown user")
		return
	}
	if err != nil {
		fmt.Println("error:", err)
		return
	}
	fmt.Printf("user: %#v\n", usr)
}

What did you see happen?

When I run the test script with CGO_ENABLED=1 (the default on my system) it's instead returning a generic errors.errorString (with the message "user: lookup username guy: no such file or directory"):

# Default behaviour (not expected)
$ CGO_ENABLED=1 go run t.go baduser
error: user: lookup username baduser: no such file or directory

# With cgo disabled I get the expected behaviour
$ CGO_ENABLED=0 go run t.go baduser
unknown user

What did you expect to see?

The user.Lookup call should return an UnknownUserError error value, whether or not CGO_ENABLED is set.

When I follow it through in my debugger, it's using the lookupUser from cgo_lookup_unix.go, which is calling _C_getpwnam_r, and that's returning a syscall.Errno error with ENOENT. The getpwnam_r docs seem to indicate this function can return ENOENT under such circumstances, so maybe the code should check for that, rather than just the f != 0 test?

Why it's changed now on my system, but was working before, I'm not sure -- my colleague, also on Ubuntu 24.04 with the same kernel (Linux 6.8.0-35-generic x86_64) doesn't seem to have this problem. My first thought was a different libc version, but my colleague (who doesn't have this issue) and I both are running the same version -- this one:

$ dpkg -s libc6 | grep Version
Version: 2.39-0ubuntu8.2

Note that I recently upgraded my OS by doing a fresh install of Ubuntu 24.04. When I was on 22.04 I didn't have this problem (though as I noted, my colleague is also on 24.04 and he doesn't have this issue -- it works both with cgo enabled and disabled).

For reference, with an existing user both cgo and non-cgo work fine:

$ CGO_ENABLED=1 go run t.go ben
user: &user.User{Uid:"1000", Gid:"1000", Username:"ben", Name:"Ben", HomeDir:"/home/ben"}
$ CGO_ENABLED=0 go run t.go ben
user: &user.User{Uid:"1000", Gid:"1000", Username:"ben", Name:"Ben", HomeDir:"/home/ben"}
@benhoyt
Copy link
Contributor Author

benhoyt commented Jun 10, 2024

Note that I looked at all of those "similar issues", but I think this is different from all of them. The closest is #25973, but it's a bit hard to tell, and that was closed quickly by the creator with "I was doing something wrong" and no further info.

@dimaqq
Copy link

dimaqq commented Jun 10, 2024

Prior art outside golang:

crystal-lang/crystal#8069

The link to Ruby implementation from the above:

https://github.com/ruby/ruby/blob/81562f943e4f33fbfd00fdfd115890ba0b76916c/process.c#L5934-L5950

@benhoyt
Copy link
Contributor Author

benhoyt commented Jun 10, 2024

@kortschak helped me work through this on Gophers Slack. getpwnam_r definitely returns ENOENT (errno 2) under some circumstances. Via debugging, I can see that's what it's returning on my machine in the above example. It's documented this way:

On success, getpwnam_r() and getpwuid_r() return zero, and set *result to pwd. If no matching password record was found, these functions return 0 and store NULL in *result. In case of error, an error number is returned, and NULL is stored in *result.

So we should be checking the return value (error number) against ENOENT, probably in lookupUser.

I still don't know what's different about my machine than my colleague's. Maybe a different nsswitch.conf? I'm out of my depth here. Suffice to say that mine is a very fresh install of Ubuntu 24.04, and I haven't mucked with those settings that I know of.

For reference, here's a little C program that Dan wrote -- it reproduces this issue on my machine.

$ cat t.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <pwd.h>
#include <errno.h>

void main() {
	struct passwd pwd;
	struct passwd *result;
	size_t buflen = 4096;
	char *buf = (char*)malloc(buflen);
	int ret = getpwnam_r("baduser", &pwd, buf, buflen, &result);
	printf("%d %d %s %d\n", ret, errno, pwd.pw_name, pwd.pw_uid);
}
$ gcc t.c
$ ./a.out
2 2 pulse 122
$ # note that "pulse" is undefined bytes (it's the last username in my /etc/passwd)

I'll try to put up a Go change to fix this.

@benhoyt
Copy link
Contributor Author

benhoyt commented Jun 10, 2024

Thanks @dimaqq, that Crystal issue looks like exactly the same problem.

benhoyt added a commit to benhoyt/pebble that referenced this issue Jun 10, 2024
Per golang/go#67912, under some circumstances
(on my machine) user.Lookup isn't returning UnknownUserError when it
should.
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/591555 mentions this issue: os/user: make Lookup* functions properly handle ENOENT

benhoyt added a commit to canonical/pebble that referenced this issue Jun 11, 2024
#427)

Per golang/go#67912, under some circumstances
(on my machine) user.Lookup isn't returning UnknownUserError when it
should. So before this fix I get the following test failures:

```
$ go test ./internals/osutil
...
FAIL: user_test.go:44: userSuite.TestRealUser

user_test.go:71:
    c.Assert(err, check.IsNil)
... value *errors.errorString = &errors.errorString{s:"user: lookup username guy: no such file or directory"} ("user: lookup username guy: no such file or directory")
...
```

And:

```
$ go test ./internals/daemon
...
----------------------------------------------------------------------
FAIL: api_files_test.go:1128: filesSuite.TestWriteErrors

api_files_test.go:1195:
    checkFileResult(c, r.Result[4], pathUserNotFound, "generic-file-error", ".*unknown user.*")
api_files_test.go:273:
    c.Check(r.Error.Message, Matches, errorMsg)
... value string = "cannot look up user and group: user: lookup username user-not-found: no such file or directory"
... regex string = ".*unknown user.*"

api_files_test.go:1196:
    checkFileResult(c, r.Result[5], pathGroupNotFound, "generic-file-error", ".*unknown group.*")
api_files_test.go:273:
    c.Check(r.Error.Message, Matches, errorMsg)
... value string = "cannot look up user and group: user: lookup groupname group-not-found: no such file or directory"
... regex string = ".*unknown group.*"
...
```
@benhoyt
Copy link
Contributor Author

benhoyt commented Jun 11, 2024

A colleague and I got to the bottom of why/when this is happening. On my machine, for whatever reason (on a relatively fresh Ubuntu 24.04 install), I have the following line in my nsswitch.conf:

$ grep passwd /etc/nsswitch.conf 
passwd:         files systemd sss

It's the sss that's doing it. I have libsss installed, though sssd is not running. To reproduce this, 1) add sss to the passwd: line as per above, and 2) install libnss-sss with sudo apt install libnss-sss if it's missing. My colleague did that that and could repro the issue of ENOENT being returned, and similarly my system no longer returns ENOENT if I remove the sss (it then returns no error and not-found).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants