Skip to content

bpo-33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid} #7081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 7, 2018
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Release GIL on `grp.getgrnam`, `grp.getgrgid`, `pwd.getpwnam` and
`pwd.getpwuid` if reentrant variants of these functions are available.
Patch by William Grzybowski.
104 changes: 98 additions & 6 deletions Modules/grpmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,9 @@ static PyObject *
grp_getgrgid_impl(PyObject *module, PyObject *id)
/*[clinic end generated code: output=30797c289504a1ba input=15fa0e2ccf5cda25]*/
{
PyObject *py_int_id;
PyObject *py_int_id, *retval = NULL;
int nomem = 0;
char *buf = NULL;
gid_t gid;
struct group *p;

Expand All @@ -119,16 +121,62 @@ grp_getgrgid_impl(PyObject *module, PyObject *id)
}
Py_DECREF(py_int_id);
}
#ifdef HAVE_GETGRGID_R
Py_BEGIN_ALLOW_THREADS
int status;
Py_ssize_t bufsize;
struct group grp;

bufsize = sysconf(_SC_GETGR_R_SIZE_MAX);
if (bufsize == -1) {
bufsize = 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind to add a "#define DEFAULT_BUFFER_SIZE 1024" at top level, rather than using an hardcoded constant here?

}
buf = PyMem_RawMalloc(bufsize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, wait, you don't check if PyMem_Malloc() failed here? I'm not sure that it's safe to call get*() functions with buf=NULL.

I have a proposition. Move the memory allocation at the start of the loop, and always use PyMem_Realloc(). PyMem_Realloc(buf, bufsize) behaves as PyMem_Malloc(). Pseudo-code:

/* PyMem_Malloc code removed from here */
while (1) {
        buf2 = PyMem_RawRealloc(buf, bufsize);
        if (buf2 == NULL) {
            nomem = 1;
            break;
        }
        buf = buf2; /* buf cannot be NULL from this point */
        (...)
    /* no more PyMem_Realloc code here neither */
}

(I don't propose to add these comments, it's just to explain my idea ;-))


while (1) {
status = getgrgid_r(gid, &grp, buf, bufsize, &p);
if (p != NULL || status != ERANGE) {
break;
}
if (bufsize > (PY_SSIZE_T_MAX >> 1)) {
nomem = 1;
break;
}
bufsize <<= 1;
p = PyMem_RawRealloc(buf, bufsize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange to reuse the "struct group *p;" variable to reallocate the "char *buf = NULL;" variable: C types are not the same!? I wold prefer to see a different variable (ex: "buf2").

if (p == NULL) {
nomem = 1;
break;
}
buf = (char *) p;
}

if ((p = getgrgid(gid)) == NULL) {
if (status != 0) {
p = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that p != NULL if status != 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, p may be set to some arbitrary entry on some error conditions of getgrgid_r and similar methods.

}
Py_END_ALLOW_THREADS
#else
p = getgrgid(gid);
#endif
if (p == NULL) {
if (buf != NULL) {
PyMem_RawFree(buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyMem_RawFree(NULL) is valid (does nothing), the if() is useless.

}
if (nomem == 1) {
return PyErr_NoMemory();
}
PyObject *gid_obj = _PyLong_FromGid(gid);
if (gid_obj == NULL)
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You leak buf memory here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am failing to see how. If p == NULL it will free buf in the if statement before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I was confused because one RawFree in done when the GIL is released, and the other one is done when we hold the GIL. I would suggest to release buf when we hold the GIL, and change the previous if as:

    if (status != 0) {
        p = NULL;
    }

PyErr_Format(PyExc_KeyError, "getgrgid(): gid not found: %S", gid_obj);
Py_DECREF(gid_obj);
return NULL;
}
return mkgrent(p);
retval = mkgrent(p);
#ifdef HAVE_GETGRGID_R
PyMem_RawFree(buf);
#endif
return retval;
}

/*[clinic input]
Expand All @@ -145,7 +193,8 @@ static PyObject *
grp_getgrnam_impl(PyObject *module, PyObject *name)
/*[clinic end generated code: output=67905086f403c21c input=08ded29affa3c863]*/
{
char *name_chars;
char *buf = NULL, *name_chars;
int nomem = 0;
struct group *p;
PyObject *bytes, *retval = NULL;

Expand All @@ -154,13 +203,56 @@ grp_getgrnam_impl(PyObject *module, PyObject *name)
/* check for embedded null bytes */
if (PyBytes_AsStringAndSize(bytes, &name_chars, NULL) == -1)
goto out;
#ifdef HAVE_GETGRNAM_R
Py_BEGIN_ALLOW_THREADS
int status;
Py_ssize_t bufsize;
struct group grp;

bufsize = sysconf(_SC_GETGR_R_SIZE_MAX);
if (bufsize == -1) {
bufsize = 1024;
}
buf = PyMem_RawMalloc(bufsize);

while(1) {
status = getgrnam_r(name_chars, &grp, buf, bufsize, &p);
if (p != NULL || status != ERANGE) {
break;
}
if (bufsize > (PY_SSIZE_T_MAX >> 1)) {
nomem = 1;
break;
}
bufsize <<= 1;
p = PyMem_RawRealloc(buf, bufsize);
if (p == NULL) {
nomem = 1;
break;
}
buf = (char *) p;
}

if ((p = getgrnam(name_chars)) == NULL) {
PyErr_Format(PyExc_KeyError, "getgrnam(): name not found: %s", name_chars);
if (status != 0) {
p = NULL;
}
Py_END_ALLOW_THREADS
#else
p = getgrnam(name_chars);
#endif
if (p == NULL) {
if (nomem == 1) {
PyErr_NoMemory();
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: the PEP 7 requires

}
else {

PyErr_Format(PyExc_KeyError, "getgrnam(): name not found: %s", name_chars);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I see a bug which is not related to your change, but while we modify these functions, it would be nice to fix it.

name_chars is encoded to the filesystem encoding, whereas %s decodes it from UTF-8. If the filesystem encoding is not UTF-8, you get mojibake.

The fix is simple: use %S format and pass name: name is always a Unicode string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If name is always a Unicode string, you can use %U.

It may be worth to use %R for the case if the name contains invisible characters or trailing whitespaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
goto out;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You leak buf here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am failing to see how. If p == NULL it will free buf in the if statement before.

}
retval = mkgrent(p);
out:
if (buf != NULL) {
PyMem_RawFree(buf);
}
Py_DECREF(bytes);
return retval;
}
Expand Down
107 changes: 101 additions & 6 deletions Modules/pwdmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -116,16 +116,62 @@ static PyObject *
pwd_getpwuid(PyObject *module, PyObject *uidobj)
/*[clinic end generated code: output=c4ee1d4d429b86c4 input=ae64d507a1c6d3e8]*/
{
PyObject *retval = NULL;
uid_t uid;
int nomem = 0;
struct passwd *p;
char *buf = NULL;

if (!_Py_Uid_Converter(uidobj, &uid)) {
if (PyErr_ExceptionMatches(PyExc_OverflowError))
PyErr_Format(PyExc_KeyError,
"getpwuid(): uid not found");
return NULL;
}
if ((p = getpwuid(uid)) == NULL) {
#ifdef HAVE_GETPWUID_R
Py_BEGIN_ALLOW_THREADS
int status;
Py_ssize_t bufsize;
struct passwd pwd;

bufsize = sysconf(_SC_GETPW_R_SIZE_MAX);
if (bufsize == -1) {
bufsize = 1024;
}
buf = PyMem_RawMalloc(bufsize);

while(1) {
status = getpwuid_r(uid, &pwd, buf, bufsize, &p);
if (p != NULL || status != ERANGE) {
break;
}
if (bufsize > (PY_SSIZE_T_MAX >> 1)) {
nomem = 1;
break;
}
bufsize <<= 1;
p = PyMem_RawRealloc(buf, bufsize);
if (p == NULL) {
nomem = 1;
break;
}
buf = (char *) p;
}

if (status != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to move this just after the get...() call.

p = NULL;
}
Py_END_ALLOW_THREADS
#else
p = getpwuid(uid);
#endif
if (p == NULL) {
if (buf != NULL) {
PyMem_RawFree(buf);
}
if (nomem == 1) {
return PyErr_NoMemory();
}
PyObject *uid_obj = _PyLong_FromUid(uid);
if (uid_obj == NULL)
return NULL;
Expand All @@ -134,7 +180,11 @@ pwd_getpwuid(PyObject *module, PyObject *uidobj)
Py_DECREF(uid_obj);
return NULL;
}
return mkpwent(p);
retval = mkpwent(p);
#ifdef HAVE_GETPWUID_R
PyMem_RawFree(buf);
#endif
return retval;
}

/*[clinic input]
Expand All @@ -152,7 +202,8 @@ static PyObject *
pwd_getpwnam_impl(PyObject *module, PyObject *arg)
/*[clinic end generated code: output=6abeee92430e43d2 input=d5f7e700919b02d3]*/
{
char *name;
char *buf = NULL, *name;
int nomem = 0;
struct passwd *p;
PyObject *bytes, *retval = NULL;

Expand All @@ -161,13 +212,57 @@ pwd_getpwnam_impl(PyObject *module, PyObject *arg)
/* check for embedded null bytes */
if (PyBytes_AsStringAndSize(bytes, &name, NULL) == -1)
goto out;
if ((p = getpwnam(name)) == NULL) {
PyErr_Format(PyExc_KeyError,
"getpwnam(): name not found: %s", name);
#ifdef HAVE_GETPWNAM_R
Py_BEGIN_ALLOW_THREADS
int status;
Py_ssize_t bufsize;
struct passwd pwd;

bufsize = sysconf(_SC_GETPW_R_SIZE_MAX);
if (bufsize == -1) {
bufsize = 1024;
}
buf = PyMem_RawMalloc(bufsize);

while(1) {
status = getpwnam_r(name, &pwd, buf, bufsize, &p);
if (p != NULL || status != ERANGE) {
break;
}
if (bufsize > (PY_SSIZE_T_MAX >> 1)) {
nomem = 1;
break;
}
bufsize <<= 1;
p = PyMem_RawRealloc(buf, bufsize);
if (p == NULL) {
nomem = 1;
break;
}
buf = (char *) p;
}

if (status != 0) {
p = NULL;
}
Py_END_ALLOW_THREADS
#else
p = getpwnam(name);
#endif
if (p == NULL) {
if (nomem == 1) {
PyErr_NoMemory();
} else {
PyErr_Format(PyExc_KeyError,
"getpwnam(): name not found: %s", name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment: please use the %S format here, but if you modify this function, would you mind to also fix the name of the parameter? Rename "arg" to "name", as in Doc/library/pwd.rst. Maybe rename the char* name to name_chars, as in grp_getgrnam_impl().

}
goto out;
}
retval = mkpwent(p);
out:
if (buf != NULL) {
PyMem_RawFree(buf);
}
Py_DECREF(bytes);
return retval;
}
Expand Down
18 changes: 16 additions & 2 deletions configure
Original file line number Diff line number Diff line change
Expand Up @@ -781,6 +781,7 @@ infodir
docdir
oldincludedir
includedir
runstatedir
localstatedir
sharedstatedir
sysconfdir
Expand Down Expand Up @@ -891,6 +892,7 @@ datadir='${datarootdir}'
sysconfdir='${prefix}/etc'
sharedstatedir='${prefix}/com'
localstatedir='${prefix}/var'
runstatedir='${localstatedir}/run'
includedir='${prefix}/include'
oldincludedir='/usr/include'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
Expand Down Expand Up @@ -1143,6 +1145,15 @@ do
| -silent | --silent | --silen | --sile | --sil)
silent=yes ;;

-runstatedir | --runstatedir | --runstatedi | --runstated \
| --runstate | --runstat | --runsta | --runst | --runs \
| --run | --ru | --r)
ac_prev=runstatedir ;;
-runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
| --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
| --run=* | --ru=* | --r=*)
runstatedir=$ac_optarg ;;

-sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
ac_prev=sbindir ;;
-sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
Expand Down Expand Up @@ -1280,7 +1291,7 @@ fi
for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
libdir localedir mandir
libdir localedir mandir runstatedir
do
eval ac_val=\$$ac_var
# Remove trailing slashes.
Expand Down Expand Up @@ -1433,6 +1444,7 @@ Fine tuning of the installation directories:
--sysconfdir=DIR read-only single-machine data [PREFIX/etc]
--sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com]
--localstatedir=DIR modifiable single-machine data [PREFIX/var]
--runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
--libdir=DIR object code libraries [EPREFIX/lib]
--includedir=DIR C header files [PREFIX/include]
--oldincludedir=DIR C header files for non-gcc [/usr/include]
Expand Down Expand Up @@ -11263,8 +11275,9 @@ for ac_func in alarm accept4 setitimer getitimer bind_textdomain_codeset chown \
clock confstr ctermid dup3 execv faccessat fchmod fchmodat fchown fchownat \
fexecve fdopendir fork fpathconf fstatat ftime ftruncate futimesat \
futimens futimes gai_strerror getentropy \
getgrgid_r getgrnam_r \
getgrouplist getgroups getlogin getloadavg getpeername getpgid getpid \
getpriority getresuid getresgid getpwent getspnam getspent getsid getwd \
getpriority getresuid getresgid getpwent getpwnam_r getpwuid_r getspnam getspent getsid getwd \
if_nameindex \
initgroups kill killpg lchmod lchown lockf linkat lstat lutimes mmap \
memrchr mbrtowc mkdirat mkfifo \
Expand Down Expand Up @@ -13696,6 +13709,7 @@ fi




# checks for system services
# (none yet)

Expand Down
3 changes: 2 additions & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -3445,8 +3445,9 @@ AC_CHECK_FUNCS(alarm accept4 setitimer getitimer bind_textdomain_codeset chown \
clock confstr ctermid dup3 execv faccessat fchmod fchmodat fchown fchownat \
fexecve fdopendir fork fpathconf fstatat ftime ftruncate futimesat \
futimens futimes gai_strerror getentropy \
getgrgid_r getgrnam_r \
getgrouplist getgroups getlogin getloadavg getpeername getpgid getpid \
getpriority getresuid getresgid getpwent getspnam getspent getsid getwd \
getpriority getresuid getresgid getpwent getpwnam_r getpwuid_r getspnam getspent getsid getwd \
if_nameindex \
initgroups kill killpg lchmod lchown lockf linkat lstat lutimes mmap \
memrchr mbrtowc mkdirat mkfifo \
Expand Down
Loading