-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid} #7081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LGTM, thanks! Please use the blurb tool to add a news entry. |
9ecf8a3
to
122536d
Compare
@tiran NEWS added, let me know if thats not to your liking, its my first time doing this ; ) Thanks! |
122536d
to
ef18509
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about getpwall()
? And is it worth to release GIL in the grp
and the spwd
modules?
@@ -0,0 +1,2 @@ | |||
Release GIL on `pwd.getpwnam` and `pwd.getpwuid`. Patch by William |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sphinx complains about using the default role. Write as :func:`pwd.getpwnam`
or ``pwd.getpwnam()``
.
@serhiy-storchaka Regarding the Regarding the other modules, I would think so. Do you want me to do that in this same PR? |
You are right, and switching a thread between |
Can you either fix |
@serhiy-storchaka You are absolutely right about re-entrant functions. However I am not sure whats the best way to check if they are available in the system in python project. Should I add a check in configure.ac for the existence of each of them? HAVE_GETPWNAM_R, HAVE_GETPWUID_R ? |
ef18509
to
f8f90e2
Compare
I have changed grp module and used the re-entrant versions of them. Let me know what you think. Thanks! |
f8f90e2
to
4722749
Compare
I am now handling ERANGE of _r functions, will update patch soon. |
cc6368f
to
1e68811
Compare
Updated handling ERANGE, let me know how that looks and/or any suggestions/changes. Thank you again. |
Windows-PR seems to be failing but I dont know why, I cant find logs in the link. How should I proceed? |
Modules/grpmodule.c
Outdated
int status, bufsize = NSS_BUFLEN_GROUP; | ||
struct group *grpbuf = NULL; | ||
|
||
p = malloc(sizeof(struct group)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a constant size buffer. It is better to use a local variable for it.
struct group grp;
...
status = getgrgid_r(gid, &grp, buf, bufsize, &p);
Modules/grpmodule.c
Outdated
struct group *grpbuf = NULL; | ||
|
||
p = malloc(sizeof(struct group)); | ||
buf = malloc(bufsize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use PyMem_RawMalloc()
.
Modules/grpmodule.c
Outdated
|
||
do { | ||
status = getgrgid_r(gid, p, buf, bufsize, &grpbuf); | ||
if(grpbuf == NULL && status == ERANGE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a space between if
and (
for conforming PEP 7.
Modules/grpmodule.c
Outdated
buf = malloc(bufsize); | ||
|
||
do { | ||
status = getgrgid_r(gid, p, buf, bufsize, &grpbuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use p
instead of grpbuf
.
Modules/grpmodule.c
Outdated
|
||
do { | ||
status = getgrgid_r(gid, p, buf, bufsize, &grpbuf); | ||
if(grpbuf == NULL && status == ERANGE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is doubled in the while
below. It would be better to write:
if (grpbuf != NULL || status != ERANGE) {
break;
}
and use while (1)
.
Modules/grpmodule.c
Outdated
do { | ||
status = getgrgid_r(gid, p, buf, bufsize, &grpbuf); | ||
if(grpbuf == NULL && status == ERANGE) { | ||
free(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be more efficient to use realloc.
Modules/grpmodule.c
Outdated
status = getgrgid_r(gid, p, buf, bufsize, &grpbuf); | ||
if(grpbuf == NULL && status == ERANGE) { | ||
free(buf); | ||
if((bufsize << 1) > (NSS_BUFLEN_GROUP << 10)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this check for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make sure we are not allocating too big of a buffer (its a maximum cap)
@serhiy-storchaka thanks for the review. Let me know if I have addressed them to your liking. |
7702a5f
to
84031ff
Compare
Is there a way I can trigger a new macOS build? The failure does not seem related to the commit. |
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New review. My comments apply to the all modified functions.
Modules/grpmodule.c
Outdated
#endif | ||
if (p == NULL) { | ||
if (buf != NULL) { | ||
PyMem_RawFree(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyMem_RawFree(NULL) is valid (does nothing), the if() is useless.
Modules/grpmodule.c
Outdated
break; | ||
} | ||
bufsize <<= 1; | ||
p = PyMem_RawRealloc(buf, bufsize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's strange to reuse the "struct group *p;" variable to reallocate the "char *buf = NULL;" variable: C types are not the same!? I wold prefer to see a different variable (ex: "buf2").
Modules/grpmodule.c
Outdated
|
||
bufsize = sysconf(_SC_GETGR_R_SIZE_MAX); | ||
if (bufsize == -1) { | ||
bufsize = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to add a "#define DEFAULT_BUFFER_SIZE 1024" at top level, rather than using an hardcoded constant here?
Modules/grpmodule.c
Outdated
if (p == NULL) { | ||
if (nomem == 1) { | ||
PyErr_NoMemory(); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: the PEP 7 requires
}
else {
Modules/pwdmodule.c
Outdated
buf = (char *) p; | ||
} | ||
|
||
if (status != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to move this just after the get...() call.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
5c47292
to
0da8c19
Compare
I have made the requested changes; please review again Thank you |
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
0da8c19
to
874339e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Honestly, the new PR is way better than the first version! I was going to merge it, when I spotted another issue: you don't check if PyMem_Malloc() fails. IMHO it's not a good idea to call get*() functions with buf=NULL. Maybe the function fails, maybe you get a crash... I would prefer to avoid the risk of crash :-) I proposed to move code which allocates the memory to avoid redundancy and to make sure that get*() are never called with buf=NULL.
By the way, I also proposed to fix a mojibake (encoding) issue while we are on these functions.
Maybe the mojibake issue can be fixed in a separated PR, since I don't think that we are going to backport this one to 3.7 and older. So maybe wait until this PR is merged, and then write a second PR to fix the mojibake issue, and we can easily backport the second PR to all supported branches.
PyErr_NoMemory(); | ||
} | ||
else { | ||
PyErr_Format(PyExc_KeyError, "getgrnam(): name not found: %s", name_chars); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, I see a bug which is not related to your change, but while we modify these functions, it would be nice to fix it.
name_chars is encoded to the filesystem encoding, whereas %s decodes it from UTF-8. If the filesystem encoding is not UTF-8, you get mojibake.
The fix is simple: use %S format and pass name: name is always a Unicode string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If name is always a Unicode string, you can use %U.
It may be worth to use %R for the case if the name contains invisible characters or trailing whitespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
else { | ||
PyErr_Format(PyExc_KeyError, | ||
"getpwnam(): name not found: %s", name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: please use the %S format here, but if you modify this function, would you mind to also fix the name of the parameter? Rename "arg" to "name", as in Doc/library/pwd.rst. Maybe rename the char* name to name_chars, as in grp_getgrnam_impl().
Modules/grpmodule.c
Outdated
if (bufsize == -1) { | ||
bufsize = DEFAULT_BUFFER_SIZE; | ||
} | ||
buf = PyMem_RawMalloc(bufsize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, wait, you don't check if PyMem_Malloc() failed here? I'm not sure that it's safe to call get*() functions with buf=NULL.
I have a proposition. Move the memory allocation at the start of the loop, and always use PyMem_Realloc(). PyMem_Realloc(buf, bufsize) behaves as PyMem_Malloc(). Pseudo-code:
/* PyMem_Malloc code removed from here */
while (1) {
buf2 = PyMem_RawRealloc(buf, bufsize);
if (buf2 == NULL) {
nomem = 1;
break;
}
buf = buf2; /* buf cannot be NULL from this point */
(...)
/* no more PyMem_Realloc code here neither */
}
(I don't propose to add these comments, it's just to explain my idea ;-))
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I have made the requested changes; please review again Thanks for pointing that out and your patience with the reviews. I am glad you pointed the Malloc issue, I also found other problem with latest Realloc change I had made. I will take your advise and create another PR for mojibake. |
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
I merged your PR, thanks! Would you mind to write a second PR to fix the encoding issue that I spootted? I explained how to fix it (use %S format). |
Especially when using more complex nss modules the call might that an
unknown amount of time depending on the service in question.
This makes sures others threads are not blocked waiting the call to
finish.
https://bugs.python.org/issue33625