Skip to content

bpo-33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid}#7081

Merged
vstinner merged 9 commits into
python:masterfrom
william-gr:pwd-release-gil
Sep 7, 2018
Merged

bpo-33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid}#7081
vstinner merged 9 commits into
python:masterfrom
william-gr:pwd-release-gil

Conversation

@william-gr

@william-gr william-gr commented May 23, 2018

Copy link
Copy Markdown

Especially when using more complex nss modules the call might that an
unknown amount of time depending on the service in question.
This makes sures others threads are not blocked waiting the call to
finish.

https://bugs.python.org/issue33625

@tiran

tiran commented May 23, 2018

Copy link
Copy Markdown
Member

LGTM, thanks!

Please use the blurb tool to add a news entry.

@william-gr

Copy link
Copy Markdown
Author

@tiran NEWS added, let me know if thats not to your liking, its my first time doing this ; )

Thanks!

@serhiy-storchaka serhiy-storchaka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about getpwall()? And is it worth to release GIL in the grp and the spwd modules?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sphinx complains about using the default role. Write as :func:`pwd.getpwnam` or ``pwd.getpwnam()``.

@william-gr

Copy link
Copy Markdown
Author

@serhiy-storchaka Regarding the getpwall() I thought about it but I wondered about the impact of releasing GIL on every loop iteration. I am not familar with cpython code, would that be bad?

Regarding the other modules, I would think so. Do you want me to do that in this same PR?

@serhiy-storchaka

Copy link
Copy Markdown
Member

You are right, and switching a thread between setpwent() and getpwent() is not safe. We would need to add a special global lock for getpwall(). Yet one example when GIL makes the code safer and simpler.

@tiran

tiran commented May 24, 2018

Copy link
Copy Markdown
Member

Can you either fix getgrgid and getgrnam in this PR or open up another PR. Don't bother about spwd. The module shouldn't be used any more.

@william-gr

Copy link
Copy Markdown
Author

@serhiy-storchaka You are absolutely right about re-entrant functions. However I am not sure whats the best way to check if they are available in the system in python project. Should I add a check in configure.ac for the existence of each of them? HAVE_GETPWNAM_R, HAVE_GETPWUID_R ?

@william-gr william-gr changed the title bpo-33625: Release GIL for getpwnam and getpwuid bpo-33625: Release GIL for grp.getgr(nam,gid} and pwd.getpw{nam,uid} May 24, 2018
@william-gr

Copy link
Copy Markdown
Author

I have changed grp module and used the re-entrant versions of them. Let me know what you think. Thanks!

@william-gr william-gr changed the title bpo-33625: Release GIL for grp.getgr(nam,gid} and pwd.getpw{nam,uid} bpo-33625: Release GIL for grp.getgr{nam,gid} and pwd.getpw{nam,uid} May 24, 2018
@william-gr

Copy link
Copy Markdown
Author

I am now handling ERANGE of _r functions, will update patch soon.

@william-gr william-gr force-pushed the pwd-release-gil branch 2 times, most recently from cc6368f to 1e68811 Compare May 24, 2018 15:16
@william-gr

Copy link
Copy Markdown
Author

Updated handling ERANGE, let me know how that looks and/or any suggestions/changes.

Thank you again.

@william-gr

Copy link
Copy Markdown
Author

Windows-PR seems to be failing but I dont know why, I cant find logs in the link. How should I proceed?

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a constant size buffer. It is better to use a local variable for it.

struct group grp;
...
status = getgrgid_r(gid, &grp, buf, bufsize, &p);

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use PyMem_RawMalloc().

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a space between if and ( for conforming PEP 7.

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use p instead of grpbuf.

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is doubled in the while below. It would be better to write:

if (grpbuf != NULL || status != ERANGE) {
    break;
}

and use while (1).

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be more efficient to use realloc.

Comment thread Modules/grpmodule.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this check for?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure we are not allocating too big of a buffer (its a maximum cap)

@william-gr

Copy link
Copy Markdown
Author

@serhiy-storchaka thanks for the review. Let me know if I have addressed them to your liking.

@william-gr william-gr force-pushed the pwd-release-gil branch 3 times, most recently from 7702a5f to 84031ff Compare May 24, 2018 19:53
@william-gr

Copy link
Copy Markdown
Author

Is there a way I can trigger a new macOS build? The failure does not seem related to the commit.

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@vstinner vstinner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New review. My comments apply to the all modified functions.

Comment thread Modules/grpmodule.c Outdated
#endif
if (p == NULL) {
if (buf != NULL) {
PyMem_RawFree(buf);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyMem_RawFree(NULL) is valid (does nothing), the if() is useless.

Comment thread Modules/grpmodule.c Outdated
break;
}
bufsize <<= 1;
p = PyMem_RawRealloc(buf, bufsize);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange to reuse the "struct group *p;" variable to reallocate the "char *buf = NULL;" variable: C types are not the same!? I wold prefer to see a different variable (ex: "buf2").

Comment thread Modules/grpmodule.c Outdated

bufsize = sysconf(_SC_GETGR_R_SIZE_MAX);
if (bufsize == -1) {
bufsize = 1024;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind to add a "#define DEFAULT_BUFFER_SIZE 1024" at top level, rather than using an hardcoded constant here?

Comment thread Modules/grpmodule.c Outdated
if (p == NULL) {
if (nomem == 1) {
PyErr_NoMemory();
} else {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: the PEP 7 requires

}
else {

Comment thread Modules/pwdmodule.c Outdated
buf = (char *) p;
}

if (status != 0) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to move this just after the get...() call.

@bedevere-bot

Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@william-gr

Copy link
Copy Markdown
Author

I have made the requested changes; please review again

Thank you

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@vstinner vstinner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Honestly, the new PR is way better than the first version! I was going to merge it, when I spotted another issue: you don't check if PyMem_Malloc() fails. IMHO it's not a good idea to call get*() functions with buf=NULL. Maybe the function fails, maybe you get a crash... I would prefer to avoid the risk of crash :-) I proposed to move code which allocates the memory to avoid redundancy and to make sure that get*() are never called with buf=NULL.

By the way, I also proposed to fix a mojibake (encoding) issue while we are on these functions.

Maybe the mojibake issue can be fixed in a separated PR, since I don't think that we are going to backport this one to 3.7 and older. So maybe wait until this PR is merged, and then write a second PR to fix the mojibake issue, and we can easily backport the second PR to all supported branches.

Comment thread Modules/grpmodule.c
PyErr_NoMemory();
}
else {
PyErr_Format(PyExc_KeyError, "getgrnam(): name not found: %s", name_chars);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I see a bug which is not related to your change, but while we modify these functions, it would be nice to fix it.

name_chars is encoded to the filesystem encoding, whereas %s decodes it from UTF-8. If the filesystem encoding is not UTF-8, you get mojibake.

The fix is simple: use %S format and pass name: name is always a Unicode string.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If name is always a Unicode string, you can use %U.

It may be worth to use %R for the case if the name contains invisible characters or trailing whitespaces.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread Modules/pwdmodule.c
}
else {
PyErr_Format(PyExc_KeyError,
"getpwnam(): name not found: %s", name);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment: please use the %S format here, but if you modify this function, would you mind to also fix the name of the parameter? Rename "arg" to "name", as in Doc/library/pwd.rst. Maybe rename the char* name to name_chars, as in grp_getgrnam_impl().

Comment thread Modules/grpmodule.c Outdated
if (bufsize == -1) {
bufsize = DEFAULT_BUFFER_SIZE;
}
buf = PyMem_RawMalloc(bufsize);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, wait, you don't check if PyMem_Malloc() failed here? I'm not sure that it's safe to call get*() functions with buf=NULL.

I have a proposition. Move the memory allocation at the start of the loop, and always use PyMem_Realloc(). PyMem_Realloc(buf, bufsize) behaves as PyMem_Malloc(). Pseudo-code:

/* PyMem_Malloc code removed from here */
while (1) {
        buf2 = PyMem_RawRealloc(buf, bufsize);
        if (buf2 == NULL) {
            nomem = 1;
            break;
        }
        buf = buf2; /* buf cannot be NULL from this point */
        (...)
    /* no more PyMem_Realloc code here neither */
}

(I don't propose to add these comments, it's just to explain my idea ;-))

@bedevere-bot

Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@william-gr

Copy link
Copy Markdown
Author

I have made the requested changes; please review again

Thanks for pointing that out and your patience with the reviews. I am glad you pointed the Malloc issue, I also found other problem with latest Realloc change I had made.

I will take your advise and create another PR for mojibake.

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@vstinner

vstinner commented Sep 7, 2018

Copy link
Copy Markdown
Member

I merged your PR, thanks!

Would you mind to write a second PR to fix the encoding issue that I spootted? I explained how to fix it (use %S format).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants