Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clearly point out the surprising behavior of --reload, --user, --group, --config (+ unlock a few previously impossible combinations) #3360

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

pajod
Copy link
Contributor

@pajod pajod commented Mar 18, 2025

Resubmitting a subset of #3201 - less mixing topics

Retained:

  • at least document the weird status quo of --user and --group settings
  • permit linting config.py without tons of complaints about class methods not taking self arg
  • permit --reload without publishing tracebacks (idea from #3125)
  • permit --reload-extra-files without --reload (separately in #3271)
    • a test that showcases how this works (similar to #3210 - will refactor if both accepted)
  • permit --config=/dev/null (without warning, when execution of gunicorn.conf.py file is not intended)

Not submitted here:

Related; I shall fix merge conflicts once this or one of these is merged:

Copy link
Owner

@benoitc benoitc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you describe what the user change fix? I don't see any real reason to change a behavior in place since many year . can you expand? The reason in #3212 (comment) described is not really a reason rather an opinion.

We need to be able to set the user and group independently of the arbiter when needed.

@@ -456,7 +456,7 @@ def _validate_callable(val):

def validate_user(val):
if val is None:
return os.geteuid()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this change fix?

Copy link
Contributor Author

@pajod pajod Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

systemd-analyze security tells users to cull unnecessary administrative syscalls from the sandbox, if the application does not need it. If gunicorn is already started without privileges (as would be common under systemd), it should no need it - everything is already setup. When it tries anyway, it can get killed with rather unspecific error code:

gunicorn.service: Main process exited, code=killed, status=31/SYS
gunicorn.service: Failed with result 'signal'.

So Gunicorn needs to be able to distinguish between being instructed to make this call, and not being instructed to make this call, so it does not fail in environments where it is neither asked nor allowed to.

@@ -1171,14 +1183,21 @@ class Group(Setting):
cli = ["-g", "--group"]
meta = "GROUP"
validator = validate_group
default = os.getegid()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again what does this change fix ? what is the error caused by this? We need to ensure the worker process run under the specified or running user.

Copy link
Contributor Author

@pajod pajod Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what the administrator specified.
This is the default case of the administrator not asking Gunicorn to mess with gid.

The error caused by this default would show up as "permission denied", in case the gunicorn process starting under the intended unprivileged user attempts to change its egid to the same unprivileged group, and either fails the lookup or is denied the (no-op) setgid call.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been done on purpose. Some OS get it incorrectly wrong when setting files permisssion when unset. Also a common pattern generally to set it to the user unless set so we make sure the worker can afaik write where the application wants;

The issue with systemd is that it try to do thing we shouldn't expect from a supervisor.

I am not really against that change but it need to be carefully tested outside of linux and systemd ecosystem. Not only in what does the system but UX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some OS get it incorrectly wrong

What scenario or OS are you thinking of here?
I am under the impression that POSIX demands that os.setgid(os.getegid()) potentially drops saved & real, but leaves effective and supplementary groups untouched. Hence I do not expect file permissions to be different.

The issue with systemd is that it try to do thing we shouldn't expect from a supervisor.

One case of expecting PID1 to do setup things we should not unset is already guarded behind a warning. I would follow that approach.. if there was a message that could be printed to the admin on how to fix the problem. "Warning: You set cfg.gid=1337 - this might fail, or might not result in the group list you configured in systemd". But that message would not help, because the admin never asked for cfg.gid==1337 in the first place.

Not only in what does the system but UX.

Would you be less concerned about this patch if I added a bunch of debug-level messages, so the admin is more likely to understand what is happening? (even without my change, is not easy to understand #2245)

@pajod
Copy link
Contributor Author

pajod commented Mar 19, 2025

We need to be able to set the user and group independently of the arbiter when needed.

Yes.

when needed.

When requested. If it is not requested, it should be clarified why it would be needed anyway.


@benoitc Rebased patchset: I dropped the UID changes in favor of just comments & documentation text that help clarify the status quo. That should be uncontroversial, right? Requesting review of the other commits.


Meanwhile, I have not been able to create a linux reproducer of anything other than username lookups or chown call failing, so the fix that is needed for supported systems might not actually need changing the default.. I'll try those changes again one I got a better understanding of which common scenarios lead to the really bad scenarios, like infinite loop:

[INFO] Booting worker with pid: 445642
[ERROR] Worker (pid:445642) was sent SIGSYS!
[INFO] Booting worker with pid: 445643
[ERROR] Worker (pid:445643) was sent SIGSYS!
[INFO] Booting worker with pid: 445644
[ERROR] Worker (pid:445644) was sent SIGSYS!
[INFO] Booting worker with pid: 445645
[ERROR] Worker (pid:445645) was sent SIGSYS!

@pajod pajod changed the title Change --user, --group --reload, --config option defaults clearly point out the surprising behavior of --reload, --user, --group, --config (+ unlock a few previously impossible combinations) Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants