-
Notifications
You must be signed in to change notification settings - Fork 18.9k
fix: Normalize RLIMIT_NOFILE (LimitNOFILE) to sensible defaults
#45534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bcc90b5 to
8b4e85b
Compare
|
Any reason not to drop the custom NOFILE limit configuration entirely and simply use the system's default, at least for the systemd unit? |
|
@corhere If you're sure that all supported distros will be compatible with doing so, removing the line should be ok 👍 (CentOS 7 is cited as still supported earlier this year, which is stuck on an old systemd release prior to 240) Alternatively, older supported distros, could be advised to use a drop-in override? NOTE: Concerns - Verbose (click to expand)Concern - May be preferable to keep
|
This comment was marked as resolved.
This comment was marked as resolved.
|
#45551 👀 (let's please not waste any cycles on Upstart 😅) |
893e6d5 to
5687f27
Compare
|
Could you please rebase to drop the original upstart change instead of reverting it in a later commit? |
3c7e689 to
674afeb
Compare
Discussing in our sync call; perhaps this is the way we should go, and use systemd's defaults, which are the same currently. Does that work for you, @polarathene ? |
That would be fine for me, but you mentioned in the past you still support CentOS 7 (EOL in roughly a year). That does not have a new enough systemd version and IIRC cannot support v240+, thus you'd probably fallback to I also expanded on that concern and another earlier, click to expand the collapsed details if that's helpful. My stance remains that it is probably be wise to go with what this PR proposes, and then at a later release remove the |
|
I'm in favor of |
|
Its probably easier to remove it now and add a patch for older distros at packaging time. |
What is the advantage of keeping a separate patch around to maintain? I believe keeping the value explicit for now (as this PR proposes) is more beneficial to everyone. Explicit defaults ensures all distros have the sane values (that would be implicit for modern distros), while documenting the change as a bonus (easier to track via When the older distros are considered unsupported, you'd remove the explicit config like you would the separate patch. This approach would transition more smoothly IMO. It also aligns with the |
My thought here is we'd only be keeping it for a distro version that is rapidly headed towards EOL. |
|
Given this is both repository is both "upstream" of packagers (Docker Inc., Mirantis, Microsoft) and this is a @corhere do you have a preference? I know Mirantis maintains their own systemd units, so maybe there are no strong opinions? |
I'm fine with either outcome personally, both would be much better than the current I would like to draw attention to the benefits I raised in my previous comment of keeping the None of that matters for me, but might be worth taking into consideration. Would you like me to make any changes to this PR? Systemd aside, this PR also normalizes all the other init configs to the same hard limit |
cpuguy83
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this with maintainers.
We'd like to get this change in for 25.0 release and agreed to just remove it completely.
In packaging we'll patch the unit file as needed (pretty much just centos7).
Are you good to make those changes?
This comment was marked as resolved.
This comment was marked as resolved.
neersighted
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't squash-merge in this repository, so if you can rewrite history into a single commit, that would be appreciated. I don't think there's a need to split changes as systemd vs non-systemd.
|
LGTM just needs a squash. |
dbf46d2 to
f3c7667
Compare
During review, it was decided to remove `LimitNOFILE` from `docker.service` to rely on the systemd v240 implicit default of `1024:524288`. On supported platforms with systemd prior to v240, packagers will patch the service with an explicit `LimitNOFILE=1024:524288`. - `1024` soft limit is an implicit default, avoiding unexpected breakage. Software that needs a higher limit should request to raise the soft limit for its process. - `524288` hard limit is an implicit default since systemd v240 and is adequate for most processes (_half of the historical limit from `fs.nr_open` of `1048576`_), while 4096 is the implicit default from the kernel (often too low). Individual containers can be started with `--ulimit` when a larger hard limit is required. - The hard limit may not exceed `fs.nr_open` (_which a value of `infinity` will resolve to_). On most systems with systemd v240 or newer, this will resolve to an excessive size of 2^30 (over 1 billion). - When set to `infinity` (usually as the soft limit) software may experience significantly increased resource usage, resulting in a performance regression or runtime failures that are difficult to troubleshoot. - OpenRC current config approach lacks support for different soft/hard limits being set as it adjusts additional limits and `ulimit` does not support mixed usage of `-H` + `-S`. A soft limit of `524288` is not ideal, but 2^19 is much less overhead than 2^30, whilst a hard limit of 4096 would be problematic for Docker. Signed-off-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
f3c7667 to
c893010
Compare
|
Friendly ping, not sure if you got notified of the commit from last month. I've slightly revised the commit message today and rebased to current upstream. I also opened an equivalent PR for |
RLIMIT_NOFILE to sensible defaultsRLIMIT_NOFILE (LimitNOFILE) to sensible defaults
What I did
Adjusted init configs to be as consistent as possible with handling the
RLIMIT_NOFILEsetting (number of open file descriptors a single process can have at one time).dockerditself (_non-issue since Go 1.19 / Aug 2022).LimitNOFILE=infinityis a regression introduced in a May 2021 PR merging a 2018 commit from another repo, whereas the value ofinfinitybecame excessive on various systems starting from 2019.524288is a standard hard limit on systems with systemd since v240, being sufficient for software and not known to cause any problems.ulimitviarc_ulimit, preventing setting a soft and hard limit, thus it will maintain a high soft limit. You cannot use-H+-Sin the sameulimitcommand to set separate soft/hard values.Reasoning for `1024:524288` (click to expand)
infinityis especially problematic as a soft limit, introducing difficult to troubleshoot / discover bugs.1048576was adjusted to524288elsewhere to align withdocker.service/ systemd.containerd.servicethan it isdocker.service.LimitNOFILEindocker.service+containerd.serviceaffects resource usage and operations:524288should be more than enough to support 65kbusyboxcontainers (which itself would encounter various other bottlenecks along the way that require configuration changes to support).--ulimit "nofile=soft:hard"(asrootno greater thanfs.nr_open).1024:524288covered here.LimitNOFILE:How I did it
Approach (click to expand):
git blamehistory on config to lookup relevant PRs on Github, review the changes between them and discussions.mobyandcontainerdregarding the issue.Related History (click to expand):
docker.service(LimitNOFILE+LimitNPROC):524288:1048576)-u/-pbased on shell)docker.servicelimits toinfinity(this future comment seems to be about this comment/issue as the motivation for the "non-zero limit" config note)-u+-pchanged tounlimitedin other init configs.LimitNOFILEreverted back to1048576and relocated above the "non-zero limit" config note.rc_ulimitfor setting limitsLimitNOFILEchanged back toinfinityand moved back under the "non-zero limit" config noteLimitNOFILE=infinityresolves to. Some distros like Debian explicitly opt-out of that, keeping the original1048576limit.LimitNOFILEwas set tounlimitedfor the same reason as the others (this was originally the intent, but later realized as not applicable).More extensive history coverage here.
containerdalso had a similarLimitNOFILEconfig dance.In addition, the ecosystem around Docker also changed. Notably changes introduced that affected what
infinityresolves to (since systemd v240 in 2018Q4), and ignoring any configured soft limit (Go 1.19/1.20 releases between Aug 2022 and May 2023).Additional references (click to expand):
containerd(review discussion)How to verify it
Requires at least Moby 23.0.6, or 24.0.0 for an important fix introduced since Go 1.19.9 / 1.20.4:
Notes (click to expand)
Without this change, both Soft and Hard values would be the same and either
1048576(2^20) or1073741816(2^30).grep 'files' /proc/$(pidof dockerd)/limitswill report the soft limit as the hard limit.This is expected because Go implicitly raises the soft limit since Go 1.19 released in Aug 2022. That is reset to the expected soft limit when the daemon runs some processes involved in
docker build.Testing at runtime with containers using
docker runinstead of via image builds requires a related change incontainerd(min version required: 1.6.21 / 1.7.1). That can be verified as detailed here.Another reproduction on limits that may be of interest.
Description for the changelog
Adjusted
RLIMIT_NOFILEto more sensible defaults.A picture of a cute animal (not mandatory but encouraged)