projectquota: protect concurrent map access (ENGCORE-920)#39644
projectquota: protect concurrent map access (ENGCORE-920)#39644cpuguy83 merged 1 commit intomoby:masterfrom
Conversation
|
@amir73il PTAL |
There was a problem hiding this comment.
Looks like it... the code was written without concurrency in mind.
There was a problem hiding this comment.
maybe we could use atomics for this one
There was a problem hiding this comment.
yeah, it's easier to reuse the same lock
|
Seems like there's some other potential for data races around project ID's. |
|
Test is failing; |
daemon/graphdriver/quota/types.go
Outdated
There was a problem hiding this comment.
can you move this above quotas ? I think the convention is to put the mutex above the fields that it is protecting
There was a problem hiding this comment.
should nextProjectID also be inside the lock?
There was a problem hiding this comment.
oh, never mind, I see brian's comment above
b6421df to
4fbe160
Compare
@cpuguy83 I think I have handled it (now in the latest version)
Are you assuming it might be unsafe? I took a look at quotactl(2) man page and can't find anything that hints it might be the case. Perhaps @amir73il can chime in? |
daemon/graphdriver/quota/types.go
Outdated
There was a problem hiding this comment.
I'm surprised this passed the gofmt check.
Can we move add a whitespace line above import?
There was a problem hiding this comment.
Sure.
I am surprised as well -- it was auto-inserted by goimports which is a superset of gofmt -- maybe it's a corner case which they don't handle, I'll take a look.
There was a problem hiding this comment.
Protect access to q.quotas map, and lock around changing nextProjectID. Techinically, the lock in findNextProjectID() is not needed as it is only called during initialization, but one can never be too careful. Fixes: 52897d1 ("projectquota: utility class for project quota controls") Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
I see an old test here related to moby/integration-cli/docker_cli_create_test.go Lines 60 to 72 in 6345208 But test is only run on devicemapper. Perhaps we should change it to allow it to run on overlay2 with XFS? |
|
Some followups Tried to repro this, and apparently projectquota is not working on latest CentOS 7, due to So, It's all working just fine with Debian 9, though it's not reproducible there since it's a UP machine. Reproduced on a SLES 15 system with 6x Xeon and XFS with pquota, using the following config: and a stress-testing script from #39135 (took two runs with default values). Compiled a fixed version; so far no bug. Left it to run overnight to fully confirm. |
|
@kolyshkin can you create a tracking issue for project quota not working on CentOS 7 ? |
|
Ran 140 iterations of the following script overnight (created/removed 1400000 containers total) using dockerd compiled from moby/moby master on SLES-15 system, no issues. #!/bin/bash
NUMCTS=10000
PARALLEL=100
create() {
# -v 'vol_{}:/vol'
time ( seq $NUMCTS | parallel -j$PARALLEL docker create alpine top > /dev/null )
echo "^^^ CREATE TIME ^^^"
}
remove() {
time ( docker ps -qa | shuf | tail -n $NUMCTS | parallel -j$PARALLEL docker rm -f '{}' > /dev/null )
echo "^^^ REMOVE TIME ^^^"
}
# precreate $NUMCTS containers
create
echo
# create another $NUMCTS while removing $NUMCTS containers
create &
remove &
wait
# remove the rest of it
remove |
fixes #39643
The bug was always there, hidden by the global lock (removed in #39135, which opened a can of worms fixed in a few followup PRs, such as #39209)