Skip to content

storage: clean up timer in cacheWatcher.add#23210

Merged
k8s-github-robot merged 1 commit intokubernetes:masterfrom
rsc:master
Mar 19, 2016
Merged

storage: clean up timer in cacheWatcher.add#23210
k8s-github-robot merged 1 commit intokubernetes:masterfrom
rsc:master

Conversation

@rsc
Copy link
Copy Markdown
Contributor

@rsc rsc commented Mar 18, 2016

In the e2e benchmarks, this timer is a significant source of garbage
and stale timers. Because the timer is not stopped after its use
in the select, it stays in the timer heap until it eventually fires
(5 seconds later). Under load, a lot of 5-second timers can pile up
before any start going away. The timer heap being large makes timer
operations take longer; the operations are O(log N) but N is still big.

The way to fix this in current versions of Go is to stop the underlying
timer explicitly, which this CL does for this one case.

There are many other places in the code that use the same idiom,
but those do not show up on profiles of the e2e server.
I am investigating changes for Go 1.7's runtime that would make
the old code behave like this new code transparently, so I don't
think it's worth updating any uses of the idiom that are not in
hot spots found with profiling.

Measuring 'LIST nodes' latency in milliseconds during e2e test
shows the benefit of this change.

Using Go 1.4.2:

BEFORE p50: 148±7 p90: 328±19 p99: 513±29 n: 10
AFTER p50: 151±8 p90: 339±19 p99: 479±20 n: 9

Using Go 1.6.0:

BEFORE p50: 141±9 p90: 383±32 p99: 604±44 n: 11
AFTER p50: 140±14 p90: 360±31 p99: 483±39 n: 10

In the e2e benchmarks, this timer is a significant source of garbage
and stale timers. Because the timer is not stopped after its use
in the select, it stays in the timer heap until it eventually fires
(5 seconds later). Under load, a lot of 5-second timers can pile up
before any start going away. The timer heap being large makes timer
operations take longer; the operations are O(log N) but N is still big.

The way to fix this in current versions of Go is to stop the underlying
timer explicitly, which this CL does for this one case.

There are many other places in the code that use the same idiom,
but those do not show up on profiles of the e2e server.
I am investigating changes for Go 1.7's runtime that would make
the old code behave like this new code transparently, so I don't
think it's worth updating any uses of the idiom that are not in
hot spots found with profiling.

Measuring 'LIST nodes' latency in milliseconds during e2e test
shows the benefit of this change.

Using Go 1.4.2:

BEFORE  p50: 148±7   p90: 328±19  p99: 513±29  n: 10
AFTER   p50: 151±8   p90: 339±19  p99: 479±20  n: 9

Using Go 1.6.0:

BEFORE  p50: 141±9   p90: 383±32  p99: 604±44  n: 11
AFTER   p50: 140±14  p90: 360±31  p99: 483±39  n: 10
@rsc
Copy link
Copy Markdown
Contributor Author

rsc commented Mar 18, 2016

/cc @wojtek-t @gmarek

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 18, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

2 similar comments
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 18, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 18, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-github-robot
Copy link
Copy Markdown

Labelling this PR as size/XS

@k8s-github-robot k8s-github-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 18, 2016
@ncdc
Copy link
Copy Markdown
Member

ncdc commented Mar 18, 2016

Nice find!

@lavalamp lavalamp assigned lavalamp and unassigned bgrant0607 Mar 18, 2016
@lavalamp lavalamp added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 18, 2016
@lavalamp
Copy link
Copy Markdown
Contributor

LGTM, thanks!

@lavalamp
Copy link
Copy Markdown
Contributor

I went through fixing these at one point, but I think this usage was added more recently.

@ncdc
Copy link
Copy Markdown
Member

ncdc commented Mar 18, 2016

@k8s-github-robot
Copy link
Copy Markdown

@k8s-bot ok to test
@k8s-bot test this

pr builder appears to be missing, activating due to 'lgtm' label.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 18, 2016

GCE e2e build/test passed for commit e4b369e.

@k8s-github-robot
Copy link
Copy Markdown

The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label?

@k8s-github-robot
Copy link
Copy Markdown

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 19, 2016

GCE e2e build/test passed for commit e4b369e.

@k8s-github-robot
Copy link
Copy Markdown

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Mar 19, 2016

GCE e2e build/test passed for commit e4b369e.

@k8s-github-robot
Copy link
Copy Markdown

Automatic merge from submit-queue

k8s-github-robot pushed a commit that referenced this pull request Mar 19, 2016
Auto commit by PR queue bot
@k8s-github-robot k8s-github-robot merged commit 4be9587 into kubernetes:master Mar 19, 2016
@wojtek-t wojtek-t assigned wojtek-t and lavalamp and unassigned lavalamp and wojtek-t Mar 19, 2016
@wojtek-t
Copy link
Copy Markdown
Member

Nice - thanks @rsc !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants