Skip to content

Remote execution build time regression with Bazel 5.1+ when action has many inputs #15872

@clintharrison

Description

@clintharrison

Description of the bug:

As of Bazel 5.1 (specifically, the change in #15091 -- cc @coeuvre and @Wyverald), we have seen an action take considerably more time to execute remotely. We previously saw <5 seconds spent in "upload missing inputs", and this PR has increased that to 90+ seconds in CI and several minutes on developer machines.

This action has around 600k inputs, though we see very long execution time even when only ~5 of them change. It appears that this is allocating much much more than previously, though I am not very familiar with rxjava, let alone its internals 😄

With these changes, it's also impossible to interrupt Bazel; this helpfully prints a stack trace of the thread though:

220713 16:07:12.594:W 330 [com.google.devtools.build.lib.util.ThreadUtils.warnAboutSlowInterrupt] Interrupt took too long. Dumping thread state.
220713 16:07:12.599:W 330 [com.google.devtools.build.lib.util.ThreadUtils.lambda$warnAboutSlowInterrupt$1] RUNNABLE <skyframe-evaluator 5 290>
	app//io.reactivex.rxjava3.core.Observable.subscribe(Observable.java:13176)
	app//io.reactivex.rxjava3.internal.operators.observable.ObservableRefCount.subscribeActual(ObservableRefCount.java:84)
	app//io.reactivex.rxjava3.core.Observable.subscribe(Observable.java:13176)
	app//io.reactivex.rxjava3.internal.operators.observable.ObservableSingleSingle.subscribeActual(ObservableSingleSingle.java:36)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.single.SingleDoOnSubscribe.subscribeActual(SingleDoOnSubscribe.java:41)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.single.SingleFlatMapCompletable.subscribeActual(SingleFlatMapCompletable.java:44)
	app//io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
	app//io.reactivex.rxjava3.internal.operators.completable.CompletableDefer.subscribeActual(CompletableDefer.java:43)
	app//io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
	app//io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle.subscribeActual(CompletableToSingle.java:37)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.subscribeActual(AsyncTaskCache.java:157)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//com.google.devtools.build.lib.remote.util.AsyncTaskCache.lambda$execute$1(AsyncTaskCache.java:309)
	app//com.google.devtools.build.lib.remote.util.AsyncTaskCache$$Lambda$925/0x00000008006a0040.subscribe(Unknown Source)
	app//io.reactivex.rxjava3.internal.operators.single.SingleCreate.subscribeActual(SingleCreate.java:40)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.completable.CompletableFromSingle.subscribeActual(CompletableFromSingle.java:29)
	app//io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
	app//io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle.subscribeActual(CompletableToSingle.java:37)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.single.SingleResumeNext.subscribeActual(SingleResumeNext.java:39)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.onNext(FlowableFlatMapSingle.java:131)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable$IteratorSubscription.fastPath(FlowableFromIterable.java:185)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable$BaseRangeSubscription.request(FlowableFromIterable.java:129)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.onSubscribe(FlowableFlatMapSingle.java:106)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable.subscribe(FlowableFromIterable.java:69)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFromIterable.subscribeActual(FlowableFromIterable.java:47)
	app//io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15917)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle.subscribeActual(FlowableFlatMapSingle.java:53)
	app//io.reactivex.rxjava3.core.Flowable.subscribe(Flowable.java:15917)
	app//io.reactivex.rxjava3.internal.operators.flowable.FlowableCollectSingle.subscribeActual(FlowableCollectSingle.java:53)
	app//io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
	app//io.reactivex.rxjava3.internal.operators.single.SingleFlatMapCompletable.subscribeActual(SingleFlatMapCompletable.java:44)
	app//io.reactivex.rxjava3.core.Completable.subscribe(Completable.java:2859)
	app//io.reactivex.rxjava3.core.Completable.blockingAwait(Completable.java:1467)
	app//com.google.devtools.build.lib.remote.RemoteExecutionCache.ensureInputsPresent(RemoteExecutionCache.java:101)
	app//com.google.devtools.build.lib.remote.RemoteExecutionService.uploadInputsIfNotPresent(RemoteExecutionService.java:1279)

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I don't have a minimal repro readily available: if necessary I suspect we could reproduce this with rules_nodejs pretty easily given that node_modules tend to have a lot of input files. Let me know if this would be helpful!

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

release 5.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

Metadata

Metadata

Assignees

Labels

P1I'll work on this now. (Assignee required)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions