Skip to content

NullPointerException after remote execution build rewind #18547

@joeljeske

Description

@joeljeske

Description of the bug:

Sometimes, after Bazel rewinds a remote exec build and seeing cache eviction, it then immediately errors with a NPE.

Some relevant bazelrc:

build --experimental_action_cache_store_output_metadata
build --remote_download_outputs=minimal
build --build_runfile_links
build --experimental_remote_cache_eviction_retries=2
build --incompatible_remote_use_new_exit_code_for_lost_inputs
build --experimental_merged_skyframe_analysis_execution 
build --experimental_skymeld_ui
--
  | (11:58:11) ERROR: remote spawn failed: 3 errors during bulk transfer:
  | com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 6428d557a02ebd774d0f0723dfba0fd5a94656f2d7d8d42163c8eee50b7fdcd5/83
  | com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: 34acfeb489251c6a6eec9985b857db13ae06fbfdabfbdafb394368b20977be85/999
  | com.google.devtools.build.lib.remote.common.CacheNotFoundException: Missing digest: af99fcbddd1d45ac4b1ab3d21ef6665c3a4c78003f3dcf2855f406e0d41f5545/1670
  | (11:58:11) INFO: Elapsed time: 639.140s, Critical Path: 504.66s
  | (11:58:11) INFO: 288 processes: 162 remote cache hit, 119 internal, 2 linux-sandbox, 4 local, 1 remote.
  | (11:58:11) FAILED: Build did NOT complete successfully
  | (Skipping other failed to build tests)
  |  
  | (11:58:12) FAILED: Build did NOT complete successfully
  | Found remote cache eviction error, retrying the build...
  | (11:58:13) INFO: Invocation ID: d4077ca2-04ef-4b75-a53f-29748b19f2ac
  | (11:58:13) INFO: Current date is 2023-05-31
  | (11:58:13) Loading:
  | (11:58:25) Loading:
  | (11:58:25) Loading: 0 packages loaded
  | (11:58:32) Analyzing: 134227 targets (0 packages loaded, 0 targets configured)
  | (11:58:32) Analyzing: 134227 targets (0 packages loaded, 0 targets configured)
  | [0 / 31] [Prepa] Action generation.txt
  | (11:58:34) FATAL: bazel crashed due to an internal error. Printing stack trace:
  | java.lang.NullPointerException
  | at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:889)
  | at com.google.devtools.build.lib.buildtool.ExecutionProgressReceiver.evaluated(ExecutionProgressReceiver.java:142)
  | at com.google.devtools.build.lib.skyframe.SkyframeExecutor$SkyframeProgressReceiver.evaluated(SkyframeExecutor.java:3059)
  | at com.google.devtools.build.skyframe.DirtyTrackingProgressReceiver.evaluated(DirtyTrackingProgressReceiver.java:116)
  | at com.google.devtools.build.skyframe.ParallelEvaluator.informProgressReceiverThatValueIsDone(ParallelEvaluator.java:127)
  | at com.google.devtools.build.skyframe.ParallelEvaluator.doMutatingEvaluation(ParallelEvaluator.java:154)
  | at com.google.devtools.build.skyframe.ParallelEvaluator.eval(ParallelEvaluator.java:677)
  | at com.google.devtools.build.skyframe.InMemoryMemoizingEvaluator.evaluate(InMemoryMemoizingEvaluator.java:203)
  | at com.google.devtools.build.lib.skyframe.SkyframeExecutor.evaluateBuildDriverKeys(SkyframeExecutor.java:2258)
  | at com.google.devtools.build.lib.skyframe.SkyframeBuildView.analyzeAndExecuteTargets(SkyframeBuildView.java:672)
  | at com.google.devtools.build.lib.analysis.BuildView.update(BuildView.java:416)
  | at com.google.devtools.build.lib.buildtool.AnalysisAndExecutionPhaseRunner.runAnalysisAndExecutionPhase(AnalysisAndExecutionPhaseRunner.java:210)
  | at com.google.devtools.build.lib.buildtool.AnalysisAndExecutionPhaseRunner.execute(AnalysisAndExecutionPhaseRunner.java:128)
  | at com.google.devtools.build.lib.buildtool.BuildTool.buildTargetsWithMergedAnalysisExecution(BuildTool.java:332)
  | at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:175)
  | at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:494)
  | at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:462)
  | at com.google.devtools.build.lib.runtime.commands.TestCommand.doTest(TestCommand.java:148)
  | at com.google.devtools.build.lib.runtime.commands.TestCommand.exec(TestCommand.java:113)
  | at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:625)
  | at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:240)
  | at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:550)
  | at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$1(GrpcServerImpl.java:614)
  | at io.grpc.Context$1.run(Context.java:566)
  | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  | at java.base/java.lang.Thread.run(Unknown Source)
  | BAZEL_TEST_EXIT_CODE: 37

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Unsure

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

release 6.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions