Better fix for dataflow analysis caching by msridhar · Pull Request #1353 · uber/NullAway

msridhar · 2025-11-26T04:07:47Z

We get rid of having two related caches that can get out of sync. Instead, we create a new RunOnceForwardAnalysisImpl class that allows for running a dataflow analysis at most once.

Summary by CodeRabbit

Refactor
- Improved analysis caching to simplify and speed reuse of analysis results.
- Introduced a run-once analysis wrapper to ensure each analysis executes at most once, reducing redundant work.
- Removed redundant per-parameter run-tracking and related invalidation logic, streamlining cache management and instantiation paths.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

codecov · 2025-11-26T04:15:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.39%. Comparing base (119448e) to head (c320c3a).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##             master    #1353   +/-   ##
=========================================
  Coverage     88.39%   88.39%           
- Complexity     2592     2594    +2     
=========================================
  Files            97       98    +1     
  Lines          8702     8705    +3     
  Branches       1732     1732           
=========================================
+ Hits           7692     7695    +3     
  Misses          505      505           
  Partials        505      505

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-11-26T05:53:07Z

Main Branch:

Benchmark                          Mode  Cnt   Score   Error  Units
AutodisposeBenchmark.compile      thrpt   25   9.951 ± 0.102  ops/s
CaffeineBenchmark.compile         thrpt   25   2.065 ± 0.013  ops/s
DFlowMicroBenchmark.compile       thrpt   25  35.200 ± 0.306  ops/s
NullawayReleaseBenchmark.compile  thrpt   25   1.562 ± 0.022  ops/s

With This PR:

Benchmark                          Mode  Cnt   Score   Error  Units
AutodisposeBenchmark.compile      thrpt   25   9.865 ± 0.100  ops/s
CaffeineBenchmark.compile         thrpt   25   2.015 ± 0.032  ops/s
DFlowMicroBenchmark.compile       thrpt   25  34.706 ± 0.289  ops/s
NullawayReleaseBenchmark.compile  thrpt   25   1.536 ± 0.047  ops/s

coderabbitai · 2025-11-26T22:04:43Z

Walkthrough

The PR changes DataFlow to store RunOnceForwardAnalysisImpl instances in the analysis cache instead of generic Analysis objects, removes the separate alreadyRunAnalyses cache and its associated invalidation/lookup logic, and updates cache loader and dataflow invocation sites to use RunOnceForwardAnalysisImpl. A new RunOnceForwardAnalysisImpl class wraps ForwardAnalysisImpl and overrides performAnalysis to execute and cache results only on the first call.

Possibly related PRs

Bound size of alreadyRunAnalyses cache to be consistent #1334: Modifies DataFlow's alreadyRunAnalyses cache logic — touches the same caching/run-tracking code paths now replaced by RunOnceForwardAnalysisImpl.
Address severe performance regression in dataflow analysis #1328: Introduced the alreadyRunAnalyses guard to avoid re-running analyses; this PR removes that guard in favor of per-instance run-once behavior.

Suggested reviewers

lazaroclapp
yuxincs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Better fix for dataflow analysis caching' accurately describes the main change: improving the caching mechanism for dataflow analysis by fixing the issue with unsynchronized caches.
Linked Issues check	✅ Passed	The PR successfully addresses #1351's root problem by replacing the dual-cache approach with RunOnceForwardAnalysisImpl, ensuring analyses execute at most once and eliminating cache inconsistency issues.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing the caching issue described in #1351; the new RunOnceForwardAnalysisImpl class and refactored DataFlow caching are within scope.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix-caching-again

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e46ebe and c320c3a.

📒 Files selected for processing (2)

nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java (2 hunks)
nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Build and test on windows-latest
GitHub Check: Build spring-framework with snapshot
GitHub Check: Build and test on ubuntu-latest
GitHub Check: Build and test on macos-latest
GitHub Check: Build caffeine with snapshot

🔇 Additional comments (1)

nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java (1)

80-90: New single-cache + RunOnce wrapper correctly fixes the invalidation bug

Moving analysisCache to store RunOnceForwardAnalysisImpl<?, ?, ?> and invoking performAnalysis(cfg) on that wrapper (which is idempotent per instance) removes the prior split between analysisCache and alreadyRunAnalyses. After eviction, any reloaded analysis instance starts with analysisPerformed == false and will be initialized when performAnalysis is requested, so a newly created analysis can no longer be mistakenly treated as “already run” with uninitialized internal state. The generics cast and call sites still match the previous usage patterns, so behavior is preserved while the cache inconsistency is eliminated.

Also applies to: 159-163

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 119448e and 5e46ebe.

📒 Files selected for processing (2)

nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java (2 hunks)
nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: benchmarking
GitHub Check: Build and test on windows-latest
GitHub Check: Build and test on macos-latest
GitHub Check: Build spring-framework with snapshot
GitHub Check: Build and test on ubuntu-latest
GitHub Check: Build caffeine with snapshot

🔇 Additional comments (3)

nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java (1)

9-15: Run-once wrapper cleanly encapsulates the caching fix

The class-level design and Javadoc clearly move the “already run” state into the analysis instance itself, which should prevent the cache/set desynchronization that caused #1351. Generics and inheritance from ForwardAnalysisImpl look correct and consistent with the existing dataflow APIs.

nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java (2)

80-90: Good refactor: cache now stores run-once analyses keyed by (transfer, cfg)

Switching analysisCache to LoadingCache<AnalysisParams, RunOnceForwardAnalysisImpl<?, ?, ?>> with the loader constructing a fresh RunOnceForwardAnalysisImpl for each (transferFunction, cfg) key cleanly centralizes the “already run” state inside the analysis instance. This eliminates the possibility of the cache entry being evicted while a separate alreadyRunAnalyses structure still claims the key is initialized, which is exactly the inconsistency that led to the NPE in issue #1351. The key still includes cfg, so a single analysis instance is never shared across different CFGs.

159-163: performAnalysisIfNeeded(cfg) correctly replaces external bookkeeping

Using RunOnceForwardAnalysisImpl<A, S, T> here and calling analysis.performAnalysisIfNeeded(cfg) whenever performAnalysis is true ensures that:

Analyses are initialized exactly once per cached instance, even if dataflow is called multiple times.

After eviction, a newly created analysis will always run before getResult()/getRegularExitStore() is used, so cfg-dependent fields can’t remain null.

This aligns with the PR’s goal of fixing the out-of-sync cache state and removes the need for a separate “already run” cache.

nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java

github-actions · 2025-11-26T23:46:08Z

Main Branch:

Benchmark                          Mode  Cnt   Score   Error  Units
AutodisposeBenchmark.compile      thrpt   25  10.016 ± 0.032  ops/s
CaffeineBenchmark.compile         thrpt   25   2.069 ± 0.014  ops/s
DFlowMicroBenchmark.compile       thrpt   25  35.227 ± 0.209  ops/s
NullawayReleaseBenchmark.compile  thrpt   25   1.573 ± 0.045  ops/s

With This PR:

Benchmark                          Mode  Cnt   Score   Error  Units
AutodisposeBenchmark.compile      thrpt   25   9.902 ± 0.062  ops/s
CaffeineBenchmark.compile         thrpt   25   2.055 ± 0.011  ops/s
DFlowMicroBenchmark.compile       thrpt   25  34.747 ± 0.416  ops/s
NullawayReleaseBenchmark.compile  thrpt   25   1.550 ± 0.041  ops/s

Better fix for dataflow analysis caching

5cebe80

msridhar added the run-benchmarks label Nov 26, 2025

msridhar mentioned this pull request Nov 26, 2025

Incorrect Cache invalidation in Nullaway 0.12.12 #1351

Closed

docs and cleanup

5e46ebe

msridhar marked this pull request as ready for review November 26, 2025 22:00

msridhar added run-benchmarks and removed run-benchmarks labels Nov 26, 2025

msridhar requested a review from yuxincs November 26, 2025 22:01

yuxincs approved these changes Nov 26, 2025

View reviewed changes

coderabbitai bot reviewed Nov 26, 2025

View reviewed changes

nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java Outdated Show resolved Hide resolved

nullaway/src/main/java/com/uber/nullaway/dataflow/RunOnceForwardAnalysisImpl.java Show resolved Hide resolved

tweaks

c320c3a

msridhar enabled auto-merge (squash) November 26, 2025 22:54

msridhar merged commit 7cd8b1b into master Nov 26, 2025
11 checks passed

msridhar deleted the fix-caching-again branch November 26, 2025 23:04

This was referenced Dec 11, 2025

Change how recursive calls to dataflow are detected #1374

Merged

Run dataflow analysis on bodies of lambdas passed to generic methods #1375

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better fix for dataflow analysis caching#1353

Better fix for dataflow analysis caching#1353
msridhar merged 3 commits intomasterfrom
fix-caching-again

msridhar commented Nov 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

codecov bot commented Nov 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

coderabbitai bot commented Nov 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

msridhar commented Nov 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

codecov bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

coderabbitai bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

msridhar commented Nov 26, 2025 •

edited by coderabbitai bot

Loading

codecov bot commented Nov 26, 2025 •

edited

Loading

coderabbitai bot commented Nov 26, 2025 •

edited

Loading