Use refined types from dataflow analysis in generic method inference#1309
Use refined types from dataflow analysis in generic method inference#1309
Conversation
WalkthroughThreads dataflow context (a TreePath and a calledFromDataflow boolean) through generic-method inference and constraint generation in GenericsChecks, adds dataflow-driven argument nullness refinement and helpers to update types with refined nullness, and makes inference caching conditional on whether inference was invoked from running dataflow. Adds AccessPathNullnessAnalysis.getNullnessFromRunning and changes DataFlow.expressionDataflow to accept an isRunning flag (and control whether analysis is performed). Updates call sites and tests to propagate and exercise the new dataflow-aware inference behavior, and adds TypeSubstitutionUtils.removeNullableAnnotation. Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1309 +/- ##
============================================
+ Coverage 88.42% 88.44% +0.02%
- Complexity 2523 2541 +18
============================================
Files 94 94
Lines 8415 8466 +51
Branches 1661 1673 +12
============================================
+ Hits 7441 7488 +47
- Misses 491 494 +3
- Partials 483 484 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
770-805: Argument-type refinement via dataflow is correct; mind repeated queriesLogic to refine locals’ nullability using running/non-running dataflow is sound and addresses the inference gap. If perf becomes a concern, consider memoizing per local symbol within a single inference run to avoid repeated expressionDataflow calls.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessAnalysis.java(2 hunks)nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessPropagation.java(1 hunks)nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java(5 hunks)nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java(13 hunks)nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
nullaway/src/main/java/com/uber/nullaway/NullabilityUtil.java (1)
NullabilityUtil(66-668)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build and test on windows-latest
- GitHub Check: Build and test on ubuntu-latest
- GitHub Check: Build spring-framework with snapshot
- GitHub Check: Build caffeine with snapshot
- GitHub Check: Build and test on macos-latest
🔇 Additional comments (3)
nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessAnalysis.java (1)
111-112: API additions look good; running-mode accessor is appropriateSwitching to expressionDataflow(..., false) and adding getNullnessFromRunning(...) aligns with the new DataFlow API. Callers handle null results.
Also applies to: 114-124, 136-137
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java (1)
804-845: Test updates reflect dataflow-aware inferenceRe-enabling and adding tests to validate refined-local behavior and loop interactions look correct and increase coverage of the new flow-sensitive inference.
Also applies to: 847-892, 1252-1254, 1266-1268
nullaway/src/main/java/com/uber/nullaway/dataflow/DataFlow.java (1)
216-235: Transfer instance reuse verified
AccessPathNullnessAnalysis.instance(...) caches a single nullnessPropagation per VisitorState and getNullnessFromRunning always passes that same instance to expressionDataflow(true); no changes required.
nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessPropagation.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
64-64: Remove unused import.The
MethodInvocationNodeclass is imported but never used in this file.Apply this diff:
-import org.checkerframework.nullaway.dataflow.cfg.node.MethodInvocationNode;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java(14 hunks)nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T04:54:20.953Z
Learnt from: msridhar
PR: uber/NullAway#1248
File: nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java:847-857
Timestamp: 2025-08-28T04:54:20.953Z
Learning: In NullAway's GenericsChecks.java, NewClassTree support for explicit type argument substitution requires more extensive changes beyond just modifying the conditional in compareGenericTypeParameterNullabilityForCall. The maintainers prefer to handle NewClassTree support in a separate follow-up rather than expanding the scope of PRs focused on specific issues like super constructor calls.
Applied to files:
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java
🧬 Code graph analysis (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (2)
nullaway/src/main/java/com/uber/nullaway/NullabilityUtil.java (1)
NullabilityUtil(66-668)nullaway/src/main/java/com/uber/nullaway/generics/TypeSubstitutionUtils.java (1)
TypeSubstitutionUtils(22-287)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build caffeine with snapshot
- GitHub Check: Build spring-framework with snapshot
- GitHub Check: Build and test on windows-latest
- GitHub Check: Build and test on macos-latest
- GitHub Check: Build and test on ubuntu-latest
🔇 Additional comments (8)
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java (4)
804-845: LGTM!The test comprehensively covers dataflow-aware inference for local variables, including positive cases (where refinement detects
@Nullable), negative cases (where refinement detects@NonNull), and field assignment scenarios.
847-892: LGTM!The test correctly verifies dataflow refinement behavior in loops, ensuring that the analysis reaches a proper fixed point where nullable values are detected even when the nullability changes within loop iterations.
894-920: LGTM!The test correctly verifies that dataflow refinement applies to fields, both for nullable fields and for fields that have been refined to non-null via null checks.
1280-1280: LGTM!The updated error messages correctly reflect that generic method inference now runs during dataflow analysis, producing more precise error diagnostics at the actual error location (dereference or return statement) rather than at the parameter passing site.
Also applies to: 1294-1294
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (4)
107-121: LGTM!The re-entrancy guard now includes both a setter and getter, which allows callers to save and restore the previous state. This addresses the concern from previous reviews about nesting safety.
Based on learnings.
765-806: Verify TreePath construction is sufficient for all cases.The comment at lines 785-787 acknowledges that constructing
new TreePath(currentPath, argument)may not be a valid tree path to the argument. While the comment explains this should be sufficient for finding the enclosing method/lambda/initializer, this assumption should be verified.Consider adding a test case that exercises nested method calls with complex tree structures to ensure this assumption holds. For example:
static <T extends @Nullable Object> T id(T t) { return t; } void test() { String x = null; // Nested call: outerMethod(id(x)) // TreePath construction should still work here }Alternatively, if there's a more robust way to obtain the correct TreePath for an argument within an invocation, consider using that approach instead.
808-836: LGTM!The method correctly handles type refinement in both directions: adding a
@Nullableannotation when refining to nullable, and removing a@Nullableannotation when refining to non-null. The use ofTYPE_METADATA_BUILDERfor type cloning and the verification at line 832 are appropriate safeguards.
642-670: LGTM!The caching behavior correctly skips caching inference results when invoked from dataflow analysis, since the dataflow facts may not yet reflect the final fixed point. This prevents incorrect cached results from being used in subsequent analysis passes.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
107-121: Re‑entrancy guard: ensure save/restore with try/finally at call sitesBoolean + getter is fine if callers save previous state and restore it in finally. Please verify all setCalledFromDataflow usages follow that pattern; otherwise, nesting can flip the flag incorrectly. Depth counter could be a future‑proof improvement.
Run to inspect usages:
#!/bin/bash echo "All setCalledFromDataflow() call sites:" rg -n -C3 'setCalledFromDataflow\(' echo echo "Nearby isCalledFromDataflow() (save/restore) usage:" rg -n -C3 'isCalledFromDataflow\('
🧹 Nitpick comments (2)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (2)
740-763: Dataflow‑aware argument refinement is a good addition; verify TreePath robustnessUsing new TreePath(currentPath, argument) is pragmatic but not a true AST path. Please verify that getNullness/getNullnessFromRunning do not rely on parent links beyond locating the enclosing method/lambda. If they do, consider resolving a real path via Trees.getPath or by deriving the argument’s path from the invocation path.
808-835: Handle synthetic @nullable in removal/inspection to keep updates symmetricTypes annotated with the synthetic “nullaway.synthetic.Nullable” won’t be recognized/removed by isNullableAnnotated() or the removal loop, potentially leaving stale synthetic annotations. Recommend recognizing the synthetic marker both for detection and removal.
Apply diffs:
@@ private Type updateTypeWithNullness( VisitorState state, Type argumentType, Nullness refinedNullness) { @@ - for (Attribute.TypeCompound annot : argumentType.getAnnotationMirrors()) { - String annotationName = annot.type.toString(); - if (Nullness.isNullableAnnotation(annotationName, config)) { + for (Attribute.TypeCompound annot : argumentType.getAnnotationMirrors()) { + String annotationName = annot.type.toString(); + if (Nullness.isNullableAnnotation(annotationName, config) + || "nullaway.synthetic.Nullable".equals(annotationName)) { removedNullable = true; continue; } updatedAnnotations.append(annot); } @@ public boolean isNullableAnnotated(Type type) { - return Nullness.hasNullableAnnotation(type.getAnnotationMirrors().stream(), config); + if (Nullness.hasNullableAnnotation(type.getAnnotationMirrors().stream(), config)) { + return true; + } + for (Attribute.TypeCompound annot : type.getAnnotationMirrors()) { + if ("nullaway.synthetic.Nullable".equals(annot.type.toString())) { + return true; + } + } + return false; }Also applies to: 1768-1776
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java(14 hunks)nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
nullaway/src/main/java/com/uber/nullaway/NullabilityUtil.java (1)
NullabilityUtil(66-668)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build and test on ubuntu-latest
- GitHub Check: Build and test on windows-latest
- GitHub Check: Build and test on macos-latest
- GitHub Check: Build caffeine with snapshot
- GitHub Check: Build spring-framework with snapshot
🔇 Additional comments (8)
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java (4)
804-845: Good coverage for locals/fields with dataflow-aware inferenceScenarios (locals, field assignment, fixed-point caveat) are well captured and assertions match new diagnostics.
847-892: Loop interplay with dataflow/inference looks solidCovers forward/backward edges and post-loop deref; matches expected nullable outcomes.
894-927: Field-based and unsupported-expression cases are appropriateField refinement and the “unsupported” conditional case validate boundaries nicely.
1287-1287: Diagnostics updates align with dataflow-driven return/receiver checksSwitch to dereferenced/returning-nullable messages is correct under refined inference.
Also applies to: 1301-1301
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (4)
642-649: Skipping cache while dataflow is running is correctPrevents caching results tied to non‑fixed‑point facts. LGTM.
Also applies to: 664-671
701-737: Path/threading through constraint generation looks correctPropagating TreePath into nested calls sets up context for dataflow‑aware refinement. LGTM.
765-806: Refinement logic is sound (locals/fields only, primitives skipped)Guarding on symbols avoids recursive inference on calls; correct use of getNullnessFromRunning when re‑entered. LGTM.
1299-1313: On‑demand inference with path propagation looks goodLazy inference when no explicit type args and proper context selection via getInvocationAndContextForInference. LGTM.
lazaroclapp
left a comment
There was a problem hiding this comment.
Looks good to me! Agree that the field setup is a bit of a code smell, but seems like it works and I can see threading the "is in dataflow" context being painful (we might want to look into aggregating a lot of the context for GenericChecks and figuring out a principled way to pass it around, since that whole processing is very context dependent, but I would do that as it's own PR in any case).
Other than that, a bunch of minor suggestions/nits, the main thing I'd expect to see is the test case from #1287, unless I missed it somehow :)
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java
Show resolved
Hide resolved
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java
Show resolved
Hide resolved
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java
Show resolved
Hide resolved
nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java
Show resolved
Hide resolved
| * @see | ||
| * com.uber.nullaway.dataflow.AccessPathNullnessPropagation#genericReturnIsNullable(MethodInvocationNode) | ||
| */ | ||
| private boolean calledFromDataflow = false; |
There was a problem hiding this comment.
Maybe not quite yet, but I wonder if we will end up needing a GenericsChecks.context object with more information specific to where GenericsChecks is being run on. Also, do we re-use these objects? If not, I'd consider checking if we can make this final and set it when creating the GenericsChecks object.
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java
Outdated
Show resolved
Hide resolved
| verify(!genericsChecks.isCalledFromDataflow()); | ||
| genericsChecks.setCalledFromDataflow(true); |
There was a problem hiding this comment.
Ah, never mind my comment at https://github.com/uber/NullAway/pull/1309/files#diff-82d66c6ba15b87acc24509d7a3440bbe7606944197b929733d315fbe0a025435R113 below, I guess we are indeed using GenericsChecks as a re-usable singleton, not per invocation, so we can't make two instances with a final calledFromDataflow.
Threading the parameter might be a reasonable follow up, but I do see the description saying this was involved.
There was a problem hiding this comment.
I did try threading the parameter, but it ended up adding parameters to other public API methods, which would end up polluting the call sites of those methods when really dataflow only calls getGenericReturnNullnessAtInvocation. I could do it, but it doesn't seem clean. I'll double check and make sure it's as bad as I thought.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (1)
833-860: Consider extracting nullable annotation removal toTypeSubstitutionUtilsfor symmetry.The method correctly refines types based on dataflow-inferred nullness. However, there's an asymmetry:
- Adding
@Nullableuses the clean helperTypeSubstitutionUtils.typeWithAnnot(line 840)- Removing
@Nullablemanually iterates and filters annotations (lines 846-858)As previously noted in the past review comments, consider extracting the removal logic to a method like
TypeSubstitutionUtils.removeNullableAnnotations(Type, Config)for symmetry, maintainability, and reuse. The manual annotation filtering doesn't depend on anything specific toGenericsChecksother thanconfig.The logic itself is correct:
nullnessToBoolproperly checks if nullness represents nullable (line 835)- Early returns prevent unnecessary work (lines 838, 844)
Verify.verify(removedNullable)safely asserts correctness since we only reach line 846 whenisNullableAnnotatedreturns trueAs per coding guidelines
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessAnalysis.java(2 hunks)nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java(15 hunks)nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- nullaway/src/main/java/com/uber/nullaway/dataflow/AccessPathNullnessAnalysis.java
- nullaway/src/test/java/com/uber/nullaway/jspecify/GenericMethodTests.java
🧰 Additional context used
🧬 Code graph analysis (1)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (2)
nullaway/src/main/java/com/uber/nullaway/NullabilityUtil.java (1)
NullabilityUtil(66-668)nullaway/src/main/java/com/uber/nullaway/generics/TypeSubstitutionUtils.java (1)
TypeSubstitutionUtils(22-287)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build caffeine with snapshot
- GitHub Check: Build and test on ubuntu-latest
- GitHub Check: Build and test on windows-latest
- GitHub Check: Build and test on macos-latest
- GitHub Check: Build spring-framework with snapshot
🔇 Additional comments (7)
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java (7)
118-132: Re-entrancy guard implementation looks good, but manual save/restore is required by callers.The addition of the
isCalledFromDataflow()getter addresses the previous review concern by enabling callers to save and restore the flag state. However, this approach still requires manual discipline (try/finally blocks) at all call sites, rather than the more robust depth-counter pattern suggested previously. While the current implementation is workable, ensure that all call sites inAccessPathNullnessPropagationproperly save and restore the flag.
589-590: TreePath threading enables dataflow context propagation.The addition of optional
@Nullable TreePath pathparameters throughout the inference call chain is a clean way to propagate invocation context without breaking existing callers. The nullable annotation appropriately handles cases where the path is not available or differs fromstate.getPath().Also applies to: 630-630, 644-644, 741-741, 757-759, 764-764, 776-776
665-671: Conditional caching correctly prevents storing results based on interim dataflow facts.The logic to skip caching when
calledFromDataflowis true is essential for correctness. During dataflow analysis, the retrieved facts may not yet reflect the fixed point, so caching results based on these interim facts would lead to incorrect inference in subsequent analyses. The implementation correctly handles both success and failure cases.Also applies to: 687-693
790-831: TreePath construction (line 813) is fragile and relies on undocumented assumptions.The construction
new TreePath(currentPath, argument)may not create a valid tree path ifargumentis not a direct child ofcurrentPath.getLeaf(). While the comment acknowledges this is "sketchy," and it may work becausefindEnclosingMethodOrLambdaOrInitializerstarts withgetParentPath(), this is fragile and could break if tree structure assumptions change.Consider one of the following approaches to make this more robust:
- Use
TreePath.getPath(state.getCompilationUnit(), argument)if the compilation unit is available- Add assertions to validate that the constructed path's structure is as expected
- Document the specific assumptions this code relies on
Otherwise, the re-entrancy handling (lines 818-826) correctly avoids recursion by checking
calledFromDataflowand calling the appropriate dataflow analysis method.Based on learnings
785-786: Dataflow refinement correctly integrated before constraint generation.The call to
refineArgumentTypeWithDataflowat line 785 is well-placed: it refines the argument type after obtaining it from the tree (line 780) but before generating subtype constraints (line 786). This ensures that generic method inference uses flow-sensitive types from dataflow analysis, which is the core objective of this PR.
1343-1343: Path parameter correctly propagated in nested inference.The
pathparameter is appropriately passed torunInferenceForCallat line 1343, consistent with the path-threading pattern established throughout this PR. This ensures that nested generic method invocations have access to the invocation context needed for dataflow-aware refinement.
6-6: Static import ofTYPE_METADATA_BUILDERis appropriate.The static import is consistent with usage patterns in
TypeSubstitutionUtils(as seen in the relevant code snippets) and improves readability at the usage sites (lines 857-858).
nullaway/src/main/java/com/uber/nullaway/generics/GenericsChecks.java
Outdated
Show resolved
Hide resolved
In the end I bit the bullet and passed it around. For other public APIs that aren't used from dataflow I just passed false for now rather than adding an extra parameter. It's a bit of an intrusive change but in the end better than a mutable field. Thanks for pushing on this :-) |
Fixes #1327 #1309 introduced a severe performance regression in dataflow analysis, as the result of running the dataflow analysis was no longer being cached. This caused a >100X slowdown in our dataflow micro-benchmark! (Which we really should have measured before landing #1309...) This PR re-introduces the cache for analysis results (by tracking when we have already run the analysis). This fix exposed some other bugs in generic method inference where we weren't passing around the right `TreePath` to enable discovery of appropriate assignment contexts and containing methods; we fix those bugs here too. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * More precise nullness inference for generic method returns in JSpecify mode by using invocation context when available. * **Bug Fixes** * Prevents redundant dataflow analyses by tracking already-run analyses; cache invalidation resets this tracking. * Behavior changes are internal only; public API remains unchanged. * **Tests** * Updated test setup and expected diagnostics to reflect refined generic inference and nullability behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Fixes #1287
In generic method inference, we now try to obtain a refined type from dataflow-based type inference for arguments when generating constraints. This introduces a recursive dependence between dataflow analysis and generic method inference, which we handle as follows. We track whether generic method inference has been invoked from dataflow by passing a
calledFromDataflowparameter. When the parameter is true, generic method inference calls a new API to get the current dataflow fact for the argument from the running dataflow analysis, avoiding a recursive run. Since this may not be the fact in the final fixed point of dataflow, we skip caching of generic method inference results when it is invoked from the dataflow analysis. This may skip caching unnecessarily in some cases, but we prefer to make correctness more obvious for now; we can optimize further in the future if needed.Summary by CodeRabbit
New Features
Bug Fixes
Tests