Skip to content

Fix XLA searchsorted side='right' for NaN values#117850

Merged
copybara-service[bot] merged 1 commit into
tensorflow:masterfrom
praneethhere:fix-xla-searchsorted-nan-upper-bound
Jun 29, 2026
Merged

Fix XLA searchsorted side='right' for NaN values#117850
copybara-service[bot] merged 1 commit into
tensorflow:masterfrom
praneethhere:fix-xla-searchsorted-nan-upper-bound

Conversation

@praneethhere

Copy link
Copy Markdown
Contributor

Fixes #117804

This fixes an inconsistency between eager TensorFlow and XLA for
tf.searchsorted(..., side="right") when the search value is NaN

The XLA lowering for UpperBound was using value >= sorted_input to count the
matching positions. For NaN, that comparison is false for every sorted input
value, so XLA returned 0

Eager TensorFlow places a NaN search value at the end of the row for
side="right". This change keeps the existing comparison logic, but treats
NaN search values as greater than all sorted input values for UpperBound

I also added a regression test covering both side="left" and side="right"

Tested with:

bazel test //tensorflow/compiler/tests:searchsorted_op_test_cpu --repo_env=HERMETIC_PYTHON_VERSION=3.12 --macos_sdk_version=15.5 --jobs=4 --local_ram_resources=4096

@google-ml-butler google-ml-butler Bot added the size:S CL Change Size: Small label May 7, 2026
@google-cla

google-cla Bot commented May 7, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@praneethhere

Copy link
Copy Markdown
Contributor Author

@googlebot rescan

@keerthanakadiri keerthanakadiri requested a review from cantonios May 7, 2026 15:28
@google-ml-butler google-ml-butler Bot added the awaiting review Pull request awaiting review label May 7, 2026
@github-project-automation github-project-automation Bot moved this to Assigned Reviewer in PR Queue May 7, 2026
@keerthanakadiri keerthanakadiri added the prtype:bugfix PR to fix a bug label May 7, 2026
@keerthanakadiri keerthanakadiri requested a review from a team June 25, 2026 14:48
@keerthanakadiri

Copy link
Copy Markdown
Contributor

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for handling NaN search values in the searchsorted operation by adding a nan_values_compare_greater parameter to BuildLowerUpperBoundOp and implementing the corresponding logic for UpperBoundOp. It also adds a unit test testNanValue to verify this behavior. The review feedback suggests optimizing the implementation by restricting the NaN comparison logic to floating-point input types using DataTypeIsFloating, thereby avoiding redundant HLO instructions for integer types.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +68 to +74
if (nan_values_compare_greater) {
// Match eager searchsorted side='right' behavior for NaN search values.
// A NaN search value is placed after all entries in the sorted input row.
auto value_is_nan = xla::Compare(values_reshaped, values_reshaped, {},
xla::ComparisonDirection::kNe);
comparison = xla::Or(comparison, value_is_nan);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since tf.searchsorted supports both floating-point and integer types, we should avoid generating unnecessary HLO instructions (Compare and Or) for integer types, as integers cannot be NaN. We can restrict this logic to only run when the input type is a floating-point type using DataTypeIsFloating.

Suggested change
if (nan_values_compare_greater) {
// Match eager searchsorted side='right' behavior for NaN search values.
// A NaN search value is placed after all entries in the sorted input row.
auto value_is_nan = xla::Compare(values_reshaped, values_reshaped, {},
xla::ComparisonDirection::kNe);
comparison = xla::Or(comparison, value_is_nan);
}
if (nan_values_compare_greater && DataTypeIsFloating(ctx->input_type(0))) {
// Match eager searchsorted side='right' behavior for NaN search values.
// A NaN search value is placed after all entries in the sorted input row.
auto value_is_nan = xla::Compare(values_reshaped, values_reshaped, {},
xla::ComparisonDirection::kNe);
comparison = xla::Or(comparison, value_is_nan);
}
References
  1. Identify unnecessary memory usage or redundant computations. (link)

@keerthanakadiri keerthanakadiri added the stat:awaiting response Status - Awaiting response from author label Jun 25, 2026

@dmiltr3 dmiltr3 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! The changes look great and correctly align XLA's behavior with eager TensorFlow for NaN search values in both side='left' and side='right'.

I appreciate that you included the unit tests covering left and right sides across the floating-point types (float16, float32, float64, and bfloat16).

@google-ml-butler google-ml-butler Bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jun 26, 2026
@github-project-automation github-project-automation Bot moved this from Assigned Reviewer to Approved by Reviewer in PR Queue Jun 26, 2026
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jun 26, 2026
@nithyak0204 nithyak0204 removed awaiting review Pull request awaiting review stat:awaiting response Status - Awaiting response from author labels Jun 29, 2026
copybara-service Bot pushed a commit that referenced this pull request Jun 29, 2026
- Old logic: Scans physical `BufferAllocation`s and only counts those where `IsPreallocatedTempBuffer()` is true. This misses scratch buffers that the compiler overlays onto live-out (output) allocations to save memory.

- New logic: Walks the executed `Thunk` sequence and sums the sizes of all logical `BufferUse::Scratch` slices, accurately capturing scratch usage regardless of physical overlay optimizations.

Example (32MB matmul scratch overlaid on a 157MB live-out output buffer):
- Old logic: Returns 0 bytes (skips the live-out allocation).
- New logic: Returns 32MB (correctly extracts the scratch thunk use).

Also had to a const way to Walk Thunks

#### Why this is needed

The current logic relies on the `BufferAssignments` being alive, since the `buffer` variable which holds an `HloValue` is owned by the `BufferAssignments` brass.

I'm in the process of removing the `BufferAssignment` from the executable since it it cannot be re-created when loading an AOT binary, and its not really needed. So we need to get rid of this implicit dependency before doing so.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#117850 from praneethhere:fix-xla-searchsorted-nan-upper-bound 7f73af1
PiperOrigin-RevId: 937384420
copybara-service Bot pushed a commit that referenced this pull request Jun 29, 2026
- Old logic: Scans physical `BufferAllocation`s and only counts those where `IsPreallocatedTempBuffer()` is true. This misses scratch buffers that the compiler overlays onto live-out (output) allocations to save memory.

- New logic: Walks the executed `Thunk` sequence and sums the sizes of all logical `BufferUse::Scratch` slices, accurately capturing scratch usage regardless of physical overlay optimizations.

Example (32MB matmul scratch overlaid on a 157MB live-out output buffer):
- Old logic: Returns 0 bytes (skips the live-out allocation).
- New logic: Returns 32MB (correctly extracts the scratch thunk use).

Also had to a const way to Walk Thunks

#### Why this is needed

The current logic relies on the `BufferAssignments` being alive, since the `buffer` variable which holds an `HloValue` is owned by the `BufferAssignments` brass.

I'm in the process of removing the `BufferAssignment` from the executable since it it cannot be re-created when loading an AOT binary, and its not really needed. So we need to get rid of this implicit dependency before doing so.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#117850 from praneethhere:fix-xla-searchsorted-nan-upper-bound 7f73af1
PiperOrigin-RevId: 937384420
@copybara-service copybara-service Bot merged commit 6195e27 into tensorflow:master Jun 29, 2026
12 checks passed
@github-project-automation github-project-automation Bot moved this from Approved by Reviewer to Merged in PR Queue Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

prtype:bugfix PR to fix a bug ready to pull PR ready for merge process size:S CL Change Size: Small

Projects

Status: Merged

Development

Successfully merging this pull request may close these issues.

tf.searchsorted(side='right') returns wrong position for NaN under XLA

6 participants