Fix NaN handling in GPU cast operations for int32/int64 conversions (#12345) by Mohataseem89 · Pull Request #106214 · tensorflow/tensorflow

Mohataseem89 · 2025-12-14T16:45:26Z

Added IsNan function templates for different types and updated CastFunctor implementations to handle NaN values during type casting.

This PR addresses Issue #12345, which reported incorrect behavior when casting floating-point tensors containing NaN values to integer types on GPU devices.

Previously, GPU cast functors directly used isnan(v) in device lambdas. This caused undefined behavior for types like Eigen::half and bfloat16 on CUDA/ROCm, and NaN inputs were not safely handled when converting to int32 or int64.

Changes

Added a device-friendly IsNan() utility function that supports float, double, Eigen::half, and bfloat16.
Updated the following CastFunctors to use IsNan(v):
- CastFunctor<GPUDevice, int32, float>
- CastFunctor<GPUDevice, int64, float>
- CastFunctor<GPUDevice, int32, double>
- CastFunctor<GPUDevice, int64, double>
Ensures that NaN values are safely converted to 0 during casting.

Files Affected

tensorflow/core/kernels/cast_op_gpu.cu.cc

This PR resolves Issue #12345 by adding safe NaN handling in GPU casting.

tensorflow/core/kernels/cast_op_gpu.cu.cc

mihaimaruseac · 2025-12-15T17:54:39Z

Please don't use "add file"/"update file"/"fix file"/etc. commit messages. These are hard to reason about when looking at the history of the file/repository. Instead, please write explanatory git commit messages.

The commit message is also the title of the PR if the PR has only one commit. It is thus twice important to have commit messages that are relevant, as PRs would be easier to understand and easier to analyze in search results.

For how to write good quality git commit messages, please consult https://cbea.ms/git-commit/

Please collapse both commits into just the first one (git rebase -i)

Add explicit IsNan helpers and update GPU CastFunctor implementations to safely handle NaN values when casting floating-point tensors to integer types on GPU devices, aligning behavior with CPU and NumPy.

Mohataseem89 · 2025-12-16T13:47:00Z

Please don't use "add file"/"update file"/"fix file"/etc. commit messages. These are hard to reason about when looking at the history of the file/repository. Instead, please write explanatory git commit messages.

The commit message is also the title of the PR if the PR has only one commit. It is thus twice important to have commit messages that are relevant, as PRs would be easier to understand and easier to analyze in search results.

For how to write good quality git commit messages, please consult https://cbea.ms/git-commit/

Please collapse both commits into just the first one (git rebase -i)

Thanks for the guidance
I have squashed the commits into one and updated the commit message to be more descriptive following the recommended guidelines.

mihaimaruseac · 2025-12-18T18:12:03Z

Waiting for review from @cantonios or someone else with more GPU kernels expertise

google-ml-butler bot added the size:M CL Change Size: Medium label Dec 14, 2025

google-ml-butler bot assigned gbaned Dec 14, 2025

keerthanakadiri requested a review from mihaimaruseac December 15, 2025 05:20

google-ml-butler bot added the awaiting review Pull request awaiting review label Dec 15, 2025

keerthanakadiri added this to PR Queue Dec 15, 2025

github-project-automation bot moved this to Assigned Reviewer in PR Queue Dec 15, 2025

keerthanakadiri added the comp:core issues related to core part of tensorflow label Dec 15, 2025

mihaimaruseac requested a review from cantonios December 15, 2025 16:21

mihaimaruseac suggested changes Dec 15, 2025

View reviewed changes

tensorflow/core/kernels/cast_op_gpu.cu.cc Outdated Show resolved Hide resolved

github-project-automation bot moved this from Assigned Reviewer to Reviewer Requested Changes in PR Queue Dec 15, 2025

Fix NaN handling in GPU cast operations for int32/int64 conversions

25ac479

Add explicit IsNan helpers and update GPU CastFunctor implementations to safely handle NaN values when casting floating-point tensors to integer types on GPU devices, aligning behavior with CPU and NumPy.

Mohataseem89 force-pushed the patch-2 branch from cff5dba to 25ac479 Compare December 16, 2025 13:43

Mohataseem89 requested a review from mihaimaruseac December 17, 2025 09:49

keerthanakadiri requested review from cantonios and removed request for cantonios January 9, 2026 08:13

Mohataseem89 marked this pull request as draft February 6, 2026 11:38

Mohataseem89 marked this pull request as ready for review February 9, 2026 13:42

keerthanakadiri removed the request for review from mihaimaruseac February 24, 2026 06:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NaN handling in GPU cast operations for int32/int64 conversions (#12345)#106214

Fix NaN handling in GPU cast operations for int32/int64 conversions (#12345)#106214
Mohataseem89 wants to merge 1 commit intotensorflow:masterfrom
Mohataseem89:patch-2

Mohataseem89 commented Dec 14, 2025

Uh oh!

Uh oh!

mihaimaruseac commented Dec 15, 2025

Uh oh!

Mohataseem89 commented Dec 16, 2025

Uh oh!

mihaimaruseac commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Mohataseem89 commented Dec 14, 2025

Changes

Files Affected

Uh oh!

Uh oh!

mihaimaruseac commented Dec 15, 2025

Uh oh!

Mohataseem89 commented Dec 16, 2025

Uh oh!

mihaimaruseac commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants