Skip to content

Fix OneHotEncoder segfault due to missing input shape validation#7302

Merged
gramalingam merged 3 commits intomainfrom
copilot/fix-7298
Sep 19, 2025
Merged

Fix OneHotEncoder segfault due to missing input shape validation#7302
gramalingam merged 3 commits intomainfrom
copilot/fix-7298

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Sep 13, 2025

The OneHotEncoder operator in ai.onnx.ml was causing segmentation faults when used with custom operators that don't provide complete shape inference information. This occurred because the shape inference function directly accessed the input tensor shape without validating that the input type information was available.

Problem

The crash occurred in the OneHotEncoder shape inference function at this line:

const TensorShapeProto& input_shape = ctx.getInputType(0)->tensor_type().shape();

When a custom operator (like CustomIdentity in the example below) precedes OneHotEncoder and doesn't provide shape information, ctx.getInputType(0) could return nullptr or the input might not have a tensor type, leading to a segmentation fault.

Reproduction

import onnx
from onnx import helper

# Create model with custom operator followed by OneHotEncoder
custom_identity = helper.make_node("CustomIdentity", ["A"], ["B"], domain="com.example")
onehot = helper.make_node("OneHotEncoder", ["B"], ["C"], domain="ai.onnx.ml", cats_strings=["foo", "bar"])

# This would crash with segfault before the fix
onnx.checker.check_model(model, full_check=True)

Solution

Added proper input validation using hasNInputShapes(ctx, 1) before accessing the input shape, following the established pattern used by other operators in the codebase:

// Check if input shape is available before accessing it
if (!hasNInputShapes(ctx, 1)) {
  return;
}

This function safely checks that:

  1. The input exists
  2. The input type is not null
  3. The input has shape information available

Impact

  • Prevents segfaults when OneHotEncoder receives inputs from operators without complete type information
  • Maintains backward compatibility - normal operation is unchanged when input shapes are available
  • Follows established patterns - uses the same defensive coding pattern as ArrayFeatureExtractor and other operators
  • Minimal change - only adds 3 lines of validation code without altering core logic

Fixes #7298.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits September 13, 2025 14:48
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com>
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com>
Copilot AI changed the title [WIP] onnx.helper.check_model segfaults due to bug in OneHotEncoder's shape inference Fix OneHotEncoder segfault due to missing input shape validation Sep 13, 2025
Copilot AI requested a review from justinchuby September 13, 2025 14:55
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.35%. Comparing base (be3e379) to head (f135aef).
⚠️ Report is 88 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7302   +/-   ##
=======================================
  Coverage   54.35%   54.35%           
=======================================
  Files         511      511           
  Lines       31837    31837           
  Branches     2850     2850           
=======================================
  Hits        17304    17304           
  Misses      13755    13755           
  Partials      778      778           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@justinchuby justinchuby marked this pull request as ready for review September 15, 2025 19:01
@justinchuby justinchuby requested a review from a team as a code owner September 15, 2025 19:01
@justinchuby
Copy link
Copy Markdown
Member

@cbourjau I tested locally and this seems to be the correct fix. Could you review? I also wonder how to prevent this type of errors systematically.

@cbourjau
Copy link
Copy Markdown
Contributor

The "Reproduction" snippet does not inspire confidence, but the solution looks like the same check that is done in other operators, too.

I don't see an easy way to prevent this error systematically in the current code base :/

@github-project-automation github-project-automation Bot moved this from In progress to Reviewer approved in PR Tracker Sep 19, 2025
@gramalingam gramalingam merged commit a0e31f5 into main Sep 19, 2025
50 checks passed
@gramalingam gramalingam deleted the copilot/fix-7298 branch September 19, 2025 20:39
@github-project-automation github-project-automation Bot moved this from Reviewer approved to Done in PR Tracker Sep 19, 2025
@justinchuby justinchuby added this to the 1.19.1 milestone Sep 24, 2025
@justinchuby justinchuby removed this from the 1.19.1 milestone Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

onnx.helper.check_model segfaults due to bug in OneHotEncoder's shape inference

5 participants