Skip to content

Remove unsafe code from StringParser#125640

Open
EgorBo wants to merge 12 commits intodotnet:mainfrom
EgorBo:remove-unsafe-from-stringparser
Open

Remove unsafe code from StringParser#125640
EgorBo wants to merge 12 commits intodotnet:mainfrom
EgorBo:remove-unsafe-from-stringparser

Conversation

@EgorBo
Copy link
Copy Markdown
Member

@EgorBo EgorBo commented Mar 17, 2026

Replace pointer-based parsing in ParseNextInt32, ParseNextInt64, ParseNextUInt32, ParseNextUInt64 with safe span-based int/long/uint/ulong.Parse calls using NumberStyles and CultureInfo.InvariantCulture.

  • Eliminates all unsafe/fixed code from StringParser
  • Uses Parse on ReadOnlySpan<char> with FormatException catch → ThrowForInvalidData()
  • OverflowException propagates naturally (preserves original exception contract)
  • Rejects leading + in signed parse methods to preserve original behavior

Replace pointer-based parsing with safe indexing and for loops
in ParseNextInt32, ParseNextInt64, ParseNextUInt32, ParseNextUInt64.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 17, 2026 01:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe pointer-based parsing from System.IO.StringParser numeric parsing helpers by switching to safe string indexing/for loops, with a small consistency fix around empty-component handling.

Changes:

  • Replaced fixed/pointer loops with safe indexed loops in ParseNextInt32/Int64/UInt32/UInt64.
  • Added/standardized empty-component validation (notably for ParseNextUInt64).

EgorBo and others added 2 commits March 17, 2026 02:57
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…d add empty-component tests

- Use AsSpan slice in all parse methods so JIT can eliminate bounds checks
- Add Theory test covering empty component -> InvalidDataException for all
  numeric parse methods

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 17, 2026 02:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe pointer-based parsing from System.IO.StringParser numeric parsing helpers, replacing it with safe ReadOnlySpan<char> indexing and for loops while keeping the same overflow/invalid-data behavior.

Changes:

  • Rewrote ParseNextInt32, ParseNextInt64, ParseNextUInt32, and ParseNextUInt64 to avoid fixed/pointers and instead parse via spans and indexing.
  • Added a new unit test intended to validate that empty components throw InvalidDataException for numeric parsing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/libraries/Common/src/System/IO/StringParser.cs Replaces pointer-based numeric parsing loops with span/index-based parsing and explicit empty-component checks.
src/libraries/Common/tests/Tests/System/IO/StringParserTests.cs Adds a theory to validate numeric parsers throw InvalidDataException on empty components.

…anch

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@EgorBo
Copy link
Copy Markdown
Member Author

EgorBo commented Mar 17, 2026

@MihuBot

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe pointer-based parsing from StringParser’s numeric parsing APIs by switching to ReadOnlySpan<char> indexing/loops, and adds a regression test to ensure empty components throw InvalidDataException.

Changes:

  • Replaced pointer iteration with ReadOnlySpan<char> + for loops in ParseNextInt32/Int64/UInt32/UInt64.
  • Added a new xUnit theory covering empty components for all four numeric parse methods.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/libraries/Common/src/System/IO/StringParser.cs Converts numeric parsing implementations from unsafe pointer-based logic to safe span-based loops.
src/libraries/Common/tests/Tests/System/IO/StringParserTests.cs Adds a theory to validate empty-component behavior for numeric parsers.

@EgorBo EgorBo marked this pull request as ready for review March 17, 2026 14:14
Copilot AI review requested due to automatic review settings March 17, 2026 14:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe pointer-based parsing from StringParser numeric parsing methods by switching to ReadOnlySpan<char> indexing and for loops.

Changes:

  • Replaced pointer iteration with span indexing in ParseNextInt32 and ParseNextInt64.
  • Replaced pointer iteration with span indexing in ParseNextUInt32 and ParseNextUInt64.
  • Added explicit empty-span validation before parsing.

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

As suggested by @tannergooding and @stephentoub: now that span exists,
use the built-in Parse methods instead of manual digit-by-digit parsing.
FormatException is caught and rethrown as InvalidDataException to preserve
existing exception semantics. OverflowException propagates naturally.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 15:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe pointer-based numeric parsing from StringParser by switching the numeric parsing helpers to span-based framework parsing APIs, keeping parsing allocation-free while eliminating fixed/pointer usage.

Changes:

  • Replaced manual unsafe digit loops in ParseNextInt32/Int64/UInt32/UInt64 with *.Parse(ReadOnlySpan<char>, NumberStyles, IFormatProvider) using InvariantCulture.
  • Added System.Globalization import to support CultureInfo.InvariantCulture.
  • Converted format failures into InvalidDataException via try/catch.

EgorBo and others added 2 commits March 24, 2026 18:19
…orInvalidData

- Switch from Parse+catch to TryParse+ThrowForInvalidData() to avoid
  exception-driven control flow for invalid data
- Reject leading '+' in ParseNextInt32/Int64 to preserve original behavior
  (old manual parser only accepted '-')
- Use ThrowForInvalidData() consistently instead of throw new InvalidDataException()
- Update overflow test: TryParse returns false for overflow, so these now
  throw InvalidDataException (all failures are 'invalid data' for this parser)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 17:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Switch from TryParse back to Parse: catch only FormatException and call
ThrowForInvalidData(), letting OverflowException propagate naturally.
This preserves the original exception contract where overflow and format
errors surface as distinct exception types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add tests for invalid input (leading +, non-numeric, whitespace, lone minus,
negative for unsigned), empty component handling, valid signed values, and
per-method overflow. Uses shared MemberData for common invalid inputs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 10, 2026 03:46
Copy link
Copy Markdown
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed new tests. LGTM

Verified the tests pass before and after the product changes, except this line failed before - you fixed it. used to throw FormatException here
https://github.com/dotnet/runtime/pull/125640/changes#diff-a36e218583d58d3fd791a3477a626b7512f8ec202dafaea506303654dfc4fcecR204

@danmoseley danmoseley enabled auto-merge (squash) April 10, 2026 03:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the pointer-based numeric parsing logic in StringParser with safe, span-based int/long/uint/ulong.Parse calls using NumberStyles and CultureInfo.InvariantCulture, and adds/extends tests to validate parsing behavior and exception mapping.

Changes:

  • Reworked ParseNextInt32/Int64/UInt32/UInt64 to use span-based Parse and translate FormatException into InvalidDataException via ThrowForInvalidData().
  • Added test coverage for invalid inputs, empty components, valid signed values, and overflow behavior for the numeric parse methods.
Show a summary per file
File Description
src/libraries/Common/src/System/IO/StringParser.cs Replaces unsafe/manual numeric parsing with safe Parse(ReadOnlySpan<char>, ...) using invariant culture and explicit + rejection for signed parses.
src/libraries/Common/tests/Tests/System/IO/StringParserTests.cs Adds theory/fact coverage for invalid/empty/valid/overflow scenarios across all numeric parse methods.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment on lines +200 to +207
public ulong ParseNextUInt64()
{
MoveNextOrFail();

ulong result = 0;
fixed (char* bufferPtr = _buffer)
try
{
char* p = bufferPtr + _startIndex;
char* end = bufferPtr + _endIndex;
while (p != end)
{
int d = *p - '0';
if (d < 0 || d > 9)
{
ThrowForInvalidData();
}
result = checked((result * 10ul) + (ulong)d);

p++;
}
return ulong.Parse(_buffer.AsSpan(_startIndex, _endIndex - _startIndex), NumberStyles.None, CultureInfo.InvariantCulture);
}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseNextUInt64 now treats an empty component as invalid (via ulong.Parse throwing FormatException), whereas the previous implementation returned 0 for an empty span (the digit loop would be skipped). This is a behavioral change that contradicts the PR description about preserving behavior/exception contract; please confirm the intended contract and either restore the prior empty->0 behavior or update the PR description/tests to explicitly document the breaking change.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

@danmoseley danmoseley Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EgorBo your call on this, seems correct to me. this is in your original change here, I only added a test for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants