Skip to content

[API Proposal]: {Last}IndexOf{AnyExcept}Range #76106

@stephentoub

Description

@stephentoub

Background and motivation

In looking over various code bases, it's relatively common to want to search for things in or out of a particular range.

For example, check whether the input contains anything outside of the range [0x20, 0x7e]:

for (int i = 0; i < token.Length; ++i)
{
if ((token[i] < 0x20) || (token[i] > 0x7e))
{
return true;
}
}
return false;

Or find the next index that's not an ASCII digit:

for (; i < end; i++)
{
if (!char.IsAsciiDigit(name[i]))
{
return false;
}
}

Or find the index of the first ASCII digit:

// Move to the beginning of the number
for (; (uint)pos < (uint)text.Length; pos++)
{
if (char.IsAsciiDigit(text[pos]))
{
break;
}
}

Or determine whether the input contains anything outside of the range of a byte:

for (int i = 0; i < input.Length; i++)
{
if (input[i] > (char)255)
{
return input; // This couldn't have come from the wire, someone assigned it directly.
}
else if (input[i] > (char)127)
{
possibleUtf8 = true;
break;
}
}

Or find the index of the first upper-case ASCII letter:

int i = 0;
while (i < s.Length)
{
if (char.IsAsciiLetterUpper(pSource[i]))
{
break;
}
i++;
}

Or find the first index of a value that's at least some integer:

for (solarMonth = 1; solarMonth < 12; solarMonth++)
{
if (days[solarMonth] >= solarDay)
{
break;
}
}

Etc. Today we don't provide an IndexOfXx method that makes it easy to search for such ranges. And even if/when we provide a more general and vectorized IndexOfAny that supports more than 5 characters (#68328), it's likely that such a method will be both harder to use (e.g. an initialization method and the actual method) and less efficient (because it's more complicated) than if we just have an IndexOfRange that lets you search for a simple range of values.

API Proposal

namespace System

public static class MemoryExtensions
{
+    public static int IndexOfRange<T>(this ReadOnlySpan<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;
+    public static int IndexOfRange<T>(this Span<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;

+    public static int IndexOfAnyExceptRange<T>(this ReadOnlySpan<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;
+    public static int IndexOfAnyExceptRange<T>(this Span<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;

+    public static int LastIndexOfRange<T>(this ReadOnlySpan<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;
+    public static int LastIndexOfRange<T>(this Span<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;

+    public static int LastIndexOfAnyExceptRange<T>(this ReadOnlySpan<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;
+    public static int LastIndexOfAnyExceptRange<T>(this Span<T> span, T lowInclusive, T highInclusive) where T : IComparable<T>;
}
  • There are 8 methods/overloads here for: Span vs ReadOnlySpan, inclusive vs exclusive, first vs last.
  • We can vectorize these, but only for built-in types where we know the semantics of the vector comparisons matches the semantics of the INumber<T> implementations. That's the same set of types as what we already vectorize in other operations, but as we're able to open up that set further (e.g. [API Proposal]: IBitwiseEquatable<T> #75642), it likely won't extend to these.
  • I've marked the constraint as being non-nullable; I don't know what it would mean to have a null low/high value. But if we think there's a good reason to allow it, we could make the constraints be ?. My intent is the implementation will validate that the range values are non-null.
  • With char.IsBetween, we decided to consider it undefined behavior if high < low, and we don't validate high >= low. My intent here is to do the same.

API Usage

internal static bool IsOnlyDigits(string str, int offset, int count) =>
    str.AsSpan(offset, count).IndexOfAnyExceptRange('0', '9') < 0;

Alternative Designs

No response

Risks

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions