Add IPAddress Span-based APIs#23224
Conversation
| @@ -150,6 +150,30 @@ public IPAddress(byte[] address, long scopeid) | |||
| PrivateScopeId = (uint)scopeid; | |||
| } | |||
There was a problem hiding this comment.
Can the IPAddress(byte[], long) ctor be implemented in terms of this new one?
public IPAddress(byte[] address, long scopeid) :
this(new ReadOnlySpan<byte>(address ?? new ArgumentNullException(nameof(address))), scopeid)
{
}or does that negatively impact the perf of the byte[]-based ctor?
There was a problem hiding this comment.
I didn't think of throwing the exception that way. Started with sharing the code but then the null check had to be specifically in the array based ctor and I ended up with the copy pasting of code.
Will change both ctors to use this type of null check. Thanks!
| } | ||
| } | ||
|
|
||
| public IPAddress(ReadOnlySpan<byte> address) |
There was a problem hiding this comment.
Same question; is there a reason we can't implement the existing byte[]-based ctor in terms of the span-based one?
| { | ||
| address = null; | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Leftover from troubleshooting that snuck in. I'll get rid of it.
| return false; | ||
| } | ||
|
|
||
| address = IPAddressParser.Parse(ipSpan, true); |
There was a problem hiding this comment.
Nit: I know the existing code doesn't have this, but it'd be nice to name the Boolean argument at the call site; otherwise it's difficult to know while reading the code what it means. (Also the parens around the expression below are unnecessary... again, I realize you were just copying the existing style.)
| } | ||
| } | ||
|
|
||
| TryWriteBytes(new Span<byte>(bytes), out int bytesWritten); |
There was a problem hiding this comment.
Nit: for robustness, it'd be good to assert that this returns true, e.g.
bool success = TryWriteBytes(new Span<byte>(bytes), out int bytesWritten);
Debug.Assert(success);
return bytes;| bytes[2] = (byte)(address >> 16); | ||
| bytes[3] = (byte)(address >> 24); | ||
| } | ||
| } |
There was a problem hiding this comment.
This code could be cleaned up while you're at it, and simplified to just:
int length = IsIPv6 ? IPAddressParserStatics.IPv6AddressBytes : IPAddressParserStatics.IPv4AddressBytes;
var bytes = new byte[length];
| } | ||
|
|
||
| return IPAddressParser.IPv6AddressToString(_numbers, PrivateScopeId, destination, out charsWritten); | ||
| } |
There was a problem hiding this comment.
Nit: you might condense this slightly, e.g.
public bool TryFormat(Span<char> destination, out int charsWritten) =>
IsIPv4 ?
IPAddressParser.IPv4AddressToString(PrivateAddress, destination, out charsWritten) :
IPAddressParser.IPv6AddressToString(_numbers, PrivateScopeId, destination, out charsWritten);|
|
||
| } | ||
|
|
||
| internal static unsafe IPAddress Parse(char* ipStringPtr, int ipStringPtrLength, bool tryParse) |
There was a problem hiding this comment.
Could this be written in terms of ReadOnlySpan<char> rather than in terms of char*? That'd a) be safer and b) avoid pinning for unnecessary periods of time. Regardless, the null check below can be moved up into the string-based overload.
| } | ||
|
|
||
| if (ipString.IndexOf(':') != -1) | ||
| if (ContainsColon(ipStringPtr, ipStringPtrLength)) |
There was a problem hiding this comment.
If you do switch to using span instead of pointers, you shouldn't need this custom ContainsColon either, as you should be able to just use Span.IndexOf. At the moment that'll require a reference to System.Memory, but hopefully that won't be needed in the near future.
There was a problem hiding this comment.
Any idea of which way to go? Both will require changes once the reference isn't required anymore.
There was a problem hiding this comment.
Especially in networking-related code, we want to avoid unnecessary unsafe code, so it'd be great to go the route of using spans as much as possible rather than introducing additional pointer-based code. For now I'd minimize changing pointer-based code to spans, as it could have more perf impact right now than we want to accept, but when converting string-based methods to something else, let's prefer spans over pointers.
| { | ||
| charsWritten = 0; | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Seems like this method should be made a private helper, and then another internal wrapper for it should be added that does this check. It's not needed when called from the string-returning version, only for the span one.
There was a problem hiding this comment.
Is it worth the extra method call to avoid that branch instruction? I think it might be better this way but I'm open to trying to measure it.
| Debug.Assert(address != null); | ||
| Debug.Assert(address.Length == IPAddressParserStatics.IPv6AddressShorts); | ||
|
|
||
| var buffer = Ipv6AddressToStringHelper(address, scopeId); |
There was a problem hiding this comment.
Why build up into a StringBuilder only to then copy the data out? We can't just go into the destination span directly?
There was a problem hiding this comment.
I went for this design to keep as much as possible shared. I will look into allocating a fixed size buffer for the string version instead.
There was a problem hiding this comment.
I see: the existing code is already using a StringBuilder, and you're now just passing that out of the call. Ok, makes sense. After this PR, it might make sense to look at cleaning that up further; seems like we should be able to avoid the StringBuilder entirely, but for now it's fine.
Nit: please change var => StringBuilder.
There was a problem hiding this comment.
Should I avoid using var in general for corefx repo?
There was a problem hiding this comment.
No, var is ok, but only if the type is named on the right-hand side, i.e. with a new or cast expression. We want the type of the variable to be easily understood by someone reading the code.
| int rem; | ||
| number = Math.DivRem(number, 10, out rem); | ||
| addressString[--i] = (char)('0' + rem); | ||
| addressString[offset - ++i] = (char)('0' + rem); |
There was a problem hiding this comment.
This is now accessing a ref inside of a loop. That's potentially expensive.
There was a problem hiding this comment.
Didn't know that about refs. Will fix.
There was a problem hiding this comment.
It just inhibits certain optimizations, as the compiler doesn't necessarily know whether the target is changing between uses.
| addressString[--i] = (char)('0' + rem); | ||
| addressString[offset - ++i] = (char)('0' + rem); | ||
| } while (number != 0); | ||
| offset = i; |
There was a problem hiding this comment.
I'm not understanding why this function needed to change...?
There was a problem hiding this comment.
Previous version wrote the IP from the end of the buffer. Writing this directly into the span caused leading \0. Changed it to write from the start of the buffer.
There was a problem hiding this comment.
I see. Ok. In that case it'd be worth adding an assert at the beginning validating that number <= 255. Or better yet, make it a byte rather than an int.
There was a problem hiding this comment.
Went with the assert as Math.DivRem only have overloads for int and long
| } | ||
|
|
||
| public static unsafe bool Ipv4StringToAddress(string ipString, out long address) | ||
| public static unsafe bool Ipv4StringToAddress(char* ipStringPtr, int ipStringPtrLength, out long address) |
There was a problem hiding this comment.
As earlier, it'd be good to using ReadOnlySpan<char> instead of the original string, rather than substituting char*. Especially in this IPAddress code, it'd be great to try to avoid adding any unnecessary unsafe code.
|
|
||
| // Only called from the IPv6Helper, only parse the canonical format | ||
| internal static unsafe int ParseHostNumber(string str, int start, int end) | ||
| internal static unsafe int ParseHostNumber(char* str, int start, int end) |
There was a problem hiding this comment.
Ditto on using ReadOnlySpan<char> unless there's a strong reason to avoid it. Will stop commenting on this, but it applies throughout the code you're touching... let's try to only switch to char* when calling existing functions that accept it.
|
|
||
| IPAddress ipAddress; | ||
| Assert.False(IPAddress.TryParse(null, out ipAddress)); | ||
| Assert.False(IPAddress.TryParse(test, out ipAddress)); |
There was a problem hiding this comment.
What was wrong with the code in this file as it was?
There was a problem hiding this comment.
The previous behaviour failed to build with ambigous overload error. Forcing the null to be of string type eliminates the Span version from the available candidates.
There was a problem hiding this comment.
Ok. FWIW, you can do that just by adding a cast, e.g. (string)null.
|
|
||
| namespace System.Net.Primitives.Functional.Tests | ||
| { | ||
| public class IPAddressParsingSpan |
There was a problem hiding this comment.
It's great to see lots of tests added, but a lot of these look like duplication of existing tests, just for the span-based overloads instead of the array-based ones. It'd be good to find a way to avoid the duplication and reuse tests once for both the array-based and span-based APIs. There are a variety of approaches to this, but one example is here:
https://github.com/dotnet/corefx/blob/master/src/Common/tests/System/Security/Cryptography/AlgorithmImplementations/RSA/EncryptDecrypt.cs
The tests are all in an abstract base class that exposes virtuals for each of the APIs that has an array vs span variant, and then derived types just override those methods, providing functionality to the base class to be tested.
stephentoub
left a comment
There was a problem hiding this comment.
Thanks for working on this!
|
Thanks for the thorough review and especially the nitpicks, good way to learn the project coding style. Will go over the internal methods again and see about using Spans instead of the |
|
Benchmarked ctor changes with these results With duplicated code.
With the chained constructor
I haven't looked at replacing the StringBuilder yet. |
| } | ||
|
|
||
| const bool tryParse = true; | ||
| address = IPAddressParser.Parse(ipString, tryParse); |
There was a problem hiding this comment.
I should have been clearer about naming the argument. I meant doing it like this:
address = IPAddressParser.Parse(ipString, tryParse: true)| public static bool TryParse(ReadOnlySpan<char> ipSpan, out IPAddress address) | ||
| { | ||
| const bool tryParse = true; | ||
| address = IPAddressParser.Parse(ipSpan, tryParse); |
| } | ||
|
|
||
| const bool tryParse = false; | ||
| return IPAddressParser.Parse(ipString, tryParse); |
| } | ||
|
|
||
| return false; | ||
| } |
There was a problem hiding this comment.
Rather than writing this method, you should be able to add a reference to System.Memory in the .csproj:
https://github.com/dotnet/corefx/blob/master/src/System.Net.Primitives/src/System.Net.Primitives.csproj#L200
Then you can delete this method, and at the call site instead of doing:
if (ContainsColon(ipSpan))do:
if (ipSpan.IndexOf(':') >= 0)That not only saves the extra code, but this will benefit from any perf improvements over the standard loop done in IndexOf.
| throw new ArgumentNullException(nameof(ipString)); | ||
| } | ||
|
|
||
| return Parse(ipString.ToCharArray(), tryParse); |
There was a problem hiding this comment.
Ouch, we should not be allocating a char[] from the string. Once you've added the System.Memory reference, you can use the AsReadOnlySpan extension method.
|
|
||
| return false; | ||
| } | ||
| internal static unsafe IPAddress Parse(string ipString, bool tryParse) |
There was a problem hiding this comment.
This no longer needs to be unsafe.
| if (ipString.IndexOf(':') != -1) | ||
| internal static unsafe IPAddress Parse(ReadOnlySpan<char> ipSpan, bool tryParse) | ||
| { | ||
| return Parse(ipSpan, tryParse); |
There was a problem hiding this comment.
Is this a typo? Won't this call itself recursively and stack overflow?
There was a problem hiding this comment.
This seems to have worked by accident because of the parameters. Renaming it to avoid issues.
|
|
||
| internal static unsafe IPAddress Parse(ReadOnlySpan<char> ipSpan, int ipStringPtrLength, bool tryParse) | ||
| { | ||
|
|
There was a problem hiding this comment.
Nit: unnecessary blank line
| return true; | ||
| } | ||
|
|
||
| internal static unsafe int IPv4AddressToStringHelper(uint address, char* addressString) |
There was a problem hiding this comment.
This no longer needs to use a pointer. The pointer was here due to stackallocating, but now we can just write this in terms of Span<char>, making this method safer against possible overflow bugs. The span-based caller can just pass in the target span. And the string-based caller can still stackalloc the target space, but then rather than passing that in directly, it can create a span from it and pass in the span here. That then also avoids the pinning that's been added in the span-based caller above.
|
|
||
| Assert.False(ip.TryFormat(new Span<char>(result), out int charsWritten)); | ||
| Assert.Equal(0, charsWritten); | ||
| } |
There was a problem hiding this comment.
Could we add an assert that verifies nothing was written to the output array? e.g.
Assert.Equal<byte>(new char[result.Length], result);|
|
||
| [Theory] | ||
| [MemberData(nameof(ValidIpv4Addresses))] | ||
| public void TryFormatIPv4_ValidAddress_Failure(string address, string expected) |
There was a problem hiding this comment.
Why would valid addresses fail to format? Too little destination space? If so, can we include that in the title?
There was a problem hiding this comment.
Assuming this is the case, my suggestion would be to combine the success and failure tests for this, and validate three things:
- That it succeeds with exactly the right amount of output space
- That it fails with one less byte available than is needed, and none of the output is written
- That it succeeds with one more byte available than is needed, and that extra byte remains unwritten
|
|
||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
You could do a similar thing with:
Assert.Equal<byte>(
expected.AsSpan().Slice(0, length).ToArray(),
actual.AsSpan().Slice(0, length).ToArray());| addressString[--offset] = '.'; | ||
| FormatIPv4AddressNumber((int)((address >> 8) & 0xFF), addressString, ref offset); | ||
| addressString[--offset] = '.'; | ||
| int charsWritten = IPv4AddressToStringHelper(address, new Span<char>((void*)addressString, MaxLength)); |
There was a problem hiding this comment.
Is there a better way to do this than to cast to void*. Would having a T* overload for Span collide with the void* version?
There was a problem hiding this comment.
You shouldn't need to explicitly cast: a char* is implicitly convertible to a void*.
There was a problem hiding this comment.
ps The reason it's void* is because you can't have a T* in C#.
|
I think it's better to separate the failure cases into their own tests. It makes it clearer what's failing when looking at the test results, I did rename the failure tests for clarity. I merged the right size/size+1 tests in IPAddressParsingSpan. |
| <Reference Include="System.Runtime.Extensions" /> | ||
| <Reference Include="System.Runtime.InteropServices" /> | ||
| <Reference Include="System.Threading" /> | ||
| <Reference Include="System.Memory" /> |
There was a problem hiding this comment.
Nit: please add it in alphabetical order
| // Failed to parse the address. | ||
| address = 0; | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Does the pinning need to extend down to here? Seems like it only need to be done just for the ParseNonCanonical call, as it was in the old code.
There was a problem hiding this comment.
I thought about it, reducing the fixed is a bit uglier with variable declarations so I went with this. Can change it to reduce the fixed instead.
| { | ||
| using (iteration.StartMeasurement()) | ||
| { | ||
| ip.TryWriteBytes(bytesSpan, out bytesWritten); |
There was a problem hiding this comment.
Is this doing enough work per iteration to produce meaningful and consistent results? Sometimes in these perf tests we need to do multiple calls / loop inside of the StartMeasurement using. Same question goes for the other tests, too.
There was a problem hiding this comment.
You have some test data in this PR, the numbers are very small so possibly too little for meaningsful results. Isn't it possible to increase iteration counts per method?
There was a problem hiding this comment.
Isn't it possible to increase iteration counts per method?
Yes, but there's overhead to the:
foreach (var iteration in Benchmark.Iterations)
{
using (iteration.StartMeasurement())
{
...
}
}If the work being measured is small compared to that, the perf test isn't effective. That's why you see code like this:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime/tests/Performance/Perf.DateTime.cs#L19-L21
and this:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime/tests/Performance/Perf.String.netcoreapp.cs#L25-L26
stephentoub
left a comment
There was a problem hiding this comment.
A few more comments/questions, but overall looks great. Thank you for working on this.
|
@dotnet/dnceng, another connection failure: Seems to be happening a lot. |
Tweaked the work amount for the perf tests. |
|
Thanks for the fixes, @Paxxi. Any regressions in the perf tests for the existing methods? |
|
Here's backported tests run without these changes
Quite large difference in total, but e.g. GetAddressBytes is 60 million method calls. I leave the decision to you as I'm not very familiar with what's considered an acceptable performance regression. |
|
Thanks. I'll pull down the PR and take a look. |
See issue #22607 This adds public IPAddress(ReadOnlySpan<byte> address) public IPAddress(ReadOnlySpan<byte> address, long scopeid) public bool TryWriteBytes(Span<byte> destination, out int bytesWritten) public static IPAddress Parse(ReadOnlySpan<char> ipSpan) public static bool TryParse(ReadOnlySpan<char> ipSpan, out IPAddress address) public static bool TryFormat(Span<char> destination, out int bytesWritten)
46a49b8 to
a13c1dd
Compare
I took a look at the four methods:
I’ve pushed a commit to address these issues. |
|
@dotnet-bot test Linux x64 Release Build please |
|
Thanks for working on this, @Paxxi. |
|
Thank you for the thorough explanation about the performance aspects! This has been a great experience and already looking for new tickets 😄 |
Add IPAddress Span-based APIs Commit migrated from dotnet/corefx@1fc5e1e
resolves #22607
This adds
Assumption is that the use of
ReadOnlySpanorSpanoverloads are by the most demanding use cases so where it makes sense the existing methods call the Span related methods or the internal helper methods have been converted to use spans.@stephentoub I think this is ready for review now