-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
various fixes and tweaks for integer parsing #16033
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1d0f56c to
e8577e3
Compare
|
Force pushed new version. Only the issues pointed out addressed. But see comment above. PTAL |
|
|
||
| s += strspn(s, WHITESPACE); | ||
| if (s[0] == '-') | ||
| return -ERANGE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should allow leading whitespace. Our parsers generally strip whitespace earlier, so if we get here with some leading whitespace, something strange is going on. But if the user somehow managed to feed us leading whitespace, we should fail here. But this behaviour didn't change with this PR, so let's leave that for later.
|
|
||
| assert(s); | ||
|
|
||
| s += strspn(s, WHITESPACE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a good idea? Why should we allow leading whitespace here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, not sure if its, but strtol() does it implicitly when parsing normally, and that's documented. I wanted to make this as little invasive as possible, so I tried to add the 0b + 0o support in a fashion that behaves as similar as possible to how 0x is already handled.
(I must say, given all the weird behaviour of strtol() that keeps revealing itself I am tempted to just replace it altogether by our own code that parses only what we want it to parse, and doesn't eat whitspace and so on. it feels right now we need more code to block all the stuff it does we don't want than it would take to write our own implementation that only does what we want. but maybe let's leave that for a later PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace it altogether by our own code that parses only what we want it to parse
Indeed.
let's leave that for a later PR
Ack.
|
|
||
| assert(s); | ||
|
|
||
| s += strspn(s, WHITESPACE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
This allows disabling a few alternative ways to decode integers formatted as strings, for safety reasons. See: systemd#15991
Let's allow "-0" as alternative to "+0" and "0" when parsing integers, unless the new SAFE_ATO_REFUSE_PLUS_MINUS flag is specified. In cases where allowing the +/- syntax shall not be allowed SAFE_ATO_REFUSE_PLUS_MINUS is the right flag to use, but this also means that -0 as only negative integer that fits into an unsigned value should be acceptable if the flag is not specified.
All other safe_atoXYZ_full() functions have the parameter optional, let's make it optoinal here, too.
Parsing is hard, hence let's use our own careful wrappers wherever possible.
Let's refuse "+" and "-" prefixed UIDs. Let's refuse whitespace-prefixed UIDS, Let's refuse zero-prefixed UIDs. Let's be safe than sorry.
parse_uid() does so many safety checks we want, hence rewrite parse_uid_range() on top of parse_uid() instead of parse_range().
Let's adopt Python 3 style 0b and 0x syntaxes, because it makes a ton of sense, in particular in bitmask settings.
|
force pushed a new version. added the test case, but not much else, see comment above |
e8577e3 to
42e57a4
Compare
|
LGTM. |
A follow-up for #15991. Addressing #15985