syntax: Reject semver version individual parts in strings#14886
syntax: Reject semver version individual parts in strings#14886Xanewok wants to merge 3 commits intoargotorg:developfrom
Conversation
This seems like an oversight from the initial implementation that supported versions such as "^0.4.0" but which now has a side effect of converting individual string literals to their token values, allowing for syntax such as: - `0 "." 8 "." 0` - '"1".2.3"` - '"1"."2"."3"` and so on. This fix aims to address the unintentional acceptance of such syntax, primarily to simplify the logic of external parsers and to more clearly establish what constitutes valid syntax for a crucial element like the version pragma statement.
|
Thank you for your contribution to the Solidity compiler! A team member will follow up shortly. If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother. If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix. |
nikola-matic
left a comment
There was a problem hiding this comment.
Hi @Xanewok, even though this is technically breaking - it's such an obscure breakage, that we can treat it as a bugfix, and release it regularly. Can you please add a changelog entry, and take a look at the comments I've left. I'll spend a bit more time reviewing this later to see if and how the test coverage should be improved.
| // Validate that the parsed version parts are either a single string literal or multiple bare tokens, | ||
| // i.e. "1.2.3" or 1.2.3 but not 1."2.3", "1".2.3 or 1"."2.3. | ||
| auto const partsEndPos = m_pos; // Points *after* the last version part | ||
| for (auto i = partsStartPos; i < partsEndPos; ++i) |
There was a problem hiding this comment.
Do you even this loop and the positional tracking? I think you can have a counter/boolean in the above while loop, and check whether currentToken() == Token::StringLiteral, and then error out as soon as you have more than 1?
| if (m_tokens[i] == Token::StringLiteral && partsStartPos != partsEndPos - 1) | ||
| { | ||
| solThrow(SemVerError, "String literals are only allowed as the only component in a version pragma."); | ||
| } |
There was a problem hiding this comment.
Also a side note - our style is to not use scoping braces {} in conditional/loop blocks if they're one liners, i.e., this if block would turn into:
| if (m_tokens[i] == Token::StringLiteral && partsStartPos != partsEndPos - 1) | |
| { | |
| solThrow(SemVerError, "String literals are only allowed as the only component in a version pragma."); | |
| } | |
| if (m_tokens[i] == Token::StringLiteral && partsStartPos != partsEndPos - 1) | |
| solThrow(SemVerError, "String literals are only allowed as the only component in a version pragma."); |
This seems like an oversight from the initial implementation that supported versions such as "^0.4.0" but which now has a side effect of converting individual string literals to their token values, allowing for syntax such as:
0 "." 8 "." 0"1".2.3""1"."2"."3"and so on.
This fix aims to address the unintentional acceptance of such syntax, primarily to simplify the logic of external parsers and to more clearly establish what constitutes valid syntax for a crucial element like the version pragma statement.
I haven't written C++ in ages, so let me know if I'm doing anything dumb or if it could be written better.
Closes #14826