Background
The OutlineParser's consumeStringLiteral method currently only handles \' and \\ escape sequences to correctly identify string boundaries. It does not validate that escape sequences are valid Apex syntax.
Current Behavior
Invalid escape sequences like '\q' or truncated unicode escapes like '\u000' are silently accepted by the OutlineParser. These would fail when deployed to Salesforce.
Proposed Enhancement
Add validation during string literal parsing to detect:
- Invalid escape characters (only
\b, \t, \n, \f, \r, \", \', \\ are valid)
- Truncated unicode escapes (
\u must be followed by exactly 4 hex digits)
- Invalid hex characters in unicode escapes
This can be done efficiently within the existing character-by-character loop in consumeStringLiteral, adding minimal overhead.
Salesforce Error Messages for Reference
Salesforce reports these as:
Illegal string literal: Invalid string literal '\q'. Illegal character sequence \q' in string literal.
Illegal string literal: Invalid string literal '\u000'. Illegal unicode sequence. Less than four hex digits \u000' in string literal.
Related
This came up while working on apex-ls PR #409 (string literal escape validation). Since OutlineParser will become the primary parser, this validation should be added here rather than in apex-ls.
Background
The OutlineParser's
consumeStringLiteralmethod currently only handles\'and\\escape sequences to correctly identify string boundaries. It does not validate that escape sequences are valid Apex syntax.Current Behavior
Invalid escape sequences like
'\q'or truncated unicode escapes like'\u000'are silently accepted by the OutlineParser. These would fail when deployed to Salesforce.Proposed Enhancement
Add validation during string literal parsing to detect:
\b,\t,\n,\f,\r,\",\',\\are valid)\umust be followed by exactly 4 hex digits)This can be done efficiently within the existing character-by-character loop in
consumeStringLiteral, adding minimal overhead.Salesforce Error Messages for Reference
Salesforce reports these as:
Illegal string literal: Invalid string literal '\q'. Illegal character sequence \q' in string literal.Illegal string literal: Invalid string literal '\u000'. Illegal unicode sequence. Less than four hex digits \u000' in string literal.Related
This came up while working on apex-ls PR #409 (string literal escape validation). Since OutlineParser will become the primary parser, this validation should be added here rather than in apex-ls.