Skip to content

Add escape sequence validation to string literal parsing #17

@kjonescertinia

Description

@kjonescertinia

Background

The OutlineParser's consumeStringLiteral method currently only handles \' and \\ escape sequences to correctly identify string boundaries. It does not validate that escape sequences are valid Apex syntax.

Current Behavior

Invalid escape sequences like '\q' or truncated unicode escapes like '\u000' are silently accepted by the OutlineParser. These would fail when deployed to Salesforce.

Proposed Enhancement

Add validation during string literal parsing to detect:

  • Invalid escape characters (only \b, \t, \n, \f, \r, \", \', \\ are valid)
  • Truncated unicode escapes (\u must be followed by exactly 4 hex digits)
  • Invalid hex characters in unicode escapes

This can be done efficiently within the existing character-by-character loop in consumeStringLiteral, adding minimal overhead.

Salesforce Error Messages for Reference

Salesforce reports these as:

  • Illegal string literal: Invalid string literal '\q'. Illegal character sequence \q' in string literal.
  • Illegal string literal: Invalid string literal '\u000'. Illegal unicode sequence. Less than four hex digits \u000' in string literal.

Related

This came up while working on apex-ls PR #409 (string literal escape validation). Since OutlineParser will become the primary parser, this validation should be added here rather than in apex-ls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions