Skip to content

URI malformed error when using unpaired surrogates #2265

@andrecedik

Description

@andrecedik

Describe the bug
I was trying to create a rule to check the names of JSON properties for the use of unpaired surrogates in UTF-8 (see https://unicodebook.readthedocs.io/issues.html#strict-utf8-decoder) and when I ran spectral it failed checking them because of an error within the md5 package.

To Reproduce

  1. Given this OpenAPI component:
  components:
    schemas:
      DummyResponse:
        type: object
        properties:
          "\uD87E\uDC04-WORKS":
            type: string
          "\uD83D-DOESNT-WORK":
            type: string
  1. Run spectral to validate the schema
  2. See error:
    Screenshot 2022-09-02 at 16 47 00

Expected behavior
Not sure what the best handling/output would be.
To confirm this locally I've added a try-catch-block around the statement that calls the md5 function.

const decorateResultsWithFingerprint = (
  results: IRuleResult[],
  computeFingerprint: ComputeFingerprintFunc,
): IRuleResult[] => {
  for (const r of results) {
    try {
      Object.defineProperty(r, 'fingerprint', {
        value: computeFingerprint(r, md5),
      });
    } catch (error) {
      if(error instanceof URIError) {
        console.log(error.stack);
        continue;
      }
    }
  }

  return results;
};

This way spectral will finish testing all the (other) rules and outputs a stack trace for users to see where the problem lies.

At this point, I'm not sure if this is the best handling of this use case. Maybe it's better to add a rule for returning an error message to the user. For this, a new function would need to be introduced that checks properties for the use of unpaired surrogates. 🤷

I'm already adding a function to achieve this, so if this is the preferred way, just let me know.

Environment (remove any that are not applicable):

  • Library version: 6.5.0
  • OS: Mac OS Monterey (12.5.1)

Additional context
The bug stems from the fact that the md5 library uses the encodeURIComponent-function to encode the strings and they clearly state on their website:

Note that a URIError will be thrown if one attempts to encode a surrogate which is not part of a high-low pair

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions