Conversation
when the `locator` is not disabled
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #847 +/- ##
=======================================
Coverage 95.12% 95.12%
=======================================
Files 8 8
Lines 2196 2196
Branches 577 577
=======================================
Hits 2089 2089
Misses 107 107 ☔ View full report in Codecov by Sentry. |
|
@kboshold this one is an actual fix for the performance issue from your initial example, even when not normalizing the input. |
Ponynjaa
left a comment
There was a problem hiding this comment.
Just two minor things I would change: the test-descriptions on the test you copied and the new one don't describe what's actually being tested.
Other than that it looks really good! I was a little confused at first when I saw your changes to the regex and loop, but after playing around with it for a little bit and testing the outcomes it seems to work exactly as before but without matching the content inbetween line-endings for no reason. Good job dude!
test/dom-parser.test.js
Outdated
| expect(new XMLSerializer().serializeToString(doc)).toEqual(source.replace(/[\u0085\u2028\u2029]/g, '\n')); | ||
| }, 500); | ||
| }); | ||
| test('should be able to open documents with alternative whitespace without creating a bottleneck and replacing them with \\n', () => { |
There was a problem hiding this comment.
and replacing them with \n
this description doesn't make sense for this test
test/dom-parser.test.js
Outdated
| @@ -330,7 +331,20 @@ describe('DOMParser', () => { | |||
| const source = `<root>${'A'.repeat(50000)}\u2029${'A'.repeat(50000)}\u0085${'A'.repeat(50000)}\u2028${'A'.repeat(50000)}\u2029</root>`; | |||
There was a problem hiding this comment.
without creating a bottleneck
The test description here actually also doesn't make sense, maybe you can rephrase it when you are already at it?
There was a problem hiding this comment.
Yes, somehow the test I copied is also covering the one that @kboshold added, so I merged them into one and corrected the description.
test/dom-parser.test.js
Outdated
| const onError = jest.fn(); | ||
| const normalizeLineEndings = jest.fn((source) => source); | ||
| const { parser } = getTestParser({ onError, normalizeLineEndings }); | ||
| const source = `<root>${'A'.repeat(15000)}\u2029${'A'.repeat(15000)}\u0085${'A'.repeat(15000)}\u2028${'A'.repeat(15000)}\u2029</root>`; |
There was a problem hiding this comment.
I think it would make more sense to have a \n as the first character, since the test currently has \u2029 twice.
Solution:
const source = `<root>${'A'.repeat(15000)}\n${'A'.repeat(15000)}\u0085${'A'.repeat(15000)}\u2028${'A'.repeat(15000)}\u2029</root>`;There was a problem hiding this comment.
I don't think it's such big of a difference, but hey, I applied your suggestion.
|
I applied all the suggested changes, so please approve it. |
normalizeLineEndings, so it actually fails..*twice, to speed up processing larger files containing certain Unicode chars as demonstrated in Increased processing times when using \u2028 and \u2029 #838