Skip to content

Wrong parse for domains like <TLD><DIGIT> #430

@doochik

Description

@doochik

tests example for test/spec/linkifyjs/scanner.test.js

// OK
['localhost', [t.LOCALHOST], ['localhost']],
// OK
['localhosts', [t.WORD], ['localhosts']],

// BUG
// actual [ 'WORD', 'DOT', 'TLD', 'DOT', 'TLD' ]
['www.drive.com', [t.WORD, t.DOT, t.WORD, t.DOT, t.TLD], ['www', '.', 'drive', '.', 'com']],

// BUG
// actual [ 'WORD', 'DOT', 'TLD', 'NUM', 'DOT', 'TLD' ]
['www.drive1.com', [t.WORD, t.DOT, t.WORD, t.NUM, t.DOT, t.TLD], ['www', '.', 'drive', '1', '.', 'com']],

// OK
['www.driver.com', [t.WORD, t.DOT, t.WORD, t.DOT, t.TLD], ['www', '.', 'driver', '.', 'com']],
// OK
['www.driv1.com', [t.WORD, t.DOT, t.WORD, t.NUM, t.DOT, t.TLD], ['www', '.', 'driv', '1', '.', 'com']],

On the other hand, cases like localhost and localhosts work well.

The problem is that domains like <TLD><DIGIT> should be parsed as WORD, not TLD.

If we move this test to parser.test.js

[
  'This Url www.drive1.com with www and digits',
  [Text, Url, Text],
  ['This Url ', 'www.drive1.com', ' with www and digits']
],
// actual
// [Text, Url, Url, Text],
// ['This Url ', 'www.drive', '1.com', ' with www and digits']

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions