Skip to content

Fix parseAddress regex#140977

Merged
alexr00 merged 1 commit intomicrosoft:mainfrom
Timmmm:patch-2
Jan 24, 2022
Merged

Fix parseAddress regex#140977
alexr00 merged 1 commit intomicrosoft:mainfrom
Timmmm:patch-2

Conversation

@Timmmm
Copy link
Contributor

@Timmmm Timmmm commented Jan 19, 2022

The previous regex was not very good and meant that hostnames containing digits and full stops were not accepted. This one should be more accurate. You can test it here.

Because regex's are completely unreadable here's how it breaks down:

^    - Start
(     - Hostname group
  [a-zA-Z0-9-]+    - First domain component
  (?:                         - Optional subsequent components
     \.[a-zA-Z0-9-]+     - Each subsequent component is a . followed by the component
  )*                      - Any number of subsequent components
:     - Port separator
)?   - Hostname group is optional
([0-9]+)    - Port, which is mandator
$   - End

This PR fixes #

The previous regex was not very good and meant that hostnames containing digits and full stops were not accepted. This one should be more accurate. You can test it [here](https://regex101.com/r/ZVItcq/1).

Because regex's are completely unreadable here's how it breaks down:

```
^    - Start
(     - Hostname group
  [a-zA-Z0-9-]+    - First domain component
  (?:                         - Optional subsequent components
     \.[a-zA-Z0-9-]+     - Each subsequent component is a . followed by the component
  )*                      - Any number of subsequent components
:     - Port separator
)?   - Hostname group is optional
([0-9]+)    - Port, which is mandator
$   - End
```
Copy link
Member

@alexr00 alexr00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Timmmm
Copy link
Contributor Author

Timmmm commented Jan 21, 2022

Yeah:

Hostnames are domains, where a domain is a hierarchical, dot-separated list of subdomains

Each element of the hostname must be from 1 to 63 characters long and the entire hostname, including the dots...

Note that specification has other limits like the length of a subdomain and the overall length which I didn't validate, because I think this is basically a sanity check and it seemed like overkill. (And if you want to go for reliable validation regexes are the wrong tool for the job.)

@alexr00 alexr00 added this to the January 2022 milestone Jan 24, 2022
Copy link
Member

@alexr00 alexr00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix!

@alexr00 alexr00 merged commit f12b5b4 into microsoft:main Jan 24, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Mar 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants