Automatically add tel to phone number when linking url#64865
Automatically add tel to phone number when linking url#64865ajlende merged 3 commits intoWordPress:trunkfrom devansh016:add-tel-to-phone-numbers
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @devansh016! In case you missed it, we'd love to have you join us in our Slack community. If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information. |
There was a problem hiding this comment.
Thanks, @devansh016! The only parts I'm not sure about are:
- How much phone number coverage we have across the world -- how much does this regex support and can we add even more coverage?
- What the name of the
isPhoneNumbercheck should be. Is that the most common name or should it be something likeisTelephoneorisTelephoneNumber?
I've pinged @ajlende for a review, as I believe he enjoys Regex :)
There was a problem hiding this comment.
I looked up national conventions for writing phone numbers from Wikipedia and there are too many to try writing a RegEx for. I also checked https://github.com/google/libphonenumber/blob/master/javascript/i18n/phonenumbers to see what Google is doing, but it has tens of thousands of lines of JS to do it properly which is overkill for us.
From what I found, there were two relevant standards: E.123 and E.164. Pretty much it just says they have a maximum of 15 digits, and each country formats as they wish using just a few characters to be used for separators.
The shortest local phone number I could find is only 4 digits long from this list, but 4 and 5 digit local numbers were rather uncommon. All of them used a limited set of ASCII characters. So something like this could work.
export function isTelephoneNumber( str ) {
if ( /[^0-9 +().-]/.test( str ) ) {
return false;
}
const digitCount = str.replace( /\D/g, '' ).length;
// TODO: Maybe use 5 or 6 as the minimum length.
return digitCount >= 4 && digitCount <= 15;
}EDIT: I had a more efficient function for checking this way, but I think the first way is much more clear to read.
This is pretty much the same thing, but more CPU efficient with the early returns if you think it reads well enough.
export function isTelephoneNumber( str ) {
// prettier-ignore
const digits = new Set( [ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' ] );
const punctuation = new Set( [ ' ', '(', ')', '-', '.', '+' ] );
let digitCount = 0;
for ( let i = 0; i < str.length; i++ ) {
if ( digits.has( str.charAt( i ) ) ) {
digitCount++;
if ( digitCount >= 15 ) {
return false;
}
} else if ( ! punctuation.has( str.charAt( i ) ) ) {
return false;
}
}
// TODO: Maybe use 5 or 6 as the minimum length.
return digitCount >= 4;
}It has some false positives and some false negatives, but I think it's a good balance of getting enough phone number like things and keeping things simple for all the national conventions for writing phone numbers.
- False negatives
- Extensions:
555-5555 ext. 555. - Short codes or special use numbers:
911in the USA. - Arabic (or any non-ASCII) localized numbers:
١٢٣٤٥٦٧٨٩٠.
- Extensions:
- False positives
- Dates in various formats:
2024.09.03. - Random sets of things with lots of punctuation:
((1-2)).(2)34
- Dates in various formats:
Another option is only formatting phone numbers that match RFC3966. This is the specification that defines valid strings for a tel: link.
This is a subset of the grammar simple enough for RegEx that only includes the number part and doesn't handle parameters, extensions, or isdn-subaddresses.
global-number-digits = "+" *phonedigit DIGIT *phonedigit
local-number-digits = *phonedigit-hex (HEXDIG / "*" / "#") *phonedigit-hex
phonedigit = DIGIT / [ visual-separator ]
phonedigit-hex = HEXDIG / "*" / "#" / [ visual-separator ]
visual-separator = "-" / "." / "(" / ")"
DIGIT = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
I wouldn't use a single RegEx for the full spec, but this is the RegEx for this subset of the specification with an optional tel: in front.
const NUMBER_ONLY_TEL_REGEXP =
/^(?:tel:)?(?:\+[\d().-]*\d[\d().-]*|[0-9a-fA-F*#().-]*[0-9a-fA-F*#])$/;I'm not entirely sure why hex digits are allowed for local numbers, but I'm guessing the * and # are to specify those keys on the phone. It matches things like #*#*#*# that definitely aren't phone numbers.
It still needs a check for the number of digits separately from the RegEx.
This RegEx also doesn't work for strings with spaces. One option could be stripping the spaces if the string doesn't begin with tel: to handle some of the E.123 formatted strings which recommend using spaces as separators.
So I guess the main thing to decide is how specific we want the check to be. We need to include more than the existing RegEx since it doesn't handle lots of formats, but we can't go so far as to implement the whole of libphonenumber for performance reasons.
What are the user's expectations around what can be auto-converted into a tel: link?
|
@devansh016, it was late last night when I was wrapping things up. I forgot to say that this is looking pretty good, all things considered! Phone numbers just happen to be more complicated than I realized 😅 I don't want this PR to get held up by those complexities, and I'm sure there's a "good enough" solution for this case. I'm asking a few folks today what they think the threshold should be for parsing a phone number. |
|
Hi @ajlende thanks for sharing your input.
I have earlier made a wrong assumption that the phone number can be only 10 digits long excluding the country code will change it to include numbers upto 15 digits max. Also, minimum length should be 3 for phone number excluding the country code to include emergency number like 911 or 102? There can be cases where the country code may not be provided. So will be making changes to the regex.
Good catch. Will add an optional
There are low changes for user to add link to date or other text which are false positive but will check ways if I can avoid this. |
My personal opinion is that it's okay to exclude some valid phone numbers if there's a higher likelihood that the same pattern will result in a false positive. I expect it's more likely that a string of 3 or 4 digits is NOT a phone number than it is a phone number which is why I was suggesting 5 or 6 as the minimum instead. Take, for example, someone wanting to link the title of 1984 by George Orwell to a purchase page for the book or something. Turning
The RFC3966 spec does say in section 5.1.5 that local numbers are required to have a If you're up for the challenge, that's something you could look into. If they follow the spec exactly and don't infer a context, then I would suggest an even simpler approach for now where we only handle phone numbers that start with
The reason I thought about dates was for linking to iCal events or something. Again, I think it's okay for some false positives to slip through. |
|
@ajlende , updated the logic as below. Assumptions:
PHONE_REGEXP is defined as
phoneNumber.replace(/[-.() ]/g, ''):
Things needed to be taken care:
|
There was a problem hiding this comment.
Some cases which use tel: prefix like
tel:879may not be valid number should we can add a check if any phone number contains tel: as a prefix it should be a valid phone number. Should this be added?
I think it's fine as-is. The mailto: one requires a valid email, so it's consistent with that, at the very least.
I'm asking a few folks today what they think the threshold should be for parsing a phone number.
Didn't get any strong opinions, so I think this looks good to me!
Thanks for the fix!
What?
The link field is automatically populated with tel: when the selected text is a phone number.
Why?
fixes #64804
How?
Testing Instructions
Screenshots or screencast
Screen.Recording.2024-08-28.at.3.46.02.PM.mov