Skip to content

Automatically add tel to phone number when linking url#64865

Merged
ajlende merged 3 commits intoWordPress:trunkfrom
devansh016:add-tel-to-phone-numbers
Sep 5, 2024
Merged

Automatically add tel to phone number when linking url#64865
ajlende merged 3 commits intoWordPress:trunkfrom
devansh016:add-tel-to-phone-numbers

Conversation

@devansh016
Copy link
Copy Markdown
Contributor

@devansh016 devansh016 commented Aug 28, 2024

What?

The link field is automatically populated with tel: when the selected text is a phone number.

Why?

fixes #64804

How?

Testing Instructions

  1. Open a post or page.
  2. Add a phone number to a paragraph
  3. Select the phone number and click on link.

Screenshots or screencast

Screen.Recording.2024-08-28.at.3.46.02.PM.mov

@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 28, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: devansh016 <devansh2002@git.wordpress.org>
Co-authored-by: jeryj <jeryj@git.wordpress.org>
Co-authored-by: ajlende <ajlende@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions github-actions bot added the First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository label Aug 28, 2024
@github-actions
Copy link
Copy Markdown

👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @devansh016! In case you missed it, we'd love to have you join us in our Slack community.

If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information.

@devansh016 devansh016 marked this pull request as draft August 28, 2024 10:25
@devansh016 devansh016 marked this pull request as ready for review August 28, 2024 11:26
@dcalhoun dcalhoun removed their request for review August 28, 2024 18:07
@up1512001 up1512001 added [Type] Enhancement A suggestion for improvement. [Feature] Link Editing Link components (LinkControl, URLInput) and integrations (RichText link formatting) labels Aug 29, 2024
@jeryj jeryj requested a review from ajlende September 3, 2024 20:36
Copy link
Copy Markdown
Contributor

@jeryj jeryj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @devansh016! The only parts I'm not sure about are:

  • How much phone number coverage we have across the world -- how much does this regex support and can we add even more coverage?
  • What the name of the isPhoneNumber check should be. Is that the most common name or should it be something like isTelephone or isTelephoneNumber?

I've pinged @ajlende for a review, as I believe he enjoys Regex :)

Copy link
Copy Markdown
Contributor

@ajlende ajlende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked up national conventions for writing phone numbers from Wikipedia and there are too many to try writing a RegEx for. I also checked https://github.com/google/libphonenumber/blob/master/javascript/i18n/phonenumbers to see what Google is doing, but it has tens of thousands of lines of JS to do it properly which is overkill for us.

From what I found, there were two relevant standards: E.123 and E.164. Pretty much it just says they have a maximum of 15 digits, and each country formats as they wish using just a few characters to be used for separators.

The shortest local phone number I could find is only 4 digits long from this list, but 4 and 5 digit local numbers were rather uncommon. All of them used a limited set of ASCII characters. So something like this could work.

export function isTelephoneNumber( str ) {
	if ( /[^0-9 +().-]/.test( str ) ) {
		return false;
	}
	const digitCount = str.replace( /\D/g, '' ).length;
	// TODO: Maybe use 5 or 6 as the minimum length.
	return digitCount >= 4 && digitCount <= 15;
}
EDIT: I had a more efficient function for checking this way, but I think the first way is much more clear to read.

This is pretty much the same thing, but more CPU efficient with the early returns if you think it reads well enough.

export function isTelephoneNumber( str ) {
	// prettier-ignore
	const digits = new Set( [ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' ] );
	const punctuation = new Set( [ ' ', '(', ')', '-', '.', '+' ] );
	let digitCount = 0;
	for ( let i = 0; i < str.length; i++ ) {
		if ( digits.has( str.charAt( i ) ) ) {
			digitCount++;
			if ( digitCount >= 15 ) {
				return false;
			}
		} else if ( ! punctuation.has( str.charAt( i ) ) ) {
			return false;
		}
	}
	// TODO: Maybe use 5 or 6 as the minimum length.
	return digitCount >= 4;
}

It has some false positives and some false negatives, but I think it's a good balance of getting enough phone number like things and keeping things simple for all the national conventions for writing phone numbers.

  • False negatives
    • Extensions: 555-5555 ext. 555.
    • Short codes or special use numbers: 911 in the USA.
    • Arabic (or any non-ASCII) localized numbers: ١٢٣٤٥٦٧٨٩٠.
  • False positives
    • Dates in various formats: 2024.09.03.
    • Random sets of things with lots of punctuation: ((1-2)).(2)34

Another option is only formatting phone numbers that match RFC3966. This is the specification that defines valid strings for a tel: link.

This is a subset of the grammar simple enough for RegEx that only includes the number part and doesn't handle parameters, extensions, or isdn-subaddresses.

global-number-digits = "+" *phonedigit DIGIT *phonedigit
local-number-digits  = *phonedigit-hex (HEXDIG / "*" / "#") *phonedigit-hex
phonedigit           = DIGIT / [ visual-separator ]
phonedigit-hex       = HEXDIG / "*" / "#" / [ visual-separator ]
visual-separator     = "-" / "." / "(" / ")"
DIGIT                = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
HEXDIG               = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

I wouldn't use a single RegEx for the full spec, but this is the RegEx for this subset of the specification with an optional tel: in front.

const NUMBER_ONLY_TEL_REGEXP =
	/^(?:tel:)?(?:\+[\d().-]*\d[\d().-]*|[0-9a-fA-F*#().-]*[0-9a-fA-F*#])$/;

I'm not entirely sure why hex digits are allowed for local numbers, but I'm guessing the * and # are to specify those keys on the phone. It matches things like #*#*#*# that definitely aren't phone numbers.

It still needs a check for the number of digits separately from the RegEx.

This RegEx also doesn't work for strings with spaces. One option could be stripping the spaces if the string doesn't begin with tel: to handle some of the E.123 formatted strings which recommend using spaces as separators.


So I guess the main thing to decide is how specific we want the check to be. We need to include more than the existing RegEx since it doesn't handle lots of formats, but we can't go so far as to implement the whole of libphonenumber for performance reasons.

What are the user's expectations around what can be auto-converted into a tel: link?

@ajlende
Copy link
Copy Markdown
Contributor

ajlende commented Sep 4, 2024

@devansh016, it was late last night when I was wrapping things up. I forgot to say that this is looking pretty good, all things considered! Phone numbers just happen to be more complicated than I realized 😅

I don't want this PR to get held up by those complexities, and I'm sure there's a "good enough" solution for this case. I'm asking a few folks today what they think the threshold should be for parsing a phone number.

@devansh016
Copy link
Copy Markdown
Contributor Author

devansh016 commented Sep 4, 2024

Hi @ajlende thanks for sharing your input.

@devansh016, it was late last night when I was wrapping things up. I forgot to say that this is looking pretty good, all things considered! Phone numbers just happen to be more complicated than I realized 😅

I don't want this PR to get held up by those complexities, and I'm sure there's a "good enough" solution for this case. I'm asking a few folks today what they think the threshold should be for parsing a phone number.

I have earlier made a wrong assumption that the phone number can be only 10 digits long excluding the country code will change it to include numbers upto 15 digits max. Also, minimum length should be 3 for phone number excluding the country code to include emergency number like 911 or 102?

There can be cases where the country code may not be provided. So will be making changes to the regex.

We probably want to add an optional tel: that handles the RFC3966 spec.

Good catch. Will add an optional tel:

False positives
Dates in various formats: 2024.09.03.

There are low changes for user to add link to date or other text which are false positive but will check ways if I can avoid this.

@ajlende
Copy link
Copy Markdown
Contributor

ajlende commented Sep 4, 2024

I have earlier made a wrong assumption that the phone number can be only 10 digits long excluding the country code will change it to include numbers upto 15 digits max. Also, minimum length should be 3 for phone number excluding the country code to include emergency number like 911 or 102?

My personal opinion is that it's okay to exclude some valid phone numbers if there's a higher likelihood that the same pattern will result in a false positive.

I expect it's more likely that a string of 3 or 4 digits is NOT a phone number than it is a phone number which is why I was suggesting 5 or 6 as the minimum instead.

Take, for example, someone wanting to link the title of 1984 by George Orwell to a purchase page for the book or something. Turning 1984 into a tel: link by default is unexpected in this case, and I think those situations are more common than someone from Saint Helena trying to link a local phone number.

There can be cases where the country code may not be provided. So will be making changes to the regex.

The RFC3966 spec does say in section 5.1.5 that local numbers are required to have a phone-context parameter in the tel: URI. Whether or not Android or iOS follow the specification or do their own processing to append it based on the phone's locale is something that I haven't researched.

If you're up for the challenge, that's something you could look into.

If they follow the spec exactly and don't infer a context, then I would suggest an even simpler approach for now where we only handle phone numbers that start with + and include a country code.

There are low changes for user to add link to date or other text which are false positive but will check ways if I can avoid this.

The reason I thought about dates was for linking to iCal events or something. Again, I think it's okay for some false positives to slip through.

@devansh016
Copy link
Copy Markdown
Contributor Author

@ajlende , updated the logic as below.

Assumptions:

  • Phone number can be maximum of length 15 and minimum length of 6 digits only including country code.
  • Number can start with/without country code.

PHONE_REGEXP is defined as /^(tel:)?(\+)?\d{6,15}$/

  • ^(tel:)?: Matches an optional "tel:" prefix at the start of the string.
  • (\+)?: Matches an optional + sign.
  • \d{6,15}$: Matches a sequence of 6 to 15 digits until the end of the string.

phoneNumber.replace(/[-.() ]/g, ''):

  • This verifies if the resulting string (after removing separators ) is a valid phone number according to the defined pattern.
  • Since there are many variations of separator it is better to remove separator before regex check and not include it in regex.

Things needed to be taken care:

  • Some cases which use tel: prefix liketel:879 may not be valid number should we can add a check if any phone number contains tel: as a prefix it should be a valid phone number. Should this be added?

Copy link
Copy Markdown
Contributor

@ajlende ajlende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cases which use tel: prefix liketel:879 may not be valid number should we can add a check if any phone number contains tel: as a prefix it should be a valid phone number. Should this be added?

I think it's fine as-is. The mailto: one requires a valid email, so it's consistent with that, at the very least.

I'm asking a few folks today what they think the threshold should be for parsing a phone number.

Didn't get any strong opinions, so I think this looks good to me!

Thanks for the fix!

@devansh016 devansh016 requested a review from jeryj September 5, 2024 16:13
@ajlende ajlende merged commit 57c59d1 into WordPress:trunk Sep 5, 2024
@github-actions github-actions bot added this to the Gutenberg 19.3 milestone Sep 5, 2024
@ajlende ajlende self-assigned this Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Link Editing Link components (LinkControl, URLInput) and integrations (RichText link formatting) First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository [Type] Enhancement A suggestion for improvement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tel: when linking phone numbers

4 participants