core(seo): support spanish in link-text audit#7547
core(seo): support spanish in link-text audit#7547paulirish merged 2 commits intoGoogleChrome:masterfrom Markel:patch-1
Conversation
Words included: - click here - here - more - more information - this - link - this link - start
There was a little space in the last line which I had deleted.
patrickhulce
left a comment
There was a problem hiding this comment.
Thanks very much for the contribution @MarkelFe!
Core team: We sortof accepted the Japanese contribution without a lot of process, but if we're going to be accepting more languages we might want a bit more consideration.
Maybe we say just top 10 languages on the internet welcome or something?
Maybe we want at least 2 native speakers to agree the sets are reasonable and not too wide?
Just thinking out loud.
| '続く', | ||
| '全文表示', | ||
| // Spanish | ||
| 'click aquí', |
There was a problem hiding this comment.
There's a fair amount of duplication with the permutations, I wonder if we could do some replacements to keep the signal in here high?
could be as simple as an easy map for now
const CHARACTER_REPLACEMENTS = new Map([
[/í/g, 'i'],
[/á/g, 'a'],
[/ó/g, 'o'],
])
function normalizeText(text) {
let normalized = text.trim().toLowerCase();
for (const [regex, replacement] of CHARACTER_REPLACEMENTS.entries()) {
normalized = normalized.replace(regex, replacement)
}
return normalized;
}
...
return BLOCKLIST.has(normalizeText(link.text));DISCLAIMER: untested
There was a problem hiding this comment.
I wonder if we could do some replacements to keep the signal in here high?
ha, the problem is with this change I would wonder if normalizeText is quite correct or if it's missing something or adding things I'm not expecting, versus just scanning a medium sized list :)
But I'd be ok with tests for it 👍
brendankenny
left a comment
There was a problem hiding this comment.
Core team: We sortof accepted the Japanese contribution without a lot of process, but if we're going to be accepting more languages we might want a bit more consideration.
yeah, considering we barely had a process for the original list of english terms, I really don't know what the best approach is here
Maybe we want at least 2 native speakers to agree the sets are reasonable and not too wide?
seems like a good idea to me
| '続く', | ||
| '全文表示', | ||
| // Spanish | ||
| 'click aquí', |
There was a problem hiding this comment.
I wonder if we could do some replacements to keep the signal in here high?
ha, the problem is with this change I would wonder if normalizeText is quite correct or if it's missing something or adding things I'm not expecting, versus just scanning a medium sized list :)
But I'd be ok with tests for it 👍
I have added a couple of translations for the Links Do Not Have Descriptive Text audit in Spanish.
Note that in Spanish there is this symbol:
´. People forget to write it so we may take into account these errors, let me know your opinion.Related Issues/PRs
#5322 but in Spanish