People in the Middle East Disagree About a Lot of Things, But I’m Quite Sure That They All Agree That This Is the Silliest Android Bug Ever

There’s a popular podcast produced by the New York Times: Hard Fork. It talks about technology, and since a lot of people these days find it difficult to talk about technology without mentioning “AI”, the two Hard Fork hosts make two disclaimers in almost every episode: That the New York Times is suing OpenAI, and that the boyfriend of one of the hosts works for Anthropic, which makes the Claude conversation simulator.

There is, however, another thing that is common to both the New York Times and Anthropic, and it has nothing to do with “AI”. It’s that Android apps made by these two companies have the same bug, and this bug is mindbogglingly silly.

This bug makes countless Android apps partially or completely broken for anyone whose phone interface is set to Hebrew, Arabic, Persian, Urdu, or any other right-to-left (RTL) language. The most frustrating part? This widespread problem, which affects hundreds of millions of users globally, can be fixed by changing one line of code.

The Problem: When English Apps Go RTL

I use my Android phone with a Hebrew interface because, well, Hebrew is my language. I don’t expect all English-language apps to support Hebrew. The New York Times, for example, publishes almost everything in English, and that’s OK. The Claude app’s user interface is in English and hasn’t been localized to Hebrew, at least yet, and that’s not too bad either. I know English, and when I choose an English-language app, I just want it to work well in English.

What’s not fine is when these apps break because they’re trying to adapt to right-to-left languages when they have no business doing so. Apps try to do this because they see that my phone asks for Hebrew user interface. They are trying—and failing. The results are grim.

In the NYT Cooking app, recipe reviews get hidden behind star ratings, English text awkwardly aligns to the right, and ellipsis marks appear on the wrong side of photo captions. The app becomes harder to use despite being entirely in English.

The Claude AI app suffers from similar problems—interface elements flip inappropriately, making the English-language interface confusing and sometimes unusable.

Perhaps most dramatically, in 2019, I captured a screenshot from the Delta Air Lines app that showed a flight appearing to go from Atlanta to JFK when it was actually going from JFK to Atlanta.

While Delta seems to have fixed its app since 2019, it illustrates how RTL bugs can be genuinely misleading.

The Root Cause: Android Studio’s Default Behavior

The real culprit isn’t individual app developers—it’s Google’s Android development ecosystem. When developers create new Android apps using Android Studio, the most popular development tool, the default configuration includes android:supportsRtl="true" in the app’s configuration file, AndroidManifest.xml.

This setting tells Android to automatically flip the app’s layout for RTL languages. Google wants to encourage and simplify RTL language support for developers, but it goes too far: developers don’t even think that they need to do anything, and the result is broken.

The irony is that we’re living in an age of incredibly sophisticated AI and machine learning, yet this simple localization bug—which has nothing to do with advanced computer science—causes daily frustration for hundreds of millions of people.

The Scale of the Problem

This isn’t a niche issue. Consider the numbers:

Arabic: more than 400 million speakers
Urdu: more than 200 million speakers
Persian: more than 110 million speakers
Hebrew speakers: about 9 million speakers
And there are other RTL languages: Punjabi, Uyghur, Yiddish, and more.

We’re talking about hundreds of millions of people who experience degraded app performance through no fault of their own. They’re not asking for their native language support—they just want English apps to work properly when their phone’s system language happens to be RTL. Because of problems like this, many of those people choose to use their phones in English and get all the apps in English, even though many of them don’t actually know English very well.

So please don’t tell me to switch my phone to English—it’s not actually a solution.

There’s another bitter irony in this situation. Hebrew speakers, Arabic speakers, and Persian speakers, are often divided by geopolitics and conflict. Yet we’re all united by the same stupid software bugs in the support for the languages that we speak, read, write, and love.

I’ve spoken with Palestinians, Saudis, Iranians, and Pakistanis about this issue. We all face the same broken apps, the same UI frustrations, the same feeling of being an afterthought in software design. Perhaps instead of fighting each other, we should unite in fighting these bugs—with code and constructive feedback, of course.

My Quixotic Quest for Fixes

I’ve become something of a Don Quixote in this fight, reporting this bug to dozens of companies. The responses have been telling:

Most companies, including Anthropic: Complete silence.
Some companies, including The New York Times, as well as Dave & Busters, italki, Citizens Bank, and many, many others: Promised to fix it, but didn’t.
A few apps, like Dunkin’ Donuts and Podcast Addict, actually got fixed as a result or my emails. ❤️
One company, Drive Less, a local Rhode Island biking app, not only fixed it, but also sent me a $20 gift card. ❤️❤️
One more company, the Massachusetts Bay Transportation Authority, also known as the MBTA or “the T”, published the source code of its Android app on GitHub under a Free Software license, so I sent a fix, and they quickly merged it and released an update!¹ ❤️❤️❤️

Despite the few positive examples at the end of this list, the pattern is clear: this is a fixable problem, but most companies don’t prioritize it because it doesn’t affect English-speaking decision-makers.

The Absurdly Simple Solution

Here’s the fix that would solve this problem in almost all affected apps:

In the AndroidManifest.xml file, change the line android:supportsRtl="true" to android:supportsRtl="false".

That’s it. One line.

This tells Android: “This app doesn’t support RTL layouts, so don’t try to flip anything.” The app will continue working normally in English, regardless of the user’s system language.

Apps that genuinely want to support RTL languages—which is commendable!—should keep the setting as "true", but then properly implement RTL layouts with appropriate testing and design considerations.

How to Make an Even Bigger Change

While individual app developers can fix it in their products, this is not really scalable. The real, big solution needs to happen at the platform level. Most importantly, Android Studio should change its defaults: New projects shouldn’t include RTL support unless developers explicitly opt in. I’m not sure how to fix it in all the existing apps, but at least in theory, it’s possible.

So What Can You Do

If you’re an Android developer, check your app’s RTL behavior. If you’re not intentionally supporting RTL languages, please set supportsRtl="false".

If you work at Google or influence Android development tools, please consider changing the default behavior to be opt-in rather than opt-out.

If you’re a user affected by these issues, don’t suffer in silence. Report bugs to app developers. Many don’t even know these problems exist. At least some of them will fix them.

Technology should work for everyone, regardless of which language they speak or which direction their language is written. This bug represents a small but important way that our interconnected world still fails to accommodate its own diversity.

The fix is simple. The impact would be enormous. All it takes is the will to change one line of code—and one default setting—at a time.

¹ Notably, the NJ Transit and the New York MTA’s TrainTime apps still have this bug, even though I’m quite sure that I reported it to them. In the battle of the state transportation agencies for not giving broken apps to people who use their phones in RTL languages, Massachusetts’ MBTA wins big time for now!

“The fix is to complete the localization”. Not letting people do it is a bug. (Also, some non-standard observations about American health insurance.)

It sometimes happens in people’s lives that someone tells them something that sounds true and obvious at the time. It turns out that it actually is objectively true, and it is also obvious, or at least sensible, to the person who hears it, but it’s not obvious to other people. But it was obvious to them, so they think that it is obvious to everyone else, even though it isn’t.

It happens to everyone, and we are probably all bad at consistently noticing it, remembering it, and reflecting on it.

This post is an attempt to reflect on one such occurrence in my life; there were many others.

(Comment: This whole post is just my opinion. It doesn’t represent anyone else. In particular, it doesn’t represent other translatewiki.net administrators, MediaWiki developers or localizers, Wikipedia editors, or the Wikimedia Foundation.)

There’s the translatewiki.net website, where the user interface of MediaWiki, the software that powers Wikipedia, as well as of some other Free Software projects, is translated to many languages. This kind of translation is also called “localization”. I mentioned it several times on this blog, most importantly at Amir Aharoni’s Quasi-Pro Tips for Translating the Software That Powers Wikipedia, 2020 Edition.

Siebrand Mazeland used to be the community manager for that website. Now he’s less active there, and, although it’s a bit weird to say it, and it’s not really official, these days I kind of act like one of its community managers.

In 2010 or so, Siebrand heard something about a bug in the support of Wikipedia for a certain language. I don’t remember which language it was or what the bug was. Maybe I myself reported something in the display of Hebrew user interface strings, or maybe it was somebody else complaining about something in another language. But I do remember what happened next. Siebrand examined the bug and, with his typical candor, said: “The fix is to complete the localization”.

What he meant is that one of the causes of that bug, and perhaps the only cause, was that the volunteers who were translating the user interface into that language didn’t translate all the strings for that feature (strings are also known as “messages” in MediaWiki developers’ and localizers’ jargon). So instead of rushing to complain about a bug, they should have completed the localization first.

To generalize it, the functionality of all software depends, among many other things, on the completeness of user interface strings. They are essentially a part of the algorithm. They are more presentation than logic, but the end user doesn’t care about those minor distinctions—the end user wants to get their job done.

Those strings are usually written in one language—often English, but occasionally Japanese, Russian, French, or another one. In some software products, they may be translated into other languages. If the translation is incomplete, then the product may work incorrectly in some ways. On the simplest level, users who want to use that product in one language will see the user interface strings in another language that they possibly can’t read. However, it may go beyond that: writing systems for some languages require special fonts, applying which to letters from another writing system may cause weird appearance; strings that are supposed to be shown from left to right will be shown from right to left or vice versa; text size that is good for one language can be wrong for another; and so forth.

In many cases, simply completing the translation may quietly fix all those bugs. Now, there are reasons why the translation is incomplete: it may be hard to find people who know both English and this language well; the potential translator is a volunteer who is busy with other stuff; the language lacks necessary technical terminology to make the translations, and while this is not a blocker —new terms can be coined along the way—, this may slow things down; a potential translator has good will and wants to volunteer their time, but hasn’t had a chance to use the product and doesn’t understand the messages’ context well enough to make a translation; etc. But in theory, if there is a volunteer who has relevant knowledge and time, then completing the translation, by itself, fixes a lot of bugs.

Of course, it may also happen that the software actually has other bugs that completing the localization won’t fix, but that’s not the kind of bugs I’m talking about in this post. Or, going even further, software developers can go the extra mile and try to make their product work well even if the localization is incomplete. While this is usually commendable, it’s still better for the localizers to complete the localization. After all, it should be done anyway.

That’s one of the main things that motivate me to maintain the localization of MediaWiki and its extensions into Hebrew at 100%. From the perspective of the end users who speak Hebrew, they get a complete user experience in their language. And from my perspective, if there’s a bug in how something works in Wikipedia in Hebrew, then at least I can be sure that the reason for it is not that the translation is incomplete.

As one of the administrators of translatewiki, I try my best to make complete localization in all languages not just possible, but easy.¹ It directly flows out of Wikimedia’s famous vision statement:

Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.

I love this vision, and I take the words “Every single human being” and “all knowledge” seriously; they implicitly mean “all languages”, not just for the content, but also for the user interface of the software that people use to read and write this content.

If you speak Hindi, for example, and you need to search for something in the Hindi Wikipedia, but the search form works only in English, and you don’t know English, finding what you need will be somewhere between hard and impossible, even if the content is actually written in Hindi somewhere. (Comment #1: If you think that everyone who knows Hindi and uses computers also knows English, you are wrong. Comment #2: Hindi is just one example; the same applies to all languages.)

Granted, it’s not always actually easy to complete the localization. A few paragraphs above, I gave several general examples of why it can be hard in practice. In the particular case of translatewiki.net, there are several additional, specific reasons. For example, translatewiki.net was never properly adapted to mobile screens, and it’s increasingly a big problem. There are other examples, and all of them are, in essence, bugs. I can’t promise to fix them tomorrow, but I acknowledge them, and I hope that some day we’ll find the resources to fix them.

Many years have passed since I heard Siebrand Mazeland saying that the fix is to complete the localization. Soon after I heard it, I started dedicating at least a few minutes every day to living by that principle, but only today I bothered to reflect on it and write this post. The reason I did it today is surprising: I tried to do something about my American health insurance (just a check-up, I’m well, thanks). I logged in to my dental insurance company’s website, and… OMFG:

What you can see here is that some things are in Hebrew, and some aren’t. If you don’t understand the Hebrew parts, that’s OK, because you aren’t supposed to: they are for Hebrew speakers. But you should note that some parts are in English, and they are all supposed to be in Hebrew.

For example, you can see that the exclamation point is at the wrong end of “Welcome, Amir!“. The comma is placed unusually, too. That’s because they oriented the direction of the page from right to left for Hebrew, but didn’t translate the word “Welcome” in the user interface.² If they did translate it, the bug wouldn’t be there: it would correctly appear as “ברוך בואך, Amir!“, and no fixes in the code would be necessary.

You can also see a wrong exclamation point in the end of “Thanks for being a Guardian member!“.

There are also less obvious bugs here. You can also see that in the word “WIKIMEDIA” under the “Group ID” dropdown, the letter “W” is only partly seen. That’s also a typical RTL bug: the menu may be too narrow for a long string, so the string can be visually truncated, but it should happen at the end of the string and not in the beginning. Because the software here thinks that the end is on the left, the beginning gets truncated instead. This is not exactly an issue that can be fixed just by completing the localization, but if the localization were complete, it would be easier to notice it.

There are even more issues that you don’t notice if you don’t know Hebrew. For example, there’s a button with a weird label at the top right. Most Hebrew speakers will understand that label as “a famous website”, which is probably not what it is supposed to say. It’s more likely that it’s supposed to say “published web page”, and the translator made a mistake. Completing the translation correctly would fix this mistake: a thorough translator would review their work, check all the usages of the relevant words, and likely come up with a correct translation. (And maybe the translation is not even made by a human but by machine translation software, in which case it’s the product manager’s mistake. Software should never, ever be released with user interface strings that were machine-translated and not checked by a human.)

Judging by the logo at the top, the dental insurance company used an off-the-shelf IBM product for managing clients’ info. If I ask IBM or the insurance company nicely, will they let me complete the localization of this product, fixing the existing translation mistakes, and filing the rest of the bugs in their bug tracking software, all without asking for anything in return? Maybe I’ll actually try to do it, but I strongly suspect that they will reject this proposal and think that I’m very weird. In case you wonder, I actually tried doing it with some companies, and that’s what happened most of the time.

And this attitude is a bug. It’s not a bug in code, but it is very much a problem in product management and attitude toward business.

If you want to tell me “Amir, why don’t you just switch to English and save yourself the hassle”, then I have two answers for you.

The first answer is described in detail in a blog post I wrote many years ago: The Software Localization Paradox. Briefly: Sure, I can save myself the hassle, but if I don’t notice it and speak about it, then who will?

The second answer is basically the same, but with more pathos. It’s a quote from Avot 1:14, one of the most famous and cited pieces of Jewish literature outside the Bible: If I am not for myself, who is for me? But if I am for my own self, what am I? And if not now, when? I’m sure that many cultures have proverbs that express similar ideas, but this particular proverb is ours.

And if you want to tell me, “Amir, what is wrong with you? Why does it even cross your mind to want to help not one, but two ultramegarich companies for free?”, then you are quite right, idealistically. But pragmatically, it’s more complicated.

Wikimedia understands the importance of localization and lets volunteers translate everything. So do many other Free Software projects. But experience and observation taught me that for-profit corporations don’t prioritize good support for languages unless regulation forces them to do it or they have exceptionally strong reasons to think that it will be good for their income or marketing.

It did happen a few times that corporations that develop non-Free software let volunteers localize it: Facebook, WhatsApp, and Waze are somewhat famous examples; Twitter used to do it (but stopped long ago); and Microsoft occasionally lets people do such things. Also, Quora reached out to me to review the localization before they launched in Hebrew and even incorporated some of my suggestions.³

Very often, however, corporations don’t want to do this at all, and when they do it, they often don’t do it very well. But people who don’t know English want—and often need!—to use their products. And I never get tired of reminding everyone that most people don’t know English.

So for the sake of most humanity, someone has to make all software, including the non-Free products, better localized, and localizable. Of course, it’s not feasible or sustainable that I alone will do it as a volunteer, even for one language. I barely have time to do it for one language in one product (MediaWiki). But that’s why I am thinking of it: I would be not so much helping a rich corporation here as I would be helping people who don’t know English.

Something has to change in the software development world. It would, of course, be nice if all software became Freely-licensed, but if that doesn’t happen, it would be nice if non-Free software would be more open to accepting localization from volunteers. I don’t know how will this change happen, but it is necessary.

If you bothered to read until here, thank you. I wanted to finish with two things:

To thank Siebrand Mazeland again for doing so much to lay the foundations of the MediaWiki localization and the translatewiki community, and for saying that the fix is to complete the localization. It may have been an off-hand remark at the time, but it turned out that there was much to elaborate on.
To ask you, the reader: If you know any language other than English, please use all apps, websites, and devices in this language as much as you can, bother to report bugs in its localization to that language, and invest some time and effort into volunteering to complete the localization of this software to your language. Localizing the software that runs Wikipedia would be great. Localizing OpenStreetMap is a good idea, too, and it’s done on the same website. Other projects that are good for humanity and that accept volunteer localization are Mozilla, Signal, WordPress, and BeMyEyes. There are many others.⁴ It’s one of the best things that you can do for the people who speak your language and for humanity in general.

¹ And here’s another acknowledgement and reflection: This sentence is based on the first chapter of one of the most classic books about software development in general and about Free Software in particular: Programming Perl by Larry Wall (with Randal L. Schwartz, Tom Christiansen, and Jon Orwant): “Computer languages differ not so much in what they make possible, but in what they make easy”. The same is true for software localization platforms. The sentence about the end user wanting to get their job done is inspired by that book, too.

² I don’t expect them to have my name translated. While it’s quite desirable, it’s understandably difficult, and there are almost no software products that can store people’s names in multiple languages. Facebook kind of tries, but does not totally succeed. Maybe it will work well some day.

³ Unfortunately, as far as I can tell, Quora abandoned the development of the version in Hebrew and in all other non-English languages in 2022, and in 2023, they abandoned the English version, too.

⁴ But please think twice before volunteering to localize blockchain or AI projects. I heard several times about volunteers who invested their time into such things, and I was sad that they wasted their volunteering time on this pointlessness. Almost all blockchain projects are pointless. With AI projects, it’s more complicated: some of them are actually useful, but many are not. So I’m not saying “don’t do it”, but I am saying “think twice”.

Google’s Lies and the Problem That Large Language Models Won’t Solve

I switched the search settings not to use Google by default in web browsers on all my devices after reading the blog post that the Head of Google Search published in response to the many reports of problems in Google’s “AI Overviews”.

Practically every point in that blog post is either a meaningless generality written in corporatespeak or a demonstrable lie. You don’t need specialized engineering knowledge or access to internal information to see it. You just need common sense.

User feedback shows that with AI Overviews, people have higher satisfaction with their search results…

Which people? Everyone? I don’t. I sharply reduced my use of Google search because I no longer trust it.

… and they’re asking longer, more complex questions that they know Google can now help with.

The word “help” is doing a lot of work here. Google can output a piece of text in response. Is this piece of text actually helpful?

AI Overviews work very differently than chatbots and other LLM products that people may have tried out.

No, they don’t. They work exactly the same. Both technologies automatically produce some text that was not written by a human.

They’re not simply generating an output based on training data.

No. They are, in fact, simply generating an output based on training data.

When AI Overviews get it wrong, it’s usually for other reasons: misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available. (These are challenges that occur with other Search features too.)

This is one of the few true things in this blog post, but it shows why this feature is completely pointless!

I mean, it’s nice that she doesn’t blame the users here for writing bad queries, but admits that the software that her team developed is bad at interpreting them.

And here’s an even more important thing: Despite the long-standing impression that “you can find everything on Google”, the “AI” innovations of the last couple of years help us realize that there are actually many topics about which there is not a lot of info online. And large language models are not going to solve this problem.

This approach is highly effective.

What does this even mean? “We are able to show more ads and improve our bottom line for the last quarter?”

Overall, our tests show that our accuracy rate for AI Overviews is on par with another popular feature in Search — featured snippets — which also uses AI systems to identify and show key info with links to web content.

This is probably the biggest lie of all in that whle post.

There is no comprehensive test or measure for accuracy! It is logically impossible to make one!

At most, there is some internal metric that middle managers present to senior managers, and it may show that the rate is “positive” according to internal company logic. However, it has absolutely nothing to do with what millions of web users actually need.

This is comparable to metrics of quality of machine translation, such as BLEU and NIST. There are methodologies and formulas behind them, but they are only useful for discussions among researchers, developers, and product and project managers, and they have very limited usefulness at predicting the correctness of the translation of a text that hasn’t yet been tested. Developers have to use those metrics because project managers love metrics, but most of them admit that they are not very good, and such a metric can never become perfect.

In a small number of cases, we have seen AI Overviews misinterpret language on webpages and present inaccurate information.

Yes, thanks again for admitting that computers are not supposed to interpret language in the first place. Humans are supposed to do it.

I could go on, but I have better things to do, like publishing three longish blog posts of my own. One is coming very soon, and it’s going to be fun, at least for me.

In response to accusations of monopolistic behavior, Google has been saying for years that competition is just a click away. It’s true, and it’s good. My experience with DuckDuckGo in the last few days has been perfectly fine.

That said, Google should still be tried for monopolistic behavior. And I kind of wish that there was regulation that prevents the deliberate destruction of fundamental public goods operated by commercial companies, but I guess that it would be very hard to legislate.

In the meantime, let’s try not to be silent about Google’s lies, and let’s consider using the competitors.

How Gboard Could Be Better for Hebrew

Oh (edit): Most of these suggestions are implemented as of February 7 2018. The only significant change that still does not seem to be implemented is the Oleh character. Thank you, Google, for your continued improvements of Gboard.

I mostly use the Gboard app for writing on my phone. The Samsung keyboard is generally not bad, but it doesn’t include Hebrew vowels, and I need them.

There are, however, several characters that are needed for Hebrew, and that aren’t included in Gboard, and some unnecessary characters could be removed.

These can be removed:

Long-pressing the minus (-) in the punctuation keyboard shows interpunct (·) and the em dash (—). They are unnecessary for Hebrew. The en dash (–), must not be removed, but see below.
The low line (_) appears twice in the punctuation keyboard: as its own key to the left of &, and as an option when long-pressing the minus (-). One of them can be removed. I’ll further argue that the en dash (–) is more useful for Hebrew than the low line (_), and the standalone low line can be replaced with the en dash. The low line is not used much anywhere except programming, while the en dash is useful for typing ranges correctly in Hebrew. I’ll readily admit that not a lot of Hebrew speakers know about the en dash’s correct semantics, but not many more people use the low line anyway.

And these should be added:

Maqaf (־, U+05be): It’s the Hebrew hyphen. It has different appearance and different direction semantics. It should be available when you long-press the minus in the main keyboard, and can also appear when you long-press the minus in the punctuation keyboard (for example, instead of the unnecessary em dash).
Geresh (׳, U+05f3) and Gershayim (״, U+05f4): These punctuation marks are similar in appearance to quotation marks, but they have different semantics. Apple went as far as replacing quotation marks on Hebrew keyboards on its devices with Geresh and Gershayim, which is an exaggeration. The usual quotation marks (‘, “) are used by most people, even though they are not perfect, and they must stay on Gboard where they are. The elegant Hebrew quotation marks (‚’„”) also appear on Gboard and must not be removed. Geresh and Gershayim can be added on the additional punctuation
Rafe (U+05bf): It’s a diacritic that looks like a line above a letter, and the opposite of dagesh, which is already available. It can appear when you long-press the letter resh (ר).
Oleh (U+05ab): It’s a diacritic that looks like a left-pointing arrow above a letter, and in modern Hebrew it signifies stress. It can appear when you long-press the letter ayin (ע).

The five character that I suggest to add are already part of the standard Hebrew keyboard (SII 1452), which is implemented in Windows 8. They must also be available in Android.

I hope that Google developers see this and make the necessary changes.

Weird GMail Habit: Removing Control Characters

GMail has a weirdish feature that probably very few people except me know about. When using it with a Hebrew user interface, invisible control characters—LRM, RLM, RLE, LRE and the like—are added to some strings to make them appear correctly in a mixed-direction interface.

Most notably, they are added to email addresses. I sometimes want to copy these email addresses as text, and my mouse pointer picks the control characters as well. Of course, these control characters are by themselves invisible to humans, but very much visible to computers, and an email address with these characters is not correct, even if it appears to be the same to human eyes.

It already became a habit for me to carefully delete and manually restore the first and the last characters of an email address to make sure that the control characters are removed.

It would be better if GMail just used the <bdi> element or CSS bidi isolation. They are fairly well supported in modern browsers and provide better experience.

Serbian Spam

I always celebrate when I receive spam in a language in which I haven’t yet received spam. I just received spam in Serbian for the first time. It was in the Cyrillic alphabet; Serbian can also be written in Latin, and it is frequently done in Serbia, possibly even more frequently than in Cyrillic, even though the government prefers Cyrillic.

This makes me wonder: Is Serbian in Cyrillic popular and important enough for spamming in it, or did the silly spammer just use Google Translate to translate to Serbian and got the result in Cyrillic, because that’s what Google Translate does?

If you know Serbian, can you please tell me whether it looks real or machine-translated? Words like “5иеарс” and the spaces before the punctuation marks give me a strong suspicion that it’s machine translation, but I might be wrong.

Молим вас за попустљивост за нежељене природи овог писма , али је рођена из очаја и тренутног развоја . Молимо носе са мном . Моје име је сер Алекс Бењамин Хубертревизор Африке развојне банке открио постојећи налог за успавану 5иеарс .

Када сам открио да није било ни наставак ни исплате са овог рачуна на овог дугог периода и наши банкарских закона предвиђа да ће било неупотребљивим чине више од 5иеарс иду на банковни прихода као неостварен фонда .

Ја сам се распитивала за личне депонента и његове најближе , али нажалост ,депонент и његове најближе преминуо на путу до Сенегала за тајкун , а он је оставио иза себе нема тело за ову тврдњу само сам направио ову истрагу само да буде двоструко сигурни у ту чињеницу , а пошто сам био неуспешан у лоцирању родбину .

So, how does it look? And do you receive Serbian spam? Thanks.

Broken right-to-left writing in the new GMail compose interface

This is a very old post. The information is irrelevant and kept only for historical interest. I’ve added this note in 2026 because I noticed that a lot of people still read it. It probably comes up high in search results.

Shalom.

Dear Google, this is a cry for help.

It seems that the new GMail compose interface overrides Firefox’s Ctrl-Shift-X shortcut, which switches the writing direction. It also overrides the right-click->Switch writing direction function; it simply doesn’t do anything.

I cannot do this in Google Chrome either, because of bug 91178 – There seems to be no way to set an input’s direction on Linux nor Chrome OS.

I can probably switch the direction by using rich text, but using rich text has its own issues, and I usually want to send my email in plain text.

Dear Google, please fix this. I tried the new compose interface several times and I complained about this problem in emails to my googler friends. Unfortunately this is still not fixed, and starting from today I can’t go back to the old compose interface.

I understand, of course, that GMail is a free service that doesn’t come with a warranty. Dear Google, I am asking you a favor. You did, in fact, contribute quite a lot to the development of support for right-to-left languages on the Web. I am only asking you to keep this support good.

Thank you.

P.S. Dear Google, please ask Google employees who speak right-to-left languages to use Google products in these languages, and to write email in these languages. Dog-fooding is the best testing. Thank you, again.

Look! I am Making All Things New

For the last couple of years I’ve been helping my parents to learn to use computers. Mostly very common and well-known things: GMail, Picasa, seraching Google, reading news websites, talking on Skype, the Russian social network Odnoklassniki, and not much more than that.

One of the most curious things that I found in my experiences with them is that emails and popups about new features are completely unhelpful to them. They always call me when they get them and ask me what to do now. It is awkward, because basically the emails tell them what to do, but instead of reading them and learning, they are reading them aloud to me:

— “It says: ‘Now you can find your friends more easily by typing their names in the search box’—so what do I do now?”

— “I don’t know… When you want to find somebody, type their names in the search box maybe?”

I am not saying that my parents are stupid; they aren’t. I am saying that these emails are not helpful. They appear to arrive from the helpful people in Google or Odnoklassniki, but the fact is that every time it happens, my parents are confused.

This makes me wonder: Is the effectiveness of these emails and popups and callouts researched? What are they good for? I don’t find them useful, because I actually like to find out things by myself; that’s my idea of user-friendliness: if it’s not self-explanatory, it is not user-friendly. My parents don’t find them useful, because they ask me what do the have to do. So is it useful for anybody?

PS 1: I know that Odnoklassniki is awful. They insisted.

PS 2: I know that Skype is not Free Software and that it doesn’t respect people’s privacy. Give me something properly Free that actually works. For what it’s worth, I did teach both of my parents to use Firefox and they hate other browsers, and on my mother’s laptop I installed Fedora, so except Skype, her online experience is almost completely Free.

A Relevant Tower of Babel

The Tower of Babel is frequently used as a symbol of foreign languages. For example, several language software packages are named after it, such as the Babylon electronic dictionary, MediaWiki’s Babel extension and the Babelfish translation service (itself named after the Babel fish from The Hitchhiker’s Guide).

In this post I shall use the Tower of Babel in a somewhat more relevant and specific way: It will speak about multilingualism and about Babel itself.

This is how most people saw the Wikipedia article about the Tower of Babel until today:

The Tower of Babel article. Notice the pointless squares in the Akkadian name. They are called "tofu" in the jargon on internationalization programmers. — The tower of Babel. Notice the pointless squares in the Akkadian name. They are called “tofu” in the jargon on internationalization programmers.

And this is how most people will see it from today:

And we have the name written in real Akkadian cuneiform!

Notice how the Akkadian name now appears as actual Akkadian cuneiform, and not as meaningless squares. Even if you, like most people, cannot actually read cuneiform, you probably understand that showing it this way is more correct, useful and educational.

This is possible thanks to the webfonts technology, which was enabled on the English Wikipedia today. It was already enabled in Wikipedias in some languages for many months, mostly in languages of India, which have severe problems with font support in the common operating systems, but now it’s available in the English Wikipedia, where it mostly serves to show parts of text that are written in exotic fonts.

The current iteration of the webfonts support in Wikipedia is part of a larger project: the Universal Language Selector (ULS). I am very proud to be one of its developers. My team in Wikimedia developed it over the last year or so, during which it underwent a rigorous process of design, testing with dozens of users from different countries, development, bug fixing and deployment. In addition to webfonts it provides an easy way to pick the user interface language, and to type in non-English languages (the latter feature is disabled by default in the English Wikipedia; to enable it, click the cog icon near “Languages” in the sidebar, then click “Input” and “Enable input tools”). In the future it will provide even more abilities, so stay tuned.

If you edit Wikipedia, or want to try editing it, one way in which you could help with the deployment of webfonts would be to make sure that all foreign strings in Wikipedia are marked with the appropriate HTML lang attribute; for example, that every Vietnamese string is marked as <span lang=”vi” dir=”ltr”>. This will help the software apply the webfonts correctly, and in the future it will also help spelling and hyphenation software, etc.

This wouldn’t be possible without the help of many, many people. The developers of Mozilla Firefox, Google Chrome, Safari, Microsoft Internet Explorer and Opera, who developed the support for webfonts in these browsers; The people in Wikimedia who designed and developed the ULS: Alolita Sharma, Arun Ganesh, Brandon Harris, Niklas Laxström, Pau Giner, Santhosh Thottingal and Siebrand Mazeland; The many volunteers who tested ULS and reported useful bugs; The people in Unicode, such as Michael Everson, who work hard to give a number to every letter in every imaginable alphabet and make massive online multilingualism possible; And last but not least, the talented and generous people who developed all those fonts for the different scripts and released them under Free licenses. I send you all my deep appreciation, as a developer and as a reader of Wikipedia.

The Case for Localizing Names

I often help my friends and family members open email accounts. Sometimes they are starting to use the Internet and sometimes they move from old email services (Yahoo, Walla!, ISP) to something modern (like it or not, Gmail).

At some point they have to fill their name, which will appear in the “from” field. And then I have to suggest them to write it in Latin characters, even though most of them speak languages that aren’t written in Latin characters – mostly Hebrew and Russian. Chances are that some day they will send an email to somebody who cannot read Russian or Hebrew, and Latin is relatively better known.

Only relatively, though. It may seem obvious to you that everybody knows the Latin script, but in fact, a lot of people are not comfortable with it at all. There are also other complications: lossy and inconsistent transliteration rules (is Amir אמיר or עמיר?), potential right-to-left rendering problems, and more. And of course, all people are happy to see their name in their language.

And people are also happy to see their friends’ names in their own language and not in a foreign or a neutral language. I have, for example, a lot of friends in India. Most of them write their names in English, but some write it in Marathi or in Malayalam. It’s certainly good for them, but in practice it’s much harder for me to find them this way, so English would be better – but Hebrew or Russian would be better yet.

Finally, there are a lot of people in the world who have more than one linguistic background. Mine are Russian, Hebrew and English, and I am really not such a special case. There are many millions of immigrants who have mixed backgrounds: Punjabi-Hindi-Urdu-English, Kurdish-Turkish-German, Kazakh-Russian-Norwegian, and others, and others and others. From each of these backgrounds they have friends, co-workers and family members, with whom they would love to communicate in the respective language. In each of these backgrounds they have friends who would want to find them using the name under which they know them there and using the appropriate language and writing system.

And sometimes people change their names, too. I did it (twice!), and so have many other people.

All this means that people’s names should be translatable, just like books, articles and software interfaces. Facebook and Google+ allow me to add a very limited number of names in foreign languages. Why wouldn’t they let me write my name in four, five, ten languages? This would make it easier for people who speak these languages to find me and to communicate with me. I would go even further and allow people who speak languages that I don’t know well to write my name as their hear it in their language and to add it to my details. Yet again, this would make me easier to find to even more people.

Some degree of automation can be possible. A lot of names are, after all, repetitive, so social networks would be able to suggest people with common names how their name would be written in other languages.

Wikipedia is actually quite good in this regard: Usually people have the same username across projects, and this username is not necessarily written in Latin letters, but people can customize the appearance of their signature in each project. I did it in a few languages, and people who speak those languages appreciate it.

I can only hope that social networks and email systems will allow as much flexibility as possible with this.