by Lars Yencken
Two years ago, 99designs had localized sites for a handful of English speaking countries, and our dev team had little experience in multilingual web development. But we felt that translating our site was an important step, removing yet another barrier for designers and customers all over the world to work together. Today we serve localized content to customers in 18 countries, across six languages. Here’s how we got there, and some of the road blocks we ran into.
The most difficult aspect to internationalizing is language, so we started with localization: everything but language. In particular, this means region-appropriate content and currency. A six-month development effort saw us refactor our core PHP codebase to support local domains for a large number of countries (e.g. 99designs.de), where customers could see local content and users could pay and receive payments in local currencies. At the end of this process, each time we launched a regional domain we began redirecting users to that domain from our Varnish layer, based on GeoIP lookups. The process has changed little since then, and continued to serve us well in our recent launch in Singapore.
Languages and translation
With localization working, it was time to make hard decisions about how we would go about removing the language barrier for non-English speakers (i.e. the majority of the world).There were a lot of questions for us to answer.
- What languages will we offer users in a given region?
- How will users choose their language?
- How will we present translated strings to users?
- How will strings be queued for translation?
- Who will do the translation?
What languages to offer?
Rather than making region, language and currency all user selectable, we chose to restrict language and currency availability to a user’s region. This was a trade-off which made working with local content easier: if our German region doesn’t support Spanish, we avoid having to write Spanish marketing copy for it. Our one caveat was for all regions to support English as a valid language. As an international language of trade, this lessens any negative impact of region pinning.
There were two main approaches we considered for translation: use a traditional GNU gettext approach and begin escaping strings, or else try a translation proxy such as Smartling. gettext had several advantages: it has a long history, and is well supported by web frameworks; it’s easily embedded; and translations just become additional artifacts which can be easily version controlled. However, it would require a decent refactoring of our existing PHP codebase, and left open issues of how to source translations.
In Smartling’s approach, a user’s request is proxied through Smartling’s servers, in turn requesting the English version of our site and applying translations to the response before the user receives it. When a translation is missing, the English version is served and the string is added to a queue to be translated. Pulling this off would mean reducing substantially the amount of code to be changed, a great win. However, it risked us relying on a third-party for our uptime and performance.
In the end, we went with Smartling for several reasons. They provided a source of translators, and expertise in internationalization which we were lacking. Uptime and performance risks were mitigated somewhat by two factors. Firstly, Smartling’s proxy would be served out of the US-East AWS region, the same region our entire stack is served from, increasing the likelihood that their stack and ours would sink or swim together. Secondly, since our English language domains would continue to be served normally, the bulk of our traffic would still bypass the proxy and be under our direct control.
Preparing our site
We set our course and got to work. There was substantially more to do than we first realized, mostly spread over three areas.
Escaping user-generated content
Strings on our site which contained user content quickly filled our translation queue (think “Logo design for Greg” vs “Logo design for Sarah”). Contest titles, descriptions, usernames, comments, you name it, anything sourced from a user had to be found and wrapped in a
<span class="sl_notranslate"> tag. This amounted to a significant ongoing audit of the pages on our site, fixing them as we went.
Making our design more flexible
In our design and layout, we could no longer be pixel-perfect, since translated strings for common navigation elements were often much longer in the target language. Instead, it forced us to develop a more robust design which could accommodate the variation in string width. We stopped using CSS transforms to vary the case of text stylistically, since other languages are more sensitive to case changes than English.
The wins snowball
After 9 months of hard work, we were proud to launch a German language version of our site, a huge milestone for us. With the hardest work now done, the following 9 months saw us launch French, Italian, Spanish and Dutch-language sites. Over time, the amount of new engineering work reduced with each launch, so that the non-technical aspects of marketing to, supporting and translating a new region now dominate the time to launch a new language.
We also encountered several unexpected challenges.
We mentioned earlier that the richer the client-side JS, the more work required to ensure smooth translation. The biggest barrier for us was our use of Mustache templates, which were initially untranslatable on the fly. To their credit, Smartling vastly improved their support for Mustache during our development, allowing us to clear this hurdle.
Translating non-web artifacts
It should be no surprise: translation by proxy is a strategy for web pages, but not a strong one for other non-web artifacts. In particular, for a long time translating emails was a pain, and in the worst case consisted of engineers and country managers basically emailing templates for translation back and forward. After some time, we worked around this issue by using Smartling’s API in combination with gettext for email translation.
Exponential growth of translation strings
Over time, we repeatedly found our translation queue clogged with huge numbers of strings awaiting translation. Many of these cases were bugs where we hadn’t appropriately marked up user-generated content, but the most stubborn were due to our long-tail marketing efforts. Having a page for each combination of industry, product category and city led to an explosion of strings to translate. Tackling these properly would require a natural language generation engine with some understanding of each language’s grammar. For now we’ve simply excluded these pages from our translation efforts.
This has been an overview of the engineering work involved in localizing and translating a site like ours to other languages. Ultimately, we feel that the translation proxy approach we took cut down our time to market significantly; we’d recommend it to other companies who are similarly expanding. Now that several sites are up and running, we’ll continue to use a mix of the proxy and gettext approaches, where each is most appropriate.
We’re proud to be able to ship our site in multiple languages, and keen to keep breaking down barriers between businesses and designers wherever they may be, enabling them to work together in the languages in which they’re most comfortable.