A few tips that will prepare you to operate in other languages and countries, and save you hundreds of thousands of dollars (or more) in engineering effort
Having led localization (internationalization and language support) at several startups, among them Lyft, Medium, Insightly and others, I have noticed a common pattern among companies of all sizes, and share some tips on how to avoid costly mistakes. Most companies only consider internationalization and language support relatively late in their lifecycle, and because of this, have to deal with a lot of technical debt that is very expensive and time consuming to fix. The simple explanation is that they typically don’t plan to expand internationally early on, and because of that decision, engineering cans get kicked down the road at great future expense.
In this article, I share a few tips that cost little or nothing to implement, and that will future proof your systems so that when the time comes to support users in other languages or countries, you don’t have to refactor large amounts of your code to do so. Also included is a tip that will enable non-technical product managers and copywriters to edit and fine tune user facing content without getting engineers involved (more on this later).
Localization Is Not Just About International Expansion
One of the reasons companies neglect to pay attention to localization is because they assume that they may not expand internationally, or at least not anytime soon. That is generally a wise attitude as there are a lot of legal and operational risks associated with international operation (think GDPR), and it is important to make sure your business is viable before expanding into foreign territories. The problem is that in any country, there are significant populations who speak a language other than the country’s dominant or official language.
If you operate primarily in the United States, what is your plan to address the Spanish speaking population? Roughly 30% of the people in the top US metropolitan markets speak Spanish at home. Many of them are bilingual, but many prefer to converse and do business in Spanish. Even if you know you will not expand to other countries, you are leaving money on the table by not making your products and services accessible in secondary languages in your home market (as well as tourist languages, if your product or service caters to visitors). So even if you do not plan to expand internationally, you need to have a plan for welcoming users for whom English is not their first language.
Create A Dummy Library To Display Messages
Mobile apps and dynamic websites display a lot of text to their users. How you manage this content affects how easy or difficult it will be to serve content in other languages, or regional variations of a single language (e.g. British vs American English).
Mobile development frameworks generally force developers to externalize the messages displayed to users (move them into separate files that can easily be swapped out). Android and iOS both use this approach, which makes it easy to translate these messages. Typically there is a file for each supported language or language variant. So static content in your mobile apps is generally pretty straightforward to localize.
The problem is that mobile apps don’t work in isolation, and typically display content that is coming from backend services and other sources. These typically don’t enforce good localization practices, so developers will tend to hard code messages in English. To make matters worse, the trend toward micro-services means that you probably also have different services running on different development platforms.
The most important thing you can do to prevent tech debt from accumulating in these services is to create a handful of dummy functions that all developers are required to use when displaying user facing messages. In Python, a function to display a message might look like (apologies for the formatting):
def t(text, comment, quantity=None, serviceName=’main’):
if len(translator_comment) < 5:
raise NameError(‘No translator / copyeditor comment provided’)
This function doesn’t do anything useful just yet, but forces developers to get in the habit of externalizing these messages. Note that it also requires a comment to explain the context in which the message is used, which helps translators and copywriters understand what’s meant, and therefore what the best translation should be.
You should also create similar dummy functions for use when formatting numbers, dates, currency amounts, etc, as the formatting rules for these vary by language and region. You’ll need this for internationalization. For example, if you decide to go to Canada, you might get by with only supporting English, but you’ll need to deal with differences in date/time notation, currencies, etc.
Doing this will put you in a much better position when you are ready to add international regions and/or language support. How much time and money will this save you? If your product is at all complex, it will save you hundreds of thousands of dollars and months of time. At most of the companies I have worked at it took them a minimum of six months to refactor their code, engineering time that could have been spent elsewhere. This can easily add up into multiple person years, meaning a cost of well over a million dollars, not to mention lost time entering the market. This often dwarves the cost of translation and copyediting.
There are benefits to this beyond future proofing. One of the most useful things about this is you can allow non-technical product managers and copyeditors to edit these messages without touching any code. How does this work? You “translate” your service from engineer English to finished English.
Localizing Your Service Within Your Primary Language
One thing you can do before you roll out support for other languages is to localize your service into better English (assuming English is your primary language). Think of the messages your developers write as functional English, sufficient for testing and validation, but not finely edited for public use.
What you do then is to localize your content from functional English to production quality English. In the case of a company operating in the US, they would localize from generic English (locale code: en) to US English (locale code: en-US). The generic English created by your developers would serve as the source copy, while copyeditors revise this as needed to maximize quality.
To do this, you will need to update your dummy functions to read and write from a database of strings, which, in turn is used to generate text files similar to those used by Android and iOS for localization.
At runtime, the function will read from a cached database, and if no match is found, will return the original string, and optionally queue the new text for translation (this pattern allows for both statically defined texts as well as dynamic texts that are coming from external sources).
At regular intervals, a cron job will kick off that generates a localization file that contains the generic English source messages, uploads this to a translation management system, and then downloads the finished English “translation” file(s), and imports those into the cached database for runtime use. NOTE: a translation management system is a platform for managing translations and translation workflows.
Copyeditors will log into the TMS to “translate” the generic English content into finished US English (or whichever variant you need). In most cases, they’ll just copy the original text over, but if there is a problem or they want to phrase things differently, they can revise the text without touching any code, and without wasting an engineer’s time to make the edit in code. This has the added benefit of parallelizing software development and copywriting, as the developer’s can write functional messages knowing that copyeditors will come along and fine tune these as needed.
Once you have this process working, you have done most of the work needed to support multilingual operation. You’ll probably need to do some work on number and date formatting, but that is all pretty easy and does not involve interacting with an external platform like a TMS. The process to translate from finished English to whichever foreign languages you want to support is exactly the same as the process above.
What About Websites, Help Center, Etc?
You probably have a number of contact points besides your mobile app or web service. This typically includes a corporate website, blogs, help center and similar content. Fortunately, these are easier to deal with.
The main thing you need to do is to select content hosting platforms that support localization, and that have been integrated with popular translation management systems. This should be a hard requirement for any content management system, help center or other content hosting platform.
Zendesk has supported multiple languages for many years, and has robust integrations with popular TMS platforms like Smartling. That said, Zendesk’s capabilities as a content management system leave a lot to be desired. It is not my first choice, my preference would be to build a help/FAQ site using a CMS like Contentful, but many support departments refuse to use anything but Zendesk. If that’s the case, it will support localization out of the box.
If you are looking for a general purpose content management system, many of them support multilingual operation. Contentful, a headless CMS, is one of my current favorites, but there are many CMSs out there to suit different needs based on complexity, customer budget, etc. Just make sure you pick one that has robust integrations with one or more translation management systems (Smartling has invested a lot in integrating with a broad range of platforms). Whatever you do, do NOT build a homegrown CMS or host your site as a collection of static files (yes, most of the companies I worked for did this initially). Besides making translation a nightmare, it doesn’t scale, and forces developers to waste their time checking in copy edits. Don’t do this.
Selecting A Translation Management System
This is an article in its own right, but let’s go with the assumption that you are not planning to launch multiple languages right now, but do want to be prepared, and want to use the copyediting trick I mentioned. Transifex is a good option at this stage, as the provide a low cost, self service solution for file based localization as described in this article. As your needs expand, you may need to look at other options such as Smartling or Memsource, but they are a good place to start and offer a straightforward REST API for managing translation files.
Want To Learn More?
There are a lot of subtleties to localization. This is intended as a set of tips you can use to get started and avoid common traps that lead to future expense. A few resources I find useful include:
http://www.i18nguy.com/ — a great compilation of tech tips by Tex Texin
https://multilingual.com/ — from the publishers of Multilingual Magazine, also the organizers of the Localization World conferences, less technical, but a good place to look for information about translation service providers.
Also watch for a series of articles on related topics such as selecting a translation management system, localization edge cases (plurals, gender, etc) and more. In the meantime, I am happy to provide advise, and you can find me at https://www.linkedin.com/in/briansmcconnell/