As a software developer and localization professional, I am often asked what are the main things to be aware of when localizing an app or service. So I put together this primer / cheat sheet for people who are new to the area.
For Engineering Managers
Use A Translation/Localization Management System — Use a TMS (translation repository) that follows conventions familiar to developers. Transifex, which was built by and for software developers is a personal favorite, and has a minimal entry cost.
Use Mature Frameworks — While it’s fun to experiment with new tools, mature languages and frameworks have had more time to figure localization out.
Every Language Or Framework Has It’s Warts — Every framework has weak points in this area. If you’re using Python, “UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xa0′ in position 20: ordinal not in range(128)” will be your new best friend.
Continuous Integration Is Your Friend — DO automate uploads of source language files to your translation management system, so you know what’s new that needs to be translated. DON’T automatically merge translations back into your system without a manual review process. Translations are done by humans, humans who can mistype markup and escape codes that can cause runtime errors.
Have One Person Own Localization — As a role, it is not strictly technical. While it often involves coding, it also involves project management, external vendors, and providing support to other departments like marketing.
Extract And Centralize Human Readable Messages — Move all of these into a central catalog of strings. Seems obvious, but most companies I have worked with put this off, and ended up having to do a lot of unnecessary work as a result.
Avoid Concatenating Strings Where Possible — It’s often tempting to generate prompts on the fly, like “You have “ + str(count) + “ widgets in your “ + account_type + “ account.” This causes all kinds of problems in other languages due to different word order, rules about pairing adjectives and nouns according to gender, rules for plurals, and more. Patterns that work in English completely break in other languages.
Have developers run the application in alternate language(s) they speak — They’ll see issues early on. For example, a Spanish speaker will immediately notice the issues caused by concatenating strings.
Use browser based machine translation to test your web interface — The translations aren’t great, but you can get a sense of how word length varies by language and how it affects layout. Just right click, select Translate to English, and then pick a new target language. Try a variety of languages such as Spanish, German and Japanese.
Use pseudolocalization — A feature in localization management systems, pseudolocalization generates prompt catalogs in a simulated locale. This is a good way to test layout, and also make sure text encoding issues don’t bite you.
Use localized string formatting tools in your framework — When you turn on new languages, dates, times, numbers, etc will be localized according to the rules for the new locale. Mature frameworks like .Net have extensive support for this. In newer frameworks, you’ll need to find third party tools or build them.
Use localization tools for UI/UX — Have developers catalog prompts in a “Basic English” locale. This is used for development and testing. Then have UI/UX staff “translate” basic English to finished English for use in the customer facing system. Developers can worry about functionality, while UI/UX focuses on getting the wording just right.
Auto-detect the user’s locale — You can auto-detect the user’s preferred locale in most environments (e.g. Accept-Language header on web). I generally recommend you auto-set the locale, but show the user where to go to override this setting if they want to.
Use international formats for client/server data exchange — Any interaction between client and server should be in a standards based format. For example, date/times should always be sent in YYYY-MM-DD HH:MM:SS format. While the user will see it displayed in local form, such as 5 Mar 2016, the server will see 2016–03–05.
Consider highly automated translation services for translating bulk content or content that changes frequently. Several companies, such as Gengo, offer professional translation via a web API. You can treat them like an automated resource, while translations are done behind the scenes. They are good for translating continuously updated content, like hotel listings, with reasonable quality. That said, use a conventional translation vendor for most material.
Label Prompts And Error Messages For QA Automation — Label prompts and error messages in a way that enables multilingual testing. In a web app, one way to do this is to embed a hidden comment in output. The QA test suite looks for the hidden output, which is the same in every interface language, while the visible message may change by locale. Then you can run every test in every locale you support, without having to hand craft localized test cases.
Do A Survey Of Language Skills In Your Company — Chances are at least half of your engineers are bilingual. Send out a Google spreadsheet and find out what languages they speak. Ask them to use and test the product in their alternate language versus English. This is a way to get free localization QA.
Hire A Localization QA Firm To Complement Automation — Automated tests do a good job at detecting regressions and functional bugs, but they don’t detect other issues, such as layout problems caused by longer words, out of context translations, and things that “don’t look right”. Acclaro, Vistatec and WeLocalize do this type of work, and charge around $80/hour. Testers will often find non-localization issues too, so it’s a 2-for-1 deal.
Display a localization feedback icon to non-English users — Users will often spot subtle localization issues, and are a great source of feedback if you make it easy for them. Display a feedback icon/form to non-English users that feeds into your ticketing system.
For Managers In General
Localization Is A Revenue Center — Track revenue, conversions and churn by locale. Localization can easily generate 10x returns, and has been an important growth lever for many leading companies.
Focus On International Languages First — By supporting a handful of languages, you can reach a large percentage of the Internet population. Focus first on languages, like Spanish or French, that are spoken in many countries, then add more based on your user demographics.
Localization Improves Conversions — Even in countries where most people speak English, such as Holland, customers prefer products that are localized in their market.
Translation Quality Is A Step Function — Translation quality varies by price. Below a certain threshold quality drops off sharply. Most translators are freelance and work for multiple agencies. The good ones avoid the agencies that pay poorly. I tell people if they are paying 15–20 cents per word, or $60–80 per hour, they are getting an ok deal (some languages like Japanese can cost more). Focus more on hiring agencies that know your subject area and the tools you use.
Localization Is Also A Domestic Opportunity — People often equate localization with international expansion. Most markets have important second and third languages, for example Spanish in the United States, and French in Canada. Even if you plan to remain domestic, localization can help you expand reach locally.
Explore Discriminatory Pricing — This sounds like a bad word, but what it means is different prices for different markets. US users might be happy to pay $25/seat-month, while users in Latin America might balk at paying more than $15, so look at bundling and pricing options that account for regional price sensitivity (local promotions are a good way to do this without having different rate sheets for different countries).
Well, thanks for giving this a read. This isn’t intended as an encyclopedia on the topic, but should help you and your team get started. Feel free to drop a line if you have comments or questions (firstname.lastname@example.org)