Internationalization and Localization
The Internet has no boundaries and neither should your web application. People all over the world access the Net to browse web pages that are written in different languages. For example, you can check your web-based email from virtually anywhere. A user in Japan can access the web and check her Yahoo! Email in Japanese. How does Yahoo do it? Is it because the user’s machine has a Japanese operating system or do web-based applications automatically adjust according to the users’ region?
Welcome to the twin worlds of Internationalization and Localization. What are they and why do we need them? This chapter attempts to answer these questions and shows you how to internationalize and localize your J2EE web applications. Each of these terms is explained in detail in the subsequent sections of this chapter. At the end of this chapter you will be able to see how to make your web application user friendly in different countries using different languages. You will also see the various ways to do this as well as some common dos and don’ts when internationalizing your applications.
In section 19.1, we will explain the basics of Internationalization and localization; their need and their advantages. In section 19.2 you will learn more about character encodings, what they are and how to use them. Section 19.3 deals with web tier internationalization and the various ways you can achieve this. Some of the approaches include Struts, JSTL tags, Java server faces and Tiles localization. Finally, in section 19.4 you will see some internationalization best practices.
8.1 Internationalization and Localization
When you read about internationalizing applications, you will come across terms such as localization, character encoding, locales, resource bundles and so on. This section covers some of the commonly used terms in this chapter.
Internationalization or I18n is the process of enabling your application to cater to users from different countries and supporting different languages. With I18n, software is made portable between languages or regions. For example, the Yahoo! Web site supports users from English, Japanese and Korean speaking countries, to name a few.
Localization or L10n on the other hand, is the process of customizing your application to support a specific location. When you customize your web application to a specific country say, Germany, you are localizing your application. Localization involves establishing on-line information to support a specific language or region. Though I18n and L10n appear to be at odds with each other, in reality they are closely linked. The later sections in this chapter will show you how.
A Locale is a term that is used to describe a certain region and possibly a language for that region. In software terms, we generally refer to applications as supporting certain locales. For example, a web application that supports a locale of “fr_FR” is enabling French-speaking users in France to navigate it. Similarly a locale of “en_US” indicates an application supporting English-speaking users in the US.
A ResourceBundle is a class that is used to hold locale specific information. In Java applications, the developer creates an instance of a ResourceBundle and populates it with information specific to each locale such as text messages, labels, and also objects. There will be one ResourceBundle object per locale.
A Character Encoding is a mapping between characters and values. In order for your web application to support multiple languages, you will first have to ensure that the appropriate encoding is supported by your application code as well as the back-end database. Typical encodings include the Unicode format, which is a 16 bit format that supports most of the world’s major languages and the DBCS format that is the Double Byte Character Set format that can hold two bytes of data per character. The Chinese language is represented using the DBCS format.
The thought process behind developing an internationalized web application is quite complex. It is a matter of setting priorities in order to develop an application whose complexity can be considerably increased based on the number of countries and languages it has to support. This means that whether you are developing a web application from scratch or modifying an existing one, you will have to identify the various factors that will drive the move towards internationalization. Section 19.1.1 shows the various factors to be considered.
19.1.1 Internationalization factors
Normally, when a web application is being developed, internationalization is not given very much importance, as the focus is more on “getting the application working” and then maybe later on supporting different locales. Though this is not recommended, it might still work as long as the application is small and can take on the changes that follow after deployment for a particular (default) locale.
But, what if the application is very complex, with a large number of JSP pages and the requirement is to support multiple locales? What factors should you consider before taking the actual step of changing your code to do this? Here are a few:
1. Identify the areas in your application that will have to change in order to support users from different countries. There are two main areas that will need change:
a. The visible part of your application – the User Interface. The user interface specific changes could mean changes to text, date formats, currency formats etc. More information about this is provided in section 19.1.3
b. The invisible parts of your application – database support for different character encoding formats and your back-end logic that processes this data.
2. Identify the prospective clientele of your application. A country that may not seem to be at the top of your list right now might actually be a good bet based on the size of it’s population. For example, the new generation of young adults in India is turning out to be a growing consumer of foreign products. Companies like Nokia and Coke are cashing in on this trend. Similarly, web applications have to be flexible enough to support different locales. It never helps to have to go back and change your design in order to accommodate new locales or any other internationalization specific changes.
3. Once the end user has been identified, see if your software supports their language-based needs. For example, do you have the software to develop a web based search engine that accepts input in Arabic and returns search results?
4. Identify a timeline within which you will have to deliver the web application with support for a given region and language. For example, the support for the French language in France (fr_FR) is different from the support for French in Canada (fr_CA) in terms of the presentation details. For example: the following code segment formats the same number for two different locales both speaking French.
// for France
Locale loc1 = new Locale("fr", "FR");
NumberFormat frenchFmt = NumberFormat.getCurrencyInstance(loc1);
double amtFR = 5010.78;
System.out.println(currFmt.format(amtFR));
// for Canada
Locale loc2 = new Locale("fr", "CA");
NumberFormat canFmt = NumberFormat.getCurrencyInstance(loc2);
double amtCA = 5010.78;
System.out.println(currFmt.format(amtCA));
And the output looks like:
5 010,78 €
5 010,78 $
This example shows the first output in Euros whereas the second output indicates the same amount in Canadian dollars. This is just one of the many differences you will come across. If you dig deeper into the intricacies of each locale specific format, you will find that even though two regions might have a common language, they differ in almost every other respect when it comes to displaying data.
The Locale and NumberFormat classes are explained in more detail in section 19.1.4
5. The last but also the most important factor is to identify the actual “type” of internationalization approach to use. That is, do you want to create a new copy of each JSP page per locale or do you want to have a single JSP page that works for different locales. The former approach (Figure 19.1) is only suitable for smaller applications with support for a few languages. If you are developing a new application or internationalizing a large application with support for multiple locales, then the latter approach (Figure 19.2) is recommended.
Figure 19.1 The “One JSP page per locale” approach

Figure 19.2 Same JSP supports multiple locales
19.1.2 Advantages of Internationalization and Localization
Having to internationalize a web application is not an easy task, as you will need to know exactly what areas of your application will have to change and update them. This means additional effort, time and cost. Many companies that decide to go global have some specific goals in mind that justify the cost. Here are a few main ones:
1. An increase in the number of users as more and more people will be able to use the application in their native languages, all this without having to create a single office in each of the regions supported. The mantra of any profit making company is quite simple actually and works well for internationalizing applications too: More users = More sales = More profit.
2. Though the cost of localization is high as each individual page will have to be translated into several languages but the benefits from sales almost always outweigh the costs. A lot of big companies in the US, Sun Microsystems included, obtain around half of their revenue from outside of the US.
3. By localizing a web application, the user level of comfort with the application increases. The primary aim of localization is to “get through” to the user. And what better way to do this than in a language that the user is comfortable with?
In this section you saw what you can gain by Internationalizing your application. The next section explains the various parts of your application that can be internationalized.
19.1.3 What can be localized?
So far you have seen what internationalization and localization are, and what they can do for us. In this section you will see what parts of your web application will have to be customized in order to support different locales.
Before we move on to the actual details we will see a simple example that might aid in understanding the need for localization better.
Consider a simple “Hello World” example just for a moment. Most users write their first web applications (read servlets, JSPs) to display the time honored Hello World message. Then they move on to challenging tasks that, say take the user’s name from a HTML form and display a more personalized message like “Welcome to ABC Bank, John Doe”. Consider the application from a user’s point of view. When your application runs anywhere in the US, everyone, well almost everyone speaks English and hence, they won’t have any trouble trying to figure out what your application is trying to say.
Now, consider the same application being accessed by a user in a non-English speaking country say Japan. There is a good chance that the very same message might not make too much sense to a Japanese user. Why not? Simple, English is not spoken in many countries and such countries use their own languages to communicate, in their talks, their books and surprise!, in their websites too.
The point in context is very simple: Present your web application to foreign users in a way they can comprehend it and navigate freely without facing any language barriers.
Great, now you know where this is leading, right? That’s right, localization! In order to localize your web application, you have to identify the key areas that will have to change. You had seen in the previous section that there are two main areas: the User interface and the back end logic (and possibly database encoding support). In this section we focus on the UI specific changes required to localize your web application.
Here is a list of the commonly localized fields in a web application:
1. Messages – These could be status messages displayed to the user or error messages
2. Labels on GUI components – labels, button names
3. Dates
4. Times
5. Numbers
6. Currencies
7. Phone numbers
8. Addresses
9. Graphics – images have to be very specific for every locale and cater to each region’s cultural tastes.
10. Icons
11. Colors – Colors play a very important role in different countries. For example, death is represented by the color white in China.
12. Sounds
13. Personal titles
14. Page layouts – that’s right. Just like colors, page layouts can vary from locale to locale based on the country’s cultural preferences.
There are other properties that you might require to be localized, but the ones mentioned are the commonly used ones. You will see in the next section a brief overview of the Java Internationalization API and some examples on how to update some of these fields dynamically based on locale informat