Languages Around The World

Formatting and Parsing

Overview

Formatters translate between binary data and human-readable textual representations of these values. For example, you cannot display the computer representation of the number 103. You can only display the numeral 103 as a textual representation (using three text characters). The result from a formatter is a string that contains text that the user will recognize as representing the internal value. A formatter can also parse a string by converting a textual representation of some value back into its internal representation. For example, it reads the characters 1, 0 and 3 followed by something other than a digit, and produces the value 103 as an internal binary representation.

These classes encapsulate information about the display of localized times, days, numbers, currencies, and messages. Formatting classes do both formatting and parsing and allow the separation of the data that the end-user sees from the code. Separating the program code from the data allows a program to be more easily localized. Formatting is converting a date, time, number, message or other object from its internal representation into a string. Parsing is the reverse operation. It is the process of converting a string to an internal representation of the date, time, number, message or other object.

Using the formatting classes is an important step in internationalizing your software because the format() and parse() methods in each of the classes make your software language neutral, by replacing implicit conversions with explicit formatting calls.

Internationalization Formatting Tips

This section discusses some of the ways you can format and parse numbers, currencies, dates, times and text messages in your program so that the data is separate from the code and can be easily localized. This is the information your users see on their computer screens, so it needs to be in a language and format that conforms to their local conventions.

Some things you need to keep in mind while you are creating your code are the following:

Numbers and Currencies

Programs store and operate on numbers using a locale-independent binary representation. When displaying or printing a number it is converted to a locale-specific string. For example, the number 12345.67 is "12,345.67" in the US, "12 345,67" in France and "12.345,67" in Germany.

By invoking the methods provided by the NumberFormat class, you can format numbers, currencies, and percentages according to the specified or default locale. NumberFormat is locale-sensitive so you need to create a new NumberFormat for each locale. NumberFormat methods format primitive-type numbers, such as double and output the number as a locale-specific string.

For currencies you call getCurrencyInstance to create a formatter that returns a string with the formatted number and the appropriate currency sign. Of course, the NumberFormat class is unaware of exchange rates so, the number output is the same regardless of the specified currency. This means that the same number has different monetary values depending on the currency locale. If the number is 9988776.65 the results will be:

In order to format percentages, create a locale-specific formatter and call the getPercentInstance method. With this formatter, a decimal fraction such as 0.75 is displayed as 75%.

Customizing Number Formats

If you need to customize a number format you can use the DecimalFormat and the DecimalFormatSymbols classes. This not usually necessary and it makes your code much more complex, but it is available for those rare instances where you need it. In general, you would do this by explicitly specifying the number format pattern.

If you need to format or parse spelled-out numbers, you can use the RuleBasedNumberFormat class. You can instantiate a default formatter for a locale, or by using the RuleBasedNumberFormat rule syntax, specify your own.

Using NumberFormat class methods with a predefined locale is the easiest and the most accurate way to format numbers, and currencies.

Date and Times

You display or print a Date by first converting it to a locale-specific string that conforms to the conventions of the end user's Locale. For example, Germans recognize 20.4.98 as a valid date, and Americans recognize 4/20/98.

NoteThe appropriate Calendar support is required for different locales. For example, the Buddhist calendar is the official calendar in Thailand so the typical assumption of Gregorian Calendar usage should not be used. ICU will pick the appropriate Calendar based on the locale you supply when opening a Calendar or DateFormat.

Messages

Message format helps make the order of display elements localizable. It helps address problems of grammatical differences in languages. For example, consider the sentence, "I go to work by car everyday." In Japanese, the grammar equivalent can be "Everyday, I to work by car go." Another example will be the plurals in text, for example, "no space for rent, one room for rent and many rooms for rent," where "for rent" is the only constant text among the three.

Formatting and Parsing Classes

ICU provides four major areas and twelve classes for formatting numbers, dates and messages:

General Formatting

Formatting Numbers

Formatting Dates and Times

Formatting Messages



Copyright (c) 2000 - 2007 IBM and Others - PDF Version - Feedback: http://icu-project.org/contacts.html

User Guide for ICU v3.8 Generated 2007-09-14.