I18N/L10N 历史 / I18N 指南 / libi18n 模块说明

注:机翻,未校对。
文章虽然从 Netscape 客户端展开 I18N/L10N 历史,但 I18N/L10N 的演化早已不仅限适用于 Netscape 客户端。


Netscape Client I18N/L10N History

Netscape 客户端 I18N/L10N 历史

Contact: Bob Jung <bobj@netscape.com>

Discussion: netscape.public.mozilla.i18n or mozilla-i18n@mozilla.org

Last Update: March 31, 1998

Client Product I18N Features & Capabilities
Navigator 1.0 Latin1 Support. (I18N Team was not started until end of 1994.) Latin1 支持。(I18N Team 直到 1994 年底才开始。
Navigator 1.1 Japanese web browsing 日本网页浏览 Japanese charset auto detection 日语字符集自动检测 Posting RFC1468 compliant Japanese News Articles and Email 发布符合RFC1468标准的日语新闻文章和电子邮件 Entering Japanese Text in HTML Forms 在 HTML 表单中输入日语文本 Selection, Insertion, Copying & Pasting of Japanese Text 日语文本的选择、插入、复制和粘贴 Line Wrapping for Japanese Text (No Kinsoku Shori) 日语文本的换行(无 Kinsoku Shori) HTTP charset handling HTTP 字符集处理
Navigator 1.1i Resourced UI and defaults for localizability 可本地化的资源 UI 和默认值 Localized for 3 languages: Japanese, German and French (L10N team did not exist yet) 本地化为 3 种语言:日语、德语和法语(L10N 团队尚不存在)
Navigator 2.x Chinese (Traditional & Simplified) & Korean Support 中文(繁体和简体)和韩语支持 Central/Eastern European Support 中欧/东欧支持 Per window (non-global) encoding support 每窗口(非全局)编码支持 Added HTML charset tagging via HTML tag 添加了通过 HTML 标记的 HTML 字符集标记 Mail and news I18N (e.g., MIME, conversions) 邮件和新闻 I18N(例如,MIME、转换) Non-English line wrapping, insertion, copy & paste 非英语换行、插入、复制和粘贴 Limited XP locale support (e.g., sorting, time & date) 有限的 XP 语言环境支持(例如,排序、时间和日期) HTTP Accept-Language header support (No Preference UI on X) HTTP Accept-Language 标头支持(X 上无首选项 UI) User Defined encoding and charset mapping 用户定义的编码和字符集映射 Localized for Japanese, German, French, Swedish, Korean, Italian, Dutch, Spanish, Brazilian Portuguese and Danish 本地化为日语、德语、法语、瑞典语、韩语、意大利语、荷兰语、西班牙语、巴西葡萄牙语和丹麦语
Navigator 3.x (Non-Gold & Gold) Additional language support (Cyrillic, Greek, Turkish) 其他语言支持(西里尔语、希腊语、土耳其语) 2-byte support in Gold Editor Gold Editor 中的 2 字节支持 Java non-Latin character display Java 非拉丁字符显示 Add Unicode Conversion 添加 Unicode 转换 Stealth Unicode support for Win32 Stealth Unicode 支持 Win32 Improve Japanese Line Wrapping by adding Kinsoku Shori 通过添加 Kinsoku Shori 改进日语换行
Communicator 4.x Chinese, Japanese and Korean (CJK) display on vanilla US Win32 中文、日文和韩文 (CJK) 在原版美国 Win32 上显示 Unicode 2.0 support Unicode 2.0 支持 Korean charset auto detection 韩语字符集自动检测 Sends HTTP Accept-Charset header 发送 HTTP Accept-Charset 标头 Java JDK 1.1.1 Internationalization Classes Java JDK 1.1.1 国际化类 Input Method support in Java AWT text widgets Java AWT 文本小部件中的输入法支持 Composer 创作者 Automatic HTML tag insertion for charset parameter 字符集参数的自动 HTML 标记插入 UTF-8 editing in Windows Windows 中的 UTF-8 编辑 Save as text (4.05 and later) into all the languages supported 另存为文本(4.05 及更高版本)支持的所有语言 Mail/News 邮件/新闻 Independent encoding per Mail/News folder 每个邮件/新闻文件夹的独立编码 Encoding menu for Mail composer 邮件编辑器的编码菜单 HTML messages in all the languages supported 支持的所有语言的 HTML 消息 UTF-7 mail send; UTF-8/UTF-7 display UTF-7 邮件已发送;UTF-8/UTF-7 显示 Dynamic Fonts for non-Latin1 encodings 非 Latin1 编码的动态字体

Copyright © 1998Netscape Communications Corporation

I18N Guidelines I18N 指南

Contact: Erik van der Poel <erik@netscape.com>

Discussion: netscape.public.mozilla.i18n or mozilla-i18n@mozilla.org

Last Update: April 22, 1998

Introduction 介绍

This document provides some I18N (internationalization) guidelines for Mozilla. These guidelines should be followed by all Mozilla programmers, regardless of country of residence.

There is a related document, called the Localizability Guidelines.

有一份相关文档,称为"本地化性指南"。

General I18N Guidelines 一般 I18N 准则

One code base for the world. The localization process is simplified if recompilation from source code is not necessary. Only the (external) resource files need to be altered. This means that there cannot be any conditional compilation for specific languages. For example, #ifdef JAPANESE is not allowed. This model is different from that used in the past in the PC world. It is possible, for example, to browse Japanese Web pages even if you are using the English version of the client. It is also possible to browse Chinese pages even if your OS is not Chinese.

8-bit clean. Do not assume that the 8th bit of a byte is unused, and can therefore be employed for your own purposes. Many character encodings use the 8th bit for non-ASCII characters.

8 位清洁。不要假设字节的第 8 位未使用,因此可以用于您自己的目的。许多字符编码对非 ASCII 字符使用第 8 位。

Character != byte. A character is not necessarily one byte. In Asian "multibyte" character encodings, some characters take up 2 bytes or more, while others are one byte each. Do not jump directly into the middle of a byte array. Do not increment a char * pointer by one to move to the next character. Use the libi18n functions to find character boundaries and to walk strings (see also ns/include/libi18n.h):

字符 != 字节。一个字符不一定是一个字节。在亚洲的"多字节"字符编码中,某些字符占用 2 个字节或更多,而其他字符则占用 1 个字节。不要直接跳转到字节数组的中间。不要将 char * 指针递增 1 以移动到下一个字符。使用 libi18n 函数查找字符边界并遍历字符串(另请参阅 ns/include/libi18n.h):

  • INTL_NextChar
  • INTL_CharLen
  • INTL_NextCharIdxInText
  • INTL_PrevCharIdxInText
  • etc

Also, take care when reading text into fixed-size buffers. For example, if you read some text into a 512-byte buffer, the last byte might be a partial character. You cannot pass this buffer to another module that expects whole characters.

Locale-sensitive operations. Converting a date/time integer into a string is a locale-sensitive operation. There are various date/time formatting conventions used around the world. Use XP_StrfTime() to produce a string in the appropriate format. Similarly, textual sorting rules vary depending on the country. Use the appropriate collation function: XP_StrColl().

区分区域设置的操作。将日期/时间整数转换为字符串是区分区域设置的操作。世界各地使用各种日期/时间格式约定。使用 XP_StrfTime() 生成适当格式的字符串。同样,文本排序规则因国家/地区而异。使用适当的排序规则函数:XP_StrColl()。

English protocol elements. Some protocols use strings that are in English. For example, email headers use strings like "Subject:". These should not be presented directly to the user. Instead, a localized version of the string should be retrieved from the resources. The protocol itself must still be honored, though. The string "Subject:" should still be used on-the-wire , while the translated version is presented to the user in the UI.

英语协议元素。某些协议使用英文字符串。例如,电子邮件标题使用"Subject:"等字符串。这些不应直接呈现给用户。相反,应从资源中检索字符串的本地化版本。不过,协议本身仍然必须得到尊重。字符串"Subject:"仍应在线使用,而翻译版本将在 UI 中呈现给用户。

Special encodings of non-ASCII text. Some protocols apply a special encoding to non-ASCII text in order to protect it while it is in transit over the Net. For example, RFC 2047 specifies the standard to use for transmitting non-ASCII text in email headers. These encoded strings look like this:

非 ASCII 文本的特殊编码。某些协议对非 ASCII 文本应用特殊编码,以便在非 ASCII 文本通过网络传输时对其进行保护。例如,RFC 2047 指定了用于在电子邮件标题中传输非 ASCII 文本的标准。这些编码的字符串如下所示:

=?ISO-8859-1?Q?Andr=E9?=

These strings should not be directly presented to the user. They should first be decoded. Conversely, strings must be en coded before sending them out onto the Net.

这些字符串不应直接呈现给用户。它们应该首先被解码。相反,字符串在发送到网络之前必须对其进行编码。

Use libi18n. Use libi18n wherever possible.

使用 libi18n。尽可能使用 libi18n。


Standards Compliance 标准合规性

Mozilla should adhere to all relevant standards. There are a number of RFCs from the IETF, Recommendations from W3C, and other specifications. Here is a list of some of the relevant specifications.

Mozilla 应该遵守所有相关标准。有许多来自 IETF 的 rfc、W3C 的推荐和其他规范。这里是一些相关规范的列表。

Overview of Internationalization & Localization 国际化和本地化概述

Once upon a time, in the dim, primordial past of software development, a lot of software could only "speak" one human language at a time. Each country or region needed its own version. In some cases, internationally relevant features were retrofitted onto the shipping English language product. In other cases, the English and international versions might be completely separate products, perhaps sharing some basic code but often sharing little more than the product's name. This was particularly the case with Asian versions of a product developed in North America or Europe. In either case, whether making such changes retroactively to a shipping product or developing parallel versions, the result was usually a lengthy, expensive product cycle.

曾几何时,在软件开发昏暗、原始的过去,许多软件一次只能"说"一种人类语言。每个国家或地区都需要自己的版本。在某些情况下,与国际相关的功能被改装到运输的英语产品上。在其他情况下,英文版和国际版可能是完全独立的产品,可能共享一些基本代码,但通常只共享产品名称。对于在北美或欧洲开发的产品的亚洲版本来说,情况尤其如此。无论哪种情况,无论是追溯性地对运输产品进行此类更改还是开发并行版本,结果通常都是漫长而昂贵的产品周期。

Even if a piece of software was functionally suitable for use in more than one country, producing an appropriate language version of that product was difficult. This is because the menus, dialog boxes and messages which make up that product's user interface (UI) were often written directly in the program source code (using printf type constructs). Successfully translating source files can be quite difficult, expensive and time-consuming for a number of reasons. It requires a combination of linguistic and engineering knowledge that not commonly available; mistakes are easy to make; and the translated code often does not function as expected, because its authors made assumptions about things like the language of the UI, the length or position of English words (which change when translated to other languages), etc.

即使一个软件在功能上适合在多个国家使用,也很难为该产品制作适当的语言版本。这是因为构成该产品用户界面 (UI) 的菜单、对话框和消息通常直接在程序源代码中编写(使用 printf 类型构造)。由于多种原因,成功翻译源文件可能非常困难、昂贵且耗时。它需要语言学和工程学知识的结合,而这些知识并不常见;错误很容易犯;翻译后的代码通常不能按预期运行,因为它的作者对 UI 的语言、英语单词的长度或位置(翻译成其他语言时会发生变化)等因素做出了假设。

This is bad. 这很糟糕。

Luckily, more and more software developers have realized that if they design their products from the beginning to understand the requirements of multi-locale computing, these products will reach global markets sooner, for less expense, and probably be much more successful than products designed the "bad old fashioned way." Such multi-locale products can be called "globally enabled". Globally enabled software is software that supports a wide range of languages, human cultural conventions, fonts, encodings and other features that make it useful, not just in one country or region, but around the world. Additionally, the user interface for globally enabled products is separate from the core instruction code, allowing the software to be translated without requiring recompilation. Since globally enabled software doesn't make assumptions about language of the user interface, the translated programs are more robust, requiring less "fixing" or special enhancements to support individual languages. This speeds the release of translated programs.

幸运的是,越来越多的软件开发人员已经意识到,如果他们从一开始就设计产品时就了解多区域计算的需求,那么这些产品将更快地进入全球市场,费用更低,并且可能比以"糟糕的老式方式"设计的产品更成功。这种多区域设置产品可以称为"全局启用"。全球支持的软件是支持多种语言、人类文化习俗、字体、编码和其他功能的软件,这些功能使其不仅在一个国家或地区,而且在全世界都很有用。此外,全局支持产品的用户界面与核心指令代码是分开的,无需重新编译即可翻译软件。由于全局支持的软件不会对用户界面的语言做出假设,因此翻译后的程序更加健壮,需要更少的"修复"或特殊增强来支持单个语言。这加快了翻译程序的发布速度。

This is good. 这很好。

The Mozilla family (Navigator and Communicator) is (and must remain) globally enabled. The core Mozilla binary executable for each platform supports computing in North American English, Western European, Central European, Chinese, Japanese and Korean locales. The user interface is contained in resource files and is, for the most part, completely separate from the core binary. Having the UI disconnected from the core code means you do not need a Japanese language version of Mozilla to browse Japanese web pages. You can use an English, a French or whatever version of Mozilla for Japanese browsing and vice versa (with the appropriate fonts and set-up, of course).

Mozilla 系列(Navigator 和 Communicator)是(并且必须保持)全局支持的。每个平台的核心 Mozilla 二进制可执行文件支持北美英语、西欧、中欧、中文、日语和韩语语言环境的计算。用户界面包含在资源文件中,并且在大多数情况下与核心二进制文件完全分开。将UI与核心代码断开连接意味着您不需要日语版本的Mozilla来浏览日语网页。您可以使用英语、法语或任何版本的 Mozilla 进行日语浏览,反之亦然(当然,使用适当的字体和设置)。

This is very good. 这很好。

The remainder of this document is intended to convey enough information so that you can continue to make Mozilla a globally enabled software project. Future Mozilla products must at the very least continue to support the level of internationalization we have today. As we move forward, we want to extend Mozilla to support more and more languages, encodings and other globally relevant feature sets.

本文档的其余部分旨在传达足够的信息,以便您可以继续使 Mozilla 成为全球支持的软件项目。未来的Mozilla产品至少必须继续支持我们今天的国际化水平。随着我们向前发展,我们希望扩展Mozilla,以支持越来越多的语言、编码和其他全球相关的功能集。

This will be really, totally good.

这将是非常非常好的。

Definitions 定义

Before proceeding, let's establish some definitions of the major terms and concepts used herein:

在继续之前,让我们建立本文使用的主要术语和概念的一些定义:

Internationalization (a.k.a. Globalization, a.k.a. Enabling) Designing and developing a software product to function in multiple locales. This process involves identifying the locales that must be supported, designing features which support those locales, and writing code that functions equally well in any of the supported locales.
Localization Modifying or adapting a software product to fit the requirements of a particular locale. This process includes (but may not be limited to) translating the user interface, documentation and packaging, changing dialog box geometries, customizing features (if necessary), and testing the translated product to ensure that it still works (at least as well as the original).
Localizability The degree to which a software product can be localized. Localizable products separate data from code, correctly display the target language and function properly after being localized.
i18n Acronym for "internationalization" ("i" + 18 letters + "n"; lower case i is used to distinguish it from the numeral 1 (one)).
L10n Acronym for "localization" ("L" + 10 letters + "n"; upper case L is used to distinguish it from the numeral 1 (one)).
L12y Acronym for "localizability" ("L" + 12 letters + "y"; upper case L is used to distinguish it from the numeral 1 (one)).
Locale A set of conventions affected or determined by human language and customs, as defined within a particular geo-political region. These conventions include (but are not necessarily limited to) the written language, formats for dates, numbers and currency, sorting orders, etc.
Resource 1. Any part of a program which can appear to the user or be changed or configured by the user. 2. Any piece of the program's data, as opposed to its code.
Core product The language independent portion of a software product (as distinct from any particular localized version of that product - including the English language version). Sometimes, however, this term is used to refer to the English product as opposed to other localizations.

Justifications (or "Why Should I Care?") 理由(或"我为什么要关心?

The Internet is arguably the single biggest revolution in human communications since some forgotten caveman learned to signal the rest of the tribe by beating on a hollow log. This is a global medium with the power to connect the world's disparate peoples, but only if the delivery mechanism can adequately handle the confusing Babel that is the hodgepodge of languages, encodings and local expectations that make up our different cultures. The problems associated with trying to re-engineer mono-lingual products to support different locales was discussed in the opening section of this document. Properly enabled software is the answer here, not a barrage of unique, mono-lingual applications.

互联网可以说是人类通信领域最大的一场革命,因为一些被遗忘的穴居人学会了通过敲打空心圆木来向部落的其他成员发出信号。这是一种全球媒介,有能力将世界上不同的民族联系起来,但前提是传递机制能够充分处理令人困惑的巴别塔,这是构成我们不同文化的语言、编码和当地期望的大杂烩。本文档的开头部分讨论了与尝试重新设计单语产品以支持不同语言环境相关的问题。正确启用的软件是这里的答案,而不是一连串独特的单语应用程序。

If your application is truly only relevant to a limited audience, defined by language or locale, then you probably don't need to care about internationalization or localization issues. However, if your application could be useful regardless of where in the world it is used, or your target users span multiple countries, regions or languages, why make it more difficult to reach them? By following these guidelines it will be much easier to release a successful product worldwide.

如果您的应用程序确实只与有限的受众相关(由语言或区域设置定义),那么您可能不需要关心国际化或本地化问题。但是,如果您的应用程序无论在世界的哪个地方使用都可能有用,或者您的目标用户跨越多个国家、地区或语言,为什么还要让接触他们变得更加困难呢?通过遵循这些准则,在全球范围内发布成功的产品将变得更加容易。

If shortcuts are taken and/or mistakes made during the core product's development, it can be time-consuming and expensive to correct these during the localization process. Such delays and expenses eat into the profits from localized releases (both monetarily and in terms of lost opportunities). Properly enabled products can help turn localization from a chaotic and expensive game of "catch up" into a smooth, well-oiled machine.

如果在核心产品开发过程中走捷径和/或犯了错误,那么在本地化过程中纠正这些错误可能既费时又费钱。这种延迟和费用蚕食了本地化发行的利润(无论是在金钱上还是在失去的机会方面)。正确启用的产品可以帮助将本地化从混乱而昂贵的"追赶"游戏转变为平稳、运转良好的机器。

The justification for presenting these guidelines, then, is to create an environment where internationally enabled, fully localizable products can be released as smoothly and quickly as possible. As you work on your various projects, please keep the following two catchy phrases in mind:

因此,提出这些指南的理由是创造一个环境,使国际支持的、完全本地化的产品能够尽可能顺利和快速地发布。当您从事各种项目时,请记住以下两个朗朗上口的短语:

  • One code base for the world
    面向全球的单一代码库
  • English is just another language
    英语只是另一种语言

libi18n Module Description

libi18n 模块说明

Discussion: netscape.public.mozilla.i18n or mozilla-i18n@mozilla.org
Last Update: March 31,1998

Contact: Bob Jung <bobj@netscape.com>

Introduction 介绍

The Mozilla family (Navigator and Communicator) is globally enabled. Globally enabled software shares common source code from which we build a single binary executable (per platform) that supports a wide variety of languages. The initial Mozilla source release supports Western, Central European, Chinese, Japanese, Korean, Greek, Turkish and Cyrillic languages. (For an overview of Mozilla Internationalization (I18N) and Localization (L10N), check out the Mozilla Internationalization & Localization Guidelines.)

Libi18n provides the underlying internationalization utility functions used in Mozilla to support international Web browsing and Internet Mail/News functionality. The emphasis is on underlying because there is a lot of other code that must be written in order to internationalize features.

Libi18n 提供了 Mozilla 中使用的底层国际化实用程序功能,以支持国际 Web 浏览和 Internet 邮件/新闻功能。重点是底层,因为必须编写许多其他代码才能使功能国际化。

Mozilla programmers should call the libi18n APIs wherever possible, but should also expect to write module and feature specific I18N aware code. Check out the other Mozilla modules to see how this has been done. In addition to calling libi18n, significant amount of programming has been required to internationalize the HTML layout engine, the front end (UI and text rendering) code, and mail/news.

Mozilla 程序员应该尽可能调用 libi18n API,但也应该编写模块和功能特定的 I18N 感知代码。查看其他Mozilla模块,了解这是如何完成的。除了调用 libi18n 之外,还需要大量的编程来国际化 HTML 布局引擎、前端(UI 和文本渲染)代码以及邮件/新闻。

This document only provides an overview of the libi18n module. For information on general I18N issues and the I18N of other Mozilla modules see I18N Guidelines.

本文档仅提供 libi18n 模块的概述。有关一般 I18N 问题和其他 Mozilla 模块的 I18N 的信息,请参阅 I18N 指南。

The functions that libi18n provides to other Mozilla modules include:

libi18n 为其他 Mozilla 模块提供的功能包括:

Character Code Conversion

字符代码转换

Finding Character Boundaries

寻找角色边界

Handling I18N related HTTP Headers

处理与 I18N 相关的 HTTP 标头

Line/Word Breaking (for text layout support)

换行/换字(用于文本布局支持)

Locale Sensitive Operations (collation, date/time formatting)

区域设置敏感操作(排序规则、日期/时间格式)

Mail/News Header Processing

邮件/新闻标题处理

Platform Independent String Resources

与平台无关的字符串资源

String Comparison 字符串比较

Unicode String Functions

Unicode 字符串函数

The corresponding libi18n public API specifications are documented in the International Library Reference.

History(可以看作对上文历史部分的注解)

With a very small I18N team and tight product release schedules, our strategy over the past 3 years has been to incrementally add features -- prioritized by Netscape's international market needs.

Our initial work for Netscape Navigator (NN) 1.1 focused on adding Japanese Web browsing capability. We invented the notion of a document character set and a window (or font encoding) character set and provided a stream module to convert incoming text documents from the document charset to the window charset. This streams module and various Japanese charset converters were the first libi18n functions. After the first Beta, we added the ability in libi18n to auto-detect between the 3 common Japanese charset encodings: Shift_JIS, JIS and EUC-JP.

我们对 Netscape Navigator (NN) 1.1 的最初工作侧重于添加日语 Web 浏览功能。我们发明了文档字符集和窗口(或字体编码)字符集的概念,并提供了一个流模块,用于将传入的文本文档从文档字符集转换为窗口字符集。这个流模块和各种日语字符集转换器是第一个 libi18n 函数。在第一个 Beta 版之后,我们在 libi18n 中添加了自动检测 3 种常见日语字符集编码的功能:Shift_JIS、JIS 和 EUC-JP。

NN1.1 was a significant advancement for Japanese Web browsing and was well received. However, all of its UI was still in English. In order to localize NN, we created a special "i-build" (NN1.1i) because NN1.1 was full of hard-coded strings and other localization unfriendly coding practices. We added libi18n APIs to make it easier to resource user visible strings. NN1.1i was then localized into Japanese, German and French -- Netscape's first localized releases! The localizability infrastructure created for NN1.1i was then merged back into the mainstream source code for NN2.x and later releases.

NN1.1 是日本网页浏览的重大进步,广受好评。但是,它的所有用户界面仍然是英文的。为了本地化 NN,我们创建了一个特殊的"i-build"(NN1.1i),因为 NN1.1 充满了硬编码字符串和其他本地化不友好的编码实践。我们添加了 libi18n API,以便更轻松地为用户可见字符串提供资源。NN1.1i 随后被本地化为日语、德语和法语------这是 Netscape 的第一个本地化版本!然后,为 NN1.1i 创建的可本地化基础设施被合并回 NN2.x 及更高版本的主流源代码中。

NN2.x extended our charset support beyond Western and Japanese. Our NN1.1 stream module and charset converter architecture were designed to be extensible (not Japanese centric) which made it straightforward to add Chinese, Korean and Central European charset encodings support in the NN2.0 libi18n.

NN2.x 将我们的字符集支持扩展到了西方和日语之外。我们的 NN1.1 流模块和字符集转换器架构被设计为可扩展的(而不是以日语为中心),这使得在 NN2.0 libi18n 中添加中文、韩文和中欧字符集编码支持变得简单明了。

Other NN2.x libi18n additions included:

其他 NN2.x libi18n 新增功能包括:

  • Enhancing the charset concept to be on a per window/context base instead of globally affecting all windows/contexts
    将字符集概念增强为基于每个窗口/上下文,而不是全局影响所有窗口/上下文
  • RFC1522 support to handle MIME headers. (Really these functions should migrate from libi18n to the libmime library.)
    RFC1522 支持处理 MIME 标头。(实际上,这些函数应该从 libi18n 迁移到 libmime 库。
  • XP locale support (e.g., sorting, time & date)
    XP 语言环境支持(例如,排序、时间和日期)
  • HTTP Accept-Language header support
    HTTP Accept-Language 标头支持

NN3.x libi18n added:

  • Additional charset converters for Cyrillic, Greek and Turkish
    用于西里尔文、希腊文和土耳其文的附加字符集转换器
  • Enhanced line wrapping for Asian languages (kinsoku shori)
    增强的亚洲语言换行 (kinsoku shori)

NN4.x libi18n added:

  • Unicode 2.0 converters
    Unicode 2.0 转换器
  • Korean charset auto-detection
    韩语字符集自动检测
  • HTTP Accept-Charset header support
    HTTP Accept-Charset 标头支持

The overall (not just libi18n) evolution of the Netscape client I18N and L10N support is highlighted by a table of the Netscape I18N/L10N Client History.

How It Works 它是如何工作的

Libi18n is a collection of fundamental internationalization functions. So it is difficult to write How It Works because there really are several "it"s. In this document, we mention a few of the bigger "it"s and include links to others.

Document Charset Conversion 文档字符集转换

One of the most important functions provided by libi18n is character set conversion of the incoming text data. As each block of text data is received from the net (or cache), the libi18n stream module heuristically determines (to the best of its ability) the character set encoding of the incoming document, then it converts the data block from the "document" character encoding to the "window" character encoding (usually equivalent to the font encoding) before passing the data downstream to the HTML parser and layout engine.

Currently the HTML parser and layout engine assumes HTML special characters (e.g., '<', '>') in text data passed downstream to them are encoded as ASCII values. Therefore ISO-2022-xx and other 7-bit encodings such as UTF-7 and HZ are converted to an ASCII "superset" encoding, and UCS-2 is converted to UTF-8 by the libi18n conversion module before being sent downstream to the HTML parser.

目前,HTML 解析器和布局引擎假定下游传递给它们的文本数据中的 HTML 特殊字符(例如 '<'、'>')被编码为 ASCII 值。因此,ISO-2022-xx 和其他 7 位编码(如 UTF-7 和 HZ)被转换为 ASCII"超集"编码,UCS-2 被 libi18n 转换模块转换为 UTF-8,然后再发送到下游的 HTML 解析器。

The character set converters called by the libi18n stream module must maintain state because (1) the text data may be stateful or contain multibyte characters and (2) state is needed in some cases in which libi18n auto-selects from a few character encodings (e.g., between the 3 common Japanese encodings).

libi18n 流模块调用的字符集转换器必须保持状态,因为 (1) 文本数据可能是有状态的或包含多字节字符,并且 (2) 在某些情况下需要状态,在这种情况下,libi18n 会从几个字符编码中自动选择(例如,在 3 种常见的日语编码之间)。

The actual character set conversion functions can be categorized in three types:

实际的字符集转换函数可分为三种类型:

  1. Algorithmic conversions for Chinese, Japanese and Korean (e.g., Shift_JIS <-> EUC-JP)
    中文、日文和韩文的算法转换(例如,Shift_JIS <-> EUC-JP)
  2. Table driven for 1-byte to 1-byte encodings (e.g., CP1250 <-> ISO8859-2)
    表驱动,用于 1 字节到 1 字节编码(例如,CP1250 <-> ISO8859-2)
  3. Table driven for Unicode conversions
    用于 Unicode 转换的表格驱动

The document character set encodings currently supported by Communicator are listed in the Netscape More Tips and Technical Information for International Users.

See the documentation on the Mozilla network library in the mozilla.org list of technical papers for more information on the Mozilla streams architecture.

有关 Mozilla 流体系结构的更多信息,请参阅技术论文 mozilla.org 列表中有关 Mozilla 网络库的文档。

Managing Charset Encodings 管理字符集编码

In addition, to doing the initial charset conversion of the text document data, Mozilla needs to track and manage the charset information, so that any text input, display or manipulation is performed correctly. The charset has significant effect on layout and editing including the behavior of line wrapping, selection, copy and paste. The behavior of the front ends (MacFE, WinFE and XFE) is also greatly affected by the charset information (e.g., how they measure and draw).

There are several types of Mozilla contexts (e.g., Web browsing, HTML composing, mail reading, mail composing) that need to track and use the charset information. Libi18n provides the APIs to manage the getting and setting for information in the charset object.

有几种类型的Mozilla上下文(例如,Web浏览,HTML编写,邮件阅读,邮件编写)需要跟踪和使用字符集信息。Libi18n 提供了 API 来管理 charset 对象中信息的获取和设置。

XP Locale Functions XP 区域设置函数

The Cross Platform (XP) locale functions provide platform independent APIs for string collation and date/time formatting. Because these are wrappers to the existing locale functions provided by the operating system the behavior may not be totally consistent across platforms.

Other libi18n Functions libi18n 的其他函数

There's more functionality provided by libi18n, but this document is intended to provide a brief overview. For more info on how to write code using the libi18n functions see the description of the libi18n public APIs, International Library Reference.


via:

Netscape Client I18N/L10N History
https://www-archive.mozilla.org/docs/reflist/i18n/i18n-history

Mozilla i18n & L10n Guidelines
https://www-archive.mozilla.org/docs/reflist/i18n/

libi18n Module Description
https://www-archive.mozilla.org/docs/reflist/i18n/libi18n-desc

相关推荐
Diamond技术流10 天前
从0开始学习Linux——文件管理
linux·运维·学习·安全·文件·权限·极限编程
Oo_Amy_oO15 天前
【极限编程(XP)】
低代码·极限编程
帅次25 天前
基于边缘计算的智能门禁系统架构设计分析
软件工程·团队开发·软件构建·需求分析·规格说明书·代码复审·极限编程
冰暮流星2 个月前
极限编程XP例题
笔记·极限编程
码力码力我爱你3 个月前
C++现代教程四
开发语言·c++·算法·极限编程
斐夷所非4 个月前
挑战英伟达的护城河 —— SCALE 源码直译让 CUDA 程序跑上 AMD
极限编程
斐夷所非5 个月前
优化页面加载时间
极限编程
斐夷所非5 个月前
“Hello, World!“ 历史由来
极限编程
斐夷所非5 个月前
Code Page 历史
极限编程