Contents Previous Next

15. How can I represent these characters in E-mail or on Usenet?

Accented characters are not included in standard, 7-bit ASCII. Since only 7-bit ASCII can be reliably transmitted over the net, this leads to problems when trying to use Esperanto in E-mail and Usenet news. These problems are not unique to Esperanto; all languages with accents have them.

Two approaches are possible: using ASCII to represent the accented characters, or using 8-bit codes and sending them somehow over the net.


Using Standard ASCII

There are two major work-arounds to represent Esperanto's accented letters using standard 7-bit ASCII: using the letter "h" to represent the circumflex, and using the letter "x" to represent all accents.

                    ^    ^    ^    ^    ^    -
Esperanto letter:   c    g    h    j    s    u

"h" method:         ch   gh   hh   jh   sh   u

"x" method:         cx   gx   hx   jx   sx   ux

The "h" method is canonical in Esperanto since the Fundamento de Esperanto, which forms the basis of the language, expressly provides for it. Note that "u with breve" is represented by "u" alone, not "uh".

The "x" method is a recent coinage and first appeared among computer users; it is used only on the Net.

The following arguments are made in favour of the "x" method:

The "x" method was very popular in the early years of the net, but the "h" method has clearly been gaining ground recently, as more "ordinary" Esperantists (as opposed to professional computer users, etc.) have started using the net. Either method may be used with confidence.

The "x" method is perhaps more suitable for beginners, since it removes all ambiguity, so that a beginner won't try to look up "flug^aveno" in the dictionary.

Other methods are also used, such as typing a circumflex accent (^) before or after the accented letter, but these are rarer.

These work-arounds should only be used when one is restricted to 7-bit ASCII. It is wrong to use them when the real characters are available. All word processing programs can handle the accented letters correctly; most typewriters (especially electronic typewriters) can also do so. It is also wrong to use these work-arounds when hand-writing.


Using 8-bit Codes

Esperanto is covered by the 8-bit encoding known as Latin-3 (ISO 8859-3:1988). Since 8-bit codes usually cannot be reliably transmitted over the net, some "data massaging" is necessary.

For E-mail, a standard known as MIME (Multi-Purpose Internet Mail Extension) converts 8-bit characters to 7-bit ASCII for transmission, and converts the message back to 8 bits upon reception. Many E-mail programs can do this conversion automatically; however, users with shell accounts (especially students) often cannot see MIME messages properly. For this reason, one should ensure that the recipient's system supports MIME before sending messages in this format.

The use of MIME in Usenet is neither specifically permitted nor expressly prohibited. Most newsreaders can't handle postings in MIME, so it is best not to use it in Usenet.

Some users post messages in soc.culture.esperanto and other Usenet groups using "raw" Latin-3 codes, without attempting to "protect" them with a 7-bit encoding. This has lead to some heated discussions between those who say that they can receive the original 8-bit Latin-3 codes, and those who say that they often (or always) receive gibberish.

Even if the codes are transmitted properly, they can only be viewed as Esperanto characters if a Latin-3 font is used; users whose language requires the use of an incompatible 8-bit font (e.g. Russian and Japanese) will have problems viewing these characters in any event.

Esperanto's accented characters are covered by the incipient "wide character" standard Unicode (ISO 10646-1:1993), so these problems will be solved if and when Unicode is widely adopted and implemented. Unicode is a widely endorsed 16-bit character code covering all languages, including non-alphabetic languages such as Chinese and Japanese.


Recommendations

For everyday use, it is probably best to use either the "h" method or the "x" method, both for E-mail and for Usenet news. These methods are widely used and recognized, and both work well in practice.

If one is sure that the recipient can handle MIME messages, then this format can be used for E-mail.

No satisfactory 8-bit solution exists today for Usenet. Either the "h" method or the "x" method should be used for Usenet news.


Contents Previous Next