Languages & Internationalisation, part I ( was Intern. II )
manuel companys
mcompanys at mac.com
Tue Jan 7 19:04:01 EST 2003
The following is more specially true for 'western european languages'
speakers using macOS, but can be usefull for other R-R users as well.
EASIEST TO LOCALIZE LANGUAGES:
******************************************
No special fonts needed neither to create nor to use the software. Easy
to do since the 1984 mac 128. To type the translation easier you just
may need to select the keyboard, on the fly in the menu bar since
system 7. The fonts are called 'Western European Languages' (Latin1)
This group includes:
- Romanic languages: French, Italian, Occitanic (Provençal,
Lenguadocian, North-occitanic, Gascon), Catalan, Italian, Corsican,
Sardinian, Spanish, Portuguese [but NOT Rumanian]
- Germanic languages: English, Dutch, Platt Deutsch, German, Yiddish,
Dannish, Sweedish, Norwegian [but NOT Icelandish; I don't know about
Frisian and Feroese]
- Finno-Ougrian languages: Finnish [but NOT Hungarian; I don't know
about Estonian and other northern languages.
- Euskarian (Basque)
- many so called 'third world' languages not needing extra diacritics
(accents, cedillas, bars, etc.)
EASIER TO LOCALIZE LANGUAGES
****************************************
Central European Language (Latin 2, I guess) use a font set very
similar but with some differences in diacritized letters. Of course
both the programmer and the user need Central European Fonts; but a
1984 mac could be used as far as the language is concerned.
This group includes;
--Germanic languages: English, Dutch, Platt Deutsch, German, Yiddish
--Finno-Ougrian: Finnish, Hungarian
--Slavic languages: Polish, Cheh, Slovak [but NOT Slovene nor
Serbo-Croatian; I don't know about other slavic languages using the
latin script]
EASY TO LOCALIZE LANGUAGES
**************************************
A. LATIN EXTENDED LANGUAGES
You can
~~EITHER: 1) make a compatible font that will include all the wished
characters (with a unique ASCII adress for the most frequent characters
or by 0-offset of the diacritic). This font shall of course be given to
the user, 2) make the appropriate keyboard map configuration to make
the input easy to the translater. This solution may frighten some
people but it is easy to do since all the needed diacritcals are
already there in Western European and creating and testing the keyboard
KCHR resource with reseadit is a matter of hours, AND this solution
DOES NOT require a new brand powerful computer neither to create nor to
use the program: any mac can do that since 1984.
~~OR: you can simply use the Extended Latin Subset of Unicode. All the
fonts in macOS X have 360 ASCII adresses including all the chars
supposedly used in all the languages using the latin alphabet. If you
are an english board user, you are lucky: you have the 'Extended
english' keyboard mapping from the input language menu (not quite
ergonomic but reasonably easy to use); other wise, have an
ergonomicqlly designed keyboard mapping from apple (according to your
wishes) before the typing mistakes drive you crazy.
Using the two-byte Unicode system, many characters happen now to have
two ASCII codes since the first 256 one-byte adressable characters are
all still there. Besides, this Unicode stuff is not yet perfectly
finalized, most fonts still are uncomplete and/or have blurry or not
style-matching characters. AND MOST IMPORTANT: you need a fast
powerfull computer with lots of RAM and a disc with hundreds of
Megabytes.
Using Unicode in 2003 to write such languages as lituanian, esperanto,
slovene, croatian, albanese, romanian, maltese or turkish, amounts to
use a whole battery of bazookas to kill a mosquito.** You could even
miss the mosquito and get some unespected 'dommages collatéraux' as we
say in french.
B. OTHER ALPHABETIC SCRIPT 'SIMPLE' LANGUAGES
I mean a) really alphabetic, (not syllabic like japanese katakana); b)
only one shape for each letter no matter the litteral environment
(this excludes arabic); c) not needing to change our standard
horizontal left-to-right system (this also excludes hebrew).
The case is technically the same than for Latin extended languages. You
only need the appropriate font, Cyrillic, for instqnce. Of course if
the trqsnlator is used to a latin alphabet he may want to have an
ergonomically defined keyboard according to his habits.
This group includes, among other languages, greek and the cyrillic
alphabet group which is in a pretty similar situation as the latin one:
a central nucleous ('easier': russian, ukrainian, bulgarian) and the
'extended more or less 'easy': cyrillic serbo-croat (Serbia,
Montenegro) and most non-indoeuropean languages of the former Soviet
Union.,
C. SIMPLE SYLLABIC SCRIPT LANGUAGES
I mean a) close to one-to-one correspondance between characters and
phonemic syllables b) no (or very few) context sensitive shape changes
and c) not needing to change our horizontal left-to-right system. This
is exactly the case* of japanese Katakana; or Hiragana, for that matter.
Technically speaking the problem is very similar as with alphabetical
languages: there is enough room in the 256 single-byte adressed apple
fonts to fit the whole katakana AND the latin alpha-numeric plus
frequently used punctuation and symbols. You even can find such fonts.
You just may need to make (or have made for you) an ergonomical
keyboard fitting your habits.
.................................
[For the not so easy languages, "La suite au prochain numéro"** as we
say in french. Well, if I am not kicked off the list before, for boring
all the nice people out there with my linguistic junk.] ;-)
______________
* Ok, they write HA for 'wa' but this does not deserve another
Hieroshima or Nagasaki trick, does it?
** Bill Gate's Entourage mqiling program does still better: it compells
the europeans to switch to Unicode if they want to use their currrency
sign, which has had an accessible ASCII adress since 1984 (the euro
sign took the place of the so called 'currency" sign nobody as ever
used for decades.
*** 'To be continued in the next issue.
More information about the use-livecode
mailing list