Diacritical Marks, Cyrillic Encodings, and related "poo"

Richmond Mathewson geradamas at yahoo.com
Wed Dec 5 10:17:46 EST 2007

Malte Brill wrote:

"that´s what I fear, however it is not what the docs

"Uppercase letters, including special characters with
marks, are converted to the lowercase equivalents. All
characters, including lowercase letters, numbers,
punctuation, and  
special characters with no upper or lower case, are
left unchanged by  
the toLower function."

What gives?"

These 'give':

1. Anglocentric computer world.

2. The standard codepage listings do not have ø marked
as the lower case of  Ø: I am sorry if that offends
the Scandinavians who use an O + umlaut; but to make
an Ö I have to hit 2 different keys; one for the
umlaut and another for the O: And That Is The Rub!

To produce an ö and an Ö the first keyDown (i.e. to
produce the umlaut) is the same, followed by a
'standard' O.

or, as Mark Weider put it more succinctly the other
day: "Welcome to Scotland".


Had a very 'sexy' week as had to extract 150 Microsoft
Word for DOS documents off an old DOS disk (enslaved
it to a Pentium 3 running Ubuntu 5 point something)
and convert them into Open Office format- and just to
spice things up they were written in Bulgarian using a
non-standard Cyrillic encoder written by a man who is
now dead.

Open Office 2.3.0 on my Macintosh actually managed to
convert about 75% of the text using a Word for
DOS/Russian converter - so obviously the dead
programmer based his Bulgarian widget on that.

A friend of mine is trying to "chew" all the documents
through a VBA macro on Office 97 under Windows
Millennium  - but it is going to be a very long night.

Anybody with bright ideas would feature in my prayers
of thanks for many weeks!

sincerely, Richmond Mathewson


A Thorn in the flesh is better than a failed Systems Development Life Cycle.

Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com

More information about the Use-livecode mailing list