character sets- missing feature?

Alex Rice alex at mindlube.com
Wed Oct 1 09:31:00 CDT 2003


On Wednesday, October 1, 2003, at 03:14  AM, Tuviah Snyder wrote:

> Yeah support for polish as a language will be in 2.1.1. So you'll be 
> able to
> use
>
> set the unicodetext of fld 1 to uniencode(sometext,"polish")

Tuviah, this will be superb- I beg you to implement it for 2.1.1. This 
is the only capability that Rev doesn't offer for my current project.

But I believe the text encoding name will have to be passed as a 
parameter, as Robert Brenstein notes. At least an optional 
parameter.Unless you have an extremely smart encoding guesser. Even 
TextEdit.app does not correctly guess the encoding this file with 
ISO-8859-2, I have to pick the encoding manually. Cocoa has a pretty 
good text encoding framework, so if it can't guess the encoding it must 
be a hard problem.

For reference, I'm using Project Gutenberg, which has approx 8,000 
files- the books are text files mostly with 7-bit ASCII, some with 
8-bit ASCII. I think the 8-bit ones are where the ISO encodings come 
into play.

There probably hundreds of files that have encodings other than the 
ISO-8859 that Rev supports. Of those maybe a few are Unicode already, 
but most are like this Polish file with  various ISO-8859-x variants.

Alex Rice <alex at mindlube.com> | Mindlube Software | http://mindlube.com

what a waste of thumbs that are opposable
to make machines that are disposable  -Ani DiFranco




More information about the use-livecode mailing list