Dos Ascii to Windows Ansi or Mac Roman

Dar Scott dsc at swcp.com
Mon Aug 6 16:15:37 EDT 2012


It just dawned on my what I wrote in the last paragraph.  Those are the only characters in common in all three character sets, so conversion is free if you don't need weird control characters.  

Unless you don't really mean ASCII, but mean some DOS code page.

(In Europe ASCII is called USASCII, which I consider to be disrespecting the work some Americans did in computers, communications and early character-set standardization, but it is in standards that way, so I'm fighting a losing battle if I complain.  To us old-timers, ASCII is fundamental.  I prefer to honor that history and put qualifiers on all enhancements and variations.  Even 35 years ago I got onto Bill about his careless speech about character sets.  Character sets have been a mess.  I think Unicode is wonderful.  Well, even Microsoft can mess that up.)

Dar


On Aug 6, 2012, at 1:57 PM, Dar Scott wrote:

> Quite right.  Characters above 128 are not going to match.  And one would question some of the control characters; they have images in Windows "Ansi" code pages.  
> 
> ASCII is a special case of UTF8, so the uniEncode and uniDecode functions might be worth exploring.  You can consider ASCII to be UTF-8 because ASCII is 7 bit and bytes for all Unicode in UTF-8 that are outside of ASCII have the msb set.  However, the only applicable character set in the list is "Ansi" and who knows what that really means.  It might mean the host character set either Mac or some Windows code page.
> 
> Long ago I created a rash of Unicode bug entries and I think I had converters then, so they must not be too hard to create.  
> 
> If you are looking at LF, CR, Tab, space, and printables (character coded 33 to 126) then they are the same in almost all applicable character sets.  But as soon as you type a dingbat on your Mac, you have more work.  (By CR I mean the ASCII CR, not the newline used in LiveCode which is LF.)
> 
> Dar
> 
> 
> 
> On Aug 6, 2012, at 1:13 PM, Devin Asay wrote:
> 
>> Actually, only the first 128 characters are reliably consistent. The upper 128 characters vary between DOS (Code Page 437) and "standard" encoding ISO-8859-1 (which, in turn, is slightly different from Windows CP 1252). The only built-in functions in LC are the MacToISO and ISOtoMac functions, so that won't help you here. You'll have to write your own function to replace upper-ascii characters. There is a handy comparison of the various common ascii character sets at <http://www.alanwood.net/demos/charsetdiffs.html>. Wikipedia entries for the various encodings can also be helpful, as they have complete character charts.
>> 
>> HTH
>> 
>> Devin
>> 
>> 
>> On Aug 6, 2012, at 12:10 PM, Bob Sneidar wrote:
>> 
>>> The first 256 characters should be the same in both. Do you have characters that exceed the normal ASCII characters? 
>>> 
>>> Bob
>>> 
>>> 
>>> On Aug 6, 2012, at 11:04 AM, Matthias Rebbe wrote:
>>> 
>>>> Hi,
>>>> 
>>>> i have here an ascii text, created under Dos, which i have to convert at least to windows ansi. Because i have to do this very often in future i wanted to create a routine with LiveCode which does the convert for me. Can this be done with a livecode command/function or do i have to create a script to replace the wrong displayed characters?
>> 
>> 
>> Devin Asay
>> Humanities Technology and Research Support Center
>> Brigham Young University
>> 
>> 
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list