curlyquotes, character sets, livecode, and english

Dar Scott dsc at swcp.com
Sun May 26 18:03:58 EDT 2013


I think I may have misunderstood the problem.

If your db wants UTF8 and you have Mac, then maybe you can convert Mac to UTF8.  

If you say a Mac string is UTF8 (when it is not) and something checks, then it will interpret numToChar(213) as the first byte of a two-byte sequence.  If the next byte is not in the right range for continuation bytes in a UTF8 multibyte sequence (128 to 191) then UTF8 checking will fail.

ASCII is valid UTF8, so if you limit yourself to characters 7-bits, then that should be OK.

Dar



On May 26, 2013, at 3:41 PM, Dr. Hawkins wrote:

> On Sun, May 26, 2013 at 10:54 AM, Dar Scott <dsc at swcp.com> wrote:
> 
>> UTF8 is one of the "languages" of uniEncode and uniDecode functions.
>> Maybe you can convert to and from UTF8 as you need.  Or pull unicode out
>> of the field and convert that.
>> 
>> Character 213 is the first of a two byte sequence in UTF8, so a bubble-gum
>> and tinfoil solution for that lone character would be to force a valid byte
>> behind it and then remove it when you need.  This a bit ugly (I am hesitant
>> to mention it) but it might have some advantages in your application.
>> 
> 
> I'm trying to figure out where they're even coming from!
> 
> The 213 wasn't utf, but either a mac curlyquote or section symbol (§).
> 
> I don't *think* that two byte codes should be coming out of a normal mac
> set to english . . .
> 
> -- 
> Dr. Richard E. Hawkins, Esq.
> (702) 508-8462
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the Use-livecode mailing list