curlyquotes, character sets, livecode, and english
Dar Scott
dsc at swcp.com
Sun May 26 18:03:58 EDT 2013
I think I may have misunderstood the problem.
If your db wants UTF8 and you have Mac, then maybe you can convert Mac to UTF8.
If you say a Mac string is UTF8 (when it is not) and something checks, then it will interpret numToChar(213) as the first byte of a two-byte sequence. If the next byte is not in the right range for continuation bytes in a UTF8 multibyte sequence (128 to 191) then UTF8 checking will fail.
ASCII is valid UTF8, so if you limit yourself to characters 7-bits, then that should be OK.
Dar
On May 26, 2013, at 3:41 PM, Dr. Hawkins wrote:
> On Sun, May 26, 2013 at 10:54 AM, Dar Scott <dsc at swcp.com> wrote:
>
>> UTF8 is one of the "languages" of uniEncode and uniDecode functions.
>> Maybe you can convert to and from UTF8 as you need. Or pull unicode out
>> of the field and convert that.
>>
>> Character 213 is the first of a two byte sequence in UTF8, so a bubble-gum
>> and tinfoil solution for that lone character would be to force a valid byte
>> behind it and then remove it when you need. This a bit ugly (I am hesitant
>> to mention it) but it might have some advantages in your application.
>>
>
> I'm trying to figure out where they're even coming from!
>
> The 213 wasn't utf, but either a mac curlyquote or section symbol (§).
>
> I don't *think* that two byte codes should be coming out of a normal mac
> set to english . . .
>
> --
> Dr. Richard E. Hawkins, Esq.
> (702) 508-8462
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list