curlyquotes, character sets, livecode, and english
Dar Scott
dsc at swcp.com
Sun May 26 13:54:51 EDT 2013
UTF8 is one of the "languages" of uniEncode and uniDecode functions. Maybe you can convert to and from UTF8 as you need. Or pull unicode out of the field and convert that.
Character 213 is the first of a two byte sequence in UTF8, so a bubble-gum and tinfoil solution for that lone character would be to force a valid byte behind it and then remove it when you need. This a bit ugly (I am hesitant to mention it) but it might have some advantages in your application.
Dar
On May 26, 2013, at 11:30 AM, Dr. Hawkins wrote:
> I can't believe that this one flummoxed me as long as it did.
>
> SQLite, like the honeybadger, just don't care . . .
>
> I got hung up on a curlyquote.
>
> I'll definitely have people pasting in from whatever sources, possibly the
> wrong one for the platform (ever read from mac or unix a web page written
> by someone who thinks that MS word is a standard?).
>
> There is no possibility of my application ever being used in an application
> other than English.
>
> For that matter, there is no possibility of it being used in a non-US
> country.
>
> postgres chokes on character 213, as an illegal utf8.
>
> The simple solution seems to be to filter any input based on the host OS,
> turning it into utf8.
>
> But what about the mac user who pastes from an ms data, or the ms who
> pastes from a webpage.
>
> Is there a "sane" way to turn english text into a single usable format?
>
> I'd be tempted to go pure 7 bit ascii, but there's enough names with extra
> characters it doesn't support to make this a non-starter.
>
> --
> Dr. Richard E. Hawkins, Esq.
> (702) 508-8462
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list