Quoted-Printable & Base64 Unicode Text in LC7
Igor de Oliveira Couto
igor at semperuna.com
Sun Jun 1 13:40:29 CEST 2014
Dear LC Gurus,
Using LC7-dp6, I’m trying to parse some raw email messages that have headers with international characters. Headers containing non-ascii characters should always be encoded in either base64 or ‘quoted-printable’ format. The header format then becomes:
* =?charEncoding?B?encodedString?= (for base64)
* =?charEncoding?Q?encodedString?= (for quoted-printable)
So, a “from” header might look like this: from: "=?UTF-8?B?UXVhbGljb3JwIFNhw7pkZQ==?=“ <somone at example.info>
A sample using quoted-printable would be: from: =?utf-8?Q?=E2=98=85?= Brittni Seger =?utf-8?Q?=E2=98=85?= <example at example.us>
Using matchText() it’s easy to extract the encoded string, but I’m having a couple of issues:
1) How to decode from “quoted-printable” to normal text? Is there a ready-made function somewhere?
2) LiveCode’s base64decode() function seems to assume that we are always dealing with ascii text - this is using version 7.0-dp6. If I get the base64decode of "UXVhbGljb3JwIFNhw7pkZQ==“ (the sender of the first example above), I get "Qualicorp Sa√∫de”, when I should be getting “Qualicorp Saúde”.
I guess that somehow I should be telling LiveCode that these characters are UTF-8. What function(s) do we use for converting between encodings in LiveCode 7? The functions that we would have used in previous versions (uniEncode, uniDecode) are now deprecated, so in a situation like this, what should we use? And, shouldn’t LC7 assume by default that everything is unicode?
Any guidance would be much appreciated.
More information about the use-livecode