chars changed in a fld
Monte Goulding
monte at sweattechnologies.com
Sat Nov 30 17:52:42 EST 2013
On 01/12/2013, at 9:02 AM, Yves COPPE <yvescoppe at skynet.be> wrote:
> I mean with a LiveCode script!
> The link you post is another computer language …;I don’t understand this language
> I hope LC 7 will help me if you can’t …
What the answer on SO says is it's quite reliable (particularly if your string is long) to just check the validity of the UTF8. That's why I posted the wikipedia article which discusses invalid byte sequences in UTF8. So... without reading too deeply into it and while coding in an email client:
function ValidUTF8 pString
repeat with tCharNum = 192 to 193
if numToChar(tCharNum) is in pString then return false
end repeat
repeat with tCharNum = 245 to 255
if numToChar(tCharNum) is in pString then return false
end repeat
return true
end ValidUTF8
Now... just because it's valid UTF8 doesn't mean it's definitely UTF8 however some editors will encode the unicode byte order mark before it and given these days UTF8 is relatively likely you might do something like this:
function IsUTF8 pString
-- check for byte order mark
if char 1 to 3 of pString is numToChar(239)&numToChar(187)&numToChar(191) or pString is an ascii string then
return true
else
-- here we are having an educated guess that it's UTF8
return ValidUTF8()
end if
end IsUTF8
It all gets more complicated though if your file could be UTF16 or UTF32 or even modified UTF8 or some other thing... lots of options and I'm not sure how smart the engine will be about this just yet... there's libraries for this stuff so maybe they will incorporate an appropriately licensed one: https://code.google.com/p/uchardet/
Cheers
--
Monte Goulding
M E R Goulding - software development services
mergExt - There's an external for that!
More information about the use-livecode
mailing list