Why do I still need MacToISO, when working with UTF-8?

Matthias Rebbe matthias_livecode_150811 at m-r-d.de
Mon Jan 16 12:49:19 EST 2017


Mark,

thanks again for your explanations.

That explains some strange things here in the past… ;)

Matthias

Matthias Rebbe
Bramkampsieke 13
32312 Lübbecke
Tel	+49 5741 310000
    	+49 160 5504462
Fax: +49 5741 310002
eMail: matthias at m-r-d.de <mailto:matthias at m-r-d.de>

BR5 Konverter - BR5 -> MP3 <http://matthiasrebbe.eu/portfolio/produkte/brx/>
> Am 16.01.2017 um 18:45 schrieb Mark Waddingham via use-livecode <use-livecode at lists.runrev.com <mailto:use-livecode at lists.runrev.com>>:
> 
>>> Am 16.01.2017 um 18:30 schrieb Mark Waddingham via use-livecode <use-livecode at lists.runrev.com <mailto:use-livecode at lists.runrev.com> <mailto:use-livecode at lists.runrev.com <mailto:use-livecode at lists.runrev.com>>>:
>>> Sure - here is how I'd slightly adjust Tiemo's code:
>>> *put fld "name" into myName*
>>> -- ...
>>> *open file myFile for binary write*
>>> *write textEncode(myName, "utf8") to file myFile*
>>> *close file myFile*
>>> -- ...
>>> *open file myFile for binary read*
>>> *read from file myFile until EOF*
>>> *close file myFile*
>>> *put textDecode(it, "utf8") into myName*
>> I always thought, that binary reading a text file would result into a
>> string with the same encoding and  line endings.
> 
> When you read a file in binary mode, what you get is binary data *not* text - i.e. it is just a sequence of bytes. The engine cannot tell by just looking at the bytes what it could be therefore you have to explicitly convert it to something - in this case we convert the sequence of bytes to text by interpreting the bytes as UTF-8.
> 
> One of the biggest changes from 6 to 7 is that binary strings and text strings are no longer the same thing.
> 
> Prior to 7, the engine didn't really 'know' anything about Unicode - the field did to a certain degree, but nothing else - and it assumed that binary strings and text strings were the same thing. Indeed, on Mac the engine would assume that a binary string could be treated as a MacRoman encoded string (as MacRoman is one byte, one char); and on Windows/Linux it would assume that a binary string could be treated as a Latin-1 encoded string (also a one byte, one char encoding).
> 
> This equivalence has been retained in 7 from 6 - which is why stacks written in 6 work exactly the same as they do in 7. Specifically, there is an implicit auto conversion between binary strings and text strings using the platform encoding:
> 
>    put <binary data> into tVar
>    put "foobar" after tVar
> 
> In the second line here, the engine will first convert tVar to a text string (assuming MacRoman encoding on Mac) then append "foobar".
> 
>> So when i binary read UTF8 files  i still have to textDecode it to UTF8?
> 
> Yes - because if you read something as binary, then it is just that - binary - it has no structure and is just a sequence of bytes.
> 
> A perhaps more obviously example is that you have to explicitly decompress data which has been compress'd and explicitly arrayDecode data which has been arrayEncode'd. When it is just data, the engine doesn't know what it could be so the code processing it has to explicitly specify a conversion.
> 
> Warmest Regards,
> 
> Mark.
> 
> -- 
> Mark Waddingham ~ mark at livecode.com <mailto:mark at livecode.com> ~ http://www.livecode.com/ <http://www.livecode.com/>
> LiveCode: Everyone can create apps
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com <mailto:use-livecode at lists.runrev.com>
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list