Importing unicode UTF8 text files - followup

Peter TB Brett peter.brett at livecode.com
Wed Aug 19 15:34:49 EDT 2015


On 2015-08-19 21:09, J. Landman Gay wrote:
> Just to follow up: When importing foreign language files, "file" isn't
> working. Non-ascii characters show up as question marks and textDecode
> does nothing.
> 
> Using "binfile" and then textDecoding does import correctly but then
> text chunking by lines fails. Line endings are imported as 2 bytes
> using byteToNum 13,10 (in that order) which isn't the line ending
> standard for any OS. I have to replace those specifically.
> 
> These are text files saved as UTF-8, created on Mac OS X and imported
> on Mac OS X. I'm using LC 7.0.6.

Hmm, this all sounds quite problematic.


Obvious thing to check: are you sure your files are valid UTF-8?  The 
following command will print "0" if the file is valid UTF-8 and "1" 
otherwise:

     iconv -f UTF-8 your_file > /dev/null ; echo $?


Otherwise, if you have a test case that you don't mind sharing, could 
you please file a bug report and add me (e-mail address below) to the Cc 
list?  Otherwise, please e-mail me directly.  If you could test with 
LiveCode 7.0.1-rc-1, that would also be quite helpful.

I *definitely can't* guarantee a quick fix, but it's possible something 
obvious is going wrong or we can find a quick workaround.

                                      Peter

-- 
Dr Peter Brett <peter.brett at livecode.com>
LiveCode Engine Development Team





More information about the use-livecode mailing list