Getting Kanji from a .csv file

Dar Scott dsc at swcp.com
Fri Jun 7 00:45:15 EDT 2013


I don't know what characters the field might throw away.  So, putting the file into the field and then modifying the field seems scary to me.  Maybe all the data is there, but maybe not.  

I'd set the field kinda like this:

set the unicodeText of field "Processed File" to the uniEncode( URL ("file:"&kFile1), "UTF8")

Or something like that; just not putting it into the field first.  

And then save the field much like this:

put uniDecode( the unicodeText of field "Processed File", "UTF8" ) into URL ("file:"&kFile2)

You can use "file:" with UTF-8.  No ghost ASCII CR or LF will show up in the representation of any characters other than CR and LF.  

There are probably typos or gross errors there, so fix as you need.

However, even some of what you set might be lost if the field does not need it.  I don't know if the field will throw away the byte-order character.  This is often included in files to help programs identify the contents.  Well, its designed to show byte-order but people do use it as a Unicode form signature.  If that gets lost you might need to put one in at the start of the file.  

So, if you still have problems, off the top of my head and full of typos and gross errors, you can do something like this to add the byte-order character:

set the useUnicode to true  -- change the behavior of numToChar()
put uniDecode( numToChar(0xFEFF) & the unicodeText of field "Processed File", "UTF8") into URL("file:"&kFile2)

Dar


On Jun 6, 2013, at 9:58 PM, Howard Bornstein wrote:

> Hi Phil (and others),
> 
> Thanks for the response and thanks especially to Devin for the excellent
> article. I was able to get the Kanji to appear properly after processing
> the whole file and then issuing this command:
> 
> set the unicodetext of fld "ProcessedFile" to the uniencode(fld
> "ProcessedFile, "UTF8")
> 
> Very easy when you know what to do :-)
> 
> One puzzle still remaining though. After I got the Kanji to show up
> properly, I tried saving the file to a text file and when I opened it up in
> TextEdit, I had lost all the Kanji again. This surprised me because I
> thought that TextEdit was a UTF8-compliant app.
> 
> I solved the problem by simply cutting and pasting from LiveCode to a
> TextEdit document (Kanji came intact) but I would still like to know how I
> could have saved the file as a text file from within LC and still have it
> work. Any ideas?
> 
> 
> 
> On Wed, Jun 5, 2013 at 8:42 PM, Phil Davis <revdev at pdslabs.net> wrote:
> 
>> Hi Howard,
>> 
>> From one unicode-ignorant soul to another -
>> 
>> Devin's explanation about LC & Unicode got me started:
>>    http://livecode.byu.edu/**unicode/unicodeInRev.php<http://livecode.byu.edu/unicode/unicodeInRev.php> -- the good part is about a third of the way down
>> 
>> Using this info + LC's various unicode functions + the styledText of a
>> field, I was recently able to paste multi-line Arabic text correctly. If I
>> can do that, you can do Kanji. Really! It reads left-to-right doesn't it?
>> 
>> Best -
>> Phil
>> 
>> 
>> 
>> 
>> On 6/5/13 7:23 PM, Howard Bornstein wrote:
>> 
>>> I have a client who wants me to do some processing on a spreadsheet file
>>> that has been saved in .csv format. One of the fields contain either
>>> English or Japanese. When I look at the fields with the Japanese, it looks
>>> like gibberish. It does not display as Kanji.
>>> 
>>> I believe the full data is still there because if I open it up in Numbers,
>>> the Kanji is displayed correctly. However, I need to use the .cvs file to
>>> process and I can't, for the life of me, make the Kanji appear.
>>> 
>>> I am *way* in over my head here with regards to different languages. I
>>> 
>>> assume this is a unicode issue but I am completely ignorant in this area.
>>> 
>>> My question is: how can I take a .cvs file, which contains some Kanji text
>>> but doesn't display as Kanji, and convert it so that, as a text file, it
>>> displays as Kanji again. I don't care where this conversion takes place--I
>>> am doing a bunch of other processing of the file in LC so it can be
>>> anywhere in the process. I'm not doing anything with the Kanji itself
>>> except displaying it.
>>> 
>>> I'd appreciate any help but if it involves unicode, please assume you are
>>> talking to an imbecile.
>>> 
>>> TIA
>>> 
>>> 
>> --
>> Phil Davis
>> 
>> 
>> ______________________________**_________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>> 
> 
> 
> 
> -- 
> Regards,
> 
> Howard Bornstein
> -----------------------
> www.designeq.com
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list