Getting Kanji from a .csv file

Dar Scott dsc at swcp.com
Wed Jun 5 23:46:51 EDT 2013


There are several standard ways to encode Japanese.

If we assume the encoding is Unicode, there are several encoding forms.

Some applications put a byte-order character at the beginning of a Unicode file.  This can be used not only to determine byte order, but also the encoding form.  If you see the first three bytes are EF BB BF, then you have UTF8 and you can throw those 3 bytes away.  (The field might throw away byte-order characters, I don't remember.)

You can also try trial and error.  

Use uniEncode() to convert to Unicode from UTF8 or, if that does not work, from Japanese (Shift-JIS).  Then set the unicodeText of the field to value you get back.  

If that does not work, try setting the unicodeText to the file as is.  Then try swapping odd-even bytes.  

Or you can let the list know the first dozen bytes of the file.  Somebody might recognize it.  Hex is better but decimal is OK if you are more comfortable with that.  If you need some help with that, ask.

Dar




On Jun 5, 2013, at 8:23 PM, Howard Bornstein wrote:

> I have a client who wants me to do some processing on a spreadsheet file
> that has been saved in .csv format. One of the fields contain either
> English or Japanese. When I look at the fields with the Japanese, it looks
> like gibberish. It does not display as Kanji.
> 
> I believe the full data is still there because if I open it up in Numbers,
> the Kanji is displayed correctly. However, I need to use the .cvs file to
> process and I can't, for the life of me, make the Kanji appear.
> 
> I am *way* in over my head here with regards to different languages. I
> assume this is a unicode issue but I am completely ignorant in this area.
> 
> My question is: how can I take a .cvs file, which contains some Kanji text
> but doesn't display as Kanji, and convert it so that, as a text file, it
> displays as Kanji again. I don't care where this conversion takes place--I
> am doing a bunch of other processing of the file in LC so it can be
> anywhere in the process. I'm not doing anything with the Kanji itself
> except displaying it.
> 
> I'd appreciate any help but if it involves unicode, please assume you are
> talking to an imbecile.
> 
> TIA
> 
> -- 
> Regards,
> 
> Howard Bornstein
> -----------------------
> www.designeq.com
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list