Getting Kanji from a .csv file

Dar Scott dsc at swcp.com
Thu Jun 6 15:01:16 EDT 2013


If, by ASCII, you mean classic ASCII, 7 bits, then ASCII is UTF-8.  You can process it as UTF-8.  In even other words, ASCII is a subset of UTF-8 in a sense.  The ASCII codes are valid Unicode codes and the UTF-8 form of Unicode containing only that range looks just like a sequence of bytes with ASCII characters in them.

If you mean (by ASCII) a single-byte character encoding, it gets harder, but maybe that can be bypassed.

There have been some recent improvements (as someone mentioned) and more cool stuff in on the way.  This is my first approach to this.

Take the value of the unicodeText property of the field.  Convert that to UTF-8  with uniDecode().  Store that in the db which is set up for that.  (Alternately, you can set up the db for 16-bit unicode with some endian, and then do endian conversion based on the platform.)  Coming back, do the reverse.  When outputting to text files, include a byte ordering code at the beginning to be a hint to applications.  

Dar




On Jun 6, 2013, at 12:33 PM, Peter Haworth wrote:

> Let's say I have an sqlite database with data stored in utf-8 format, which
> I think is the default for sqlite dbs. The application using the database
> may be taking input in ASCII format or unicode depending on the user's
> language settings.  So sometimes, the application will get ASCII data and
> other times unicode.
> 
> 
> What do I need to do to ensure data ends up in the database in utf-8 format
> whether it's ASCII or unicode.  SImilarly, when I'm displaying data from
> the  database, do I need to do any conversion before displaying it to the
> user.  Or does this "just work" as long as the data is is written and read
> on the same computer?
> 
> 
> 
> Pete
> lcSQL Software <http://www.lcsql.com>
> 
> 
> On Wed, Jun 5, 2013 at 8:42 PM, Phil Davis <revdev at pdslabs.net> wrote:
> 
>> Hi Howard,
>> 
>> From one unicode-ignorant soul to another -
>> 
>> Devin's explanation about LC & Unicode got me started:
>>    http://livecode.byu.edu/**unicode/unicodeInRev.php<http://livecode.byu.edu/unicode/unicodeInRev.php> -- the good part is about a third of the way down
>> 
>> Using this info + LC's various unicode functions + the styledText of a
>> field, I was recently able to paste multi-line Arabic text correctly. If I
>> can do that, you can do Kanji. Really! It reads left-to-right doesn't it?
>> 
>> Best -
>> Phil
>> 
>> 
>> 
>> On 6/5/13 7:23 PM, Howard Bornstein wrote:
>> 
>>> I have a client who wants me to do some processing on a spreadsheet file
>>> that has been saved in .csv format. One of the fields contain either
>>> English or Japanese. When I look at the fields with the Japanese, it looks
>>> like gibberish. It does not display as Kanji.
>>> 
>>> I believe the full data is still there because if I open it up in Numbers,
>>> the Kanji is displayed correctly. However, I need to use the .cvs file to
>>> process and I can't, for the life of me, make the Kanji appear.
>>> 
>>> I am *way* in over my head here with regards to different languages. I
>>> assume this is a unicode issue but I am completely ignorant in this area.
>>> 
>>> My question is: how can I take a .cvs file, which contains some Kanji text
>>> but doesn't display as Kanji, and convert it so that, as a text file, it
>>> displays as Kanji again. I don't care where this conversion takes place--I
>>> am doing a bunch of other processing of the file in LC so it can be
>>> anywhere in the process. I'm not doing anything with the Kanji itself
>>> except displaying it.
>>> 
>>> I'd appreciate any help but if it involves unicode, please assume you are
>>> talking to an imbecile.
>>> 
>>> TIA
>>> 
>>> 
>> --
>> Phil Davis
>> 
>> 
>> ______________________________**_________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list