Unicode (was Getting Kanji from a .csv file)

Peter Haworth pete at lcsql.com
Sat Jun 8 13:20:53 EDT 2013


I apologize up front for being particularly clueless on this whole
character encoding concept.  I'm still trying to adjust to speaking
American English as opposed to the Queen's English so not too suprising I'm
not grasping unicode too well!

I understand the concepts and the use of uniencode and unidecode but I
don;t understand when I need to care.

I'll use my SQLiteAdmin program as an example.  It provides schema
maintenance and data browsing/update features for SQLite databases and uses
most of the standard LC controls, including datagrids.  Users can enter
data into it and have it used to INSERT, UPDATE, or DELETE rows.  They can
also type in SELECT criteria and have the qualifying data displayed in
field and datagrid controls. Currently, there is no attempt to do any
encoding or decoding of data.

On my computers here in the USA, I've never had any issues using it on any
of my databases, but I've never tried to access one whose contents weren't
in American English..

Now let's say someone in a country whose language requires the use of
unicode encoding purchases the program.  WIll it work OK for that person in
terms of entering data into the controls and displaying data in the
controls from their database, assuming that the database contains UTF8
encoded data?  Or do I have to uniencode/decode to ensure things work right?

Now let's say the database is using UTF16 encoding, or anything other than
UTF8.  I can detect that situation in the database and I think I would need
to use uniencode/decode to deal with it?

Now the user takes his UTF8 database and puts it on a colleague's computer
here in the USA with the computer's language settings set to American
English.  I would then need to decode/encode.... I think.

>From the original thread, it seems clear that when I import data into the
database via SQLiteAdmin, I do need to be aware of the encoding in the
imported file and that there may be a way to detect that within the file
depending on how it was produced. Conversely, when I export data, I should
try to create the same marker in the file.

And finally, is the simplest way to take care of this to simply
uniencode/decode everything using the databases encoding without regard as
to whether that's necessary or not?

Pete
lcSQL Software <http://www.lcsql.com>



More information about the use-livecode mailing list