SQLite, Unicode & LC

Peter Haworth pete at lcsql.com
Thu Apr 10 12:46:25 EDT 2014


A few comments below.

On Wed, Apr 9, 2014 at 8:28 PM, Kay C Lan <lan.kc.macmail at gmail.com> wrote:

> Pete said:
>
> 1) Exports from iTunes and gets a word like eÜjûzëiÇoò [hope it displays
> with all the accents] with all the accented chars as garbage.
>

That's correct when I used Textedit with its default character encoding
("Automatic") for opening files.  I just tried it with Textedit's character
encoding set to utf8 and the accented characters now show correctly.
 Apparently Textedit is unable to automatically detect utf8 correctly.


> 2) I don't know how those displayed for him in a LC variable or field.
>

Looking at it in the variable viewer, it displayed with the corrupted
characters.


>  3) He imports that data into SQLite and gets those same carbage chars.
> 4) He used unidecode(uniencode()) to convert the garbage and display
> correctly in SQLite Management software
>

Slight correction to that - once the data was in the SQLite database
correctly formatted with uniencode/decode, it displayed correctly in an LC
application after doing a SELECT on it.  It also displayed correctly in my
SQLite admin tool but since that tool is my SQLiteAdmin utility which is
written with LC, that's probably not a good benchmark :-)


> In my case, when I orginally wrote my script (6.1.x) I never used any
> uniencode or unidecode:
>
> 1) Exported a file and a word like eÜjûzëiÇoò appeared exactly like that in
> a BBEdit text file that reported it as UTF8 and Unix CRs.
>

I don't have BBEdit but it sounds like it does a better job than TextEdit
on detecting character encodings.


> 2) Put it in a LC variable and field and it looked exactly the same.
>

That's where I get a different result than you - I get the corrupted
characters. even when I coerce TextEdit to displaying them correctly. Did
you save the file with BBEdit before loading it into LC? If so, maybe that
removed the need for the LC uniencode/decode.


> 3) Imported into SQLite UTF8 db and the word looked exactly the same.
> 4) When I SELECTED the record and displayed it a LC field it looked exactly
> the same.
>

I'd expect 3) and 4) to be the case if it looked OK in the LC variable.

>
> NOW, since updating to LC 6.6.1GM (which has updated SQLite)
>
> 1) In SQLite original records with accented words look correct.
> 2) When I SELECT I have to use the mentioned unidecode(uniencode()) to
> display correctly.
>

I don't see that in 6.6.1.  The existing records in my database display
correctly in LC after a SELECT with no uniencode/decode.


>
> BUT NOW in 6.6.1GM if I
>
> 1) Take a BBEdit UTF8 Unix CRs text file with the word eÜjûzëiÇoò
> 2) Put it in an LC variable or field it still looks correct
> 3) Import it into SQLite without any uniencode and/or unidecode it looks
>
 like this e j z i o  --blank where accented chars should be
>

Do you know what version of the SQLite library your admin tool is using?
 I'm wondering if there's some incompatibility in how UTF8 is handled in
different versions of the SQLite library.


> 4) When I SELECT the record and display it in an LC variable or field
> without using uniencode and/or decode it displays correctly.
> 5) So the only problem here is it doesn't display correctly in SQLite
>
> 6) On the other hand if I employ unidecode(uniencode()) I get this in the
> db: ejzio
> 7) When I SELECT the record and display it in LC I get ejzio with or
> without using unidecode(uniencode()) or worse if I use any combination of
> uniencode or unidecode.
>
> So Pete reported accents incorrectly displaying in his text file, and he
> can correct those by employing unidecode(uniencode()) to look fine in
> SQLite.
>
> I on the other hand have correctly displayed accents in text files, but
> can't get those to appear in SQLite correctly using your suggested
> solution.
>
> In the long term, unless LC 7.x stuffs things up further, for me the
> simplest solution seems to be to ignore unicode all together, just import
> it into SQLite, and not look at it using an SQLite Manager software, if I
> need to look at it I'll simply extract the data using LC or I notice that
> if I Export the data to a UTF8 Text file all the accents appear correctly.
>
> The problem to me seems to revolve around what happened when LC 6.6.x
> upgraded SQLite, which now seems to prevent my SQLite Management software
> (tried 3) from correctly displaying accents when it obviously still can.
>
> I'm on OS X 10.9.2, LC 6.6.1GM
>



Pete
lcSQL Software <http://www.lcsql.com>
Home of lcStackBrowser <http://www.lcsql.com/lcstackbrowser.html> and
SQLiteAdmin <http://www.lcsql.com/sqliteadmin.html>



More information about the use-livecode mailing list