Unicode

Mark Waddingham mark at livecode.com
Wed May 6 14:59:52 EDT 2015


Normalisation, folding and collation all require a fair bit of data to do (proportional to the size of the current iteration of the unicode character tables in fact) which will be why SQLite externalises such functions as hooks.

It might be we can hook this up to the routines in the engine though at some point (I say some point because the db drivers are two steps away from the engine at the moment in terms of binding).

In the meantime if you have columns in a db where you don't need to preserve the form and case you could just normalise and case-fold the strings in LiveCode before inserting.

Mark

Sent from my iPhone

> On 6 May 2015, at 19:45, Peter Haworth <pete at lcsql.com> wrote:
> 
> Interesting, thanks Mark.
> 
> In case any database developers are interested, it seems that the SQLite
> upper() and lower() functions only work with ASCII characters as does
> COLLATE NOCASE so at least Livecode is ahead of SQLite.  It is possible to
> write your own collation handlers for SQLite but wouldn't have any idea
> where to start with something like that.
> 
> I often use COLLATE NOCASE along with a UNIQUE constraint in my databases
> to guarantee the uniqueness of a column's values no matter what case; I
> guess I'll have to be sure no non-ASCII language users are involved in the
> future!
> 
> Pete
> lcSQL Software <http://www.lcsql.com>
> Home of lcStackBrowser <http://www.lcsql.com/lcstackbrowser.html> and
> SQLiteAdmin <http://www.lcsql.com/sqliteadmin.html>
> 
>> On Wed, May 6, 2015 at 12:42 AM, Mark Waddingham <mark at livecode.com> wrote:
>> 
>>> On 2015-05-06 01:53, Peter Haworth wrote:
>>> 
>>> Right, this is where I get confused on the issue of whether there are
>>> uppercase equivalents of all lowercase glyphs in all languages.  The link
>>> you provided sheds light on this
>> 
>> The Greek alphabet does have upper and lower case variants. However, in
>> the case of typing 'qwerty' and 'QWERTY' using a Greek keyboard layout then
>> you get the strings:
>>  qwerty = ;ςερτυ
>> and
>>  QWERTY = :΅ΕΡΤΥ
>> 
>> Which (by virtue of the punctuation and the terminal sigma on q and w) are
>> definitely not the same when compared caselessly ;)
>> 
>> Mark.
>> 
>> i.e. Don't assume that shift-<letter> gives you an uppercase version of
>> <letter> in any keyboard layout.
>> 
>> --
>> Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
>> LiveCode: Everyone can create apps
>> 
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list