AW: Unicode: LC 7.0 - PHP - MySQL?

Martin Baxter mblivecode at harbourhosting.co.uk
Fri Oct 31 08:45:45 EDT 2014


Tiemo,

Not sure what your database is collated as now as I thought you said it
was ascii_general_ci in your original post. But the collation of the
database columns affects sorting, indexing and so-on. I don't think it
actually stops you storing unexpected characters.

Anyway, I believe that for single-byte characters, UTF-8 is equivalent
to ISO-8859-1, so if your single byte umlaute are encoded using a
compatible variant of that, then there won't be a problem whichever
character set LC is expecting.

Ü (Uuml)

for example is at position 220 for unicode (e.g utf8) and also
ISO-8859-1 (and therefore typical windows charsets that have that
character at the same position) So you wouldn't notice a conflict in
that case, but still might get some issues with things like curly quotes
perhaps and some of the less common characters.

That would be my assumption about what's going on, anyway.

Martin

On 31/10/14 10:49, Tiemo Hollmann TB wrote:
> Hello Martin,
> thank you for your helpful informations, though I am a little puzzeled,
> probably because I am missing something.
> What I don't understand is that I currently get all of my German Umlaute
> properly from LC via PHP into my UTF-8 MySQL db, though the columns in my
> table are set to ascii_general_ci and the German Umlaute don't belong to the
> plain ASCII subset. And I don't see where this collation is changed by PHP.
> Do you have any explanation for that?
> Thanks for your coaching
> Tiemo
> 
> 
>> -----Ursprüngliche Nachricht-----
>> Von: use-livecode [mailto:use-livecode-bounces at lists.runrev.com] Im
> Auftrag
>> von Martin Baxter
>> Gesendet: Freitag, 31. Oktober 2014 11:19
>> An: How to use LiveCode
>> Betreff: Re: Unicode: LC 7.0 - PHP - MySQL?
>>
>> A little additional info from me.
>>
>> If your database will only ever contain ascii characters, then
> ascii_general,
>> utf-8 and latin1_swedish will all work fine because the ascii characters
> are
>> the same in all of them.
>>
>> I would expect problems though if mixing these up and subsequently
> attempting
>> to introduce non-ascii characters to the data.
>>
>> latin1_swedish was the default in MYSQL, since it was originally developed
> by
>> Swedes.
>>
>> You should set the desired collation for the database when you create it,
> but
>> it is also possible to change it later.
>>
>> In my experience it is important to get the character set defined
> consistently
>> throughout the workflow.
>>
>> This involves:
>>
>> 1) The collation of the database (and individual columns) Normally set
> when
>> database created.
>> 2) The database connection should specify the character set to be
> transmitted
>> (done in php when making the connection)
>> 3) Character manipulations in php may need to specify the character set
>> 4) LC scripts of course need to take character set into account.
>> 5) Any html involved should specify and be written using the same
> character
>> set, especially if forms are acquiring user input to be stored in the
>> database.
>>
>> For web-based work, utf-8 is very popular and utf8_general_ci is often
>> nowadays the default collation in web database front ends.
>>
>> HTH
>>
>> Martin Baxter
>>
>> On 29/10/14 16:11, Tiemo Hollmann TB wrote:
>>> Hello,
>>>
>>> I have a LC 6 program communicating through PHP with a MySQL db.
>>> Because my background about Unicode, PHP and MySQL is limited I wonder
>>> what I have to care about, when migrating to LC 7.
>>>
>>> I have read the release notes of LC 7. My limited thinking was, that
>>> UniCode really has a unique code for each sign on the planet. But why
>>> is there a
>>> UTF-8 / UTF-16. Which one is LC using? Which one is my MySQL db using?
>>> I don't find any information about UTF-8/16 in my db description. How
>>> is the collation of the db related to UTF-x and to LC?. My tables are
>>> collated in ascii_general_ci. In some of my PHPs a "COLLATE
>> latin1_swedish_ci" is used.
>>> I have no idea why this Swedish collation is in my german PHP and how
>>> it can be compatible with my ascii_general_ci DB. (The PHPs are made
>>> by third
>>> party)
>>>
>>> What do I have to change in my LC program when migrating to 7. Where
>>> to start?
>>>
>>> Is LCs Unicode really the magic thing, where I don't have to care
>>> about any charset related thing and all my thinking is just waste? Or
>>> do I have to migrate, test and dig into one crash after the other? Or
>>> do you have any helpful hints, how to start such a migration and what to
>> look for?
>>>
>>> Thanks
>>>
>>> Tiemo
>>>
>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
> subscription
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
> 





More information about the use-livecode mailing list