Sorting strangeness

Mark Waddingham mark at
Wed Sep 18 02:57:12 EDT 2019

On 2019-09-16 21:54, Paul Dupuis via use-livecode wrote:
> IF sort lines of <var> ascending text was working correctly for lines
> of mixed ASCII and Unicode, for someone sorting lines of text that can
> be Native text, Unicode text (both RTL and LTR), or mixtures of both,
> is it better to use SORT ... TEXT or SORT ... INTERNATIONAL? I don't
> know enough about what the "international" (using the system locale
> settings)  and Unicode may mean in relation to one another?

You should use 'sort international' when you are displaying a sorted 
to a user who is looking through it manually.

The ordering provided by 'sort text' is purely by unicode codepoint, 
has no direct relation to 'expected' order when read by a human and 
is determined by technical considerations (structuring a large 21-bit 
frequency of use and, most importantly, round-tripping to legacy 
encodings and

The core of the sort order provided by 'international' sorting is the
Unicode Collation Algorithm - which provides (at its code) a 
order for all the languages/scripts present in Unicode. e.g. Latin 
languages are generally expected to come before Greek which is expected 
come before Cyrillic.

This core order is then tailored by locale to enable account to be taken
of the individual expectations of the user of the sorted list. For 
different languages have different sort orders for what you might 
the 'same letters' due to using the same glyphs. For example, a Swedish 
would expect 'z' to sort before 'ö'; whereas a German user would expect
'ö' to sort before 'z'.

The engine uses ICU's implementation of Unicode collation, and supports 
a wide
range of locales - the locale used is read on engine startup from the 
system settings.

Hope this helps,


Mark Waddingham ~ mark at ~
LiveCode: Everyone can create apps

More information about the use-livecode mailing list