Sorting strangeness

Mark Waddingham mark at livecode.com
Mon Sep 16 14:45:30 EDT 2019


On 2019-09-16 19:01, Paul Dupuis via use-livecode wrote:
> Thanks Bob for being one of the folks on the list who always tries to
> offer a solutions for people.
> 
> That said, I have solutions a plenty. My real question is for
> LIVECODE, LTD or perhaps someone like Mark Waddingham who could
> actually tell whether this is the expected behavior (not a BUG, but
> probably should be documented) or an aberrant behavior (a BUG and
> should be reported)

Its definitely a bug - sorting a field with that content works 
correctly, but sorting a variable doesn't.

After staring at the string for a while it occurred to me that the line 
which is sorting incorrectly is all ASCII - "Norwegian Norsk" - indeed 
the following causes the string to sort correctly again:

   sort <original text> ascending text by (each & 
(numToCodepoint(0xFFEF)))

When sort is done, it first splits the input string into separate 
strings - one for each line. In this case the "Norwegian Norsk" line 
becomes a native string, whereas all the others are unicode. The above 
forces all lines on which the string is sorted to be forced to unicode 
so the bug doesn't manifest.

Poking around some more, this also seems to work correctly:

   set the caseSensitive to true
   sort <original text> ascending text

So there appears to be a difference between the sort keys being 
generated for unicode and native strings - at least when caseSensitive 
is false.

The field case works because the field coerces all content to unicode 
(as the text APIs on all platforms take UTF-16 these days), and I 
believe there is an optimization in place if you sort a field by lines - 
it doesn't have to cut anything up, it just uses the backing string from 
each paragraph.

I have a feeling I know precisely where the problem lies, so if you file 
a bug (for once) we should be able to fix it quite rapidly.

Warmest Regards,

Mark.

P.S. Another way to get the correct result is to do this (which is 
essentially what the engine does internally if caseSensitive is true):
   set the caseSensitive to true
   sort <original text> ascending text by toLower(each)


-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list