NULL characters and sorting
Richard Gaskin
ambassador at fourthworld.com
Fri Mar 20 11:26:30 EDT 2009
Paul Looney wrote:
> One of my customers had a problem displaying all of their archived
> orders.
> There should have been 16,020 archived records but only 3,879 were
> showing up in the list.
...
> I started checking the contents of variables in the appropriate
> handler and discovered there were the proper number of records just
> before a sort by column. After the sort, records were missing.
> Phil, the Great, Davis - Wizard of West Linn - suggested checking for
> and removing NULLs (because they terminate a line in C). They were
> 131,023 NULLs in the pre-sort variable.
> When I removed them before the sort, the number of listed records
> jumped from 3,879 to 16,020.
> This leaves some questions:
> How can 131,023 NULL "characters" reduce the displayed "lines" by
> 12,141?
The sort command has a limit which I don't believe is currently
documented: it can only be used reliably on data sets in which no line
is longer than 65,535 characters.
I've submitted a request to have this noted in the docs:
<http://quality.runrev.com/qacenter/show_bug.cgi?id=7823>
I've also submitted a request to have this limit raised:
<http://quality.runrev.com/qacenter/show_bug.cgi?id=7824>
I learned about this when I encountered a similarly mystifying bug with
one of my WebMerge customers, back when Dr. Raney managed the engine.
His response was that if one had an item in a line which was pushing the
line beyond those limits, that's a good argument for putting that data
somewhere else. ;)
At the time I argued with him, but working around it led me to a useful
feature in my product: the ability to reference external files from
within a record. In my case one of the editors at a major Mac magazine
was using WebMerge on some data from FileMaker, in which he stored full
articles. With the addition of the new feature he was able to write
those articles in any tool and store them anywhere, merely referencing
the file from his database; my product would then find it and include it
as though it was part of the data.
While it worked out well for me and my customers, I still see the
occasional data set in which some lines are longer than 65,535 chars,
and still believe it would be a useful enhancement to the engine.
In your case it may not be a problem at all now that the NULLs are
removed, presumably reducing the number of chars well below that limit.
64k is approximately 28 pages' worth of stuff, so it's pretty rare that
such a limit would be exceeded on a single line of actual data.
Now that Phil's removed the NULLs, the next trick is to figure out how
those got into the data in the first place. I'm tempted to place a bet
that it's related to pasting data that had been copied from MS Word.
I'd be interested to learn how that happened either way.
--
Richard Gaskin
Fourth World
Revolution training and consulting: http://www.fourthworld.com
Webzine for Rev developers: http://www.revjournal.com
More information about the use-livecode
mailing list