Maximum field size

Richard Gaskin ambassador at fourthworld.com
Fri Jan 20 14:41:29 EST 2023


David Epstein wrote:

 > Richard Gaskin asks “Why?”
 >
 > I have developed a set of routines to analyze tabular data.  For KB
 > or MB-sized files, it is convenient to display them in a field.  It
 > would be simplest if I could also load GB-sized files and use my
 > routines unchanged, but I accept that this is impractical.  But in
 > order to design workarounds I’d like to get as much clarity as
 > possible on what limits I am working around.

Do you read the text when it's measured in megabytes?

R and other data processing tools encourage habits of displaying 
results, but rarely the data set as a whole.  Of course I haven't seen 
what you're working on, and I've had my own moments now and then when 
just randomly scanning large data sets has yielded "a ha!" insights, so 
I can appreciate the desire for your work with Cornell.

One option to consider, if practical for your needs, is that a one-time 
change to work with the data in a variable for all data regardless of 
size would at least obviate the need for special-casing data sets of 
specific size.


As for field limits, I believe Jacque summarized them well:

- Per line: 64k chars per line when rendered without text wrap
   (rendering limit only; field text still addressable, and everything
    works swimmingly in a var)

- Total - Logical: 4GB (32-bit ints used for allocation)

- Total - Practical: a mix of: available addressable space on the 
current system in its current state, likely at times requiring much more 
than the size of the data on disk given the iterative allocation calls 
to move the I/O buffer into the variable space, mitigated by any 
limitations imposed by the host OS's allocation routines provided for 
contiguous blocks (Mark Waddingham has cited in this many times how 
Win32 APIs have some limits on contiguous allocation far below the 
logical 4GB threshold).

- Total - Anecdotal: I use the Gutenberg KJV Bible file frequently for 
stress testing text routines, but while we think of the Bible as a large 
text it weighs in at just 4.5 MB.  In rarer cases where I've needed to 
probe for outliers I've created test sets above 100 MB without issue, 
but begin to see major slowdowns long before that is line-wrapping 
calculations are needed, and further above ~100 MB significant slowdowns 
for display, scrolling, and save operations.

-- 
  Richard Gaskin
  Fourth World Systems
  Software Design and Development for the Desktop, Mobile, and the Web
  ____________________________________________________________________
  Ambassador at FourthWorld.com                http://www.FourthWorld.com



More information about the use-livecode mailing list