Buffer size (was Looking for parser for Email (MIME))
Richard Gaskin
ambassador at fourthworld.com
Tue Mar 22 10:24:27 EDT 2016
Mark Waddingham wrote:
> open file ...
> repeat forever
> read from file ... until return
> if the result is not empty then
> exit repeat
> end if
> if *it is a new message boundary* then
> ... finish processing current message ...
> ... start processing new boundary ...
> else
> ... append line to current message ...
> end if
> end repeat
What is the size of the read buffer used when reading until <char>?
I'm assuming it isn't reading a single char per disk access, probably at
least using the file system's block size, no?
I ask because some months ago I wrote a needed to parse a 6GB file and
"read...until CR" was slower than I preferred so I experimented with a
complicated routine that reads into a buffer of about 128k and then
parses the buffer.
If I can turn up the code it may be mildly interesting, but the main
question it raised for me was:
Given that the engine is probably already doing pretty much the same
thing, would it make sense to consider a readBufferSize global property
which would govern the size of the buffer the engine uses when executing
"read...until <char>"?
In my experiments I was surprised to find that larger buffers (>10MB)
were slower than "read...until <char>", but the sweet spot seemed to be
around 128k. Presumably this has to do with the overhead of allocating
contiguous memory, and if you have any insights on that it would be
interesting to learn more.
I recognize this sort of things may seem like mere performance
fetishism, but I believe this has useful application for making LC an
ever better solution for working with large amounts of data.
Pretty much any program will read big files in chunks, and if LC can do
so optimally with all the grace and ease of "read...until <char>" it
makes one more strong set of use cases where choosing LC isn't a
tradeoff but an unquestionable advantage.
--
Richard Gaskin
Fourth World Systems
Software Design and Development for the Desktop, Mobile, and the Web
____________________________________________________________________
Ambassador at FourthWorld.com http://www.FourthWorld.com
More information about the use-livecode
mailing list