ambassador at fourthworld.com
Mon Dec 8 11:49:06 CST 2008
> This is the sort of thing I wrote, too. HC code would be similar
> ; I'll take your word for the speed.
> I was wondering if there was something snazzy in Rev (that was not
> in HC) that did not require running through the whole body of text,
> in other words, directly, like the new "replaceText" function that
> is a one-line substitute for "fullReplace". Very Rinaldi-like.
While the algorithms you would use in Rev and HC may appear similar in
some respects, don't underestimate the power of "repeat for each". This
form of repeat was not available in HC, and is at least one or two
orders of magnitude faster than "repeat with i = 1 to the number of
The reason for the incredible speed difference between the two is that
"repeat for each" makes the assumption (indeed requires) that the text
being parsed will not change during the repeat, while "repeat with..."
cannot know this in advance.
So when running "repeat with i = 1 to the number of lines", each time
through the loop it needs to count the lines from 1 to i, and then you
can say "get line i" and that will once again count from 1 to i to
obtain the line. Lots of redundant overhead, but necessary in a form in
which the text being parsed might be changing in each iteration.
In contrast, "repeat for each" counts and parses chunks as it goes,
doing two operations at once (going to the line and parsing it out into
the iteration variable) and never counting its way through the text
within a given iteration, since it knows where it left off last time.
To make things even faster, a few years ago Scott Raney optimized the
"put <text> after <othertext>" operation, with a dramatic speed boost.
In most xTalks, modifying a chunk means a fair amount of overhead with
multiple copies of the destination text to accommodate the change. With
"put...after..." the underlying pointer manipulation has been
significantly optimized specifically for that append operation, so it's
much faster in Rev than any other xTalk I've seen (and I've worked with
most of them, including HC, SC, Plus, OMO, Gain Momentum, and ToolBook).
These two constructs, "repeat for each..." and "put...after...", combine
well for most common parsing tasks, making extraordinarily efficient
work of slicing through even large blocks of data and combining your
found results into a new variable.
And then there are arrays, variables with slots for each element which
can be referenced by the element name.
In HC, multi-part data could be stored in a given variable only by
carefully minding your chunks, e.g.:
put tData into line 4000 of tMyVar
This means that you not only need to make sure that tData never contains
any character you're using to delimit your chunks, but also that
accessing it will require the overhead of counting chunks to access thte
one you want:
get line 4000 of tMyVar
Arrays use an internal hashing scheme so that element names are
associated with the location of the element's data in a much more
efficient way. To store data you just use:
put tData into tMyVar
And to get it:
But the real advantage of arrays is that you're not limited to indexing
them by number; you can use names instead.
For example, if you wanted to store info about a person in a chunk, you
put tName into line 1 of tMyVar
put tPhone into line 2 of tMyVar
put tAddress into line 3 of tMyVar
Works well enough - as long as you remember which line has which data.
So this requires adding comments in your code to remind you later on how
the data is structured.
With arrays this is much simpler:
put tName into tMyVar["name"]
put tPhone into tMyVar["phone"]
put tAddress into tMyVar["address"]
If you need to display all of the elements of an array, you can combine
them with the "combine" command:
combine tMyVar with return and tab
put tMyVar into fld 1
You can also convert a list into an array with the split command:
put fld 1 into tMyVar
split tMyVar with return and tab
In summary, there's a bit of a learning curve with picking up the most
efficient ways to do things in Rev. But it's time well spent, because
in most cases you'll be free to munge data however you want right in the
language, liberated from dependencies on externals written in C.
And given the overhead of the XCMD interface, you might even find that
some of these operations actually run faster in Transcript than in the
externals you used to use.
Revolution training and consulting: http://www.fourthworld.com
Webzine for Rev developers: http://www.revjournal.com
More information about the use-livecode