How can we dynamically create variable names from changing value "x" on a loop?
Ben Rubinstein
benr_mc at cogapp.com
Mon Nov 7 10:39:18 EST 2016
On 07/11/2016 15:02, Richard Gaskin wrote:
> Ben Rubinstein wrote:
>
>> (Re Mike and Mark's comments, if it's a small thing I'll use an
>> array; but for large quantities of data - I'm often dealing with
>> very large files, and after calling this function will loop over
>> tens or hundreds of thousands of rows using the variables - I feel
>> the need for speed outweighs the simplicity.)
>
> Indeed, contrary to popular belief I've seen cases where certain aggregate
> operations on an array take more time than achieving the same outcomes with
> delimited lists.
>
> But so far only a few.
>
> How did you benchmark that, and what was the measured difference?
I'll confess: no benchmarking has taken place, just intuition.
I love arrays - coming from a HyperCard world in which I was similarly doing
large amounts of processing over files, the ability to use hashed arrays when
I discovered MetaCard made an enormous difference. I was and am blown away by
the speed of access they allow.
The context in which I'm typically using this 'makeAccessVars' functionality
is where the code loads a massive TSV file and then iterates through the rows
doing various processing on the data. I don't want to hard-code the columns in
which the data will be, because very occasionally that may change, and it's
too easy to have a subtle bug here.
So the typical routine is something like
do makeAccessVars("vi", line 1 of tTSVdata)
delete line 1 of tTSVdata
repeat for each line tRec in tTSVdata
doSomething item viUserID of tRec, item viUserName of tRec
...
end repeat
If I'm doing something that _isn't_ going to repeat a vast number of times, I
often use a variation more along these lines:
put line 1 of tTSVdata into tColumnNames
delete line 1 of tTSVdata
repeat for each line tRec in tTSVdata
put explodeRow(tRec, tColumnNames) into aData
doSomething aData["User ID"], aData["User Name"]
...
end repeat
where 'explodeRow' does the obvious thing to construct an array containing the
data from the row, each value indexed by the name of the column in which it
appeared. I prefer that style, as it makes the "doSomething" part of the code
- which is generally the most interesting bit, and therefore the one that
needs easiest to understand - clearer. Obviously it must be slower though: but
I admit I've never done the experiment to find out how significant the
difference is.
(Actually I'd probably get better performance in the latter case, and further
enhance readability, if I combined the two approaches, i.e modify
'makeAccessVars' to that instead of returning a string which when passed to
'do' declares variables named for each column and assigns indices to them,
it's called for each row to assign the actual values to the variables.I'm not
sure why I don't do this.)
Ben
More information about the use-livecode
mailing list