How can we dynamically create variable names from changing value "x" on a loop?

Ben Rubinstein benr_mc at cogapp.com
Mon Nov 7 10:39:18 EST 2016


On 07/11/2016 15:02, Richard Gaskin wrote:
> Ben Rubinstein wrote:
>
>> (Re Mike and Mark's comments, if it's a small thing I'll use an
>> array; but for large quantities of data - I'm often dealing with
>> very large files, and after calling this function will loop over
>> tens or hundreds of thousands of rows using the variables - I feel
>> the need for speed outweighs the simplicity.)
>
> Indeed, contrary to popular belief I've seen cases where certain aggregate
> operations on an array take more time than achieving the same outcomes with
> delimited lists.
>
> But so far only a few.
>
> How did you benchmark that, and what was the measured difference?

I'll confess: no benchmarking has taken place, just intuition.

I love arrays - coming from a HyperCard world in which I was similarly doing 
large amounts of processing over files, the ability to use hashed arrays when 
I discovered MetaCard made an enormous difference. I was and am blown away by 
the speed of access they allow.

The context in which I'm typically using this 'makeAccessVars' functionality 
is where the code loads a massive TSV file and then iterates through the rows 
doing various processing on the data. I don't want to hard-code the columns in 
which the data will be, because very occasionally that may change, and it's 
too easy to have a subtle bug here.

So the typical routine is something like

	do makeAccessVars("vi", line 1 of tTSVdata)
	delete line 1 of tTSVdata
	repeat for each line tRec in tTSVdata
		doSomething item viUserID of tRec, item viUserName of tRec
		...
	end repeat


If I'm doing something that _isn't_ going to repeat a vast number of times, I 
often use a variation more along these lines:

	put line 1 of tTSVdata into tColumnNames
	delete line 1 of tTSVdata
	repeat for each line tRec in tTSVdata
		put explodeRow(tRec, tColumnNames) into aData
		doSomething aData["User ID"], aData["User Name"]
		...
	end repeat

where 'explodeRow' does the obvious thing to construct an array containing the 
data from the row, each value indexed by the name of the column in which it 
appeared.  I prefer that style, as it makes the "doSomething" part of the code 
- which is generally the most interesting bit, and therefore the one that 
needs easiest to understand - clearer. Obviously it must be slower though: but 
I admit I've never done the experiment to find out how significant the 
difference is.

(Actually I'd probably get better performance in the latter case, and further 
enhance readability, if I combined the two approaches, i.e modify 
'makeAccessVars' to that instead of returning a string which when passed to 
'do' declares variables named for each column and assigns indices to them, 
it's called for each row to assign the actual values to the variables.I'm not 
sure why I don't do this.)

Ben




More information about the use-livecode mailing list