Translate metadata to field content
Niggemann, Bernd
Bernd.Niggemann at uni-wh.de
Thu Feb 20 14:21:18 EST 2020
In reply to Mark Waddingham's comments
Thank you Mark Waddingham for the improved scripts and the hints as to why they improve speed.
I adapted Mark's version for unique occurrence, changed how the position of the target word is determined in the target line.
It is not safe to assume that the sum of words of the runs is the number of words of the line up to the target word. The reason is that runs are depending on formatting and formatting can create a new run in the middle of a word and thus increase word count.
I did not opt for Mark's use of codeunits because I had the impression it was not faster and makes the code less obvious.
--------------------------------------
local tTextOfRuns
repeat for each key i in tDataA
local tRunsA
put tDataA[i]["runs"] into tRunsA
repeat for each key j in tRunsA
if tRunsA[j]["metadata"] is tSearchText then
repeat with m = 1 to j
put tRunsA[m]["text"] after tTextOfRuns
end repeat
put the number of words of tTextOfRuns into tNumWords
put true into tFlagExit
exit repeat
end if
end repeat
if tFlagExit then
exit repeat
end if
end repeat
--------------------------------------
select word tNumWords of line i of field "x"
text consists of 96881 words and 23161 lines of heavily formatted text
(it is the script of RevDataGridLibraryBehaviorsDataGridButtonBehavior copied twice into a field as described before)
word# old new version, times in ms
96881 240 110
80000 220 100
60000 180 60
30000 120 125
10000 85 125
1000 50 90
1 50 60
Timing this is a bit tricky. For "repeat with I = 1 to item 2 of the extents" it is obvious that time increases with increasing the target word number.
For "repeat for each key I in tDataA" it is not sequential but faster. However that also makes for variations in speed depending on the internal state of the array structure.
All timings are estimated averages of 5 to 10 measurements . Variability is typically about +-5 to 10 milliseconds with outliers.
However the overall speed gain is quite impressive and well worth the change.
I learned a lot about handling larger datasets using arrays, than you.
Kind regards
Bernd
More information about the use-livecode
mailing list