Translate metadata to field content

Thu Feb 20 02:56:59 EST 2020

On 2020-02-19 21:40, Niggemann, Bernd via use-livecode wrote:
> here is Richard's script which I changed to get the number of words of
> the line with the tagged word, the number of lines are taken from the
> array.
> 
> The tagged word is then: word tNumWords of line (current array key)
> 
> ---------------------------------------------------------
> put item 2 of the extents of tDataA into tExtents
>    repeat with i = 1 to tExtents
>       put item 2 of the extents of tDataA[i]["runs"] into tCounter
>       repeat with j = 1 to tCounter
>          if tDataA[i]["runs"][j]["metadata"] is tSearchText then
>             repeat with m = 1 to j
>                add the number of words of tDataA[i]["runs"][m]["text"]
> to tNumWords
>             end repeat
>             put true into tFlagExit
>             exit repeat
>          end if
>       end repeat
>       if tFlagExit then exit repeat
>    end repeat
> ---------------------------------------------------------
> 
> select word tNumWords of line i of field "x"

That approach is much better than my suggested one, and is independent 
of the soft breaks of the text as well :)

It can be made a little more efficient though...

[ DISCLAIMER: I don't have any test data to run these on - so the 
following code snippets have not been tested in any way - or 
syntax/error checked :D ]

NON-UNIQUE ANCHORS

If the anchors used are non-unique and you want the first matching 
anchor in the page from top to bottom / left to right then...

1. Using 'the number of elements of' rather than 'the extents' saves 
some time. As the arrays in question are known to be sequences, the 
number of elements of SEQUENCE == item 2 of the extents of SEQUENCE

2. Factoring out the common array lookups will save some time.

With these two changes you'd have:

repeat with i = 1 to the number of elements in tDataA
   local tRunsA
   put tDataA[i]["runs"] into tRunsA
   repeat with j = 1 to the number of elements in tRunsA
     if tRunsA[j]["metadata"] is tSearchText then
       repeat with m = 1 to j
         add the number of words of tRunsA[m]["text"] to tNumWords
         put true into tFlagExit
         exit repeat
       end repeat
     end if
   end repeat
   if tFlagExit then
     exit repeat
   end if
end repeat
select word tNumWords of line i of field "x"

UNIQUE ANCHORS

If the anchors being searched for are unique in a document, then using 
repeat for each key in both loops will save some time. Although the 
search order in the runs and lines will be arbitrary (hash-order), as 
the thing being searched for is unique this doesn't matter.

[ The reason this should be faster is that the engine doesn't need to 
process the index vars (i / j) before looking up in the array. ]

repeat for each key i in tDataA
   local tRunsA
   put tDataA[i]["runs"] into tRunsA
   repeat for each key j in tRunsA
     if tRunsA[j]["metadata"] is tSearchText then
       repeat with m = 1 to j
         add the number of words of tRunsA[m]["text"] to tNumWords
         put true into tFlagExit
         exit repeat
       end repeat
     end if
   end repeat
   if tFlagExit then
     exit repeat
   end if
end repeat
select word tNumWords of line i of field "x"

RUN WITH METADATA DEFINES SELECTION - NON-UNIQUE SEARCH

If the area you want to select is defined by the run with metadata being 
searched for, then you can avoid words altogether and just count 
codeunits. Codeunits are the fastest chunk to count as they don't need 
any iteration of the content of the string being queried:

repeat with i = 1 to the number of elements in tDataA
   local tRunsA
   put tDataA[i]["runs"] into tRunsA
   repeat with j = 1 to the number of elements in tRunsA
     local tRunA
     put tRunsA[j] into tRunA
     if tRunA["metadata"] is tSearchText then
       repeat with m = 1 to j - 1
         add the number of codeunits of tRunsA[m]["text"] to 
tNumCodeunitsBefore
         put the number of codeunits in tRunA["text"] into tNumCodeunits
         put true into tFlagExit
         exit repeat
       end repeat
     end if
   end repeat
   if tFlagExit then
     exit repeat
   end if
end repeat
select codeunit tNumCodeunitsBefore to tNumCodeunitsBefore + 
tNumCodeunits - 1 of line i of field "x"

Mutatis mutandis for the unique case using repeat for each key.

Again - none of these methods require formattedStyledText, just 
styledText (indeed formattedStyledText wouldn't work with this approach 
as that adds extra codeunits - the VTABs - for the soft-breaks which 
aren't actually there!).

Hope this helps!

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps