Searching for a word when it's more than one word

Tom Glod tom at makeshyft.com
Sun Sep 2 20:51:56 EDT 2018


i had this same problem a few weeks ago...luckily it wasn't critical to the
featureset, so i didn't find a solution.  I will swing back around with the
help of this thread.  thanks for entertaining the problem.

On Sun, Sep 2, 2018 at 5:09 AM Quentin Long via use-livecode <
use-livecode at lists.runrev.com> wrote:

> Have pondered the question, and come up with some code which may or may
> not solve the problem at hand, but which may at least prove helpful in
> looking for a real solution:
>
> ==========================
>
> Assumption: You’ve got a text document (not HTML, not RTF, just plain TXT)
> which contains, among other things, however-many place names.
> Assumption: You have a return-list of place names, which may or may not be
> single words
> Assumption: The text document is in the variable SourceDoc
> Assumption: The list of place names is in the variable NamesList
>
> Assumption: You want a document which contains a complete census of
> exactly which of the place-names in NamesList occur in SourceDoc
> Assumption: For each place-name which does occur within SourceDoc, you
> want a list of which word-numbers each such occurrance begins at
>
> put “” into PlaceNamesCensus
> repeat for each line DisName in NamesList
>   put the number of words in DisName into DisNameWords
>   put 0 into SearchOffset
>   put “” into FoundLocs
>   repeat
>     put offset (DisName, SourceDoc, SearchOffset) into DisLoc
>     if DisLoc = 0 then
>       -- there is no character string which matches the place name in
> question
>       end repeat
>     else
>       —- there is a character string which matches the place name in
> question
>       —- is it the actual placename, and not finding “chester” in
> “colchester”?
>       put the number of words in (char 1 to DisLoc of SourceDoc) into
> StartWord
>       if DisName = (word StartWord to (StartWord + DisNameWords - 1) of
> SourceDoc) then
>         -- it’s a match, yay!
>         put StartWord into item (1 + the number of items in FoundLocs) of
> FoundLocs
>       end if
>       add DisLoc to SearchOffset
>     end if
>   end repeat
>   if FoundLocs <> “” then
>     —- nope, DisName wasn’t in SourceDoc
>     put “[nil]” into DeseLocs
>   else
>     —- yay! DisName *was* in SourceDoc! at least once!
>     put FoundLocs into DeseLocs
>   end if
>       put DisName & comma & DeseLocs into line (1 + the number of lines in
> PlaceNamesCensus) of PlaceNamesCensus
> end repeat
>
> ==========================
>
> Known issue: The above code does not pretend to locate possessive
> instances of place names (i.e., California's, the United Kingdom's, etc).
> Am thinking that pre-processing of SourceDoc will be helpful-to-necessary.
> This pre-processing may need to accommodate more issues than just
> possessives.
>
>
> "Bewitched" + "Charlie's Angels" - Charlie = "At Arm's Length"
> Read the webcomic at [ http://www.atarmslength.net ]!
> If you like "At Arm's Length", support it at [
> http://www.patreon.com/DarkwingDude ].
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the use-livecode mailing list