What is the best/fastest way to extract strings of text?

Michael Kann mikekann at yahoo.com
Tue Aug 2 09:05:15 EDT 2011




put numToChar(255) into m
set the itemDel to m  ---------- FORGOT THIS LINE
replace "<#B>" with m
replace "<#E>" with m
put zero into c
repeat for each item k in raw_input
   add 1 to c
   if c mod 2 is zero then
      put k & kr after h
   end if
end repeat
delete char -1 of h
put h into clean_output


--- On Tue, 8/2/11, Keith Clarke <keith.clarke at clarkeandclarke.co.uk> wrote:

From: Keith Clarke <keith.clarke at clarkeandclarke.co.uk>
Subject: Re: What is the best/fastest way to extract strings of text?
To: "How to use LiveCode" <use-livecode at lists.runrev.com>
Date: Tuesday, August 2, 2011, 3:01 AM

The recipe I (learned here and) use with extracting specific HTML / XML elements is to get the specific target elements on their own line, remove the unwanted lines and then move the target string items in the remaining lines out into a separate variable - something like...

1. Get the target elements into their own line by prefixing the opening tag with return, using: replace "<#B>" with return & "<#B>" in theSource
2. Get the closing tag onto its own line by adding a return suffix, using: replace "<#E>" with "<#E>" & return in theSource
2. Remove the unwanted lines, (that lack the specific opening tag) using: filter theSource with "<#B>" 
3. Delimit the line into items at the '>' character, using: set the item delimiters to numtochar(62)
4. Iterate through the list to extract the string, using: 
    repeat for each line l in theSource
        put item 2 and return after theExtract
    end repeat
5. Clean-up the extract of any extra returns, using: filter theExtract without empty

If (my pre-coffee brain worked) theExtract should contain the tagged strings in theSource.

...hmmm, talking of coffee...
Best,
Keith..
  
On 2 Aug 2011, at 08:24, Keith (Gulf Breeze Ortho Lab) wrote:

> Hello,
> 
> I am still playing with LiveCode and am now exploring chunks...
> 
> My question is as follows. Suppose I have a variable with a lot of text. Throughout the text I have various strings, separated by consistent tags, that I need to extract.
> 
> For example, the following text is in the variable myVar:
> 
> The boy <#B>went to the store<#E>. He enjoyed his day out.
> 
> The woman loves <#B>shopping at the mall<#E>. So do I.
> 
> The girl loves <#E>eating at the restaurant<#E>. So does he.
> 
> I am looking for the most efficient way to extract each of the strings of text between the <#B> and <#E> tags... I presume I will have to use a loop and the matchChunk function? I have experimented but am having a problem putting starting and ending positions into variables.
> 
> Is there a better way to accomplish the above?
> 
> Thanks!
> 
> - Boo
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the Use-livecode mailing list