What is the best/fastest way to extract strings of text?

Keith (Gulf Breeze Ortho Lab) keith at gulfbreezeortholab.com
Wed Aug 3 02:38:19 EDT 2011


Thanks Keith.

----- Original Message ----- 
From: "Keith Clarke" <keith.clarke at clarkeandclarke.co.uk>
To: "How to use LiveCode" <use-livecode at lists.runrev.com>
Sent: Tuesday, August 02, 2011 3:01 AM
Subject: Re: What is the best/fastest way to extract strings of text?


> The recipe I (learned here and) use with extracting specific HTML / XML 
> elements is to get the specific target elements on their own line, remove 
> the unwanted lines and then move the target string items in the remaining 
> lines out into a separate variable - something like...
>
> 1. Get the target elements into their own line by prefixing the opening 
> tag with return, using: replace "<#B>" with return & "<#B>" in theSource
> 2. Get the closing tag onto its own line by adding a return suffix, using: 
> replace "<#E>" with "<#E>" & return in theSource
> 2. Remove the unwanted lines, (that lack the specific opening tag) using: 
> filter theSource with "<#B>"
> 3. Delimit the line into items at the '>' character, using: set the item 
> delimiters to numtochar(62)
> 4. Iterate through the list to extract the string, using:
> repeat for each line l in theSource
> put item 2 and return after theExtract
> end repeat
> 5. Clean-up the extract of any extra returns, using: filter theExtract 
> without empty
>
> If (my pre-coffee brain worked) theExtract should contain the tagged 
> strings in theSource.
>
> ...hmmm, talking of coffee...
> Best,
> Keith..
>
> On 2 Aug 2011, at 08:24, Keith (Gulf Breeze Ortho Lab) wrote:
>
>> Hello,
>>
>> I am still playing with LiveCode and am now exploring chunks...
>>
>> My question is as follows. Suppose I have a variable with a lot of text. 
>> Throughout the text I have various strings, separated by consistent tags, 
>> that I need to extract.
>>
>> For example, the following text is in the variable myVar:
>>
>> The boy <#B>went to the store<#E>. He enjoyed his day out.
>>
>> The woman loves <#B>shopping at the mall<#E>. So do I.
>>
>> The girl loves <#E>eating at the restaurant<#E>. So does he.
>>
>> I am looking for the most efficient way to extract each of the strings of 
>> text between the <#B> and <#E> tags... I presume I will have to use a 
>> loop and the matchChunk function? I have experimented but am having a 
>> problem putting starting and ending positions into variables.
>>
>> Is there a better way to accomplish the above?
>>
>> Thanks!
>>
>> - Boo
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your 
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your 
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
> 





More information about the use-livecode mailing list