Small regex project for pay

J. Landman Gay jacque at hyperactivesw.com
Wed Mar 16 13:06:03 EDT 2016


Just for the record, you could probably meet all the requirements using 
LiveCode's trueword token (in LC 7) which solves the whitespace and 
punctuation issues. But I think regex would still be faster.
--
Jacqueline Landman Gay         |     jacque at hyperactivesw.com
HyperActive Software           |     http://www.hyperactivesw.com



On March 16, 2016 10:02:23 AM Paul Dupuis <paul at researchware.com> wrote:

> I have a small outsource coding project.
>
> Given a variable tContent which contains a bunch of text and a variable
> tString which contains a search string and a function that:
>
> 1) searches tContent using matchChunk(tContent,tRegex,tStart,tEnd)
> repeatedly building a cr delimited list of start and end character
> positions of all the occurrences of tString in tContent. So for tContent of:
>
> 'Now is the time for all good people to be good people.'
>
> And tString of 'people', the function returns a delimited list:
>
> 30,35
> 48,53
>
> 2) The function supports optional Boolean parameters to support case
> sensitivity/insensitivity
>
> Note: while this can be done easily with the offset function, the next
> parts can't
>
> 3) The function supports 4 search modes (a) the normal character search
> described in (1) where tString can be a substring of any part of
> tContent; (b) [the part you can't do with offset) support whole matches
> (i.e. tString should only match is the char before tString is white
> space (including cr) or punctuation and the char after tString is also
> white space or punctuation; (c) support Begins With where tString is
> preceded by white space/punctuation for begin with; and (d) Ends With
> where tString terminates with white space/punctuation for ends with.
>
> (a),(b),(c), and (d) are mutually exclusive options, but (2) case
> sensitivity should work with any of the four modes
>
> I have a framework for the function (currently using offset and not
> supporting all the options). To change it to using matchChunk, I really
> need the regex expressions for the options:
>
> i.e something like:
>
> switch pMode
>   case "normal"
>     if tCaseSenitive = true then
>       put <someregex>&tString&<somemoreregex> into tRegexToUse
>     else
>       put <someregex>&tString&<somemoreregex> into tRegexToUse
>     end if
>     break
>   case "whole"
>     if tCaseSenitive = true then
>       put <someregex>&tString&<somemoreregex> into tRegexToUse
>     else
>       put <someregex>&tString&<somemoreregex> into tRegexToUse
>     end if
>     break
>   case "begins"
>     ...
>   case "ends"
>     ...
> end switch
> Or some variation of this code (perhaps the case sensitivity option is a
> single if before or after the switch
>
> So, I am looking for the regex and a sample function in a stack that
> demonstrates the regex performs the matches correctly for the 8 test
> cases (4 modes with or without case sensitivity)
>
> Email me your price for this job to paul at researchware.com
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your 
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode






More information about the use-livecode mailing list