regex url validator?

Alex Tweedly alex at tweedly.net
Tue Oct 23 12:29:04 EDT 2018


Hi Sean,

I think there are two (hopefully straightforward) suggestions for the 
docs on this:

1. the syntax diagram is (I think) wrong (or at least misleading) ....

> filter [{lines | items | keys | elements} of] *filterSource* {with | 
> without | [not] matching} [{wildcard pattern | regex pattern}] 
> *filterPattern* [into *targetContainer*]
>
in particular, the bit that says  ....  [{wildcard pattern | regex 
pattern}]  ....

It implies that it is optional to have anything there - and that if you 
do, then it can be either one or the other.  Nothing there indicates 
that the alternatives are

alternative one:   "wildcard pattern" or nothing,
alternative two:  "regex pattern".

Maybe it could be

filter [{lines | items | keys | elements} of] *filterSource* {with | 
without | [not] matching} {[wildcard pattern] | regex pattern} 
*filterPattern* [into *targetContainer*]

(i.e. moving the option [...] inside the alternate {...} - which I think 
gives equivalent possibilities, but makes it clearer which is which;  
clearer at least to me :-)


2. Explicitly say in the description that the phrase "regex pattern" is 
required if the filterpattern is to be interpreted as a regex, while the 
alternative "wildcard pattern" is optional, and is the default, used if 
neither of those phrases occurs.

Hmmm - neither of these descriptions are very clear :-(   but maybe 
they'll help :-)

Alex.



On 23/10/2018 17:09, Sean Cole (Pi) via use-livecode wrote:
> Tom,
> I've looked at the docs and I'm not sure how to make it much clearer.
> Perhaps you have a suggestion?
> The examples given are:
>
> local tVar put the propertyNames into tVar filter tVar with "[az]*" -- tVar
> contains all property names beginning with a or z
>
> This one is NOT a regular expression (regEx) but is the same as specifying
> it to be a wildcard pattern.
>
> -- Filtering a string literal causes the filtered string to be placed in
> the it variable filter items of "apple,banana,cherry" with regex pattern
> "b.*" -- it contains "banana"
>
> This one is and is clearly indicated.
>
> Perhaps we could move things about a bit.
> The description say:
>
> The filter...without form and the filter...not matching form discard the
> lines, items, keys or elements that do not contain a match for the
> specified filterPattern.
>
> This could possibly be written better to read:
>
> The 'filter...without' form and the 'filter...not matching' form discard
> the lines, items, keys or elements that do not contain a match for the
> specified filterPattern.
>
> This could then be followed by the 'Wildcard pattern' description which is
> NOT regEx. It's the same as not specifying a pattern. That would mean
> putting the RegEx description further towards the end making it obvious to
> be something different from *filter x with/without [wildcard pattern]*
>
> What does everyone think?
>
> Sean
>
>
> On Tue, 23 Oct 2018 at 16:51, Sean Cole (Pi) <sean at pidigital.co.uk> wrote:
>
>> Maybe the documentation needs a tweak.
>>
>>
>> Will do...
>>
>> Sean
>>
>>
>> On Tue, 23 Oct 2018 at 15:04, Tom Glod via use-livecode <
>> use-livecode at lists.runrev.com> wrote:
>>
>>> <SOLVED>
>>>
>>> I originally used
>>>
>>> filter the lines of my_container with myregex
>>>
>>> or filter my_container with myregex
>>>
>>> none of the expressions worked..... just blanked out.
>>>
>>> When I used the same form as Steve...it worked.
>>>
>>> filter my_container with regex pattern myregex   into my_variable
>>>
>>> So is that a bug in the filter command? literally lost hours on this.
>>>
>>>
>>>
>>> Thank you all ..... onto some working code. :)
>>>
>>>
>>> On Tue, Oct 23, 2018 at 9:27 AM Stephen MacLean via use-livecode <
>>> use-livecode at lists.runrev.com> wrote:
>>>
>>>> Hi Tom,
>>>>
>>>> Don’t know if you found a solution yet, but this is from the rsIsValid
>>>> suite I put together a few years back.
>>>>
>>>> http://forums.livecode.com/viewtopic.php?f=16&t=26653&p=138698#p138698
>>> <
>>>> http://forums.livecode.com/viewtopic.php?f=16&t=26653&p=138698#p138698
>>>> .
>>>> https://github.com/renegadesteve/rsIsValid <
>>>> https://github.com/renegadesteve/rsIsValid>
>>>>
>>>>
>>>> Below is the LCS version:
>>>> function rsIsValidURL pURL
>>>>
>>>> put pURL into tCC
>>>>
>>>> // get scheme
>>>>
>>>> put "^(?<scheme>[a-z][a-z0-9+\-.]*):" into tSchemeRegex
>>>>
>>>> get matchText(tCC, tSchemeRegex,tScheme)
>>>>
>>>> // check scheme for http, https, ftp, ftps, mailto, nntp, news, or file
>>>>
>>>> if it <> true then return false
>>>>
>>>> if tScheme = "mailto" then
>>>>
>>>> //get the email address from the URL and then validate it
>>>>
>>>> delete char 1 to 7 of tCC
>>>>
>>>> return rsIsValidEmail_LC(tCC)
>>>>
>>>> else
>>>>
>>>> // setup the regex pattern
>>>>
>>>> put "^(?:http://|https://|ftp://
>>> |ftps://|nntp://|news://)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]-*)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]-*)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$"
>>>> into tURLRegex
>>>>
>>>> // run filter against regex pattern and set result to true if it
>>> matches.
>>>> filter tCC with regex pattern tURLRegex into tMatch
>>>>
>>>> if tMatch <> empty then
>>>>
>>>> return true
>>>>
>>>> else
>>>>
>>>> return false
>>>>
>>>> end if
>>>>
>>>> end if
>>>>
>>>> end rsIsValidURL
>>>>
>>>>
>>>> Best,
>>>>
>>>> Steve MacLean
>>>>
>>>>> On Oct 22, 2018, at 10:07 PM, Tom Glod via use-livecode <
>>>> use-livecode at lists.runrev.com> wrote:
>>>>> Hi peeps,
>>>>>
>>>>> I'm trying to use regex to validate a list of URLs
>>>>>
>>>>> I've tried 4 or 5 different "regular" expressions that supposedly work
>>>>> ..... but LC does not give me anything back.  None of them work,
>>>>>
>>>>> Like for example ... this one.
>>>>>
>>>>>
>>> ^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$
>>>>> or this one
>>>>>
>>>>> /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/
>>>>>
>>>>> or this one just to make sure
>>>>>
>>>>> (https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)
>>>>>
>>>>> Does anyone know how to get these or others to work in LC?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Tom
>>>>> _______________________________________________
>>>>> use-livecode mailing list
>>>>> use-livecode at lists.runrev.com
>>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list