Why No Built in GetMyIP call in LiveCode?
John Patten
johnpatten at me.com
Sat Jun 11 11:45:28 EDT 2011
Good thing I read this explanation of regex Saturday morning ;-) ... I think?
Seems like matchtext/regex would be useful for HTML scraping strategies too.
Thanks for the explanation Jim!
John Patten
SUSD
Sent from my iPad
On Jun 10, 2011, at 9:20 PM, Jim Ault <jimaultwins at yahoo.com> wrote:
>
> On Jun 10, 2011, at 7:39 PM, J. Landman Gay wrote:
>> On 6/10/11 8:21 PM, Jim Ault wrote:
>>> The () parens are telling the engine to capture any chars that meet the
>>> conditions inside and assign them to the first variable specified. In
>>> this case, it is 'retVal'
>>> If there were a second set of (), then those chars would be assigned to
>>> the second variable specified.
>> Good explantion, I like when regex gets explained. But what I don't get is how come the first set of parentheses aren't put into the variable:
>>
>> get matchText(tEthernetConfig,"(?s)inet (.*?) ",retVal)
>>
>> The LC engine ignores the "(?s)". That's good and as it should be, but I'm not sure why.
>
> LC honors the (?s), but as a directive, not a caputure.
>
>
> When a paren is read and is followed by a ?
> this signals an 'operation' rather than a 'capture'
> --
> Additional regex conditions or qualifiers are.....
> Lookahead and Lookbehind ... scanning operations designated by
>
> (?<= (?<! lookbehind positive and negative logic
> positive and negative logic lookahead (?= (?!
> --
> (?Usi) means shortest match, allow multiple lines, disregard case
> (?U) means shortest match, single line, case sensitive
> (?s) means longest match, allow multiple lines, case sensitive
> if it is missing then
> default = means longest match, single line, case sensitive
>
> What is meant by 'single line' is that a return char restarts the scanning on another line.
> Multi line means the return is seen as just another char in the text block so the repeat loops can keep going to find the longest match.
> ---
>
> Think of the regex engine as a complex series of nested repeat loops that are a combination of
> repeat while
> repeat until
> making many, many char by char scans, in both directions, from both ends of a block, to find the longest positive result, unless told to be ungreedy (shortest result)
>
> The repeat loops are designed to accept strings and operators in series
> such that a given block of text is scanned in both directions in order to implement logic patterns.
>
> This multiple scanning can occur from the first char forward and the last char backward to find the best solution.
>
> Simple rules don't show you all the multiple scans (repeat loops) that are used to arrive at the parsed result.
> Large blocks of text can take several minutes to scan depending on conditions and conditionals.
>
>
> The paren as a directive works for
> [a-zA-Z] means a to z lower and upper for a single char
> or
> (?i)[a-z] means a to z lower and upper for a single char
> --since the 'i' means case insensitive
> (?i)([a-z]) -- will capture a single char if it is a-z either case.
> If the test fails there is no value assigned.
> LC will allow a test for empty, but Perl and others will report an error like 'undefined' since there was no match, no capture, and no assignment. Also, in Perl, etc, you must define variables ahead of the regex if you want to avoid 'undefined'.
>
> Hope this makes a little light reading for the weekend.
>
> Jim Ault
> Las Vegas
>
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list