testing on case

Dar Scott dsc at swcp.com
Fri Dec 10 15:18:17 EST 2004


On Dec 10, 2004, at 9:12 AM, James.Cass at sealedair.com wrote:

>         put matchText( z, "^[A-Z]")
> I definitely get "false" returned.

matchText( "mom", "^[A-Z]")
==>
false

matchText( "Mom", "^[A-Z]")
==>
true

matchText( "=Mom", "^[A-Z]")
==>
false

matchText( "\Mom", "^[A-Z]")
==>
false

This is consistent with testing whether the first character is a 
capital (ASCII) letter.

The regex looks good.

> But when I do this in the messagebox
>         put matchText( z, "^[aA-zZ]")
> I get true.  This is the way I would expect it to behave.

put matchText( "mom", "^[aA-zZ]")
==>
true

put matchText( "Mom", "^[aA-zZ]")
==>
true

put matchText( "=Mom", "^[aA-zZ]")
==>
false

put matchText( "\Mom", "^[aA-zZ]")
==>
true

Here is what is going on.  The pattern [aA-zZ] will match any of these 
letters:

a
ABCDEFGHIJKLMNOPQRSTUVWZYZ[\]^_`abcdefghijklmnopqrstuvwxyz
Z

The range portion is usually environment dependent and depends on the 
collating order or the coding order used in an implementation.  For 
ASCII and code sets that are supersets of ASCII, the pattern A-z would 
result in matching the middle line above.

charToNum("A")    ==>  65
charToNum("\")    ==>  92
charToNum("z")    ==>  122

An alternative that might be better for the future is this:

     matchText( "mom", "^[[:upper:]]")
or better
     matchText( "mom", "\A[[:upper:]]")

The matching is currently ASCII, but the library can handle UTF-8, 
somewhat, and if Revolution is ever extended to handle that, that 
pattern should be ready.  It might be that the library will also be 
extended to handle other popular high codes.

Dar

****************************************
     Dar Scott Consulting
     http://www.swcp.com/dsc/
     Programming Services
****************************************



More information about the use-livecode mailing list