capitalize

Dar Scott dsc at swcp.com
Thu Nov 2 14:22:10 EST 2006


On Nov 2, 2006, at 7:13 AM, Richmond Mathewson wrote:

> for those of us who stray into the Cyrillic alphabet
> and other non-Roman writing systems . . .
>
> The RR documentation points out that toLower and
> toUpper only function with the first 128 ASCII codes.
>
> Which is a shame.
>
> Which means that RR is not entirely Unicode compliant.

The use of Unicode is obviously the way to go in handling fully  
flexible and globalized text.

I expect that, as features improve, Revolution will grow in that  
direction.

I don't see a general toUpper or toLower as a high priority in that  
growth.  This is because one can write (and folks have written, I  
think) language specific functions as they need.  This is also  
because in the general case, the concept gets hazy.  Many languages  
don't have the concept of upper and lower.  Some have different upper  
case letters depending on the context.  I expect 3rd party libraries  
might fit the bill.  Even so I would not be surprised if eventually  
these functions are improved.  Even when Revolution becomes fully  
Unicode in some sense, folks might rely on Unicode db functions  
instead of toUpper and toLower, anyway.

However, currently only (about) two 8-bit character sets are used as  
the primary character sets in Revolution.  There is some room for  
improvement here, but that improvement might best come when a Unicode  
based Revolution comes.

I know RunRev is working hard on improvements and global text is on  
the list.  Revolution has a few features related to Unicode that help  
in the mean time.  Revolution is not "Unicode compliant" and I don't  
think there have been such claims.

So, the nature of these functions is consistent with the history and  
growth-point of Revolution.  Lack of universal language  
capitalization is not a shortcoming of Revolution, but is an op for  
RunRev or 3rd parties.

As for Cyrillic used for Russian, most of the Unicode alphabet  
character code points (but not all) are sequential, which might be  
useful in writing a converter.

Dar

PS I hope to get my Unicode 5 book in the mail any day now (for  
casual reading) and might change my mind on things based on that.



More information about the use-livecode mailing list