best/fastest way to tell if a field contains unicode text?

Geoff Canyon gcanyon at gmail.com
Wed Mar 26 20:00:49 EDT 2014


On Wed, Mar 26, 2014 at 12:03 PM, Fraser Gordon <fraser.gordon at runrev.com>wrote:

> On 26 Mar 2014, at 15:37, Geoff Canyon <gcanyon at gmail.com> wrote:
>
> >
> > "somebody, somewhere, might be depending on the fact that it interprets
> the
> > number modulo 256"
>
> We've already had a bug report against 7.0 because it wasn't doing that in
> certain cases.
>

One bug report doesn't make it a requirement. You'll never hear from the
people who are confused by  a more complex syntax (which already has
hundreds and hundreds of tokens). Think of me as the Lorax, I speak for
clearer syntax.


> The problem with making non-compatible changes to existing APIs is that
> changing things to work against the new API isn't a one-time cost. If you
> take LiveCode itself as an example, there are a number of run-time checks
> where things are done differently depending on the operating system version
> because something broke when a new version of the OS was released.
> Sometimes, we are lucky and the fix works on old versions too but (more
> often) it doesn't so both code paths have to remain until support is
> dropped for the older version.
>

Are there a significant number of people who will have to maintain their
code under both 7.0 and previous versions? Who use syntax that would
require this sort of dual-maintenance?


>
> >
> > My point is that we will *all* suffer with poor, confusing syntax,
> > *forever*, so that hypothetical person doesn't have to fix their use of
> > numToChar.
>
> Backwards compatibility in LiveCode only needs to be sustained until Open
> Language: at that point, the core language will be cleaned up so that these
> legacy issues can be forgotten (though the old language will hang around as
> a backwards-compatibility mode for existing scripts but can be ignored when
> writing new scripts).
>

Perhaps a nitpick, but if you have a backwards-compatibility mode, then
backwards compatibility is being sustained beyond Open Language. But yes, I
am *very* much looking forward to open language. When can I get it? ;-)

> One of the main advantages of Livecode is the natural syntax. Sacrificing
> > that to backwards compatibility is a poor trade-off.
> >
>
>
> I agree that clean syntax is better but there are limits to what we can do
> - it can be very discouraging when you upgrade to the latest-and-greatest
> version of some software and suddenly something doesn't work (I get that
> feeling when having to switch between the various versions of Xcode that we
> use!).
>

 Good release notes are key, but only if you buy in to the concept of
abandoning the past in the first place, obviously.

"numToChar" and "charToNum" are unsound concepts anyway - imagine you have
> the sequence of Unicode "characters" (really codepoints) {"e",
> combining-accute} - these will display as a single grapheme "é" which means
> that in the LiveCode language they are one character. However, there are
> two codepoints - how do you map that to a single number? This may seem
> esoteric, but on MacOS X, accented characters in file names are returned in
> such a form. Either charToNum will have to not match what "char" means in
> LiveCode or it will have to fail on what is a relatively common occurrence.
> Thus, doing the wrong, backwards-compatible thing seemed a better choice
> than doing a new wrong thing.
>

I'm probably being naive, but are you saying that there isn't a numerical
equivalent to each character in unicode?

As an aside, I suddenly thought (for the first time ever) of the byte
chunk. I assume it will continue to represent 8 bits when char has left it
behind for unicode? Also, I would have expected byte to return a value that
can be treated as a number from 0-255, but maybe I'm thinking about this
wrongly.

gc



More information about the use-livecode mailing list