Fwd: New chunks

Benjamin Beaumont ben at runrev.com
Thu Mar 13 12:53:18 EDT 2014


Dear List Members,

This discussion has been very interesting and there have been a lot of
suggestions made. Ultimately I do not feel that now is the correct time to
change the fundamental meaning of any of the syntax of LiveCode. My team
and I have been very carefully crafting the 7.0 release so that existing
applications will run and (with virtually no changes in most cases) work
exactly the same as before except that they will allow arbitrary language
strings as input rather than ones which are just limited to the 'native'
character set of the platform (i.e. MacRoman / Latin-1). As Richard pointed
out in a recent post, this has been a huge endeavour and if we were to
change fundamental syntax then it would make it a lot harder to determine
where problems might lie as we stabilize the 7.0 engine and get it
release-worthy.

One of the next big projects we will be undertaking when 7.0 is
release-worthy is integrating the Open Language ideas that have been
discussed and mentioned (albeit in not much depth) before. At this point we
will be completely at liberty to experiment with, adapt and even change
syntax to ensure it is the best it can be, and the most appropriate. One
part of this project will be the refinement of all existing syntax - there
will be two parsers the existing one and the new, together with a script
translation system that should mean upgrading to the new syntax will be
straightforward (and only required when you wish to use the new Open
Language syntax in an existing script - you won't have to update all
scripts at once, and perhaps never at all for some). This is one point at
which we can perhaps correct some historical syntax which is perhaps not
the best.

Moreover, eventually the Open Language project will mean that on (at the
very least) a project-by-project basis you will be able to tailor your
syntax environment to how you want it. If you prefer the current definition
of 'word' then you will be able to continue to use that - just plug in a
module that maps the 'word' syntax to the existing semantics.

It is also important to stress just how important the current 'word' chunk
actually is in script - it's been interesting to see people go from "we
should change the meaning of word" to, "hmmm, perhaps we shouldn't - I
looked at my scripts last night and it would be a nightmare to
change". Currently
LiveCode's 'word' chunk is inherited from HyperCard and is deeply ingrained
in the language - it is a programmatic construct which is convenient for
numerous things, it is not an attempt at proper word boundary analysis. [ A
good example of usage, original cited by Monte, is things like 'word 2 of
the name of tObject' ].

So with that in mind, I really do think the only option we have now if we
want a more word-like word chunk is to choose a different name for it. This
gives existing scripts access to the ability without having to change them
in any way, but doesn't close the door to a more radical change in the
future (when we have Open Language).

The original suggestion was 'naturalWord' to suggest 'natural language' -
however that does not seem popular (I could mention 'lead balloon' here,
but I think the discussion on the lists speaks for itself).

The suggestion of 'unicodeWord' (or variants thereof) I do think would be
an incorrect path - the notion of word boundaries is not somehow just
applicable to 'Unicode' text, it applies to existing (non-Unicode) text
also. Indeed, in 7.0 there will not be a difference - all of the tokens
which mention 'unicode' will be deprecated as they are no longer needed for
new code. i.e. The idea is that you just have text - word boundary analysis
applies to all text, regardless of whatever internal encoding you might
need to store it.

Another suggestion has been 'wordUnit' - however (to me at least) this just
does not seem to suggest anything meaningful. We are talking about 'words'
which is something everyone has some idea about - indeed, as has been
pointed out, the fact that LiveCode's current 'word' is not really like a
'word' as we might intuitively expect can trip-up people new to the
language and its history. I'm not sure adding 'unit' on the end of 'word'
adds anything really related to the underlying concept nor helps quantify
the difference.

Given that we are talking about adding a chunk type that is more
'real-world' or 'true' to intuitive expectation of what a word should be -
the best suggestion I've seen so far is 'trueWord' (thanks Richard!).

Warm regards,

Mark



More information about the use-livecode mailing list