[DOCs] Split and Combine - a Scripting Challenge?

Kay C Lan lan.kc.macmail at gmail.com
Tue Aug 7 04:33:34 EDT 2007


OK, let's just say I'm a sceptic. Not a full card carrying member of
Sceptics Dot Org, but enough to annoy my wife;-)

For reason that may become apparent in a couple of days, I'm looking at the
Rev Docs and notice that for the itemDelimiter I can use "a single
character". Although it doesn't say it specifically, it would imply a
numToChar up to 255. Interestingly, although you get the error:

'Error description: Chunk: source is not a number'

if you try to:  set the itemDelimiter to "kk"

You DON'T get an error if you try set the itemDelimiter to numToChar(666)
[assuming you've NOT set useUnicode to true] The catch is that it doesn't
actually set the itemDelimiter to a numToChar greater than 255, but sets it
to the 'first' char of the two byte char. This can be mathematically
calculated by subtracting, 256,512,1024, etc until it gets below 256, in the
above case it would be charToNum(154).

This is easy to demonstrate by pasting the following into the msg box:

set the itemDelimiter to numToChar(666)
put charToNum(the itemDelimiter) into msg

Substitute any number for 666, even if it's a number beyond the 65535 normal
unicode limit, Rev still sets it to a char below 256.

Interestingly, if you do 'set the useUnicode to true' then you will end up
with an error message.

But that is not the reason for this ramble. No the reason for this ramble is
that the Rev Docs say that for Split and Combine the character used must be
in the ASCII range, 1 to 127; and that's what I've been working on for ages.
But the sceptic in me just wanted to see what happens if I do use something
outside that range.

My initial 'crude manual' tests indicated that there was actually no problem
with Split or Combine using a character up to 255. So I set about writing a
more robust 'automatic' test. Here's what I came up with:

[commented numbers are explained below]

-------------------------------------

repeat with x = 128 to 255
 --1
 set the itemDelimiter to numToChar(x)
 --2
 repeat with y = 1 to 100
  repeat with z = 1 to 5
   put numToChar(x-y-z)  into char z of item y of tStartList
  end repeat
 end repeat
 put tStartList into tArray
 --3
 split tArray by itemDelimiter
 --4
 repeat with y = 1 to 100
  put tArray[y] into item y of tEndList
 end repeat
 --5
 if (tEndList <>  tStartList) then
  answer "Split failed with char: " & numToChar(x) titled "Failed"
  put tEndList & cr & tStartList into msg
  exit to top
 else
  --put tEndList & cr & tStartList & cr  into msg
 end if
 --6
 combine tArray by itemDelimiter
 --7
 repeat for each item tItem in tArray
  if (tItem is not among the items of  tStartList) then
   answer "Combine failed with char: " & numToChar(x) titled "Failed"
   put tItem & " is not amoung " & tStartList into msg
   exit to top
  else
   --put tItem & "combine checks OK" & into msg
  end if
 end repeat
end repeat
--8
put "Everything Checked OK!" into msg

-------------------------------------

--1 Set the itemDelimiter to a char greater than 127

--2 create a 100 item list.
Initially each item was a single character, and to ensure no conflict with
the itemDelimiter char, was simply x-y
Just for the sake of it I then decided to make the items 5 char 'words', so
added the x-y-z repeat loop.
As a matter of interest, the x-y version runs in a blink of the eye, the
x-y-z version takes a second or two.

--3 create an array with Split using a character above 127

--4 Because Combine doesn't build a list in any specific order, rebuild the
list in the exact order it was created.

--5 Compare the StartList with the EndList and if they are not the same pop
up a dialog. If you 'uncomment' the put line you can watch the script run,
but it adds significantly to the time taken to complete.

--6 Now Combine the array, again using a char outside the allowed range.

--7 A simple check, if each item in the new list is in StartList. Again
report if it isn't and again, if you uncomment the put line you can watch
the script run and add much much more time to the running of the script.

--8 If it gets to here then what came Out must be what went In so using a
char from 128 to 255 doesn't seem to cause an error.

Anyone like to point out the error in my logic? Or write a more robust test
that supports the limit in the Docs?

This was tested on an Intel Mac, OSX 10.4.9, Rev 2.8.1 build 472

NOTE: I'm NOT advocating the use of chars above 127 for Split or Combine!!
This is purely for the 'enjoyment' of developing a script that tests a
multitude of cases to see if an error occurs:-)



More information about the use-livecode mailing list