binaryEncode/binaryDecode (was Re: Endian conversion problems)

David Beck davethebrv at mac.com
Wed Aug 6 16:45:01 EDT 2003


Well, I finally got an acceptable (IMOO) solution to the read/writing large
amounts of big endian data on a windows machine. I'll post part of that at
the bottom, and if anybody wants the the rest of it I would be happy to
provide it to them off list. But in the course of comming up with this
solution I discovered something about the binaryEncode/binaryDecode
functions.

Basically these two functions are a mess. The documentation is flat out
wrong and even if it were correct they would still be a mess.

for binaryEncode:

the 'n' format is documented as encoding a number string into a signed
2-byte integer in network byte order. The 'N' format is documented to do the
same thing for a 2-byte unsigned integer in network byte order.

This is wrong.

The 'n' format will convert signed OR unsigned values into a 2-byte integer
in network byte order, and the 'N' format will convert signed or unsinged
values into a 4-byte integer in network byte order.

This function works consistently regardless of whether you are running on a
Mac or Windows machine.

for binaryDecode:

The 'n' format is documented to convert signed big endian 2-byte binary data
into a number string. This works as documented on the Mac side, but if I
build a standalone for Windows, the 'n' flag starts converting the same
2-bytes as unsigned 2-byte data.

The 'N' format is documented to convert unsigned 2-byte binary data in
network byte order, but actually converts 4-byte binary data in network byte
order as signed data if it is on the mac side, and unsigned data if it is on
the window side.

I know the 'I' (uppercase i) flag has similar problems, as I would guess the
's' flag does, but I didn't go there since I didn't have to.

Please fix these functions and their documentation!!

As for the solution I settled on for reading a large amount of big endian
data, here is the a closs platform function for reading a large sequence of
signed 2-byte big endian ints. Basically it reads them all in one go, stores
the lot of them in a temporary variable, and then pops them out of the temp
var one at a time, converting them with binaryDecode (taking into account
the function's little 'quirks') as it goes:

function ReadUInt16 filePath, count
  if the platform is "Win32" then
    read from file filePath for (count * 2)
    put it into rawData
    put "" into finalResult
    put "" into curNum
    repeat while rawData is not empty
      put binaryDecode( "n", char 1 to 2 of rawData, curNum ) into dummy
      -- we need to give the number its sign ourselves, since
      -- the 'n' flag always reads unsigned data on Windows.
      if curNum < 0 then put 32768 + ( curNum + 32768 ) into curNum
      put curNum & comma after finalResult
      delete char 1 to 2 of rawData
    end repeat
    delete the last char of finalResult
  else
    read from file filePath for count uint2
    put it into finalResult
  end if

  return finalResult
end ReadUInt16

This way you avoid doing a 1-byte file read for every number, which would
slow things down a lot. I also have cross platform functions to read/write
big endian unsigned 2-byte ints and unsigned 4-byte ints. If anybody would
like them, drop me a note.

Hopefully this will save some people the same 'discovery process' I went
through.

Cheers,
Dave

on 8/5/03 8:27 PM, Dar Scott at dsc at swcp.com wrote:

> 
> On Tuesday, August 5, 2003, at 08:51 PM, David Beck wrote:
> 
>> The question is: What is the best way to read/write a bunch of 2-byte
>> integers in network byte order on Windows??
> 
> Network order is a good way to go.
> 
>> I looked in to read/writing raw chars, and then using the
>> binaryEncode/binaryDecode functions to translate the raw binary data to
>> numbers, but those functions require a parameter for every value that
>> gets
>> encoded/decoded. I need to be able to read/write sequences of up to
>> 1120
>> 2-byte integers in a reasonable amount of time, and I can neither
>> supply
>> 1120 arguments to those functions, nor can I read/write the ints one
>> at a
>> time, since doing this is unacceptably slow.
> 
> You can always use numToChar() and charToNum(), but I think the
> direction you are heading is probably better in this case.
> 
> Read and write your file with URL "binfile:...".
> 
> For writing, accumulate the file value with 'put ... after...' and then
> put the file.  That is fast.  Create your two byte strings one at a
> time with binaryEncode() in a loop and append them.
> 
> For reading files, you can use "repeat with i = 1 to the length of
> myBinFile by 2".  The use i to pick off a two char string and
> binaryDecode() that.  (I'm not sure I have that last cycle of that loop
> right.)
> 
> I don't think that is as slow as it looks.  (Even so, I have suggested
> an accumulation syntax for the format strings to handle data like this;
> we'll see if that goes anywhere.)
> 
> 
>> ps. As a side note, I noticed that there is no support for 4-byte
>> integers
>> in network byte order from the binaryEncode/binaryDecode functions.
>> This
>> isn't a problem for me, since almost all of the data I need to
>> read/write is
>> 2-byte integers, but I thought it was strange since there is support
>> for
>> 4-byte integers in host byte order from the same functions.
> 
> I think this Transcript Dictionary format description is in error:
> 
>> N: convert next amount 4-byte chunks of data to unsigned integers in
>> network byte order
> 
> This is actually signed.  (I think it might accept an unsigned number
> in the high range in binaryEncode(), though.)
> 
> Dar Scott
> 
> 
> 
> 
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-revolution




More information about the use-livecode mailing list