randomly order a list

Dar Scott dsc at swcp.com
Wed Jun 5 18:41:24 EDT 2013


MD5 works on 512-bit blocks.  The message is padded so that it is 64 bits short of being a multiple of 512.  It works on all of those blocks iteratively, generating a 128-bit value each iteration and consuming one.  The final 128-bit value is the hash.  

So, it does crank away on all of the bits.  However, if the length is 56 bytes or less, the time is constant.  

Dar


On Jun 5, 2013, at 11:38 AM, Geoff Canyon wrote:

> What code were you using Alex? I thought the first step(s) of the MD5
> process reduce (or grow) whatever input string is given to 128 bits, and
> then everything from there operates on the 128 bit data. Likewise for SHA1,
> in 160 bits.
> 
> In other words, the size of the individual strings should have a limited
> impact on the MD5 algorithm. For example, the two times returned by this
> are nearly identical:
> 
> on mouseUp
>   repeat 5
>      put random(999999999) after S[1]
>   end repeat
>   repeat 5000
>      put random(999999999) after S[2]
>   end repeat
>   repeat with i = 1 to 2
>      put ticks() into T
>      repeat 1000000
>         get MD5digest(S[1])
>      end repeat
>      put i && ticks() - T & cr after R
>   end repeat
>   put R
> end mouseUp
> 
> 
> 
> On Tue, Jun 4, 2013 at 6:26 PM, Alex Tweedly <alex at tweedly.net> wrote:
> 
>> No comments on the "collision-or-not-ness", but some concerns about
>> performance.
>> 
>> The performance of "random() & random()" is conveniently data-independent,
>> but that for md5digest() is not. With nice short lines, it is indeed faster
>> than the random&random version, but as the line size increases, so does the
>> time taken by all of the digest methods. I didn't test it thoroughly, but
>> the swap-over point is fairly low - somewhere around 500 chars per line.
>> 
>> -- Alex.
>> 
>> 
>> On 04/06/2013 18:51, Geoff Canyon wrote:
>> 
>>> At the risk of beating the decaying equus -- the previously suggested
>>> random() solutions should be fine for all purposes --I found an
>>> alternative
>>> that:
>>> 
>>> 1. Is faster than sorting by random(999999999) & random(999999999)
>>> 2. Is about as fast as sorting by random(999999999)
>>> 3. Is (I think) less likely to have duplicate sort keys
>>> 
>>> The drawback is that it is determinative (albeit random) for any given set
>>> of data, unless you are willing to accept performance equivalent to
>>> sorting
>>> by random(999999999) & random(999999999), while providing near-certainty
>>> of
>>> a true sort (I think).
>>> 
>>> The one-time, as fast as any solution so far, sort is:
>>> 
>>>       sort lines of myVar by md5digest(each)
>>> 
>>> Collisions are highly unlikely in 128 bits. Even random(999999999) &
>>> random(999999999) only provides about 60 bits, which, to be clear, is
>>> *more* than enough, but md5 is (I think) even more certain, and faster.
>>> However, it will always produce the same results.
>>> 
>>>       sort lines of myVar by sha1digest(each)
>>> 
>>> Works roughly the same: 160 bits of guaranteed-no-collision-ness, but it's
>>> a little slower, although still much faster than random(999999999) &
>>> random(999999999). Like MD5, it will always sort the same data the same
>>> (random) way.
>>> 
>>> The same-ness for either solution can (I think) be fixed by this:
>>> 
>>>       put ticks() into T
>>>       sort lines of myVar by md5digest(T & each)
>>> 
>>> or
>>> 
>>>       put ticks() into T
>>>       sort lines of myVar by sha1digest(T & each)
>>> 
>>> That should result in random results each time, and is a little faster
>>> (MD5) or about 1/3 slower (SHA1) than random(999999999) &
>>> random(999999999)
>>> 
>>> If anyone has thoughts on the collision-or-not-ness of MD5 or SHA1, feel
>>> free to comment. Otherwise, I hope I'm done now ;-)
>>> ______________________________**_________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>>> 
>> 
>> 
>> ______________________________**_________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list