Could Rev do this?

Richard Gaskin ambassador at fourthworld.com
Sat Apr 15 16:11:37 EDT 2006


simplsol wrote

 >  Richard wrote:
 >> I was thinking about the same problem here the other day,
 >> and wile I haven't written anything yet it occurred to me
 >> that Rev's MD5 function could probably be useful: you'd
 >> make a list of all the files, read each file into a
 >> variable and run the variable through MD5, and then any
 >> matching MD5 keys are likely duplicates.
 >>
 >>  Anyone here see a weakness to that crude approach?
 >
 > Richard,
 > A potential danger I see to this approach is if one of the
 > duplicates had been edited. The routine would have to check
 > for modification date (probably safe to keep the one most
 > recently changed) - just not delete any apparent duplicates
 > that have different edit dates?

I think an MD5 approach would account for that, since any edits would 
change the MD5 signature.

The MD5digest function takes any chunk of data and returns a short 
(16-byte) binary "signature" derived in a way that makes it 
mathematically improbable that two different sets of data passed in will 
ever produce the same signature.

So if you had two files and only one pixel was different between them, 
the MD5digest result would be different.  Only true duplicates, where 
all data is completely identical, would deliver the same MD5 signature.

--
  Richard Gaskin
  Managing Editor, revJournal
  _______________________________________________________
  Rev tips, tutorials and more: http://www.revJournal.com




More information about the use-livecode mailing list