md5

capellan capellan2000 at yahoo.com
Thu Feb 21 14:12:20 EST 2008


Hi Mark,

This has been discussed before ;-)

In 2005, september 25
Mark Waddingham wrote,
in the message: 
put url some url into someVar...

Mark Schonewille was trying to do something very similar a while ago and
filed an enhancement request about being able to do md5 digests on large
files *without* them needing to be loaded. For interest see here:
 
  <http://support.runrev.com/bugdatabase/show_bug.cgi?id=2410>

As suggested (by me) in the bug-report, you needn't take the md5digest
of the entire file at once and can do something like this instead:

function quasiMD5 pFile
  local tMD5s
  open file pFile for binary read
  repeat 
    read from file pFile for CHUNK_SIZE chars
    if the result is EOF then
      exit repeat
    end if
    put the md5digest of it after tMD5s
  end repeat
  close file pFile
  return the md5Digest of tMD5s
end quasiMD5

Where you can make CHUNK_SIZE some suitable size (perhaps 256k/512k).

[ My intuitive analysis of the impact of doing this on the integrity
(i.e. potential for collision) of the digest is that it will be minimal
- but perhaps someone more knowledgeable in this area could comment. ]

Hope this helps,

Mark Waddinham.


Well, i do not try this code myself, but i am sure that no md5digest command
line tool tries to load the whole Linux Installer image CD in memory to
verify
the file. Surely, they use an aproach similar to this suggested by Mark
Waddingham.
But there is only one thing to know for sure... What exact size are the
chunks to process?

alejandro



masmit wrote:
> 
> I need to get the md5 of potentially very large  
> files on disc. I'm hoping to avoid having to load them into memory in  
> order get the digest.
> 
> Mark
> 

-- 
View this message in context: http://www.nabble.com/md5-tp15536091p15618684.html
Sent from the Revolution - User mailing list archive at Nabble.com.




More information about the use-livecode mailing list