Speeding up get URL

Shari shari at gypsyware.com
Sun Aug 3 10:35:31 EDT 2008


Goal:  Get a long list of website URLS, parse a bunch of data from 
each page, if successful delete the URL from the list, if not put the 
URL on a different list.  I've got it working but it's slow.  It 
takes about an hour per 10,000 urls.  I sell tshirts.  Am using this 
to create informational files for myself which will be frequently 
updated.  I'll probably be running this a couple times a month and 
expect my product line to just keep on growing.  I'm currently at 
about 40,000 products but look forward to the day of hundreds of 
thousands :-)  So speed is my need... (Yes, if you're interested my 
store is in the signature, opened it last December :-)

How do I speed this up?

# toDoList needs to have the successful URLs deleted, and failed URLs 
moved to a different list
# that's why p down to 1, for the delete
# URLS are standard http://www.somewhere.com/somePage

   repeat with p = the number of lines of toDoList down to 1
       put url (line p of toDoList) into tUrl
       # don't want to use *it* because there's a long script that follows
       # *it* is too easily changed, though I've heard *it* is faster than *put*
       # do the stuff
       if doTheStuffWorked then
          delete line p of toDoList
       else put p & return after failedList
       updateProgressBar # another slowdown but necessary, gives a 
count of how many left to do
    end repeat
-- 
   Dogs and bears, sports and cars, and patriots t-shirts
   http://www.villagetshirts.com
  WlND0WS and MAClNT0SH shareware games
  http://www.gypsyware.com



More information about the use-livecode mailing list