Speeding up get URL

Sarah Reichelt sarah.reichelt at gmail.com
Sun Aug 3 20:31:50 EDT 2008


On Mon, Aug 4, 2008 at 12:35 AM, Shari <shari at gypsyware.com> wrote:
> Goal:  Get a long list of website URLS, parse a bunch of data from each
> page, if successful delete the URL from the list, if not put the URL on a
> different list.  I've got it working but it's slow.  It takes about an hour
> per 10,000 urls.  I sell tshirts.  Am using this to create informational
> files for myself which will be frequently updated.  I'll probably be running
> this a couple times a month and expect my product line to just keep on
> growing.  I'm currently at about 40,000 products but look forward to the day
> of hundreds of thousands :-)  So speed is my need... (Yes, if you're
> interested my store is in the signature, opened it last December :-)
>
> How do I speed this up?

Shari, I think the delay will be due to the connection to the server,
not your script, so there may not be a lot you can do about it.

I did have one idea: can you try getting more than one URL at the same
time? If you build a list of the URLs to check, then have a script
that grabs the first one on the list, and sends a non-blocking request
to that site, with a message to call when the data has all arrived.
While waiting, start loading the next site and so on. Bookmark
checking software seems to work like this.

Would opening a socket and reading from the socket be any faster? I
don't imagine that it would be, but it might be worth checking.

The other option is just to adjust things so it is not intrusive e.g.
have it download the sites overnight and save them all for processing
when you are ready, or have a background app that does the downloading
slowly (so it doesn't overload your system).

Cheers,
Sarah



More information about the use-livecode mailing list