URL exists?
Dave Cragg
dcragg at lacscentre.co.uk
Fri Nov 7 11:48:24 EST 2003
At 1:23 am -0500 7/11/03, Shari wrote:
>What's the fastest way to check if an URL exists?
>
>I tried:
>
>if URL "http://www.whatever.com/something/else.html" is empty then
> blah blah blah
>end if
>
>Very very slow. Is there a faster way?
You'll get different results, depending on whether the host server
(www.whatever.com) or the resource on the server
(/something/else.html) doesn't exist. If it's the latter, then
typically you'll get a 404 response (the result will return "error
404" plus a descriptive string, often "file not found") But data is
usually returned from such a url call, consisting of the "missing
page html message" that http servers usually return. So checking for
empty won't work in this case. However, the whole process shouldn't
be slow.
The "missing host" shouldn't be slow either if the DNS lookup returns
quickly. (libUrl does a lookup with hostNameToAdress before making
any requests.) "invalid host address" is returned in the result.
A delay is most likely to occur because of connection problems, or
when using numerical IP addresses (192.168.1.1, etc.) that don't
exist on the local network.
At 2:33 am -0500 7/11/03, Brian Yennie wrote:
>I think the only way that will be technically faster than this would
>be to use sockets, and make a HEAD request, rather than a GET for
>the url in question.
>
>You'd have to look up http protocol, but basically what it does it
>let you get just the http headers for a page rather than the
>contents. You'd also have to parse the return code...
In theory, it should be possible to make a HEAD request using the
libUrlSetCustomHttpHeaders routine. But I just checked, and it's
flawed. libUrl tries to read for data beyond the returned headers,
and you have to wait for a timeout before it returns. I'll try and
fix this in a future revision, probably with a specific routine for
using HEAD. Meawhile, here's a *quickly* concocted routine for
sending HEAD requests. (WARNING: it really needs socketError and
socketTimeout handlers and a way to jump out of the waits in the main
routine if something goes wrong.)
## this in a button somewhere
## substitue your values for tHost, tResource, tPort
on mouseUp
put "www.whatever.com" into tHost
put "80" into tPort
put "/something/else.html" into tResource
put empty into tData
doHEAD tHost, tPort, tResource, tData
if the result <> empty then
answer the result
else
answer tData
end if
end mouseUp
------------------------------------
## this part in the message path (card, stack, library)
local lvWritten, lvReadData, lvReaded
on doHEAD pHost, pPort, pResource, @pResponse
put "HEAD" && pResource && "HTTP/1.1" into tHeaders
put crlf & "Host:" && pHost after tHeaders
put crlf & "User-Agent: Metacard" after tHeaders ##optional
put crlf & crlf after tHeaders
put pHost & ":" & pPort into tSocket
open socket to tSocket
if the result is not empty then return the result
put empty into lvWritten
write tHeaders to socket tSocket with message "written"
if the result <> empty then
put the result into tRes
close socket tSocket
return tRes
end if
wait until lvWritten <> empty with messages
put empty into lvReaded
read from socket tSocket with message "readed"
if the result <> empty then
put the result into tRes
close socket tSocket
return tRes
end if
wait until lvReaded <> empty with messages
put lvReadData into pResponse
close socket tSocket
return empty
end doHEAD
on written x, y
put true into lvWritten
end written
on readed x,y
put y into lvReadData
put true into lvReaded
end readed
-----------------------------------
Cheers
Dave
More information about the metacard
mailing list