Grabbing HTTP Headers
Sannyasin Sivakatirswami
katir at hindu.org
Mon May 5 18:15:01 EDT 2003
Dan:
Well if you are using Apache it's easy.. .you can just read the globals
which are poked by the header when the request some down the pipe
Maybe the following little script will help. Have put the entire script
here in case it is useful to
others doing CGI stuff.. we recently converted our entire
HinduismToday.com site to .shtml
and restructured directories and this was created to deal with the fall
out.
Delete the stuff pertaining to our particular situation. it is a simple
solution also for anyone want to handle 404's on their site. Basically
this is a redirect solution... incoming 404 is checked for the
requested page. If the redirect is known then it can be found in the
404s-map.txt file. If the page definitely does not exist or for some
reason we don't want to map it to another page then we simply return a
sweet 404 page to the user with suggestions etc.
We have Rev other tools that FTP the log files from the remote server,
strip out all dups and present a clean list of this weeks 404's which
are then used to build more cases in the switch below or up date the
URL map.
#!/export/vhost/com/h/hinduismtoday/www/public_html/cgi-bin/mc
on startup
## this CGI is trigged by the .htaccess file invocatio of
## Apache's "Errordocument" module which passes the
## the HTTP header and other data from the GET request.
## to the Apache global variable "globals"
## we simply take this data, restructure it a bit and turn it into
## a little array.
repeat for each item i in the globals
put i & "=" & value(i) & cr after tGlobes
end repeat
split tGlobes by cr and "="
## now you have all the header info in a nice little array
## New we read in our redirect matrix/map into an array.
put url "file:../ssi/404s-map.txt" into urlMap
split urlMap by cr and "="
## now put the extract from the global array the page
## the person is trying to go and put it into
## into a variable.
put tGlobes["$REQUEST_URI"] into tRequest
##### logging feature
## here we extract certain parts of the http header
## and pass it to our own 404 log... I did this just
## be able to "watch' the 404's a bit more and this passes
## a bit more data that what we get in the error logs.
## but later we would eliminate this log... it's redundant and
## make another maintenance task: to clean up that log file.
put "*" & the date & cr after logEntry
put "$REQUEST_URI" & tab & tGlobes["$REQUEST_URI"] & cr after logEntry
put "$REMOTE_HOST" & tab & tGlobes["$REMOTE_HOST"] & cr after logEntry
put "$REMOTE_ADDR" & tab & tGlobes["$REMOTE_ADDR"] & cr after logEntry
put "$0" & tab & tGlobes["$0"] & cr after logEntry
put cr & cr after logEntry
switch
case tRequest contains ".html"
##replace .html with .shtml
put tRequest into tempRequest
replace ".html" with ".shtml" in tempRequest
##check for existance then redirect if yes
if there is a file "../" & tempRequest then
redirect_um(tempRequest)
break
end if
##test to see if it is an ht article (add ../archives)
if there is a file "../archives/" & tempRequest then
redirect_um("/archives" & tempRequest)
break
end if
##if not see if it is in the map
if tRequest is among the lines of (keys(urlMap)) then
redirect_um(urlMap[tRequest])
break
else
error_page tRequest,logEntry
break
end if
##break ## shouldn't be needed
##check to see if it is an HT article index page (it's not an html
file)
case (char 1 to 5 of tRequest contains "2")
put tRequest into tempRequest
replace ".html" with ".shtml" in tempRequest
##put archives infront then test for index.shtml
if there is a file "../archives/" & tempRequest & "index.shtml"
then
redirect_um("/archives" & tempRequest)
break
end if
##put archives infront then test for index.shtml in format
(/1995/1)
if there is a file "../archives/" & tempRequest & "/index.shtml"
then
redirect_um("/archives" & tempRequest & "/")
break
end if
##check to see if it is another kind of file like pdf in the
archives folder
if there is a file "../archives/" & tRequest then
redirect_um("/archives" & tRequest)
break
end if
##if not see if it is in the map
if tRequest is among the lines of (keys(urlMap)) then
redirect_um(urlMap[tRequest])
break
else
error_page tRequest,logEntry
break
end if
case (char 1 to 5 of tRequest contains "19")--means it's a directory
put tRequest into tempRequest
replace ".html" with ".shtml" in tempRequest
##put archives infront then test for index.shtml
if there is a file "../archives/" & tempRequest & "index.shtml"
then
redirect_um("/archives" & tempRequest)
break
end if
##put archives infront then test for index.shtml in format
(/1995/1)
if there is a file "../archives/" & tempRequest & "/index.shtml"
then
redirect_um("/archives" & tempRequest & "/")
break
end if
##if not see if it is in the map
if tRequest is among the lines of (keys(urlMap)) then
redirect_um(urlMap[tRequest])
break
else
error_page tRequest,logEntry
break
end if
default
##if not see if it is in the map
if tRequest is among the lines of (keys(urlMap)) then
redirect_um(urlMap[tRequest])
break
else
error_page tRequest,logEntry
break
end if
end switch
end startup
on error_page theError,logEntry
## the request is not on our map...
## we send the user a page that helps him out a bit
## with his predicament... we need to work on this page
## it could be better done.. this was a first draft.
put url "file:../ssi/redirect_404.shtml" into buffer
replace "***requestedPage***" with theError in buffer
put "Content-Type: text/html" & cr
put "Content-Length:" && the length of buffer & cr & cr
put buffer
put logEntry after url "file:../webmaster/logs/404s.txt"
end error_page
on redirect_um pageToRedirectTo
put pageToRedirectTo into newURL
put "Status: 301 Moved Permanently" & cr
## in theory the above updates caches
## in search engines and the local browser such
put "Location: " & newURL & cr & cr
end redirect_um
Good Luck!
Sannyasin Sivakatirswami
Himalayan Academy Publications
at Kauai's Hindu Monastery
katir at hindu.org
www.HimalayanAcademy.com,
www.HinduismToday.com
www.Gurudeva.org
www.Hindu.org
On Monday, May 5, 2003, at 10:16 AM, Dave Cole wrote:
> From: Dave Cole <runrev at colegroup.com>
> Date: Mon May 5, 2003 10:16:46 AM Pacific/Honolulu
> To: use-revolution at lists.runrev.com
> Subject: Re: Grabbing HTTP Headers
> Reply-To: use-revolution at lists.runrev.com
>
>> OK, I give up. I've spent a few hours researching this and haven't
>> found an answer.
>>
>
> dan:
>
> are we talking a mac server environment here (ie: webstar and/or
> machttp)?
>
> if so, i have a slew of hypertalk code on grabbing headers from a
> webstar server that i will share either with you directly or with the
> list.
>
> i wrote a bunch of webstar cgis in hc back in 1997 ... it all
> leverages appleEvents. ...
>
> \dmc
More information about the use-livecode
mailing list