Grabbing HTTP Headers

Sannyasin Sivakatirswami katir at hindu.org
Mon May 5 18:15:01 EDT 2003


Dan:

Well if you are using Apache it's easy.. .you can just read the globals
which are poked by the header when the request some down the pipe

Maybe the following little script will help. Have put the entire script 
here in case it is useful to
others doing CGI stuff.. we recently converted our entire 
HinduismToday.com site to .shtml
and restructured directories and this was created to deal with the fall 
out.

Delete the stuff pertaining to our particular situation. it is a simple 
solution also for anyone want to handle 404's on their site. Basically 
this is a redirect solution... incoming 404 is checked for the 
requested page. If the redirect is known then it can be found in the 
404s-map.txt file. If the page definitely does not exist or for some 
reason we don't want to map it to another page then we simply return a 
sweet 404 page to the user with suggestions etc.

We have Rev other tools  that FTP the log files from the remote server, 
strip out all dups and present a clean list of  this weeks 404's which 
are then used to build more cases in the switch below or up date the 
URL map.


#!/export/vhost/com/h/hinduismtoday/www/public_html/cgi-bin/mc

on startup
## this CGI is trigged by the .htaccess file invocatio of
## Apache's "Errordocument" module which passes the
## the HTTP header and other data from the GET request.
## to the Apache global variable "globals"
## we simply take this data, restructure it a bit and turn it into
## a little array.

repeat for each item i in the globals
put i & "=" & value(i) & cr after tGlobes
end repeat
split tGlobes by cr and "="
## now  you have all the header info in a nice little array

## New we read in our redirect matrix/map into an array.

put url "file:../ssi/404s-map.txt" into urlMap
split urlMap by cr and "="

## now put the extract from the global array the page
## the person is trying to go and put it into
## into a variable.

put tGlobes["$REQUEST_URI"]  into tRequest

##### logging feature
## here we extract certain parts of the http header
## and pass it to our own 404 log... I did this just
## be able to "watch' the  404's  a bit more and this passes
## a bit more data that what we get in the error logs.
## but later we would eliminate this log... it's redundant and
## make another maintenance task: to clean up that log file.

put "*" & the date & cr after logEntry
put "$REQUEST_URI"  & tab & tGlobes["$REQUEST_URI"] & cr after logEntry
put "$REMOTE_HOST"  & tab & tGlobes["$REMOTE_HOST"] & cr after logEntry
put "$REMOTE_ADDR"  & tab & tGlobes["$REMOTE_ADDR"] & cr after logEntry
put "$0"  & tab & tGlobes["$0"] & cr after logEntry
put  cr & cr after logEntry

     switch
     case tRequest contains ".html"
       ##replace .html with .shtml
       put tRequest into tempRequest
       replace ".html" with ".shtml" in tempRequest

       ##check for existance then redirect if yes
       if there is a file "../" & tempRequest then
           redirect_um(tempRequest)
           break
       end if


       ##test to see if it is an ht article (add ../archives)
       if there is a file "../archives/" & tempRequest then
           redirect_um("/archives" & tempRequest)
           break
       end if

       ##if not see if it is in the map
       if tRequest is among the lines of (keys(urlMap)) then
         redirect_um(urlMap[tRequest])
         break
       else
         error_page tRequest,logEntry
         break
       end if
     ##break ## shouldn't be needed

     ##check to see if it is an HT article index page (it's not an html 
file)
     case (char 1 to 5 of tRequest contains "2")
       put tRequest into tempRequest
       replace ".html" with ".shtml" in tempRequest

       ##put archives infront then test for index.shtml
       if there is a file "../archives/" & tempRequest & "index.shtml" 
then
          redirect_um("/archives" & tempRequest)
          break
       end if

       ##put archives infront then test for index.shtml in format 
(/1995/1)
       if there is a file "../archives/" & tempRequest & "/index.shtml" 
then
          redirect_um("/archives" & tempRequest & "/")
          break
       end if

       ##check to see if it is another kind of file like pdf in the 
archives folder
       if there is a file "../archives/" & tRequest then
          redirect_um("/archives" & tRequest)
          break
       end if


       ##if not see if it is in the map
       if tRequest is among the lines of (keys(urlMap)) then
         redirect_um(urlMap[tRequest])
         break
       else
         error_page tRequest,logEntry
         break
       end if


     case (char 1 to 5 of tRequest contains "19")--means it's a directory
       put tRequest into tempRequest
       replace ".html" with ".shtml" in tempRequest

       ##put archives infront then test for index.shtml
       if there is a file "../archives/" & tempRequest & "index.shtml" 
then
          redirect_um("/archives" & tempRequest)
          break
       end if

      ##put archives infront then test for index.shtml in format 
(/1995/1)
       if there is a file "../archives/" & tempRequest & "/index.shtml" 
then
          redirect_um("/archives" & tempRequest & "/")
          break
       end if

       ##if not see if it is in the map
       if tRequest is among the lines of (keys(urlMap)) then
         redirect_um(urlMap[tRequest])
         break
       else
         error_page tRequest,logEntry
         break
       end if

     default
       ##if not see if it is in the map
       if tRequest is among the lines of (keys(urlMap)) then
         redirect_um(urlMap[tRequest])
         break
       else
         error_page tRequest,logEntry
         break
       end if

     end switch


end startup



on error_page theError,logEntry
## the request is not on our map...
## we send the user a page that helps him out a bit
## with his predicament... we need to work on this page
## it could be better done.. this was a first draft.
    put url "file:../ssi/redirect_404.shtml" into buffer
    replace "***requestedPage***" with theError in buffer

    put "Content-Type: text/html" & cr
    put "Content-Length:" && the length of buffer & cr & cr
    put buffer
    put logEntry after url "file:../webmaster/logs/404s.txt"
end error_page


on redirect_um pageToRedirectTo
   put pageToRedirectTo into newURL
   put "Status: 301 Moved Permanently" & cr
## in theory the above updates caches
  ## in search engines and the local browser such

   put "Location: "  & newURL & cr & cr
end redirect_um


Good Luck!

Sannyasin Sivakatirswami
Himalayan Academy Publications
at Kauai's Hindu Monastery
katir at hindu.org

www.HimalayanAcademy.com,
www.HinduismToday.com
www.Gurudeva.org
www.Hindu.org



On Monday, May 5, 2003, at 10:16  AM, Dave Cole wrote:

> From: Dave Cole <runrev at colegroup.com>
> Date: Mon May 5, 2003  10:16:46  AM Pacific/Honolulu
> To: use-revolution at lists.runrev.com
> Subject: Re: Grabbing HTTP Headers
> Reply-To: use-revolution at lists.runrev.com
>
>> OK, I give up. I've spent a few hours researching this and haven't 
>> found an answer.
>>
>
> dan:
>
> are we talking a mac server environment here (ie: webstar and/or 
> machttp)?
>
> if so, i have a slew of hypertalk code on grabbing headers from a 
> webstar server that i will share either with you directly or with the 
> list.
>
> i wrote a bunch of webstar cgis in hc back in 1997 ... it all 
> leverages appleEvents. ...
>
> \dmc




More information about the use-livecode mailing list