|
|
CGI Environmental Variables
One of the methods that the web server uses to pass information to a cgi script
is through environmental variables. These are created and assigned appropriate values
within the environment that the server spawns for the cgi script. They can be accessed
as any other environmental variable, like with getenv() (in C) or
%ENV{'VARIABLE_NAME'} (in Perl). Many of them, contain important information, that
most cgi programs need to take into account.
This list, highlights some of the most
commonly used ones, along with a brief description and notes on possible uses for them.
This list is by no means a complete reference; many servers pass their own extra
variables, or having different names for some, so better check with your server's
documentation. The purpose of this list is only to suggest some common good uses
for some of the server-passed information.
CONTENT_LENGTH
The length, in bytes, of the input stream that is being passed through standard input.
This is needed when a script is processing input with the POST method, in order to read
the correct number of bytes from the standard input. Some servers end the input string
with EOF, but this is not guaranteed behaviour, so, in order to be sure that you read the
correct input length you can do something like
read(STDIN,$input,$ENV{CONTENT_LENGTH})
DOCUMENT_ROOT
The directory over which all www document paths are resolved by the server.
Sometimes it is useful to know the server's document root, in order to compose
absoulte file paths when all the script is eing given as a parameter is the
relative path of the file within the www directory. It is also good practice
to have your script resolve paths in this way, both for security reasons and
for portability. Another common use is to be able to figure out what the url
of a file will be if you only know the absolute path and the hostname. (there's
another variable to find out the hostame)
HTTP_REFERER
The URL that the referred (via a link or redirection) the web client to
the script. Typed URLs and bookmarks usually result in this variable being
left blank.
In many cases a script may need to behave differently depending on the
referer. For example, you may want to restrict your counter script to
operate only if it is called from one of your own pages, to prevent
someone from using it from another web page without your permission.
Or even, the referer may be the actual data that the script needs
to process. Extending the example above you might also like to install
your counter to many pages, and have the script figure out from the referer
which page generated the call and increment the appropriate count, keeping
a separate count for each individual URL. A snippet for the referer blocking
example could be:
die unless($ENV{HTTP_REFERER}=~m/http:\/\/(www\.)?$mydomain\//);
HTTP_USER_AGENT
The name/version of the client issuing the request to the script.
Like with referers, one might need to implement behaviours that vary
with the client software used to call the script. A redirection script
could make use of this information to point the client to a page optimized
for a specific browser, or you may want to have it block requests from
specific clients, like robots or clients that are known not to support
appropriate features used by what the script would normally output.
PATH_INFO
The extra path information followin the script's path in the URL.
A URL that refers to a script may contain additional information,
commonly called 'extra path information'. This is appended to
the url and marked by a leading slash. The server puts this information
in the PATH_INFO variable, which can be used as a method
to pass arguments to the script.
PATH_TRANSLATED
The PATH_INFO mapped onto DOCUMENT_ROOT.
Usually PATH_INFO is used to pass a path argument to the script. For example
a counter might be passed the path to the file where counts should be stored.
The server also makes a mapping of the PATH_INFO variable onto the document
root path and store is in PATH_TRANSLATED which can be used directly as an
absolute path/file.
QUERY_STRING
Contains query information passed via the calling URL, following a question
mark after the script location.
QUERY_STRING is the equivalent of content passed through STDIN in POST, but
for script called with the GET method. Query arguments are written in this
variable in their URL-Encoded form, just like they appear on the calling
URL. You can process this string to extract useful parameters for the script.
REMOTE_ADDR
The IP address from which the client is issuing the request.
This can be useful either for logging accesses to the script (for example
a voting script might want to log voters in a file by their IP in order
to prevent them from voting more than once) or to block/behave differently
for particular IP adresses. (this might be a requirement in a script that
has to be restricted to your local network, and maybe perform different
tasks for each known host)
REMOTE_HOST
The name of the host from which the client issues the request.
Just like REMOTE_ADDR above, only that this is the hostname of the remote
machine. (If it is known via reverse lookup)
REQUEST_METHOD
The method used for the request. (usually GET, POST or HEAD)
It is wise to have your script check this variable before doing anything. You
can determine where the input will be (STDIN for POST, QUERY_STRING for GET)
or choose to permit operation only under one of the two methods. Also, it is
a good idea to exit with an explanatory error message if the script is called
from the command-line accidentally, in which case the variable is not defined.
SCRIPT_NAME
The virtual path from which the script is executed.
This is very useful if your script will output html code that contains
calls to itself. Having the script determin its virtual path, (and hence,
along with DOCUMENT_ROOT, its full URL) is much more portable than hard
coding it in a configuration variable. Also, if you like to keep a log
of all script accesses in some file, and want to have each script report
its name along with the calling parameters or time, it is very portable to
use SCRIPT_NAME to print the path of the script.
SERVER_NAME
The web server's hostname or IP address.
Very similarly to SCRIPT_NAME this value can be used to create more portable
scripts in case they need to assemble URLs on the local machine. In scripts
that are made publically accessible on a system with many virtual hosts, this
can provide the ability to have different behaviours depending on the virtual
server that's calling the script.
SERVER_PORT
The web server's listening port.
Complements SERVER_PORT above, in forming URLs to the local system.
A commonly overlooked aspect, but it will make your script portable
if you keep in mind that not all servers run on the default port and
thus need explicit port reference in the server address part of the URL.
Online Documentation/Tutorials
- Your web server's documentation should provide a complete reference of all environmental variables passed to
CGI scripts.
Comments
|
Don Park | Posted at 2:17pm on Thursday, March 27th, 2008 | HTTP_REFERRER has four Rs |
Richard | Posted at 10:48pm on Monday, April 14th, 2008 | Not on my Apache server it doesn't. |
Aaron | Posted at 10:25am on Wednesday, June 25th, 2008 | Not on my IIS server either. |
Anon | Posted at 12:30pm on Friday, July 11th, 2008 | I read somewhere about how referrer was misspelled in the initial creation of the internet and since then programmers have adapted code to accept both spellings. |
Anon | Posted at 10:43am on Monday, August 18th, 2008 | Just remember it's double Rs in JavaScript (document.referrer), and it can't be uppercase. :) |
Anonymous | Posted at 11:54am on Thursday, October 9th, 2008 | Javascript is client-side, so this whole document has no bearing |
Anonymous | Posted at 2:10pm on Sunday, November 16th, 2008 | when spelling nazi attack.... |
Elvis | Posted at 11:29pm on Friday, March 20th, 2009 | Lucky to find you, keep on the good workk guys! Best of luck.i |
Comments to date: 8.
|
Suggested Reading
The Official CGI.pm Programming Guide is the definitive manual and guidebook for writing CGI programs
with perl and the CGI library. While the manual distributed with the library as part of perl's documentation is well
written and covers almost anything you'd need to know about using CGI.pm, this book is a useful companion for anyone
making CGI scripts with perl.
CGI Programming is an introductry book for CGI programming, perhaps not the best book I've read.
It covers most topics about the CGI protocol and how to write server side programs to work with it.
Nevertheless, most if not all of the information in this book (as with most books that discuss CGI programming)
can be found in tutorials and references on the web, but if you feel like buying a book anyway, you may want to
consider this one.
Webmaster in a nutshell is a catch-all reference book for webmaster and programmers. It does
not have anuything that you can't find online, but if you're liek me you might want to have all the
stuff you refer to frequently nicely laid out in a well-organized book lying on your desk. If you're looking
for something like that then you'll be happy with this book.
The Perl Cookbook is full of quick solutions to everyday programming problems in perl with explanations
and tips easy to understand even for beginners, but also frequently useful even to more experienced programmers.
The code is clear and straightforward and the topics covered as well-thought and correspond to real world examples,
so frequently you can literally copy code snippets from the book and fit them in your program. It is a nice complement
for the Camel Book on your bookshelf.
|