Perlfect Solutions

Pre-processing Url-Encoded CGI Input

So, you're writing your first CGI, and you're stuck right at the very beginning... Don't worry, that's very common. As soon as you set out to write your first script that will process data from a form you get to some of the most commonly asked newbie questions. This article will help you sort them out once and for all. As you'll soon see, it's all quite simple.

Where do I get my input from?

So, you wrote your simple form, say something like this: <FORM METHOD="GET" ACTION="/cgi-perlfect/test.pl"> Your name: <INPUT TYPE="text" NAME="name"> You e-mail: <INPUT TYPE="text" NAME="email"> <INPUT TYPE="submit"> </FORM> Now, you expect your CGI (test.pl) to get the values for name and email somehow... but how? There are two answers to that question. That is because there are two methods to submit a form, POST and GET. In fact, the only difference between the two methods lies in the way that the input is given to the script. Here's what happens in each case:

  1. GET: The input of a GET request is stored in a special environmental variable (if you don't know what that is, don't worry) called QUERY_STRING To access an environmental variable from your script, you have to use the %ENV hash that perl nicely provides. All you need to say is: $my_input = $ENV{QUERY_STRING}; and the form's data will be put to $my_input.
  2. POST: The POST method does not put your input in some variable. Instead, the web server, upon executing your script will pass the CGI input to the script from STDIN. (the standard input) So, all you have to do is to read from the STDIN filehandle to get your input. The only tricky thing is that, an EOF (end-of-file) is not guaranteed at the end of the string. So how do you figure out how much you need to read? Not coincidentally, the server puts the length of the input string into an environmental variable, CONTENT_LENGTH. So you have to read CONTENT_LENGTH bytes from STDIN. The line that does this is: read(STDIN, $my_input, $ENV{CONTENT_LENGTH}); Isn't perl lovely? A whole paragraph of jargon can be written in one simple line!

Now try the followin one-liner, that prints out to the browser the input string it got from the form: print "Content/type: text/plain\n\n$ENV{QUERY_STRING}";

Hell, but this comes all scrambled! How do I decode it?

For various reasons that we will not discuss at the moment, CGI input comes in a, somewhat messy, form, that has the fancy name Url-Encoded. The general form is the following: field1=value1&field2=value2&...&fieldn=valuen where, field1...fieldn are the names (as in the name attribute of the input tag in the HTML form) of the input fileds and value1...valuen are the corresponding values. (what the user typed in or selected) In addition, the resulting string is encoded, by replacing all spaces with pluses (+) and replacing certain other characters (like / and ~ and :) with hexadecimal ascii codes representing them. Now this looks messy, but don't be very scared (just a little), because with perl it only takes a couple of lines to decode. So, let's see how it's done:

First of all you'll have to separate each field-value pair from the rest. We use the split function to do that. @fv_pairs = split /\&/ , $my_input; The line above simple tells perl to split up the input string on the & symbol (which is the separator of two pair in url-encoded form) and put the list of resulting field-value pairs in the array @fv_pairs.

Then, we will have to take each of those pairs, separate the fieldname from the value, decode them, and store them in a hash (associative array) so that each fieldname is the key to its value. Here's how we do it: foreach $pair (@fv_pairs) { if($pair=~m/([^=]+)=(.*)/) { $field = $1; $value = $2; $value =~ s/\+/ /g; $value =~ s/%([\dA-Fa-f]{2})/pack("C", hex($1))/eg; $INPUT{$field}=$value; } } Now, what this does is pretty straightforward. We try each pair in turn, and if it is of the form field=value we store the fieldsnam and value in two separate variables. Then we replace all +es with spaces and decode hexadecimal sequences. (well, if you don't really follow the regular expressions, don't worry... you don't need to understand how they work, so long as you know how to use them - you will learn more with time) Finally, we put the value we get in an associative array with the relevant fieldname as a key.

So, now, all we need to do, is use the associative array %INPUT to access our form's data. For example, the following lines: print "Content-type: text/plain\n\n"; print "Your name is $INPUT{name} and your email is $INPUT{email} will print to the browser, something like Your name is Nick and your email is nick@perlfect.com for the example form we saw earlier.

That's basically all you need to know to get started. Trying things out and experimenting is your best bet at learning, so go on and try for yourself...

Happy CGI writing!

Save This Page

Comments


Warning: mysql_connect() [function.mysql-connect]: Access denied for user 'perlfect'@'68.178.254.190' (using password: YES) in /home/content/g/i/o/giorgoszervas/html/comments/comments_include.php on line 6
Connection Error: Access denied for user 'perlfect'@'68.178.254.190' (using password: YES)

Like it? Share it!

Post to del.icio.us
Post to
del.icio.us

Suggested Reading

Official CGI.pm Programming Guide The Official CGI.pm Programming Guide is the definitive manual and guidebook for writing CGI programs with perl and the CGI library. While the manual distributed with the library as part of perl's documentation is well written and covers almost anything you'd need to know about using CGI.pm, this book is a useful companion for anyone making CGI scripts with perl.
CGI Programming CGI Programming is an introductry book for CGI programming, perhaps not the best book I've read. It covers most topics about the CGI protocol and how to write server side programs to work with it. Nevertheless, most if not all of the information in this book (as with most books that discuss CGI programming) can be found in tutorials and references on the web, but if you feel like buying a book anyway, you may want to consider this one.
Webmaster In A Nutshell Webmaster in a nutshell is a catch-all reference book for webmaster and programmers. It does not have anuything that you can't find online, but if you're liek me you might want to have all the stuff you refer to frequently nicely laid out in a well-organized book lying on your desk. If you're looking for something like that then you'll be happy with this book.
Perl Cookbook The Perl Cookbook is full of quick solutions to everyday programming problems in perl with explanations and tips easy to understand even for beginners, but also frequently useful even to more experienced programmers. The code is clear and straightforward and the topics covered as well-thought and correspond to real world examples, so frequently you can literally copy code snippets from the book and fit them in your program. It is a nice complement for the Camel Book on your bookshelf.