ACCC Home Page ACADEMIC COMPUTING and COMMUNICATIONS CENTER
Accounts / Passwords Email Labs / Classrooms Telecom Network Security Software Computing and Network Services Education / Teaching Getting Help
 
CGI Programming at UIC
0 Contents 1 Introduction 2 Background 3 Codewrap 4 Perlwrap
5 Perl 6 PHP Examples 7 Perl Examples A1 Related Links  

2. Background

   
 
     
Web Server
 

What is a Web server? Just a program that listens for incoming connections. When a connection is made, the browser sends the Web server a message using HyperText Transport Protocol (HTTP). The Web server will interpret the message to figure out what was requested, send a return message, and then close the connection.

In particular, HTTP is stateless. This means that each connection and single exchange of messages is independent; when the server answers a connection, there is no intrinsic memory that a previous connection was made. Every click on every link is a separate connection.

Normally, the Web server receives a message to fetch an HTML file. Simple enough, the server reads the appropriate file from disk, wraps a little HTTP around it, and sends it back to the browser. The process is the same if a GIF or JPEG or other kind of file is requested; the only difference is that the Web server uses HTTP to specify what kind of file is being returned. Technically, the browser does not know what kind of file it is requesting, but it does know what kind of file is returned because the HTTP tells it.

Files are classified by MIME type. For example, text/html is the MIME type for ordinary HTML, and image/gif is used for a GIF image. You only have to know a little about this, because your CGI script will have to specify the MIME type of what it returns.

Suppose a browser requests a certain URL, and the Web server returns an HTML file. The browser cannot know whether that file existed on disk or was generated on the fly by a program. This is all a CGI script is: instead of fetching a file, the Web server executes a program and returns the output of that program as if it were a file.

How does the Web server know when to fetch a file and when to run a program? Something in the URL tells it. It could be different for different servers. Sometimes it is the file extension, sometimes the permissions, sometimes it lives in a special directory. No matter, a properly configured Web server knows the difference.

 
     
CGI Script
 

How does the CGI script know what to do? Where does it get its input, and how does it send output to the browser? Input information is sent either in a complicated URL or via submission of a web form. This information is sent in the HTTP request, and the Web server puts it in the environment before executing the CGI script. The script simply examines its environment or reads its input. (Well, it's simple in principle. Fortunately, there are libraries that make it simple in practice, too.)

Output is simpler. The CGI script just writes to STDOUT and the Web server makes sure this is sent back to the browser. Some caveats, though. The CGI script must indicate the MIME type of the output, using a certain format. Worse, CGI scripts don't often handle errors well. In normal programming, if you screw up, probably the compiler gives you some informative messages about your error and where it occurred. With CGI, you have to be very careful or all you get is "Premature end of script headers" and absolutely no clue as to what the real problem is.

 
     
More Info
 

There's lots of info on the web. I'd normally start with Yahoo and look for the special sections on HTTP, HTML, or CGI. You may not need to understand too much about HTTP, but you will definitely want to understand HTML forms. HTTP may be useful for advanced features, such as cookies.

I mentioned that CGI scripts get information from their environment. I use a page from NCSA to remind me exactly which environment variables to use.

When starting a CGI script in perl, there is a standard perl module, CGI.pm, that is particularly useful in fetching the input sent by the browser. Actually, it has many other features, some of which seem overkill to me. But one useful feature is that it lets you debug your script from the command line, rather than forcing you to run everything through the Web server. Check out the man page on tigger or icarus.

There are lots of books on CGI programming, and other books on perl. You can, of course, download publically available scripts and examine them for ideas. I normally recommend this practice for HTML, but be careful with CGI. Firstly, of course, you can only download the CGI source if the author has made it possible; running the script gets you the output, not the source. But the main problem is that it is hard for a beginner to tell a good, useful script from a bad, insecure script. This has mostly to do with straight programming, and somewhat to do with the ins and outs of CGI.

 
 

CGI Previous: 1 Introduction Next: 3 Codewrap


2006-9-29  wwwtech@uic.edu
UIC Home Page Search UIC Pages Contact UIC