ACCC Home Page Academic Computing and Communications Center  
Accounts / Passwords Email Labs / Classrooms Telecom Network Security Software Computing and Network Services Education / Teaching Getting Help
 
The ADN Connection, July/August/September 1997 The A3C Connection
July/Aug/Sept 1997 Contents The ADN Post Getting in Touch with the World Quick Index to ADN Services Do-It-Yourself Accounts The ADN Network Services Kit FormMail Version 2.0
Using SearchUIC Free ADN Seminars for Fall '97 Instructional Technology Services ADN Free Public Micro Labs UIC Announce is Here! About the ADN Connection  

Using SearchUIC

 
The Campus Beat
WWW Everyone

Putting information out on the World Wide Web is great, but what if you want to ask questions to get specific information back from your visitors? That's what HTML forms are for, and they aren't much harder to produce than simple HTML. But you'll need a program to receive the submitted forms and email them on to you. The programs that do this are called CGI scripts (Common Gateway Interface); they're simply a way to get a Web server to run a program rather than just fetching a file.

Do you want to ask people specific questions and to receive their answers by email, or do you want to let people search your myriad pages for specific words? You are now in luck! FormMail and SearchUIC will let you do exactly that, with features you might not have known you needed. And you won't even need to write any programs; the FormMail and SearchUIC CGI scripts are publicly available on both tigger and icarus; just direct the output of your HTML form to these scripts, and let the Web server do the rest.

FormMail and SearchUIC are two of a growing list of advanced Web tools available at UIC. The list and how-to instructions using for each tool -- including FormMail and SearchUIC -- can be found on the Web (of course!), at: http://www.uic.edu%url[SERVICES-PAGE}%#SERV-WEB

 
     
 
     
SearchUIC Version 1.0
  You may have noticed the new search link on the UIC home page. (See Figure 3.) All the Web pages that you can get to from the UIC Web page (and further down the "tree", for those pages located on tigger or icarus) are now indexed. You can search for specific words in the pages' titles, their URLs, or even their text.

Figure 3: Searching the UIC Web Pages

This figure show the results of a simple search for the keyword UICycle (Facilities Management's recycling program.) To get there from the UIC home page at: http://www.uic.edu/, select the search link, type uicycle in the box labeled "Search words">, then click the Submit button. The advanced search form (at the bottom of the screen) allows you to do more specific searches, such as to match all the words or to match them as a phrase, to limit the search to document titles, or to limit the search to specific URLs.

Would you like to provide a similar service for your departmental pages? Use a customized HTML form to let visitors search through your own set of pages, and even put your special logo on the input and output screens? Can do.

The job of gathering up the files and indexing them is done for you, courtesy of the Netscape Catalog Server. Twice a week, a Web robot starts at the UIC home page, and traverses the Web on tigger and icarus (less often for other machines), gathering and indexing all the pages it finds. This becomes the database searched by the existing search form.

You can use the same database. But when you build your HTML search form, use hidden tags to restrict the search to URLs in your own Web space. After that, your search form can be simple ("Type in a few search words") or complex (choices of Boolean expressions, choices of searching in the title, keywords, text, or URL, choices of result format and result ordering, and so on.) The choice is yours.

SearchUIC is similar in spirit to FormMail. (Reusing some of the same code helps!) You construct your own HTML search form, and direct the results to the SearchUIC CGI script. In addition, you can create a configuration file that will control the functioning of SearchUIC. The configuration file can control the output format, and even produce different output screens if no hits are returned or if there is an error. And like FormMail, you can use a minimal configuration file to get started, and add features as you desire. Simple to start with and flexible to continue with.

Return to Contents

 
     
Notice to Web Authors: How to make your visitors' searches work.
  Your pages are being indexed.

You probably knew the major commercial engines were scanning your pages, and we are now doing it ourselves at UIC. Even if you don't care about customizing HTML search forms, you probably do care that your pages are found through appropriate searches, and you probably care that they are presented well in the resulting hit list. This is not a random process; the structure of your HTML pages has a great influence.

What can you do to help?

Use valid HTML. Many browsers are tolerant of invalid HTML, but if the indexer can't understand the file, it can't index it well. The indexer needs to know more about your content in order to categorize it than the browser needs to know to render it.

Make sure your HTML includes a <title> element within the <head> element. And be very sure the title is well chosen to be as descriptive as possible. The title is not very important to the Web browser (usually it displays at the very top of the browser window, not in the main section). But it is probably the most important indicator to a search engine. It's used to find pages and displaying a list of hits. A poor or missing title might mean your page will be missed.

Use <h1>, <h1>, and <h2> tags correctly for headings. Use <b>, <font>, and so on for non-headings. Search engines normally assume that words in headings are more important indicators than words in general text. And the only way they have to identify headings is by looking at the <hn> tags.

Use <meta> tags. This is useful only to fine-tune more complex features. You can add tags like this to the <head> element:
<meta name="author" content="Werner Heisenberg">
<meta name="keywords" content="Uncertainty Principle">
Then later you can search for all documents whose "author" is "Heisenberg". You can't do this kind of search if you don't tag your documents this way.
Note: don't just make up the fields like "author" or "keywords". Use ones that are already designated. (For more information, see "Searchable Fields" in: http://www.uic.edu/depts/accc/webpub/search/ )

Once you've changed your documents, wait until the search robot re-indexes them, and try out the searches. Make sure your documents work as well in a search as they do on screen.

Return to Contents

 

What if you don't want your pages to be indexed?

That can be arranged. The SearchUIC Web page (at the URL: http://www.uic.edu/depts/accc/webpub/search/ ) explains how. Select "Robots -- Keep Off!" in the table of contents. The method described works both for SearchUIC and other "polite" search engines.

Comments are welcome; send them to:
Bob Goldstein, bobg@uic.edu

 
The ADN Connection, July/Aug/Sept 1997 Previous:  FormMail Version 2.0 Next:  Free ADN Seminars for Fall '97


1999-9-9  connect@uic.edu
UIC Home Page Search UIC Pages Contact UIC