ACCC Home Page ACADEMIC COMPUTING and COMMUNICATIONS CENTER
Accounts / Passwords Email Labs / Classrooms Telecom Network Security Software Computing and Network Services Education / Teaching Getting Help
 
Web Searching / Indexing
0 Contents 1 Google 2 Intro 3 What's Indexed 4 Fields & Queries
5 Forms 6 Output 7 Examples A1 Related Links  

Custom Web Search Forms

 

I'll presume you already know about HTML forms in general, and you just need to know the specifics of using Catalog features. This section is technical and probably a bit confusing. It is probably best understood in conjunction with the examples later.

The general idea is that you prepare an HTML search form, and point the action back to a common CGI script. That script reads the form input, does the search, and returns the results. I've already written this CGI script; all you have to do is use it.

Actually, the CGI script will happily accept either POST or GET submissions, so you could construct a URL to do a fixed search, rather than use an HTML form, if this is useful to you.

If you don't like the default output format, you can prepare a special config file, and tell the cgi script (in the <FORM> tag) where to find this file. The config file gives you quite a bit of control. It may look complicated, but it doesn't have to be. And it is even possible to embed the original HTML form inside the config file, in such a way that you don't need the bare HTML form at all. You'll need to see the examples for this.

Note: If you use both an HTML form and a config file, and happen to have the same variable set in both, the value in the HTML form takes precedence.

 
   
 
     
<Form> tag
  The first point is that everyone can use the same cgi script to process the input from a Catalog search form. So near the top of your form, you'll need the line:

<FORM Method="post" Action="http://www.uic.edu/htbin/search/SearchUIC">

If you do happen to use a config file (don't worry about the details yet), this tag must be modified. You will need to append the path part of the URL of the config file to the above Action attribute. For example, suppose I have a config file named myconfig.sgml located in ~bobg/public_html. The url of this config file would be: http://www.uic.edu/~bobg/myconfig.sgml and therefore I'd prepare my <FORM> tag like this:

<FORM Method="post" Action="http://www.uic.edu/htbin/search/SearchUIC/~bobg/myconfig.sgml">

(If you do have a config file, and the config file resides on icarus, you should be sure to use www2.uic.edu instead of www.uic.edu in the Action attribute.)
 
     
Input Fields
  The second point is that a form's information is contained in the input fields (or select elements or radio buttons or checkboxes or hidden fields). All that counts is the name of the input field, and the value of the input field. The CGI script creates an internal query string based on based on the words typed into the HTML form, and based on the names of the HTML input fields. The input field names must be constructed to tell the CGI script how to process the query.

Don't confuse input field names with search field names. On an HTML form, each entry field (such as a text entry, checkbox, etc) has a name embedded in the HTML tag. For example, in

<input type=text name=searchme size=50>
the name of the input field is searchme. In contrast, when a file is indexed by the Verity engine, different words are put into search fields, with names such as title, partial-text,author, keywords, and so forth. In order for a user to search for the word Shakespeare in the author search field (of all documents cataloged by a given search engine), the search field name must appear in (or be associated with) the input field name on the HTML form.

Thus to make a form, you must know what names of the input fields are allowed, and how those input fields will affect the query. Some of the input field names are fixed, but others depend on the names of the search fields from the set of documents indexed. Some input fields will affect what results are returned; other fields will affect how the results are presented.

The easiest query to construct is simply a text input field with the name "query". This works, but the user must enter the entire query, include any Boolean expressions, parentheses, names of search fields, and so forth. This is great for a sophisticated user who knows both the underlying Verity syntax and the search fields used in indexing the document collection. For more normal users, however, the HTML form designer (i.e. you) can make life easier by constructing more complex input fields that hide these complications.

 
     
Query Fields
  Each input field can be associated with two search attributes, a search field and a search operator. The search field, as described above, is one of the fields indexed by the Verity engine, such as author or title. If no search field is given, the default is to search all available search fields together.

There is also a third attribute, moperator, which is also a search operator that can be used in conjunction with a multiple-select HTML widget. If you don't use multiple-select boxes, don't worry about it.

The search operator is a Boolean operator or something fancier that tells the CGI script how to process the search words. For example, the operator AND tells the CGI script to AND together all the input words from that particular input fields, whereas the operator PHRASE tells the CGI script to search for those exact words in the order given. By default, the search operator is OR unless otherwise specified.
Available Search Operators
OperatorInput ValueMeaning
AND text text is AND'ed together and applied only to search_field
OR text text is OR'ed together and applied only to search_field
PHRASE text text is quoted as a phrase and applied only to search_field
WORD text words are matched exactly in search_field, without stemmed variations
GT date find values of search_field representing dates after (Greater Than) the specified date
LT date find values of search_field representing dates before (Less Than) the specified date

 
     
Query Field Format
  Now that you know to choose a search field and search operator, you need to know how to construct an HTML input field that refers to this search field and operator. There are three ways to do this.
  1. Single Tag. Construct a single HTML input tag, whose name starts with "query" and contains the search field and operator, separated by colons:
    <input name=query:search_field:operator:moperator>
    
    Note that search_field in the above Input Field Names should be replaced with the appropriate name of a search field, or it can be left blank so as to apply to all search fields. Also operator should be replaced with an operator from the above table, or left blank to indicate OR.
  2. Double Tag. Construct two HTML tags, one to receive the search words to look for, and the other to receive the search field and operator. This is useful if you want to give the user an option as to what search field to use, yet use one HTML field for the search words. The format is:
    <input name=queryname:myname>
    <input name=querytype:myname value=search_field:operator:moperator>
    
    You may choose any reasonable unique string for myname; its only purpose is to tie the two HTML fields together in search function.
  3. Multi-Tag. Construct four HTML tags, one to receive the search words to look for, one to receive the search field, and one to receive the search operator. This is useful if both the search field and search operator must be selected separately from select boxes. The format is:
    <input name=queryname:myname>
    <input name=queryfield:myname value=search_field>
    <input name=queryop:myname value=operator>
    <input name=querymop:myname value=moperator>
    

If multiple input fields are used, the final query is the ANDed value of all query input values.

Quick Examples

There are more complete examples in a later section, but here are some ways to construct queries.

  1. Suppose the following two tags were used:
    <INPUT Type="text" Name="query:title:and" >
    <INPUT Type="text" Name="query:author:or" >
    
    and suppose the end user put the words Network Modem in the first text entry box, and the words Shakespeare Milton in the second text entry box. This would translate into a search for
    (title <CONTAINS> Network AND title <CONTAINS> Modem ) AND (author <CONTAINS> Shakespeare OR author <CONTAINS> Milton )
    Presumably this would retrieve all the HTML files written by either Shakespeare or Milton (or both) that had both the words Network and Modem somewhere in the title.
  2. Suppose you want to let the user search in either the author or title search fields.
    <INPUT Type=text Name="queryname:bob" Size=50>
    <SELECT Name="querytype:bob">
    <Option Value="author:or">Author
    <Option Value="title:or">Title
    </SELECT>
    
    Note that for each queryname, there must be exactly one querytype values. In the above example, you should not allow the user to select multiple values, or the results will be unpredictable.
  3. Here's an example using moperator with a multiple-select box.
    <SELECT MULTIPLE SIZE=2 Name="queryname:bob" >
    <Option> Ed Purcell
    <Option> Felix Bloch
    </SELECT>
    <INPUT Type=hidden Name="queryfield:bob" value="author" >
    <INPUT Type=hidden Name="querymop:bob" value="or" >
    <INPUT Type=hidden Name="queryop:bob" value="phrase" >
    
    Note that a multiple-select box means you can make multiple selections from the same input widget. This is different from making single selections from multiple widgets. In the above case, if both items are selected from the select box, the query would be:
    (title<CONTAINS>"Ed Purcell") OR (title<CONTAINS>"Felix Bloch")
    The OR comes from the moperator, and applies to the link between the two selections. The quotes come from the operator (that is, the "phrase" in the queryop field), and applies to each selection. In this case, the query would find those articles written by either Ed Purcell or Felix Bloch.
 
     
Other Input Fields
  Some options affect the search or display of results. These values can be set in the HTML form, or in the underlying SGML config file, if used.
Query Options
Input Field NamePossible ValuesMeaning
s_maxhits integer Maximum number of returned hits (This can be less than 50, but not greater.)
s_starthits integer Where to start listing hits. Default 1 indicates beginning.
s_sortby ALPHA Sort alphabetically by title
DATE Sort by date last modified
SCORE Sort by relevance score
s_resultsby ALL Show all fields
fields Comma-delimited list of search fields to show (If there are one or two fields, and if the first one is title, then output is one line/hit. Otherwise each field is output in one or more lines.) Example: title,description
 
 

Web Search Forms Previous: 4 Fields & Queries Next: 6 Output


2002-6-29  wwwtech@uic.edu
UIC Home Page Search UIC Pages Contact UIC