| ACADEMIC COMPUTING and COMMUNICATIONS CENTER | |||||||||
| |||||||||||||||||
Introduction | |||||||||||||||||
|
Send comments or bug reports to wwwtech@uic.edu. |
|||||||||||||||||
| Introduction | |||||||||||||||||
|
This document will help you build a custom index and query form for your Web documents. Or, at the least, you will learn how to adjust your documents so that the existing indexing and query forms find them when they should. There are now two options, Google and UIC Search Engine. Most of this document is about the older UIC Search Engine. Consider using a custom Google Search if it can meet your requirements. If not, keep reading. Note: I assume you are already familiar with HTML. If you intend to build your own query page, you must already be familiar with HTML forms, as well. You don't need to be a programmer, but attention to detail is quite helpful. The ADN is running the Netscape Catalog Web indexing system at UIC. Netscape Catalog is a complicated system with many moving parts, but it allows a huge degree of flexibility in customizing which documents are indexed, how they are indexed, how one performs a query, and how the results are presented. We have tried to configure this system to give individual users as much flexibility as possible, consistent with general Web security practices. At the same time, we've tried to make this system usable to those who don't want to understand every last detail. But "ease-of-use" being the opposite of "flexible", some compromise had to be reached. You will need to know some details, although I hope they aren't too hard to learn. If you want to rely on the default indexing and searching, you still might want to know a little about Netscape Catalog, because the nature of your HTML documents can affect how they are indexed and, therefore, how easily they are found by the right people. You should read the Bare Minimum, just to make sure your documents turn up in the right searches. This document is a bit long. If you want to get started quickly, go to the example section, copy one of the example search forms to your own Web directory, and make changes to it. After you understand how Netscape Catalog works, and how to formulate queries, you'll be more able to customize a search for others. Even though this document is long, it is also incomplete. The material here is oriented toward how to construct search forms using Netscape Catalog and the relevant UIC modifications. A real discussion of how the catalog server works is beyond the scope of this document, as is a full discussion of the verity search engine. Also, some features related to use at UIC are still under construction. |
|||||||||||||||||
| Create Your Own Search Form -- Bird's Eye View | |||||||||||||||||
|
If you want to create your own search form, either to create your own look-and-feel or to easily search through a subset of all the pages at UIC, here are the general steps.
The documentation is written mostly as a reference, and can be a bit complicated. You might want to start with the Examples and Tutorial. |
|||||||||||||||||
| Make Your Files Findable -- Bare Minimum | |||||||||||||||||
|
Even if you don't create a custom query page, there are a few things to keep in mind about your HTML documents.
|
|||||||||||||||||
| General Scheme | |||||||||||||||||
|
Netscape Catalog reads and indexes a set of Web documents, most of which are HTML, but it handles other formats, too. This reading and indexing takes place automatically. (That is to say, the reading and indexing is done centrally, and hardly anyone needs to worry about the details.) All of the files are put into one large index, but appropriately worded queries can be constructed that search only a selected part of this data. Roughly speaking, a file will be indexed or refreshed once per week; the index itself will be deleted after about 3 months if not refreshed. So once you create or change a file, it may take as long as a week for the file to show up in the index. Once you destroy a file, the indexed entry will live on for another 3 months. Netscape Catalog holds the index in a special web server called a "Catalog Server", whose function is to build the index and to respond to queries. The search engine used by the Catalog Server is from Verity. One queries the Catalog Server through a web form. The data entered on the web form is transmitted to a cgi script. The cgi script then connects to the Catalog Server, performs the search, reformats the output, and sends the list of hits back to the end user. The owner of a set of web pages may want to customize this search procedure. There are several opportunities for user customization.
|
|||||||||||||||||
| Web Search Forms | Previous: 1 Google | Next: 3 What's Indexed |
| 2008-2-12 wwwtech@uic.edu |
|