1 INFO & DEC SCI 480 SPRING, 1994 CLUSTER ANALYSIS SCLOVE OBTAINING "NORMIX", WOLFE'S MIXTURE-MODEL CLUSTERING PROGRAM Notes on Chapter 3 (A Review of Clustering Methods), cont'd Wolfe (1971) gives a procedure for testing H0: number of clusters is r vs H1: number of clusters is r' > r. Let likelihood ratio = max L(r)/max L(r'), where max L(k) is the likelihood for k clusters, maximized over the unknown parameters (i.e., with max likelihood estimates put in for unknown parameter values). In many statistical problems, minus 2 times the (natural) log of the likelihood ratio has a distribution which is approximated by a chi-square distribution, the d.f. being equal to the difference in the numbers of parameters. In this case, this distribution is approximated by a constant times a chi-square distribution with an appropriate number of d.f.; the constants are given in Wolfe (1971). Obtaining the NORMIX Program --------------------------------------- The NORMIX program has been converted to run on 386/486 MSDOS systems. It uses extended memory as needed to run large problems. The computational procedures are identical to the 1978 mainframe version, but a number of improvements have been made to the I/O to make it more user-friendly. In addition, it runs about 25 times faster than it did on old mainframes. A shareware version, NORMIX20.ZIP, may be downloaded by anonymous ftp from SIMTEL20 mirror sites, such as: SITE DIRECTORY OAK.Oakland.Edu /pub/msdos/statstic archive.orst.edu /pub/mirrors/simtel20 /msdos/statstic DIRECTIONS: FTP to one of the above sites, change to the appropriate directory using the cd command. Put the FTP program into BINARY or IMAGE mode. Then get NORMIX20.zip Copy the NORMIX20.zip file to your C: drive root directory. Unzip it with PKUNZIP -D NORMIX20.ZIP, or UNZIP -D NORMIX20.ZIP. Then follow the directions in the README and documentation files. 1 IDS 480, S '94 p. 2 --------------------------------------------------------------------- References ---------- Wolfe, J. H. (1970). Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research 5, 329-350. Wolfe, J. H. (1971). A Monte Carlo study of the sampling distribution of the likelihood ratio for mixtures of multinormal distributions. Naval Personnel and Training Research Laboratory Technical Bulletin STB 72-2. San Diego, California. SLS:ss/NORMIX.480s94 30-Mar-94