C$PAGES=40 C$STATEMENTS=999999 C$T=19 C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C CLUSPAC: Computer Programs for Mixture-Model Clustering C C C C C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C CMS DSN = MIX1DTA CLUSPAC C C C C C C THE PROGRAMS MIX* IN THE SUBLIBRARY CLUSPAC OF THE C C LIBRARY CLUSPAC ARE FOR CLUSTERING DATA BY C C ITERATIVE MAXIMIZATION OF THE MIXTURE-MODEL LIKELIHOOD C C C C N K C C --- -- C C L = | | > P(C)*F(X(I)|C), C C | | -- C C I=1 C=1 C C C C WHERE C C C C N = NUMBER OF OBSERVATIONS ("SAMPLE SIZE"), C C K = NUMBER OF CLUSTERS, C C X(I) = I-TH OBSERVATION, I = 1,2,...,N, C C F(X|C) = VALUE AT X OF THE C-TH CLASS-CONDITIONAL C C DENSITY FUNCTION (C=1,2,...,K) C C AND C C P(C) = PRIOR PROBABILITY OF CLASS C. C C C C C C REFERENCE FOR CLUSTERING BY MIXTURE MODEL: C C C C Wolfe, J. H. (1970). Pattern clustering by multivariate C C mixture analysis. Multivariate Behavioral Research 5, 329-350. C C C C C C IN THE PROGRAM NAME C C C C MIX1DTA C C C C THE "1" MEANS THAT THE PROGRAM IS FOR UNIVARIATE C C (1-DIMENSIONAL) DATA; C C C C THE "DT" MEANS THE VARIANCES ARE ALLOWED TO VARY ACROSS C C CLUSTERS; C C C C AND THE "A" MEANS THAT THERE IS AUTOMATIC SETTING OF NUMBERS C C OF CLUSTERS AND INITIAL MEANS. C C C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C PROGRAMMED BY C C DR. STANLEY L. SCLOVE 312/996-2681 C C DEPARTMENT OF C C INFORMATION & DECISION SCIENCES M/C 294 C C COLLEGE OF BUSINESS ADMINISTRATION C C UNIVERSITY OF ILLINOIS AT CHICAGO C C BOX 4348 C C CHICAGO, IL 60680-4348 C C C C C C VERSION 3.0 20-OCT-92 C C C C COPYRIGHT 1991, 1992 STANLEY LOUIS SCLOVE. C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C RESTRICTIONS (CAN BE MODIFIED): C C N, SAMPLE SIZE, AT MOST 9999; C C K, NUMBER OF CLUSTERS, AT MOST 29; C C ITER, MAXIMUM NUMBER OF ITERATIONS, 20. C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C CONTROL CARDS: C C C C (1) DATASET TITLE C C (2) N, IN FORMAT (2X,I4) C C (3) FMT, IN FORMAT (18A4), E.G., (1X,F4.1). C C ALLOW AT LEAST ONE BLANK IN FMT: IT WILL ALSO BE USED C C FOR OUTPUT, WHERE CC1 IS FOR CARRIAGE CONTROL. C C ALLOW A CC FOR THE DECIMAL POINT ON OUTPUT, C C WHETHER OR NOT THERE IS ONE ON INPUT. C C (4) DATA, IN FORMAT SPECIFIED BY FMT C C C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C C DIMENSION X(9999),XMNDSQ(9999),ICLUS(9999),IOTA(9999) DIMENSION XJ(9,9) DIMENSION DSQ(29),SUM(29) DIMENSION TITLE(18) DIMENSION NC(29),XMEAN(29),XMEANOL(29) DIMENSION BPLUS(29),BMINUS(29) C DIMENSION FMT(18) DIMENSION SS(29),SSD(29) DIMENSION SD(29) DIMENSION VAR(29) DIMENSION ICLSOL(9999) DIMENSION F(9999,29) DIMENSION P(29), XNC(29) DIMENSION PP(29,9999) DIMENSION XMXPR(9999) DIMENSION DENOM(9999) DIMENSION AICVEC(29),SCHVEC(29),XKSVEC(29) DOUBLE PRECISION SUM,SS,F,P,PP,SSD,XNC C C CONTROL CARDS: C C (1) DATASET TITLE C (2) N, IN FORMAT (2X,I4) C (3) FMT, IN FORMAT (18A4), E.G., (1X,F4.1). C ALLOW AT LEAST ONE BLANK IN FMT: IT WILL ALSO BE USED C FOR OUTPUT, WHERE CC1 IS FOR CARRIAGE CONTROL. C ALLOW A CC FOR THE DECIMAL POINT ON OUTPUT, WHETHER OR NOR C THERE IS ONE ON INPUT. C (4) DATA, IN FORMAT SPECIFIED BY FMT C C READ(5,15000) TITLE C C WRITE PROGRAM INFORMATION. WRITE(6,20000) WRITE(6,21000) WRITE(6,22000) C WRITE(6,16000) TITLE C C READ SAMPLE SIZE, N. READ(5,10000) N XN = N WRITE(6, 1051) N C C READ DATA FORMAT. C Remove next statement for WATFIV version: READ(5,15000) FMT C C READ DATA AND C COMPUTE STATISTICS OF WHOLE SAMPLE: TOTAL=0.0 SUMSQS=0.0 SSDEVS=0.0 READ(5,FMT) ( X(I), I = 1,N ) C For WATFIV version: C READ(5, * ) ( X(I), I = 1,N ) C Initialize XMAX and XMIN: XMAX = X(1) XMIN = X(1) C DO 100 I = 1,N TOTAL = TOTAL + X(I) SUMSQS = SUMSQS + X(I)*X(I) IF (X(I) .LT. XMIN) XMIN=X(I) IF (X(I) .GT. XMAX) XMAX=X(I) 100 CONTINUE C C IF IT IS DESIRED TO PRINT OUT THE DATA, REMOVE THE "C"S C FROM CC1 IN THE APPROPRIATE FOLLOWING STATEMENTS: C WRITE DATA: C WRITE(6, 1009) C1009 FORMAT(1X,'DATA:'/) C WRITE(6, FMT) (X(I), I=1, N) C C WRITE SUMMARY STATISTICS FOR WHOLE SAMPLE: WRITE(6, 1151) WRITE(6,FMT) XMIN C WRITE(6, * ) XMIN C C WRITE(6, 1152) WRITE(6,FMT) XMAX C WRITE(6, * ) XMAX XBAR = TOTAL/XN SSDEVS = SUMSQS- TOTAL*TOTAL/XN VARHAT1 = SSDEVS/XN WRITE(6, 1112) XBAR WRITE(6, 1113) VARHAT1 PI = 3.141593 TEMP = 2.0*PI*SSDEVS/N XMN2LL = N*(1.0 + ALOG(TEMP)) STDDEV = SQRT(VARHAT1) WRITE(6,66000) XMN2LL WRITE(6,68000) SSDEVS, STDDEV C NO. OF PARAMETERS FOR UNCLUSTERED SAMPLE IS: C 1 MEAN + 1 VARIANCE = 2 PARAMETERS NOPARM = 1 + 1 C AIC = XMN2LL + 2.0*NOPARM WRITE(6,60000) AIC SCH = XMN2LL + ALOG(XN)*NOPARM XKASH = SCH - ALOG(2*VARHAT1**3) WRITE(6,58000) SCH WRITE(6,17000) XKASH C A TABLE OF MODEL SELECTION CRITERIA-VALUES FOR VARIOUS K C WILL BE PRINTED AT THE END. THE NEXT INSTRUCTIONS C SET UP THE FIRST ENTRIES FOR THAT TABLE. AICVEC(1) = AIC SCHVEC(1) = SCH XKSVEC(1) = XKASH C C C C SET CONSTANTS. C XJ(IC,K), K=1 TO 9, IC = 1 TO K C OPTIMAL CLASS PROBABILITIES - NORMAL DISTRIBUTION C JOHARI & SCLOVE (1975). "PARTITIONING A DISTRIBUTION." C COMMUNICATIONS IN STATISTICS. XJ(1,2)=.5 XJ(2,2)=.5 XJ(1,3)=.2703 XJ(2,3)=.4594 XJ(3,3)=.2703 XJ(1,4)=.1631 XJ(2,4)=.3369 XJ(3,4)=.3369 XJ(4,4)=.1631 XJ(1,5)=.1068 XJ(2,5)=.2444 XJ(3,5)=.2976 XJ(4,5)=.2444 XJ(5,5)=.1068 XJ(1,6)=.0739 XJ(2,6)=.1810 XJ(3,6)=.2451 XJ(4,6)=.2451 XJ(5,6)=.1810 XJ(6,6)=.0739 XJ(1,7)=.0536 XJ(2,7)=.1375 XJ(3,7)=.1986 XJ(4,7)=.2106 XJ(5,7)=.1986 XJ(6,7)=.1375 XJ(7,7)=.0536 XJ(1,8)=.0402 XJ(2,8)=.1067 XJ(3,8)=.1613 XJ(4,8)=.1918 XJ(5,8)=.1918 XJ(6,8)=.1613 XJ(7,8)=.1067 XJ(8,8)=.0402 XJ(1,9)=.0310 XJ(2,9)=.0845 XJ(3,9)=.1324 XJ(4,9)=.1643 XJ(5,9)=.1756 XJ(6,9)=.1643 XJ(7,9)=.1324 XJ(8,9)=.0845 XJ(9,9)=.0310 C C PERFORM CLUSTERING FOR K = 2 TO KUP GROUPS. C KUP = 17 C DO 995 K = 2,KUP WGSS = 0.0 WRITE(6,23000) K C C COMPUTE INITIAL MEANS, EQUALLY SPACED THROUGH RANGE OF DATA, C INITIAL VALUE OF VARIANCES C AND INITIAL PRIOR PROBABILITIES (EQUAL): XK = K DO 101 IC=1,K C XC = IC C C TO spread out the means to the min and max, use: C XMEAN(IC) = XMIN + (IC-1)*(XMAX-XMIN)/(K-1) C C To center the means more, instead use: XMEAN(IC) = XMIN + IC*(XMAX-XMIN)/(K+1) C C TO SET INITIAL VALUE OF VARIANCES, USE THE VARIANCE-COMPONENTS C RELATIONSHIP C MS(TOTAL) = MS(BETWEEN) + MS(WITHIN) C WITH AN (ARBITRARY) ASSUMPTION THAT MS(WITHIN) = MS(TOTAL)/K. VAR(IC) = VARHAT1/K C SET INITIAL VALUES OF PRIOR PROBABILITIES EQUAL TO THE C OPTIMAL VALUES FOR PARTITIONING A NORMAL DISTRIBUTION: IF ( KUP .GT. 9 ) GO TO 103 P(IC) = XJ(IC,K) C TO SET INITIAL VALUES OF PRIOR PROBABILITIES EQUAL TO C BINOMIAL(X=IC-1;N=K-1,P=1/2). C FOR IC = 1,2,...,K, C REMOVE THE "C" FROM CC1 OF THE NEXT LINE: C P(IC) = GAMMA(XK)/(GAMMA(XC)*GAMMA(XK-XC+1)*(2**K)) C C FOR EQUAL INITIAL VALUES OF PRIOR PROBABILITIES: C 103 CONTINUE P(IC) = 1.0/K C C 101 CONTINUE C C C WRITE INITIAL VALUES OF PRIOR PROBS: WRITE(6,25000) C WRITE(6, 1025) ( P(IC), IC = 1, K ) C C WRITE INITIAL MEANS: WRITE(6,14000) C WRITE(6,19000) ( XMEAN(IC), IC=1, K ) C WRITE INITIAL VARIANCE: WRITE(6,24000) C WRITE(6, 1017) VAR(1) DO 105 INTGER=1,N IOTA(INTGER) = INTGER 105 CONTINUE C ITER = 1 C 601 CONTINUE C IF (ITER .EQ. 1) GO TO 560 C STORE OLD CLUSTERING: DO 565 I = 1,N ICLSOL(I) = ICLUS(I) 565 CONTINUE C STORE OLD CLUSTERING RESULTS FOR TESTING CONVERGENCE: XMN2LLOL = XMN2LL DO 850 IC = 1,K XMEANOL(IC) = XMEAN(IC) 850 CONTINUE C C COMMENCE DISTANCE COMPUTATIONS. 560 CONTINUE DO 102 I = 1,N DO 102 IC = 1,K DSQ(IC) = ( XMEAN(IC) - X(I) )**2 ZSQ = DSQ(IC)/VAR(IC) IF ( ZSQ .LE. 174.673 ) GO TO 110 F(I,IC) = 0.0 GO TO 102 110 CONTINUE C Note that pi is included in the computations for comparability C even though it is just a constant. F(I,IC) = EXP(-ZSQ/2.0)/SQRT(2*PI*VAR(IC)) 102 CONTINUE C C COMPUTE POSTERIOR PROBABILITIES OF GROUP MEMBERSHIP C [PP(IC,I) = CONDITIONAL PROBABILITY OF CLUSTER IC, GIVEN X(I)]: DO 405 I = 1,N DENOM(I) = 0.0 DO 405 IC=1,K DENOM(I) = DENOM(I) + P(IC)*F(I,IC) 405 CONTINUE DO 406 I = 1,N DO 406 IC=1,K C IF ( DENOM(I) .EQ. 0.0 ) DENOM(I)=0.0001 PP(IC,I)= P(IC)*F(I,IC)/DENOM(I) 406 CONTINUE C C COMPUTE NEW LABELS BY MAX POSTERIOR PROBABILITY: DO 8 I = 1,N XMXPR(I) = PP(1,I) ICLUS(I) = 1 DO 8 IC = 2,K IF ( PP(IC,I) .GT. XMXPR(I) ) GO TO 9 GO TO 8 9 XMXPR(I) = PP(IC,I) ICLUS(I) = IC 8 CONTINUE C C WRITE NEW LABELS: IF (N .GE. 31) GO TO 200 WRITE(6, 11000) WRITE(6,12000) (IOTA(I), I=1, N) WRITE(6,13000) (ICLUS(I), I=1, N) 200 CONTINUE C C UPDATE CLUSTER PRIOR PROBABILITIES P(IC), MEANS XMEAN(IC) AND C VARIANCE VARHAT: WGSS = 0.0 DO 6 IC = 1,K XNC(IC) = 0.0 C XNC(IC) WILL BE THE SUM OVER ALL N OBSERVATIONS OF THEIR C POSTERIOR PROBABILITIES OF MEMBERSHIP IN CLUSTER IC. SUM(IC) = 0.0 SS(IC) = 0.0 DO 67 I = 1,N XNC(IC) = XNC(IC) + PP(IC,I) SUM(IC) = SUM(IC) + PP(IC,I)*X(I) SS(IC) = SS(IC) + PP(IC,I)*X(I)*X(I) 67 CONTINUE C C IF ONE OF THE K CLUSTERS BECOMES EMPTY, THE PROCEDURE C WILL GO ON TO THE NEXT VALUE OF K: IF ( XNC(IC) .EQ. 0.0 ) GO TO 125 XMEAN(IC) = SUM(IC)/XNC(IC) SSD(IC) = SS(IC) - SUM(IC)*SUM(IC)/XNC(IC) C VAR(IC) = SSD(IC)/XNC(IC) C SD(IC) = SQRT(VAR(IC)) C C Recall that XNC(IC) is the sum over all N observations of their C posterior probabilities of membership in cluster IC. P(IC) = XNC(IC)/XN WGSS = WGSS + SSD(IC) 6 CONTINUE C C COUNT NUMBERS IN CLUSTERS: DO 66 IC = 1,K NC(IC) = 0 66 CONTINUE DO 400 I = 1,N IGROUP = ICLUS(I) NC(IGROUP) = NC(IGROUP) + 1 400 CONTINUE C C C VARHAT = WGSS/XN C SUMLNF = 0.0 DO 161 I=1,N SUMPXF = 0.0 DO 159 IC=1,K SUMPXF = SUMPXF + P(IC)*F(I,IC) 159 CONTINUE C F(I,IC) HAD NOT PREVIOUSLY BEEN DIVIDED BY SQRT(2*PI*VARHAT): C Oh, yes it had. C SUMPXF = SUMPXF/SQRT(2.0*PI*VARHAT) SUMLNF = SUMLNF + ALOG(SUMPXF) 161 CONTINUE C XMN2LL = -2.0*SUMLNF C WGMS = WGSS/(N-K) WRITE(6,66000) XMN2LL WRITE(6,64000) WGSS, WGMS STDERR = SQRT(WGMS) WRITE(6,71000) STDERR C KM1 = K-1 DO 500 IC=1,KM1 ICP1 = IC+1 C Compute boundary between c-th and c+1-st classes. B = XMEAN(IC)/VAR(IC) - XMEAN(ICP1)/VAR(ICP1) A = 0.5*(1.0/VAR(ICP1) - 1.0/VAR(IC)) C = DLOG(P(IC)/P(ICP1)) C = ALOG(SD(ICP1)/SD(IC)) + C TEMP=(XMEAN(ICP1))**2/(2.0*VAR(ICP1))-(XMEAN(IC))**2/(2.0*VAR(IC)) C = C + TEMP DISCR = B**2 - 4.0*A*C BPLUS(IC) = (-B + SQRT(DISCR))/(2.0*A) BMINUS(IC) = (-B - SQRT(DISCR))/(2.0*A) 500 CONTINUE WRITE(6, 1036) ITER WRITE(6, 1035) (BMINUS(IC), IC=1, KM1) WRITE(6, 1020) (XMEAN(IC), IC=1, K) IF (ITER .EQ. 1) GO TO 600 DO 555 I = 1,N IF (ICLUS(I) .EQ. ICLSOL(I)) GO TO 555 GO TO 600 555 CONTINUE 600 CONTINUE C Even if no case changes cluster, another iteration will be C performed if the parameter estimates have changed much. DO 2450 IC = 1,K TEST=(XMEAN(IC)-XMEANOL(IC))/XMEANOL(IC) TOL = 0.02 IF (ABS(TEST) .LE. TOL ) GO TO 2450 GO TO 2500 2450 CONTINUE C If all changes are small, write results for this value of K. GO TO 530 C Otherwise, iterate further. 2500 CONTINUE ITER = ITER + 1 C C IF PROCEDURE HAS NOT CONVERGED IN 40 ITERATIONS, IT WILL C GO ON TO THE NEXT VALUE OF K. IF (ITER.GE.41) GO TO 570 GO TO 601 570 WRITE(6, 1160) GO TO 2900 C C IF ONE OF THE K CLUSTERS BECOMES EMPTY, THE PROCEDURE C WILL GO ON TO THE NEXT VALUE OF K. 125 CONTINUE WRITE(6, 1235) GO TO 2900 C C 530 CONTINUE C 2900 CONTINUE WRITE(6, 1040) (SUM(IC), IC=1, K) WRITE(6, 1045) (NC(IC), IC=1, K) WRITE(6,70000) (VAR(IC), IC=1, K) WRITE(6, 1055) (SD(IC), IC=1, K) WRITE(6, 1060) ( P(IC), IC=1, K) C VARHAT IS MLE OF VARIANCE. VARHAT = WGSS/N WRITE(6, 1100) VARHAT C C C C COMPUTE MODEL-SELECTION CRITERIA: C NO. PARAMETERS = K MEANS + K VARIANCES + (K-1) PROBS. NOPARM = K + K + (K-1) C WRITE(6,72000) NOPARM AIC = XMN2LL + 2.0*NOPARM SCH = XMN2LL + ALOG(XN)*NOPARM C XKASH = SCH - ALOG(2*VARHAT**3) C WRITE(6,60000) AIC WRITE(6,58000) SCH WRITE(6,17000) XKASH C C C C AICVEC(K) = AIC SCHVEC(K) = SCH XKSVEC(K) = XKASH C C GO ON TO NEXT VALUE OF K: 995 CONTINUE C C WRITE VALUES OF MSC'S FOR VARIOUS K: WRITE(6,34000) DO 605 K = 1,KUP WRITE(6,33000) K, AICVEC(K), SCHVEC(K), XKSVEC(K) 605 CONTINUE C 10000 FORMAT(2X,I4) 11000 FORMAT(1X,'CLUSTERING') 12000 FORMAT(1X,'POINT: '/, (1X,40I3)) 13000 FORMAT(1X,'CLUSTER: '/, (1X,40I3)) 33000 FORMAT(1X,'K=',I3, 3F15.2) 34000 FORMAT(1X,'MODEL SELECTION CRITERIA'/ X' AIC SCHWARZ KASHYAP '/) 14000 FORMAT(/1X,'INITIAL MEANS') 25000 FORMAT(/1X,'INITIAL VALUES OF PRIOR PROBS') 19000 FORMAT(1X, 9F13.2/) 1025 FORMAT(1X, 9F13.4/) 24000 FORMAT(/1X,'INITIAL VALUE OF VARIANCES') 1017 FORMAT(1X, F15.4/) 1100 FORMAT(1X, 'M.L. ESTIMATE OF COMMON VARIANCE = ',F14.5/) 21000 FORMAT(1X,'CMS DSN = MIX1DTA CLUSPAC') 22000 FORMAT(1X,'Copyright 1991, 1992 Stanley L. Sclove '/) 23000 FORMAT('1',1X,'K = ',I2,' CLUSTERS') C 1020 FORMAT(1X,'MEANS: ',9F13.2) 64000 FORMAT(/1X,' POOLED STATISTICS: '/' WGSS = ',F14.4, X' WGMS = ',F14.4/) 1035 FORMAT(1X,'BOUNDARIES:', 9X, 9F9.1) C 1151 FORMAT(1X,'MINIMUM OF SAMPLE: ') 1152 FORMAT(1X,'MAXIMUM OF SAMPLE: ') 1112 FORMAT(/,1X,'MEAN = ', F14.4) 1113 FORMAT(1X, 'M.L. ESTIMATE OF VARIANCE = ',F14.5/) 66000 FORMAT(/1X,' MINUS 2 LOG LIKELIHOOD = ',F14.4/) 68000 FORMAT(/1X,' SSDEVS = ',F14.4,' STDDEV = ',F14.4/) 20000 FORMAT('1','****************************************', A/,1X,'PROGRAM MIX1DTA CLUSPAC '/ B,1X,'FOR CLUSTERING UNIVARIATE DATA (DATA ON THE LINE)'/ C1X,'DEVELOPED AND PROGRAMMED BY DR. STANLEY L. SCLOVE' D/,1X,'VERSION 3.0 20-OCT-92 '/) 1036 FORMAT(1X,'ITERATION ', I2) 1040 FORMAT(1X,'SUMS:',6X,9F13.2) 1045 FORMAT(1X,'NUMBERS:',3X,9(I10,3X)) 15000 FORMAT(18A4) 1051 FORMAT(1X,'N = ',I4/) 1055 FORMAT(1X,'STD.DEVS.: ',9F13.2) 1060 FORMAT(1X,'PROBS.: ',9F13.2) 16000 FORMAT(1X,18A4/) 70000 FORMAT(1X,'VARIANCES: ',9F13.2) 71000 FORMAT(1X,'STD.ERROR=SQRT(WGMS) = ',F13.4/) C 1160 FORMAT(1X,'THIS STEP HAS NOT CONVERGED IN 40 ITERATIONS. STOP') 60000 FORMAT(1X,'AIC = ', F14.4/) 58000 FORMAT(1X,'SCHWARZ CRITERION = ', F14.4/) 17000 FORMAT(/,1X,'KASHYAP CRITERION = ', F24.4) C 72000 FORMAT(/1X,'NUMBER OF PARAMETERS = ',I4/) 1235 FORMAT(1X,'NO OBSERVATIONS IN GROUP ',I3,'. STOP') C 1995 STOP END C$DATA TRYPANOS DATA: LENGTHS OF 500 ROUNDWORMS (Lancaster, H. O. 1969) N=0500 15 15 15 15 15 15 16 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 23 23 23 23 23 23 23 23 23 23 23 23 23 23 24 24 24 24 24 24 24 24 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 32 32 32 32 32 32 32 32 32 32 32 32 32 33 33 33 33 33 33 33 34 34 34 35 35 /* #TRYPANOS DATA D #LENGTHS OF 500 ROUNDWORMS #See: #Lancaster, H. O., The Chi-Squared Distribution, Wiley, New York, 1969. #Frequency distribution: #value: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 #freq.: 6 4 27 57 94 49 37 22 14 13 11 19 25 23 29 27 18 13 7 3 2 /*