CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C CLUSPAC: Computer Programs for Mixture-Model Clustering C C C C COPYRIGHT (C) 1991, 1992 STANLEY L. SCLOVE C C C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C CMS DSN = MIX1CMA CLUSPAC C C C C THE PROGRAMS MIX* CLUSPAC IN ISOPAC ARE FOR CLUSTERING DATA C C BY ITERATIVE MAXIMIZATION OF THE MIXTURE-MODEL LIKELIHOOD C C C C N K C C --- -- C C L = | | > p(c)*f(x(i)|c), C C | | -- C C i=1 c=1 C C C C WHERE C C C C N = NUMBER OF OBSERVATIONS ("SAMPLE SIZE"), C C K = NUMBER OF CLUSTERS, C C x(i) = i-TH OBSERVATION, i = 1,2,...,N, C C F(x|c) = VALUE AT x OF THE c-TH CLASS-CONDITIONAL C C DENSITY FUNCTION (c=1,2,...,K) C C AND C C p(c) = PRIOR PROBABILITY OF CLASS c. C C C C C C REFERENCE FOR CLUSTERING BY MIXTURE MODEL: C C C C Wolfe, J. H. (1970). Pattern clustering by multivariate C C mixture analysis. Multivariate Behavioral Research 5, 329-350. C C C C C C THE "1" IN THE PROGRAM NAME "MIX1CMA" MEANS THAT C C THE PROGRAM IS FOR UNIVARIATE (1-DIMENSIONAL) DATA C C (DATA ON THE LINE); THE "CM" MEANS THAT A COMMON VARIANCE IS C C ASSUMED ACROSS CLUSTERS; AND THE "A" INDICATES THAT THERE IS C C AUTOMATIC SETTING OF NUMBERS OF CLUSTERS AND INITIAL MEANS. C C C C C C PROGRAMMED BY C C DR. STANLEY L. SCLOVE 312/996-2681 C C DEPARTMENT OF C C INFORMATION & DECISION SCIENCES M/C 294 C C COLLEGE OF BUSINESS ADMINISTRATION C C UNIVERSITY OF ILLINOIS AT CHICAGO C C BOX 4348 C C CHICAGO, IL 60680-4348 C C C C C C VERSION 1.3 21-MAY-91 C C C C COPYRIGHT (C) 1991, 1992 STANLEY L. SCLOVE C C C C C C C C RESTRICTIONS (CAN BE MODIFIED): C C N, SAMPLE SIZE, AT MOST 999; C C K, NUMBER OF CLUSTERS, AT MOST 29; C C ITER, MAXIMUM NUMBER OF ITERATIONS, 20. C C C C C C C C CONTROL CARDS: C C C C (1) DATASET TITLE C C (2) N, IN FORMAT (2X,I4) C C (3) FMT, IN FORMAT (18A4), E.G., (1X,F4.1). C C ALLOW AT LEAST ONE BLANK IN FMT: IT WILL ALSO BE USED C C FOR OUTPUT, WHERE CC1 IS FOR CARRIAGE CONTROL. C C ALLOW A CC FOR THE DECIMAL POINT ON OUTPUT, C C WHETHER OR NOT THERE IS ONE ON INPUT. C C (4) DATA, IN FORMAT SPECIFIED BY FMT C C C C C C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C C C C C C DIMENSION X(999),XMNDSQ(999),ICLUS(999),IOTA(999) DIMENSION XJ(9,9) DIMENSION DSQ(29),C(29),SUM(29) DIMENSION TITLE(18) DIMENSION B(29),NC(29),XMEAN(29) DIMENSION FMT(18) DIMENSION SS(29),SSD(29) DIMENSION SD(29) DIMENSION VAR(29) DIMENSION ICLSOL(999) DIMENSION F(999,29) DIMENSION P(29), XNC(29) DIMENSION PP(29,999) DIMENSION XMXPR(999) DIMENSION DENOM(999) DIMENSION AICVEC(29),SCHVEC(29),XKSVEC(29) C DOUBLE PRECISION SUM,SS,F,P,PP C C CONTROL CARDS: C C (1) DATASET TITLE C (2) N, IN FORMAT (2X,I4) C (3) FMT, IN FORMAT (18A4), E.G., (1X,F4.1). C ALLOW AT LEAST ONE BLANK IN FMT: IT WILL ALSO BE USED C FOR OUTPUT, WHERE CC1 IS FOR CARRIAGE CONTROL. C ALLOW A CC FOR THE DECIMAL POINT ON OUTPUT, WHETHER OR NOR C THERE IS ONE ON INPUT. C (4) DATA, IN FORMAT SPECIFIED BY FMT C C READ(5,15000) TITLE C C WRITE PROGRAM INFORMATION. WRITE(6,20000) WRITE(6,21000) WRITE(6,22000) C WRITE(6,16000) TITLE C C READ SAMPLE SIZE, N. READ(5,10000) N XN = N WRITE(6, 1051) N C C READ DATA FORMAT. READ(5,15000) FMT C C READ DATA AND C COMPUTE STATISTICS OF WHOLE SAMPLE: TOTAL=0.0 SUMSQS=0.0 SSDEVS=0.0 C C READ(5, * ) ( X(I), I = 1,N ) C READ(5,FMT) ( X(I), I = 1,N ) DO 100 I = 1,N TOTAL = TOTAL + X(I) SUMSQS = SUMSQS + X(I)*X(I) IF (I .EQ. 1) GO TO 498 GO TO 499 498 XMAX = X(1) XMIN = X(1) 499 CONTINUE IF (X(I) .LT. XMIN) XMIN=X(I) IF (X(I) .GT. XMAX) XMAX=X(I) 100 CONTINUE C IF IT IS DESIRED TO PRINT OUT THE DATA, REMOVE THE "C"S C FROM CC1 IN THE APPROPRIATE FOLLOWING STATEMENTS: C WRITE DATA: C WRITE(6, 1009) C1009 FORMAT(1X,'DATA:'/) C WRITE(6, FMT) (X(I), I=1, N) C WRITE SUMMARY STATISTICS FOR WHOLE SAMPLE: WRITE(6, 1151) WRITE(6, * ) XMIN WRITE(6, 1152) WRITE(6, * ) XMAX XBAR = TOTAL/XN SSDEVS = SUMSQS- TOTAL*TOTAL/XN VARHAT = SSDEVS/XN WRITE(6, 1112) XBAR WRITE(6, 1113) VARHAT PI = 3.1415927 TEMP = 2.0*PI*SSDEVS/N XMN2LL = N*(1.0 + ALOG(TEMP)) STDDEV = SQRT(VARHAT) WRITE(6, 1121) SSDEVS, XMN2LL, STDDEV C NO. OF PARAMETERS FOR UNCLUSTERED SAMPLE IS: C 1 MEAN + 1 VARIANCE = 2 PARAMETERS NOPARM = 1 + 1 C AIC = XMN2LL + 2.0*NOPARM WRITE(6, 1205) AIC SCH = XMN2LL + ALOG(XN)*NOPARM XKASH = SCH - ALOG(2*VARHAT**3) WRITE(6,18000) SCH WRITE(6,17000) XKASH C A TABLE OF MODEL SELECTION CRITERIA-VALUES FOR VARIOUS K C WILL BE PRINTED AT THE END. THE NEXT INSTRUCTIONS C SET UP THE FIRST ENTRIES FOR THAT TABLE. AICVEC(1) = AIC SCHVEC(1) = SCH XKSVEC(1) = XKASH C C C C SET CONSTANTS. C XJ(IC,K), K=1 TO 9, IC = 1 TO K C OPTIMAL CLASS PROBABILITIES - NORMAL DISTRIBUTION C JOHARI & SCLOVE (1975). "PARTITIONING A DISTRIBUTION." C COMMUNICATIONS IN STATISTICS. XJ(1,2)=.5 XJ(2,2)=.5 XJ(1,3)=.2703 XJ(2,3)=.4594 XJ(3,3)=.2703 XJ(1,4)=.1631 XJ(2,4)=.3369 XJ(3,4)=.3369 XJ(4,4)=.1631 XJ(1,5)=.1068 XJ(2,5)=.2444 XJ(3,5)=.2976 XJ(4,5)=.2444 XJ(5,5)=.1068 XJ(1,6)=.0739 XJ(2,6)=.1810 XJ(3,6)=.2451 XJ(4,6)=.2451 XJ(5,6)=.1810 XJ(6,6)=.0739 XJ(1,7)=.0536 XJ(2,7)=.1375 XJ(3,7)=.1986 XJ(4,7)=.2106 XJ(5,7)=.1986 XJ(6,7)=.1375 XJ(7,7)=.0536 XJ(1,8)=.0402 XJ(2,8)=.1067 XJ(3,8)=.1613 XJ(4,8)=.1918 XJ(5,8)=.1918 XJ(6,8)=.1613 XJ(7,8)=.1067 XJ(8,8)=.0402 XJ(1,9)=.0310 XJ(2,9)=.0845 XJ(3,9)=.1324 XJ(4,9)=.1643 XJ(5,9)=.1756 XJ(6,9)=.1643 XJ(7,9)=.1324 XJ(8,9)=.0845 XJ(9,9)=.0310 C C PERFORM CLUSTERING FOR K = 2 TO 9 GROUPS. DO 995 K = 2,9 WGSS = 0.0 WRITE(6,23000) K C C COMPUTE INITIAL MEANS, EQUALLY SPACED THROUGH RANGE OF DATA, C INITIAL PRIOR PROBABILITIES (EQUAL) AND INITIAL VALUE C OF COMMON VARIANCE: XK = K DO 101 IC=1,K C XC = IC XMEAN(IC) = XMIN + (IC-1)*(XMAX-XMIN)/(K-1) C C For more centered spacing, use the following: C XMEAN(IC) = XMIN + IC*(XMAX-XMIN)/(K+1) C C SET INITIAL VALUES OF PRIOR PROBABILITIES EQUAL TO THE C OPTIMAL VALUES FOR A NORMAL DISTRIBUTION: P(IC) = XJ(IC,K) C TO SET INITIAL VALUES OF PRIOR PROBABILITIES EQUAL TO C BINOMIAL(X=IC-1;N=K-1,P=1/2). C FOR IC = 1,2,...,K, C REMOVE THE "C" FROM CC1 OF THE NEXT LINE: C P(IC) = GAMMA(XK)/(GAMMA(XC)*GAMMA(XK-XC+1)*(2**K)) C C FOR EQUAL INITIAL VALUES OF PRIOR PROBABILITIES, C REMOVE THE "C" FROM CC1 OF THE NEXT LINE: C P(IC) = 1.0/K . C 101 CONTINUE VARHAT = (XMAX - XMIN)/4 VARHAT =(VARHAT**2)/K C C C C WRITE INITIAL VALUES OF PRIOR PROBS: WRITE(6,25000) C WRITE(6, 1025) ( P(IC), IC = 1, K ) C C WRITE INITIAL MEANS: WRITE(6,14000) C WRITE(6,19000) ( XMEAN(IC), IC=1, K ) C WRITE INITIAL VARIANCE: WRITE(6,24000) C WRITE(6, 1017) VARHAT DO 105 INTGER=1,N IOTA(INTGER) = INTGER 105 CONTINUE C ITER = 1 C 601 CONTINUE C IF (ITER .EQ. 1) GO TO 560 C STORE OLD CLUSTERING: DO 565 I = 1,N ICLSOL(I) = ICLUS(I) 565 CONTINUE C COMMENCE DISTANCE COMPUTATIONS. 560 CONTINUE DO 102 I = 1,N DO 102 IC = 1,K DSQ(IC) = ( XMEAN(IC) - X(I) )**2 ZSQ = DSQ(IC)/VARHAT IF ( ZSQ .LE. 174.673 ) GO TO 110 F(I,IC) = 0.0 GO TO 102 110 CONTINUE F(I,IC) = EXP(-ZSQ/2.0) 102 CONTINUE C C COMPUTE POSTERIOR PROBABILITIES OF GROUP MEMBERSHIP: DO 405 I = 1,N DENOM(I) = 0.0 DO 405 IH=1,K DENOM(I) = DENOM(I) + P(IH)*F(I,IH) 405 CONTINUE DO 406 I = 1,N DO 406 IC=1,K C IF ( DENOM(I) .EQ. 0.0 ) DENOM(I)=0.0001 PP(IC,I)= P(IC)*F(I,IC)/DENOM(I) 406 CONTINUE C C COMPUTE NEW LABELS BY MAX POSTERIOR PROBABILITY: DO 8 I = 1,N XMXPR(I) = PP(1,I) ICLUS(I) = 1 DO 8 IC = 2,K IF ( PP(IC,I) .GT. XMXPR(I) ) GO TO 9 GO TO 8 9 XMXPR(I) = PP(IC,I) ICLUS(I) = IC 8 CONTINUE C IF (N .GE. 31) GO TO 200 C WRITE NEW LABELS: WRITE(6, 11000) WRITE(6,12000) (IOTA(I), I=1, N) WRITE(6,13000) (ICLUS(I), I=1, N) 200 CONTINUE C C UPDATE CLUSTER PRIOR PROBABILITIES P(IC), MEANS XMEAN(IC) AND C VARIANCE VARHAT: WGSS = 0.0 DO 6 IC = 1,K XNC(IC) = 0.0 C XNC(IC) WILL BE THE SUM OVER ALL N OBSERVATIONS OF THEIR C POSTERIOR PROBABILITIES OF MEMBERSHIP IN CLUSTER IC. SUM(IC) = 0.0 SS(IC) = 0.0 DO 67 I = 1,N XNC(IC) = XNC(IC) + PP(IC,I) SUM(IC) = SUM(IC) + PP(IC,I)*X(I) SS(IC) = SS(IC) + PP(IC,I)*X(I)*X(I) 67 CONTINUE C C IF ONE OF THE K CLUSTERS BECOMES EMPTY, THE PROCEDURE C WILL GO ON TO THE NEXT VALUE OF K: IF ( XNC(IC) .EQ. 0.0 ) GO TO 125 XMEAN(IC) = SUM(IC)/XNC(IC) SSD(IC) = SS(IC) - SUM(IC)*SUM(IC)/XNC(IC) VAR(IC) = SSD(IC)/( XNC(IC) ) C IF ( VAR(IC) .LE. 0.0 ) VAR(IC) = 0.0001 SD(IC) = SQRT(VAR(IC)) P(IC) = XNC(IC)/XN WGSS = WGSS + SSD(IC) 6 CONTINUE C C COUNT NUMBERS IN CLUSTERS: DO 66 IC = 1,K NC(IC) = 0 66 CONTINUE DO 400 I = 1,N IGROUP = ICLUS(I) NC(IGROUP) = NC(IGROUP) + 1 400 CONTINUE C C C VARHAT = WGSS/XN C SUMLNF = 0.0 DO 161 I=1,N SUMPXF = 0.0 DO 159 IC=1,K SUMPXF = SUMPXF + P(IC)*F(I,IC) 159 CONTINUE C F(I,IC) HAD NOT PREVIOUSLY BEEN DIVIDED BY SQRT(2*PI*VARHAT): SUMPXF = SUMPXF/SQRT(2.0*PI*VARHAT) SUMLNF = SUMLNF + ALOG(SUMPXF) 161 CONTINUE C XMN2LL = -2.0*SUMLNF C WGMS = WGSS/(N-K) WRITE(6, 1021) WGSS, XMN2LL, WGMS STDERR = SQRT(WGMS) WRITE(6,71000) STDERR C KM1 = K-1 DO 500 IC=1,KM1 ICP1 = IC+1 C B(IC) IS BOUNDARY BETWEEN G-TH AND G+1-ST CLASSES. B(IC) = ( XMEAN(IC) + XMEAN(ICP1) )/2.0 BDYADJ = DLOG(P(ICP1)) - DLOG(P(IC)) BDYADJ = BDYADJ/(XMEAN(ICP1) - XMEAN(IC)) BDYADJ = VARHAT*BDYADJ B(IC) = B(IC) - BDYADJ 500 CONTINUE WRITE(6, 1036) ITER WRITE(6, 1035) (B(IC), IC=1, KM1) WRITE(6, 1020) (XMEAN(IC), IC=1, K) IF (ITER .EQ. 1) GO TO 600 DO 555 I = 1,N IF (ICLUS(I) .EQ. ICLSOL(I)) GO TO 555 GO TO 600 555 CONTINUE GO TO 530 600 CONTINUE ITER = ITER + 1 C C IF PROCEDURE HAS NOT CONVERGED IN 20 ITERATIONS, IT WILL C GO ON TO THE NEXT VALUE OF K. IF (ITER.GE.21) GO TO 570 GO TO 601 570 WRITE(6, 1160) GO TO 995 C C IF ONE OF THE K CLUSTERS BECOMES EMPTY, THE PROCEDURE C WILL GO ON TO THE NEXT VALUE OF K. 125 CONTINUE WRITE(6, 1235) GO TO 995 C C 530 CONTINUE C WRITE(6, 1040) (SUM(IC), IC=1, K) WRITE(6, 1045) (NC(IC), IC=1, K) WRITE(6,70000) (VAR(IC), IC=1, K) WRITE(6, 1055) (SD(IC), IC=1, K) C VARHAT IS MLE OF VARIANCE. VARHAT = WGSS/N WRITE(6, 1100) VARHAT C C C C COMPUTE MODEL-SELECTION CRITERIA: C NO. PARAMETERS = K MEANS + 1 VARIANCE + (K-1) PROBS. NOPARM = K + 1 NOPARM = NOPARM + K - 1 WRITE(6,72000) NOPARM AIC = XMN2LL + 2.0*NOPARM SCH = XMN2LL + ALOG(XN)*NOPARM C XKASH = SCH - ALOG(2*VARHAT**3) C WRITE(6, 1205) AIC WRITE(6,18000) SCH WRITE(6,17000) XKASH C C C C C Store values of model-selection criteria for this value of K: C AICVEC(K) = AIC SCHVEC(K) = SCH XKSVEC(K) = XKASH C C GO ON TO NEXT VALUE OF K: 995 CONTINUE C C WRITE VALUES OF MSC'S FOR VARIOUS K: WRITE(6,34000) DO 605 K = 1,9 WRITE(6,33000) K, AICVEC(K), SCHVEC(K), XKSVEC(K) 605 CONTINUE C 10000 FORMAT(2X,I4) 11000 FORMAT(1X,'CLUSTERING') 12000 FORMAT(1X,'POINT: '/, (1X,40I3)) 13000 FORMAT(1X,'CLUSTER: '/, (1X,40I3)) 33000 FORMAT(1X,'K=',I3, 3F15.2) 34000 FORMAT(1X,'MODEL SELECTION CRITERIA'/ X' AIC SCHWARZ KASHYAP '/) 14000 FORMAT(/1X,'INITIAL MEANS') 25000 FORMAT(/1X,'INITIAL VALUES OF PRIOR PROBS') 19000 FORMAT(1X, 9F13.2/) 1025 FORMAT(1X, 9F13.4/) 24000 FORMAT(/1X,'INITIAL VARIANCE') 1017 FORMAT(1X, F15.4/) 1100 FORMAT(1X, 'M.L. ESTIMATE OF COMMON VARIANCE = ',F14.5/) 21000 FORMAT(1X,'CMS DSN = MIX1CMA CLUSPAC') 22000 FORMAT(1X,'COPYRIGHT (C) 1991, 1992 STANLEY L. SCLOVE.'/) 23000 FORMAT('1',1X,'K = ',I1,' CLUSTERS') 1020 FORMAT(1X,'MEANS: ',9F13.2) 1021 FORMAT(/1X,'WGSS = ',F14.4,' MINUS 2 LOG LIKELIHOOD = ', XF14.4, ' WGMS = ',F14.4/) 1035 FORMAT(1X,'BOUNDARIES:', 8X, 9F13.2) 1151 FORMAT(1X,'MINIMUM OF SAMPLE: ') 1152 FORMAT(1X,'MAXIMUM OF SAMPLE: ') 17000 FORMAT(/,1X,'KASHYAP CRITERION = ', F14.4) 1112 FORMAT(/,1X,'MEAN = ', F14.4) 1113 FORMAT(1X, 'ESTIMATE OF VARIANCE = ', F14.5/) 1121 FORMAT(/1X,'SSDEVS = ',F14.4,' MINUS 2 LOG LIKELIHOOD = ', XF14.4, ' STDDEV = ',F14.4/) 20000 FORMAT('1','-------------------------------------------', A/,1X,'PROGRAM MIX1CMA CLUSPAC '/ B,1X,'FOR CLUSTERING UNIVARIATE DATA (DATA ON THE LINE)'/ C1X,'DEVELOPED AND PROGRAMMED BY DR. STANLEY L. SCLOVE' D/,1X,'VERSION 1.3 21-MAY-91 '/) 1036 FORMAT(1X,'ITERATION ', I2) 1040 FORMAT(1X,'SUMS:',6X,9F13.2) 1045 FORMAT(1X,'NUMBERS:',3X,9(I10,3X)) 15000 FORMAT(18A4) 1051 FORMAT(1X,'N = ',I3/) 1055 FORMAT(1X,'STD.DEVS.: ',9F13.2) 16000 FORMAT(1X,18A4/) 70000 FORMAT(1X,'VARIANCES: ',9F13.2) 71000 FORMAT(1X,'STD.ERROR=SQRT(WGMS) = ',F13.4/) C 1160 FORMAT(1X,'PROGRAM HAS NOT CONVERGED IN 20 ITERATIONS. STOP') 1205 FORMAT(1X,'AIC = ', F14.4/) 18000 FORMAT(1X,'SCHWARZ CRITERION = ', F14.4/) 72000 FORMAT(/1X,'NUMBER OF PARAMETERS = ',I4/) 1235 FORMAT(1X,'NO OBSERVATIONS IN GROUP ',I3,'. STOP') C 1995 STOP END C$DATA TRYPANOS DATA: LENGTHS OF 500 ROUNDWORMS (Lancaster, H. O. 1969) N=0500 15 15 15 15 15 15 16 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 23 23 23 23 23 23 23 23 23 23 23 23 23 23 24 24 24 24 24 24 24 24 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 32 32 32 32 32 32 32 32 32 32 32 32 32 33 33 33 33 33 33 33 34 34 34 35 35 /* #TRYPANOS DATA D #LENGTHS OF 500 ROUNDWORMS #See: #Lancaster, H. O., The Chi-Squared Distribution, Wiley, New York, 1969. #Frequency distribution: #value: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 #freq.: 6 4 27 57 94 49 37 22 14 13 11 19 25 23 29 27 18 13 7 3 2 /*