IDS 460 (SAMPLE SURVEYS)
SPRING, 1995
TEXT: SCHEAFFER/MENDENHALL/OTT, 4th ed.
SCLOVE

SOLUTIONS TO IDS 460 SPRING '95 FINAL EXAMINATION


This examination was cumulative, comprehensive and multiple choice. The questions vary in style and level of difficulty.


PART 1. FIGURES FOR A POPULATION

The distribution of ages for N = 8700 graduate and professional students at Minnesota State University is as follows.

 
-------------------------------------------------------------------
Age             21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36  37
% of students    1 10  5  5  5  5  5  5  6  7  8  9 10  9  8  1   1
Cumulative %     1 11 16 21 26 31 36 41 47 54 62 71 81 90 98 99 100
-------------------------------------------------------------------
 
1. How many of the students are exactly 22 years of age?
(A) 10   (B) 11   (C) 87   (D) 870  (E) 957   students
                            ------
10% of 8700
 
2. What percentage of the population is over 30 years of age?
(A) 38   (B) 46   (C) 53   (D) 54   (E) 62 %
         ------
3. What is the mean age in this population?
(A) 28.95  (B) 29.15  (C) 29.95  (D) 32.95  (E) 33.95 years of age
           ---------
Mean = sum of products = 21(.01) + 22(.10) + ... + 37(.01) = 0.21 + 2.20 + ... + 0.37 = 29.15

PART 2. COMPUTATIONS FOR A SAMPLE

The times to completion of the final exam for a class of 9 students were as follows.

35, 40, 45, 50, 55, 60, 65, 80, 80 minutes

4. What is the median time to completion?

(A) 40   (B) 45   (C) 50   (D) 55   (E) 60   minutes
                           ------
5. What is the lower quartile?
(A) 40   (B) 45   (C) 50   (D) 55   (E) 60   minutes
         ------
6. What is the upper quartile?
(A) 50   (B) 55   (C) 60   (D) 65   (E) 80   minutes
                           ------

PART 3. SIMPLE RANDOM SAMPLING

7. Suppose a sample of 2 is to be drawn with replacement from a population of size 4. What is the probability that the sample consists of two different elements?
(A) 1/16  (B) 1/4   (C) 1/2   (D) 3/4     (E) 15/16
                              -------
ANS.: 4 x 3/(4x4) = 3/4

PART 4. (0,1)-VARIABLES

Suppose y is a variable that takes the value 1 with probability p and 0 with probability q (q = 1-p). 8. What is the expected value of y?
(A) p   (B) q    (C) pq    (D) 1/2  (E) 1/4
-----
9. What is the variance of y?
(A) p   (B) q    (C) pq    (D) 1/2  (E) 1/4
                 ------
10. What is the maximum value of the variance of y?
 
(A) p   (B) .7071    (C) pq    (D) 1/2  (E) 1/4
                                        -------
11. What is the maximum value of the standard deviation of y?
 
(A) p   (B) .7071    (C) pq    (D) 1/2  (E) 1/4
                               -------
12. What is the expected value of (1-y) ?
 
(A) p   (B) q    (C) pq    (D) p-q  (E) q-p
        -----
13. What is the expected value of (y-p) ?
 
(A) 0   (B) 1    (C) p     (D) q    (E) q-p
-----

PART 5. STANDARD ERROR OF THE MEAN

14. A simple random sample of size 100 is to be drawn without replacement from a population of size 400 with a standard deviation of 100. What is the standard error of the mean?
 
(A) .25       (B) 1.0    (C) 8.67    (D) 10.0    (E) 100.0
                         --------
SOLUTION: Var(mean) = (var/n)[(N-n)/(N-1)] = 100**2/100 (400-100)/(400-1), or about 75; s.e.mean = square root of this, or about 8.67 .

PART 6. SIMPLE RANDOM SAMPLING: CONFIDENCE INTERVAL FOR A MEAN

15. If GMAT scores have a standard deviation specified as 100, and a sample of n = 49 from a population of N = 490 gave a mean of 560, then a 95% confidence interval for the mean of that population is (560-A, 560+A), where A = ?
 
(A) 1.96(100/49)
(B) 1.96(100/7)
(C) 1.96(100/49)(490-49)/(490-1)
(D) 1.96(100/49)(.950)
(E) 1.96(100/7)(.950)
---------------------
 
16. About what sample size n would be required (instead of 49) so that the margin of error A would be 20?
 
(A) 8    (B) 50    (C) 83      (D) 101    (E) 111
                   ------
 

PART 7. SIMPLE RANDOM SAMPLING: CONFIDENCE INTERVAL FOR A PROPORTION

17. If 480 persons out of a random sample of n = 1600 from a population of N = 10,000 favor H. Roth Parrot for President, then a 95% confidence interval for the proportion of that population in favor of Parrot is (.30-A, .30+A), where A = ?
 
(A) 1.96(.3)(.7)/1600
(C) 1.96(.3)(.7)/40
(B) 1.96(.458)/1600
(D) 1.96(.458/1600)(10000-1600)/(10000-1)
(E) 1.96(.458/40)(.9166)
------------------------
 
18. About what sample size n is required (instead of 1600) to make the margin of error A = .01 ? [Use p = .5 in the formula to get a conservative estimate of the required sample size.]
 
(A) 50   (B) 400   (C) 500     (D) 600    (E) 5000
                                          --------
From the formula in the textbook, n = Npq/[(N-1)B**2/4 + pq] = 10000(.5)(.5)/[9999*.01**2/4 + (.5)(.5)] or about 5000.
Alternatively, set two times the standard error equal to .01 and solve for n.

PART 8. PROBLEM 4: TWO-WAY CLASSIFICATION WITH SURVEY DATA

19. For the dataset CREDIT DAT, what is the difference between mean monthly income for married and unmarried females?
 
(A) $53.20 (B) $234.80 (C) $388.60 (D) $623.40  (E) $676.60
----------
From one of the Problems:
  ROWS: MSTATUS     COLUMNS: GENDER
            0        1      ALL
 
   0    540.7    929.3    825.7
   1    487.5   1164.1   1119.7
  ALL   527.4   1078.0    994.9
 
   CELL CONTENTS --
            JOBINC:MEAN
$540.70-487.50 =  $53.20
 

PART 9. CHAPTER 4: STRATIFIED SAMPLING

20. Two strata are described below.
 
     ----------------------------------------
     Stratum, i                  1          2
     Stratum sizes            4000       6000
     Standard deviation          3          4
     Cost per observation      $16         $9
     ----------------------------------------
 
A sample of size n = 100 can be afforded. What is the optimal allocation to the two strata?
 
(A) 50, 50   (B) 40, 60   (C) 33, 67   (D) 27, 73   (E) 25, 75
                                       ----------
n1:n2::N1(sigma1)/sqrt(c1):N2(sigma2)/sqrt(c2)
       N1(sigma1)/sqrt(c1) = 4000(3)/4 = 3000
       N2(sigma2)/sqrt(c2) = 6000(4)/3 = 8000
n1:n2::3000:8000, i.e., n1=100 x 3/11 = 27, n2 = 73.
 
21. (continuation) If the mean of stratum 1 is 64 and the mean of stratum 2 is 68, what is the population mean?
 
(A) 66.0   (B) 66.1   (C) 66.2   (D) 66.3   (E) 66.4
                                            --------
.4(64) + .6(68) =66.4

22. (continuation) What is the variance of the population?

 
(A) 9.0   (B) 13.2   (C) 15.0   (D) 16.0   (E) 17.04
                                           ---------
N x pop.var. = BGSS + WGSS
      = 4000[(64-66.4)**2 + 3**2] + 6000[(68-66.4)**2 + 4**2]
      = 4000[ 5.76        +  9  ] + 6000[  2.56      + 16  ]
      = 4000[ 14.76             ] + 6000[   18.56          ]
      = 4000[ 14.76             ] + 6000[   18.56  ]  = 170400,
so pop.var. = 17.04 .
 

PART 10. PPS SAMPLING

23. Use Table 4.6 to find the probability that the estimate of the total is equal to 10. [Note: This is Table 8.3, p. 316, in the new 5th edition.]
 
(A) .01   (B) .08   (C) .16   (D) .24   (E) .25
                                        -------
Prob(estimate = 10) = .08+.01+.16

24. What is the probability that the estimate of the total is equal to the actual value of the total?

 
(A) .01   (B) .08   (C) .16   (D) .24   (E) .25
                                        -------
The true value is 10.

PART 11. PROBLEM 7: ZIPCODE DATA

25. How many of the 1000 zipcodes have a SportsPI at least two standard deviations above the mean?
 
(A) 16     (B) 18     (C) 20     (D) 22     (E) 30
           ------
From solution to Problem 7:
 
 MTB > # 5.  List the high SportsPI zipcodes:  All zipcodes having a
 MTB > # SportsPI at least two standard deviations above the mean.
 MTB > #
 MTB >       ERASE C21-C40
 MTB >       LET K21 = MEAN('SportsPI')
 MTB >       LET K22 = STDEV('SportsPI')
 MTB >       LET K23 = K21 + 2*K22
 MTB >       LET K24 = MAX('SportsPI')
 MTB >       COPY C1-C20 to C21-C40;
 SUBC>       USE rows with 'SportsPI' = K23:K24.
 MTB >       PRINT C21-C40
 
  ROW    C21   C22   C23     C24     C25     C26    C27     C28
 
    1     31     1    14   60090   59852   21660   31.2   41593
    2    347     2    30    3033    2052     669   31.7   38570
    3    363     2    31    7970    5994    1789   30.1   62358
    4    369     2    31    8536    5820    3346   34.4   37414
    5    375     2    34   10518     762     247   33.3   56568
    6    396     2    34   12570    3278    1082   30.9   34050
    7    413     2    34   13732    8295    2580   28.9   37172
    8    478     2    38   18968     356     125   33.2   31833
    9    554     3     9   33322   24945   11118   56.4   27374
   10    555     3     9   33445   28337   14535   64.5   24403
   11    557     3     9   33548    5848    3266   67.2   39552
   12    637     3    20   21082    1009     338   34.7   50901
   13    760     3    43   77077   37253   14860   31.4   49553
   14    791     3    45   22066   10346    3305   32.7   74308
   15    872     4     5   92056    1771     726   34.4   26880
   16    892     4     5   94807    1993     831   31.1   33826
   17    912     4     6   80134   14662    4494   31.1   52796
   18      0     4    50   83013     336     141   32.9   29167
 

PART 12. GEO-DEMOGRAPHICS

26. The U.S. population is about how many millions?
 
(A) 2.6   (B) 26   (C) 27    (D) 260    (E) 2600
                             -------
27. The number of zipcodes in the U.S. is about ?
 
(A) 400    (B) 4,000  (C) 40,000  (D) 400,000  (E) 4,000,000
                      ----------

PART 13. PROBLEM 8: PROBABILITY; TWO-WAY TABLES

28. In Little Town one-fourth of all dog owners have cats and one-half of all cat owners have dogs. What is the ratio of the number of dog owners to cat owners?
 
(A) 3:2   (B) 2:1     (C) 5:2     (D) 3:1     (E) 7:2
          -------
 
                     DOG?
              |  Yes        No  |
          ----|-----------------|----
          Yes |    a         b  |  R1
 CAT?         |                 |
          No  |    c         d  |  R2
          ----|-----------------|----
              |   C1        C2  |
 
a = (C1)/4 = (R1)/2; (C1) = 4(R1)/2 = 2(R1): C1:R1::2:1
 

PART 14. RATIO ESTIMATION

29. If the sugar content of a sample of oranges weighing 50 pounds is 5 pounds, and the total weight of the truckload of oranges is 1000 pounds, what is the ratio estimate of the sugar content of the truckload of oranges?
 
(A) 10 lbs. (B) 15 lbs.  (C) 20 lbs.  (D) 25 lbs.  (E) none of these
                                                   -----------------
5:50::x:1000, x = 100 lbs.

PART 15. DIFFERENCE ESTIMATION

Consider the following data, similar to Example 6.10 (in the 4th edition). The population contains N = 200 inventory items with a stated total book value of $12,000. Thus the true (population) mean book value (mean of x) is $60. A simple random sample of n = 6 items yields the results shown in the table.
 
 
____________________________________________________________________
 
   Item, i     Audit Value, y      Book Value, x    Difference, d
                             i                  i                i
____________________________________________________________________
 
       1               9                10                     -1
       2              15                12                     +3
       3               7                 8                     -1
       4              29                26                     +3
       5              45                44                     +1
       6              54                50                     +4
____________________________________________________________________
 
         Sums:       159               150                     +9
____________________________________________________________________
 
 
30. Consider the following statements.
     I.  The sample variance of the differences is 4.70.
         TRUE
    II.  The sample standard deviation of the differences is $2.17
         (to the nearest cent).
         TRUE
   III.  The estimated standard error of the sample mean is 0.846
         (to the nearest tenth of a cent).
 
   FALSE: Var of mean difference = (4.70/n)(N-n)/N
   = (4.70/6) (200-6)/200 = 0.760; sqrt(0.760) = $0.872.
(A) Only I is true.
(B) Only II is true.
(C)  Only III is true.
(D)  All three statements are true.
(E)  None of the above.
----------------------
 
31. (continuation) Construct "two-sigma" confidence limits (to the nearest cent) for the true difference between the means of y and x. The limits are ?
 
(A) (-0.24,3.24)   (B) (-1.69, 1.69)   (C) (-0.19, 3.19)
---------------
(D) (-0.19, 0.19)    (E) (-3.19, 3.19)
 
mean diff = 1.5; conf.int. is (mean diff - A, mean diff + A), where A = 2 x $0.872 = $1.74. 32. (continuation) The difference-method estimate of the true mean of y is ?
 
(A) 60.50    (B) 61.00  (C) 61.50    (D) 74.50    (E) None of these
                        ---------
Estimate = true mean of x + mean difference = 60.0 + 1.5 = 61.5 33. (continuation) The estimated variance of the estimator of the mean of y is ? (to three decimals)
 
(A) 0.590   (B) 0.764   (C) 0.780   (D) 0.975   (E) None of these
                                                -----------------
Same as variance of mean difference = 0.760, computed above

PART 16. TWO-STAGE CLUSTER SAMPLING

34. In Example 9.5, pp. 350-351 of the 5th edition (was page 299 in the 4th edition), what would be the optimal value of m if the cost of cutting a battery open were 24 times (instead of six times) the cost of measuring a plate?
 
(A) 2.09   (B) 4.18   (C) 8.36   (D) 16.72   (E) 33.44
                      --------

PART 17. INVERSE SAMPLING

35. A fair coin is tossed until the fifth Head occurs. What is the probability that this requires exactly 8 tosses?
 
(A) 1/32      (B) 1/8      (C) 35/256   (D) 35/128     (E) 56/256
                                        ----------
Pr(4 Heads in the first 7 tosses and Heads on the 8th toss) = C(7,4)(1/2)**8 = 7x6x5/(3x2x1)(1/2)**8 = 35/128

PART 18. RANDOMIZED RESPONSE MODEL

36. A simple random sample of 200 dog owners was selected from the population of dog owners in the city. Two questions were written down:

Question 1. Has your dog been vaccinated against rabies?
Question 2. Is the last digit of your social security number odd?

A sheet of random numbers was used. The following procedure was followed: (i) A two-digit random number was chosen. If the random number was 80 or above, Question 1 was answered. (ii) If the random number was 79 or below, Question 2 was answered. There were 90 responses of "yes" and 110 responses of "no". From this, estimate the percentage of the dog owners whose dogs had been vaccinated (to the nearest one-tenth of a percent).

 
(A) 57.1%   (B) 87.4%   (C) 87.5%   (D) 87.6%   (E) None of these
                                                -----------------
         P(Q1) = .2       P(Yes|Q1) = p, to be estimated
 
         P(Q2) = .8       P(Yes|Q2) = 1/2
 
Set  90/200 = P(Yes) = .2p + .8(1/2), giving p = (.45-.4)/.2 = .25
or 25%.
 

SLS:ss/460ofsln.html
20-Nov-95
latest revision 16:31, 23-Apr-96