University of Illinois at Chicago
College of Business Administration
Department of Information & Decision Sciences

Short Review of IDS 270-371 (Business Statistics I-II)
Professor Stanley L. Sclove
Familiarity with the following topics will be assumed, along with the accompanying Minitab or Excel commands:
Descriptive Statistics
Univariate:
    Frequency distribution; histogram
    quartiles
    Mean absolute deviation
    Standard deviation
    Minitab: DESCribe, DOTPlot, HISTogram
Bivariate:
    Covariance and Correlation
    Minitab:  PLOT, LPLOt, CORR
Probability Theory:
unions, intersections, complements of sets
mean and variance of a discrete random variable
Binomial Distributions
Bernoulli variables
mean, variance
Discrete and Continuous Random Variables
Minitab: CDF, INVCDF
Normal Family of Distributions
two parameters: mean and variance
Confidence Intervals and Hypothesis Testing
one-sample problem for means
one-sample problem for proportions
matched-sample problem for means
matched-sample problem for proportions
two-sample problem for means
two-sample problem for proportions
General: p-values; power
Minitab:   TINT, TTESt, TWOSample
Regression
          Basics of simple and multiple regression
Minitab: REGRess, STEPwise, BESTregression
Categorical Data
Two-Way Frequency (Count) Tables (Contingency Tables)
Test of Equality of Row Distributions
Minitab: TABLe, CHISquare


EXERCISES
 
        PART 1.  COMPUTATIONS FOR A SAMPLE
 
             For a sample of  n = 3  observations,  the sum is  6  and the
        sum of squares is  50.
 
        1.  The sum of squared deviations (from the sample mean) is  ?
 
 
 
 
        --------------------------------------------------------------------
 

 
 
        PART 2.  CALCULATIONS FROM A SAMPLE
 
        2.  A sample of three observations has a sum of 6 and a sum of
        squares equal to 50.  One observation is -3.  What are the other two
        observations?
 
 
 
 
 
 
 
 
 
 
        3.  A sample of 10 observations has a mean of 100.  The sum of 9 of
        the observations is 900.  What is the value of the other observation?
 
 
 
 
        4.  A sample of n = 14 has a standard deviation of 3.1.  What is the
        sum of squared deviations?
 
 
 
 
        5.  Consider the following table.
 
            Distribution of Number of Magazine Subscriptions in Households
 
               -----------------------------------------------
               Number of subscriptions     0     1     2     3
               Number of households       10    40    30     f
               -----------------------------------------------
 
        Find the value of  f  such that the mean number of subscriptions
        per household is 2.0.   f  =  ?
 
 
 
 
        ----------------------------------------------------------
 

 
 
        PART 3.  PROBABILITY:  EQUALLY LIKELY CASES
 
              A  custodian  is  asked  to  rank  four brands  (A, B, C, D) of
        common household cleanser according to his preference, number 1 being
        the cleanser he prefers most,  and  so  on.   Suppose  the  custodian
        really  has  no preference among the four brands and hence all orders
        are equally likely to occur.
 
        6.  What is the probability that C is first and D  is  third  in  the
        ranking?
 
 
 
 
        7.  What  is the probability that A is ranked either second or third?
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 4.  PROBABILITY:  COMPOUND EVENTS
 
        8.   A state highway department has contracted for  the  delivery  of
        sand,  gravel,  and cement at a construction site.  Due to other work
        commitments and labor force problems, contracting firms cannot always
        deliver items on the agreed delivery date.  Based on  past  evidence,
        the  probabilities that sand, gravel, and cement will be delivered on
        the promised delivery dates by the contracting firms are .3,  .6  and
        .8,  respectively.   Assume  that  the delivery or nondelivery of one
        material is independent of another.
 
              Find the probability that all three materials will be delivered
        on time.
 
 
 
 

 
 
        PART 5.  EXPECTED VALUE OF A RANDOM VARIABLE
 
        9.  Consider the following probability distribution of a random
        variable  x.
 
               -------------------------------------
                     v             -3     5     10
               P(x = v)            .2     p    .8-p
               -------------------------------------
 
        What is the value of  p  so that the expected value of  x  is 5.0?
 
 
 
 
        ----------------------------------------------------------
 
        PART 6.   DISCRETE DISTRIBUTIONS
 
        10.  Often  the  values  1,  2, 3, 4 and 5 are assigned to categories
        such as "Strongly Disagree," "Disagree Somewhat,"  "Neutral,"  "Agree
        Somewhat,"  "Strongly  Agree."  This  question  has to do with such a
        "scaling."
 
              Of all random variables taking the values 1, 2, 3,  4,  5,  the
        one  with   P(x=1)  =  1/2   and   P(x=5)  = 1/2 has maximum standard
        deviation.  What is the value of this standard deviation?
 
 
 
 
 
        ----------------------------------------------------------
 
        PART 7.  BINOMIAL DISTRIBUTIONS
 
        11.  Suppose a binomial distribution has a mean of 6 and  a  variance
        of  3.   Then  what  are  the values of the parameters n and p of the
        distribution?
 
 
 
 
 
 

 
        PART 8  SAMPLING FROM A FINITE POPULATION
 
        12.  A sample of n = 400 is to be drawn (without replacement) from a
        population of N = 2000 with a standard deviation of $4000.  What is
        the standard deviation of the sample mean?
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 9  NORMAL DISTRIBUTIONS:  PERCENTILES
 
        13.  What is the 95th percentile of the standard normal
        distribution?
 
 
        14.  What is the 75th percentile of the standard normal
        of the standard normal distribution?
 
 
        ----------------------------------------------------------
 
        PART 10.  RANDOM SAMPLING FOR MEASURED CHARACTERISTICS; THE NORMAL
                 DISTRIBUTION
 
             In quality control, samples are selected from a production line
        and various quality characteristics are measured
        in order to check that the process is "in control."
        Suppose that a bottling process is intended to fill bottles
        with, on average, 21 fluid ounces of beverage.  Variation around
        this mean follows the normal distribution with a standard deviation
        of 0.5 fluid ounces.
 
        15.  If a technician samples 25 bottles (when the process is
        "in control") and measures the amount of
        beverage in each, what is the probability
        that the sample average (for the 25 bottles) will exceed 21.2
        fluid ounces?
 
 
 
 

 
 
        PART 11.  NORMAL APPROXIMATION TO THE BINOMIAL
 
             Statistics released by the National Highway Traffic Safety
        Administration and the National Safety Council
        show that on an average weekend night,  1  out of
        every 10 drivers on the road is drunk.  If 400 drivers are
        randomly checked next Saturday night, what is the probability
        that the number of drunk drivers will be
 
        16.  More than 49?
 
 
 
 
        17.  Exactly 40?
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 12.  INTERVAL ESTIMATION OF A BINOMIAL PROBABILITY
 
             In a random sample of n = 625 persons,   300 favor Paul Parrot
        for President.
 
        18.  What is the estimate of the  standard deviation  of the
        sampling distribution  of the sample proportion ("p-hat")?
 
 
        ---------------------------------------------------------------------
 
        PART 13.  SAMPLE SIZE DETERMINATION FOR A CONFIDENCE INTERVAL
 
             Suppose that GMAT scores have a known standard deviation of 100.
        A sample of scores of UIC MBA students is to be taken to estimate the
        mean in that population.  Determine how large a sample is required to
        form a 95% confidence interval with a margin of error (half-width) of
        25 points.
 
        19.  The required sample size is about  ?
 
 
 
        ---------------------------------------------------------------------
 

 
        PART 14.  OBSERVED SIGNIFICANCE LEVEL:  p-VALUE
 
        20.  In  a  test  of  the  null hypothesis that the mean is 18 versus
        alternatives that the mean is greater than 18, a sample of  n  =  100
        observations  gave  a  mean  of  19.5 and standard deviation of 6.00.
        What is the p-value (i.e., the achieved level of significance)?
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 15.  CONFIDENCE INTERVAL FOR A DIFFERENCE BETWEEN TWO MEANS
 
             Lifetimes of two types  of  batteries  were  compared.   Summary
        statistics  are given in the table.  Note that the means are given in
        hours and the standard deviations are given in minutes.
 
        TABLE.  Statistics from samples of battery lifetimes
 
                                             standard
                                 n    mean  deviation
                  ---------------- ------- ----------
                  Battery A     64   7 hr      30 min
                  Battery B    100   6 hr      15 min
                  -----------------------------------
 
        In  what  follows,  do not pool the variances since, judging from the
        ratio of 2 between the two sample  standard  deviations,  it  appears
        that   the   population  variances  differ.   Also,  use  the  normal
        distribution (rather than t) due to the large sample sizes.
 
        21.  What is  the  estimate  of  the   standard  deviation    of  the
        difference between sample means?
 
 
 
        22.  What is the 95% confidence interval for the difference
        between means?
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 16.  CHI-SQUARE GOODNESS OF FIT TEST
 
        23.  The number of accidents per day were recorded in a plant for
        100 days.  The data are tabulated below.
        Compute the value of chi-square for testing the hypothesis
        that the following data came from the distribution (.4, .3, .2, .1)
        over the values, 0, 1, 2, 3 or more.
 

 
 
        ---------------------------------------------------------------------
        Number of accidents:                                    0   1   2  >2
        Number of days on which this many accidents occurred:  45  25  20  10
        ---------------------------------------------------------------------
        Hypothesized prob. of this no. of accidents            .4  .3  .2  .1
        ---------------------------------------------------------------------
 
        The value of the chi-square test statistic is  ?
 
 
 
 
 
        PART 17.   CONTINGENCY TABLES
 
                                    OWN A CAR?
                                |  Yes       No  |
                     ___________|________________|_____
                                |                |
                     Full time  |   16        1  |   17
        EMPLOYMENT   Part time  |   68       15  |   83
                     None       |   50       19  |   69
                     ___________|________________|_____
                                |                |
                                |  134       35  |  169
 
        24.  Of all 169 people, what percentage are employed full time and
        own a car?
 
 
 
        25.  Of those who are employed full time, what percentage own a car?
 
 
 
        26.  Of those who own a car, what percentage are employed full time?
 
 
 
        27.  What is the number of degrees of freedom associated with the
        chi-square test statistic for this table?
 
 
 

 
 
        28.  Compute the value of the chi-square test statistic.
        (Answer:  4.586)
 
 
 
 
 
        29. (continuation)  Find the corresponding p-value.
 
 
 
 
 
        ---------------------------------------------------------------------
 
        PART 18.  REGRESSION
 
             Suppose  that  for  speeds  between  5  and 95 mph the miles per
        gallon (G) and speed (S) are  related  according  to  the  regression
        equation
 
                     Y   =   7.216  -  1.073 X
 
        where  Y = ln G  and  X = ln S.
 
        30.  If  the speed (S) is 65, then the predicted gasoline mileage (G)
        is  ?
 

Created   25 April 1996       Updated 6 Jan 2001