University of Illinois at Chicago
College of Business Administration
Fall Semester, 1997
MBA 503 (Statistics Module for the MBA Program):
Data Analysis for Managerial Decisions
Instructor: Prof. Stan Sclove
Textbook: Levine, Berenson & Stephan
NOTES TO ACCOMPANY LBS CHAPTER 1: INTRODUCTION AND DATA COLLECTION
These notes Copyright © 1998 Stanley Louis Sclove
"Statistics" (singular) is the body of techniques for dealing with "statistics" plural. Each technique has its own special capabilities and unique instances of applicability. The family of techniques can assist in the accomplishment of description, explanation, prediction and control.
It is commonly the case that several variables are observed for a sample of individuals. The result is often called a "data set." It is customary to envision a data set as being comprised of rows and columns. The rows pertain to each observation, such as each person or each completed questionnaire in a large survey. The columns pertain to each variable, such as a response or an observed characteristic for each person.
Rows: records; individuals; cases; respondents; subjects; patients; etc.Columns: fields; variables; characteristics; responses; etc.
Data sets can be immense; a single study may have a sample size of 1,000 or more respondents each of whom answers 100 questions. Here the data set would be 1,000 by 100, or 100,000 cells of data. The need for summarization is evident. Though such a dataset is very large, statistical cluster analysis might show that there are only five clusters of types of individuals among all the individuals in the sample. Multiple regression might identify six significant predictor variables from among the many variables represented.Metric: quantitative/numerical
Nonmetric: qualitative/categorical
Likert scales: Very low/low/medium/high/very high scaled as 1,2,3,4,5
Semantic differential: agree strongly/agree somewhat/ neutral/disagree somewhat/strongly disagree, scaled as 1,2,3,4,5
Summated rating scale: a variable defined by summing over several Likert or semantic-differential variables presumed to measure the same thing
The most common type of data is a table of rows and columns, the rows corresponding to individuals; the columns, to variables. This is two-way, two-mode data. "Two-way" refers to its being in a two-way array (matrix). The two "modes" are individuals and variables.
A miles-between-cities table is an example of two-way, one-mode data. The mode, cities, is arrayed along both dimensions, rows and columns.Three-mode data. Persons by variables by occasions-of-measurement (time) would be 3-way, 3-mode data.
Distance between planets by date would be three-way, two-mode data (modes = planets and dates; ways = planets x planets x dates), as would be paired comparisons done on say, three separate occasions.