I. Statistics
What you need to know: Statistics provide valid and reliable results only when the data collection and research methods follow established scientific procedures.
The term distribution refers to a collection of numbers. Frequency distribution means a collection of scores, ordered according to magnitude and their respective frequencies. Think of the frequency of something's magnitude. How about UT's national standing in football? What about it's academic standing? What if the measure is financing education, as in funding levels compared to other states and universities? Where do we stand? What about the distribution of a population on a public opinion survey? See how we transform data into proportions or percentages? It helps us make sense of the world and make predictions.
By a similar token, the median wouldn't tell us much of anything in nominal data. Using gender again, there are only two answers, 1 for males and 2 for females. So what's the middle score? If we had 35 respondents and 17 were males and 18 were females, the middle score would be a 2. But in actuality, this is just telling us there are more females than males in that sample, which is what the mode is telling us as it indicates the score occurring most often in a distribution. Median: The midpoint of a distribution (half the scores lie above it, half below it). Just look at the list. What's in the middle? The median is appropriate for variables measured at the ordinal level. Mean: The average of a set of scores. The sum of all the numbers in a list simply divided by the total number of scores in the list. The mean is an appropriate statistic for variables measured at the interval and ratio levels, but not for ordinal or nominal measures. And remember the mean is sensative to outliers, or extreme scores. Can you think of an example in public opinion on public affairs where the extreme might skew the sample? What if a large block of religious-right voters refused to answer or tell the truth on public opinion polls but showed up at voting booths in large numbers? The polls might be wrong due to the skewed sample. This we consider a subversion of the research process and not good for the measures of our democracy.
Variance: A mathematical index of the degree scores deviate from or are at variance with the mean. A small variance indicates that most of the scores in the distribution lie fairly close to the mean; a large variance represents widely scattered scores. Valuable in analyzing multiple populations with ANOVA. Standard deviation: Estimate of the scores about the mean. Approximately 66 percent of all cases will occur within the first standard deviation above and below the means. The second standard deviation contains approximately 95 percent of all cases above and below the mean. The third standard deviation contains approximately all of the cases.
Some of the inferential statistical analyses we'll cover in this course include nonparmetric procedures such as the chi-square "goodness-of-fit" test and Contingency Table Analysis, and the parametric procedures of t-Test, Analysis of Variance (ANOVA) and Correlation. Not so long ago, statistics were calculated by hand. Today, we have powerful computers to do the calculations, thus speeding up the research process. Though computers can do the calculations amazingly fast, there is still the possibility of human error while entering the data, or as my chemistry teacher used to say, "The problem's not with the calculator, but the calculator operator." We'll use SPSS for the majority of our work, but there are many statistical sites online, including:
Revised 092811 — http://www.uamont.edu/FacultyWeb/sitton/crz/mrea/statintro.html |