Mass Communication Research
Sampling

INDEX SYLLABUS SCHEDULE e-MEDIA COMM-STOP

Population and Sample

Often in scientific research our goal is to describe a population. A population can be defined as the complete set of subjects, variables, or concepts under consideration. If we examine every member of a population, we call that a census. Like the US Census, this means counting or examining every member of a population. Often we do not have the time or resources to do an entire census, so we take a sample. A sample is a subset of a population that is representative of the entire population. If it's not representative, it's not generalizable to the larger population.

The accuracy of systematic sampling depends on the adequacy of the sampling frame, or complete list of the members of a population. The phone book is inadequate for phone numbers, for example, because everyone in the population is not listed. A better sampling frame would be all known working telephone numbers in a block of all numbers in the area under study.

Types of Samples

Non-Probability Samples (not random or mathematical, )

  1. Intercept (mall or in the street), not really a sample;
  2. Available or convenience, subject to error;
  3. Volunteers may differ from the general population and produce erroneous results;
  4. Purposive or quota samples are not generalizable.
Probability Samples

  1. Simple random: Every subject, unit or person in a population has an equal chance of being selected;

    • Random without replacement (the most common) meaning subjects are removed from possible selection;
    • Random with replacement means subjects are removed from selection but may be chosen at another time.

  2. Systematic (fixed interval) sampling involves picking every nth subject, unit or person based on the sample size and starting point. (Widely used in survey research);
  3. Stratified sampling is used to get adequate representation in a subsample;
  4. Cluster sampling is used when there is no way to get a good sampling frame, or population list. It's a subdivision of a population;
  5. Multistage sampling is sometimes used on national surveys, where individual households or persons are selected instead of groups of numbers, for example.

Sample Size

Researchers should always use as large a sample as possible. Sample size calculators are available on the Internet. How large a sample depends on a few general principles:
  1. Research method: Focus groups use smaller samples, but the results are not meant to be generalizable. Samples with 10-50 subjects are used in pretesting measurement instruments and pilot studies.
  2. Subsamples often consist of 50, 75, or 100 subjects.
  3. Cost and time constraints. The sample size usually conforms to a project's budget.
  4. Larger samples are required for multivariate studies because they analyze multiple response data.
  5. Researchers should select a larger sample than required for panel studies, focus groups, and other prerecruit projects. Expected mortality for such projects is between 10 to 25 percent, and 50 percent mortality is not uncommon.
  6. A literature review will indicate the sample size of similar studies. Representative samples of 400 have been used in many studies with reliable results.
  7. A large unrepresentative sample is as meaningless as a small unrepresentative sample.

All research involves error. Sampling error measures the error produced and checks it against variable variance. It is directly related to sample size. The error figure improves as the sample size is increased, but in decreasing increments. Decreasing a sample size from 1,000 to 400 will only decrease the sampling error by a small percentage. You can find many online sampling error calculators, of which one can be found at Wimmer-Hudson Research & Development.

So, How close do you want to be? The confidence interval is the plus-or-minus figure usually reported with poll results. We most often use plus or minus 5 percent, although it's not unusual to get that down to plus or minus 3 percent, if the sample is large enough and you are confident of the list's accuracy. For example, if you use a confidence interval of +/- 3 percent, and 47 percent of the sample chooses an answer, you could be "sure" if you'd asked the question of the entire relevant populaton between 44 percent (47-3) and 50 percent (47+3).

How sure do you want to be? The confidence level is expressed as a percentage and represents how often the true percentage of the population who would pick an answer lies within the confidence interval. We most often use 95%. Any less may be too unreliable, and getting it higher is often too difficult and not worth the cost or time.

If you don't understand something in this Web note, please e-mail Dr. Sitton.

INDEX SYLLABUS SCHEDULE e-MEDIA COMM-STOP

©M. Mark Miller & Ronald W. Sitton 2009
Revised 092811 — http://www.uamont.edu/FacultyWeb/sitton/crz/mrea/sampling.html