Many of you have no doubt heard the term “Sampling Error” and have asked why there are always “errors” in marketing research. What is sampling error? How can we conduct research that is error free?
Well, the answer is that you can’t. When you use a sample of the population to measure characteristics of the entire population you actually create sampling error. But, that doesn’t mean that you’re doing anything wrong!
Sampling error is simply the difference in measurement when instead of including the entire population (also known as a census), you select a sample of the population. So unless you’re going to pay for many more completed surveys than you need, you’re going to have sampling error.
A simple example will illustrate. Let’s say you had a group of 20 men and wanted to determine the average height of the group. With only 20 men, it is easy to measure each man and calculate an exact average height, as follows:
Because you measured the entire population, you have no sampling error. But, let’s take a sample of every other man. What will the average of this sample of 10 be? As you can see below, with a sample of 10 you get close to the actual average, but are .4 inches off. This is the impact of sampling error in your sample.
What if you only took a sample of every 4th man or a sample of 5? Here’s what would happen? Well, with a smaller sample, the amount of error increases to 2.4 inches.
This of course is an over simplified illustration. In the real world we are dealing with populations in the 100,000’s or millions where it is impossible or just too expensive to do a census. Sampling error can never be eliminated entirely, but using statistical theory, sampling error can be minimized and measured. The most direct way of minimizing sampling error is to increase the sample size. As seen in our simple example, the error for the sample 10 men was lower than for the sample of 5 men.
Statistically, sampling error is measured as a margin of error. The margin of error is a statistic expressing the amount of random sampling error in a survey’s results. It asserts a likelihood (not a certainty) that the result from a sample is close to the number one would get with a census, if the whole population had been queried. The likelihood of a result being “within the margin of error” is itself a probability, commonly 95%. The larger the margin of error, the less confidence one should have that the poll’s reported results are close to the true figures; that is, the figures for the whole population.
There is a long, complex statistical formula to calculate the margin of error. But, thanks to the Internet, there are number of websites that provide free, easy to use calculators to determine either the sample size needed for a particular margin of error or the margin of error inherent with a particular sample size. You can find these online sample size calculators here or here.
For example, if you select a sample of 400 people, your margin of error would be +/- 4.9% with 95% confidence. This means that if you selected an infinite number of samples of 400 from the population, your estimates (means, percentages, etc.) will be within +/- 4.9% of the population parameter 95% of the time. Decreasing the sample size to 300 would result in a margin of error of 5.66% and increasing the sample to 500 would decrease the sample to 4.38%.
So, the art of selecting a sample size is simply balancing the cost of additional sample against the size of the sampling error to achieve an acceptable margin of error. There are other sources of error in addition to sampling error, but let’s hold those for future blogs!