One of the basic statistical concepts that we in research deal with every day is the notion of “statistically significant” differences. This – in my experience – is also one of the most misunderstood concepts among my clients. So, here is a plain English explanation of everything you need to know about “statistically significant.”
What I most often hear from clients is that they want a “statistically significant sample” or “how many people do I need to be statistically significant?” This is the wrong question, which comes from confusing a few different ideas with one another.
- “Statistically valid”: A sample of at least 30 records (people, households, buildings, companies, whatever your sampling unit is) that was selected from the population using some appropriate sampling procedure.
- “Statistically significant”: There is a very high probability that the difference between two measures, or between a measure and a benchmark value is not the result of random variation, but of real differences between the items compared. The idea of statistical significance exists only when comparing two or more values.
- “Significance level”: The threshold probability to be considered “statistically significant.” Usually expressed as a percent. Commonly used significance levels in market research are 80%, 90%, and 95%.
- “Margin of error”: The margin of error defines the range of values that contains the “true” value. The margin of error is related to the significance level, the sample size, the variance, and the measured value itself (for proportions). When you see a mention of a survey being “+/- 3%,” you are seeing the margin of error. ‘So, when a research result such as the following is reported:
- “We tested the difference in our measurement of product interest for Group A and Group B using a significance level of 95% and found the differences to be statistically significant.”
What we mean is that there is at least 95% probability that the measured values between Group A and Group B are different because of real differences between the groups, not random variation.
When we say: “The purchase intent for the new product is 62%, with a margin of error of plus or minus 3%.”
We mean that the actual purchase intent is somewhere between 59% and 65%. The significance level also comes into play here. If the significance level was 95%, then we’re saying that there is a 95% probability that the actual value is between 59% and 65%. For any given measurement, the higher the significance level (i.e. closer to 100%), the bigger the margin of error. This is simply saying that we can be more confident that a value falls within a larger range.
Returning to our original question “How many people do I need to be statistically significant?” should be “How many people do I need to be statistically valid?” This question always has the same answer.
I hope this has helped shed some light on these important ideas in market research. For further reading, the best introduction to basic statistics I have ever read is:
The Cartoon Guide to Statistics: http://www.assoc-amazon.com/e/ir?t=wwwericksonmr-20&l=ur2&o=1 by Gonick & Smith. As the name implies, its a comic book. It does an incredible job at explaining many basic statistical concepts in a very easy to understand way.