How to approach non-significant results
A non-significant result generally means that the study was inconclusive.
A non-significant result does not mean that the phenomenon doesn't exist, that the groups are equivalent, or that the independent variable does not affect the outcome.
With null-hypothesis significance testing (NHST), when you find a result that is not significant, all you can say is that you cannot reject the null hypothesis (which is typically that the effect-size is 0). You cannot use this as evidence to accept the null hypothesis: that claim requires running different statistical tests (see Other Statistical Learning). As a result, you cannot evaluate the truth-value of the null hypothesis: you cannot reject it and you cannot accept it. You still don't know, just as you didn't know before you ran the study. Your study was inconclusive.
Not finding an effect is different than demonstrating that there is no effect.
Put another way: "absence of evidence is not evidence of absence".
To claim "the null hypothesis is true", one would need to run specific statistics (called an equivalence test) that show that the effect-size is approximately 0.
Small Sample Sizes and Power
Small samples are a major reason that studies return inconclusive results.
The real reason is insufficient power.
Power is directly related to the design itself, the sample size, and the expected effect-size of the effect you're trying to measure.
Power determines the minimum effect-size that a study can detect, i.e. the effect-size that will result in a significant p-value.
In fact, when a study finds statistically significant results with a small sample, chances are that the estimated effect-size has been wildly inflated due to noise. Small samples can end up capitalizing on chance noise, which ends up meaning that "statistically significant" effect-size estimates tend to be way too high, making the study particularly unlikely to replicate, even under similar conditions.
With small samples, you're damned if you do find something (your effect-size will be wrong) and you're damned if you don't find anything (your study was inconclusive so it was a waste of resources).
Run a priori power analyses to determine sample sizes for the minimum effect-sizes of interest.
Index
Return to Statistics
Jump to Effect-Sizes