How to Choose Between Normal and t Distribution
Choosing between a normal distribution and a t distribution for statistical analysis depends on the context of your data, particularly the sample size and knowledge of population variance. By following these guidelines, you can select the appropriate distribution for your analysis.
When to Use the Normal Distribution
Use the normal distribution when both of the following conditions are true:
- Sample size is large. When the sample size is large (n ≥ 30), the central limit theorem ensures the sampling distribution of a statistic is approximately normal, even if the population distribution is not.
- Population variance is known. If the population variance (σ) is known and sample size is large, the test statistic follows a normal distribution.
When to Use the t Distribution
Use the t distribution when either of the following conditions are true:
- Sample size is small. When sample size is small (n < 30), the t-distribution accounts for the additional uncertainty in estimating the population standard deviation. When sample size is large (n ≥ 30), the sample standard deviation is more likely to be a good estimate of the population standard deviation.
- Population variance is unknown. If the population variance is unknown and you are estimating it using the sample standard deviation (s), the test statistic follows a t distribution.
Warning: The t distribution assumes that the population from which the sample is drawn is approximately normal. If sample size is small (n < 30) and the population distribution is distinctly not normal (e.g., heavily skewed or contains outliers), using the t-distribution can lead to unreliable results. In such cases, non-parametric tests may be more appropriate or transformations of the data may be required.
Other Considerations
When the population distribution is not heavily skewed and does not have outliers, the t distribution is often the safest choice.
- Robustness. If the sample size is large, the normal and t distributions give nearly identical results, so the choice between them becomes less critical.
- Software defaults. Many statistical software tools automatically choose the t distribution when the population variance is unknown and the sample size is small.
If you do choose the t distribution, you will have to specify degrees of freedom. Guidelines for calculating degrees of freedom are described at https://stattrek.com/statistics/degrees-of-freedom.