Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics

How to Choose Between Normal and t Distribution

Choosing between a normal distribution and a t distribution for statistical analysis depends on the nature of the data. Factors to consider include the type of statistic (e.g., proportion, sample mean), sample size, and knowledge of population variance. By following these guidelines, you can select the appropriate distribution for your analysis.

When to Use the Normal Distribution

Use the normal distribution with a sample mean when both of the following conditions are true:

  • Sample size is large. When the sample size is large (n ≥ 30), the central limit theorem ensures the sampling distribution of a statistic is approximately normal, even if the population distribution is not.
  • Population variance is known. If the population variance (σ) is known and sample size is large, the test statistic follows a normal distribution.

Use the normal distribution with a sample proportion when all of the following conditions are true:

  • Population size (N) is at least 10 times sample size (n).
  • The sampling method is simple random sampling.
  • n * p ≥ 10, where p is the sample proportion.
  • n * (1 - p) ≥ 10.

Note: When the sample proportion p equals 0.5, the last two conditions require that at least 20 observations be sampled from a population for the sampling distribution to be approximatley normal. When the sample proportion p is more extreme than 0.5, more observations are required.

When to Use the t Distribution

Use the t distribution with a sample mean when either of the following conditions are true:

  • Sample size is small. When sample size is small (n < 30), the distribution of the sample mean is not well approximated by the normal distribution. The t distribution more accurately represents the distribution of the mean with small samples.
  • Population variance is unknown. If the population variance is unknown and you are estimating it using the sample standard deviation (s), the sample mean follows a t distribution.

Warning: The t distribution assumes that the population from which the sample is drawn is approximately normal. If sample size is small (n < 30) and the population distribution is distinctly not normal (e.g., heavily skewed or contains outliers), using the t-distribution can lead to unreliable results. In such cases, non-parametric tests may be more appropriate or transformations of the data may be required.

Other Considerations

When the population distribution is not heavily skewed and does not have outliers, the t distribution is often the safest choice.

  • Robustness. If the sample size is large, the normal and t distributions give nearly identical results, so the choice between them becomes less critical.
  • Software defaults. Some statistical software tools automatically choose the t distribution when the population variance is unknown, regardless of sample size.

If you do choose the t distribution, you will have to specify degrees of freedom. Guidelines for calculating degrees of freedom are described at https://stattrek.com/statistics/degrees-of-freedom.