Confidence Interval for a Population Mean, when the Distribution is Non-normal

When the distribution is normal, we use the z-statistic when the population variance is known and we use t-statistic when the population variance is unknown.

However, when the distribution is not normal, we cannot create a confidence interval if the sample size n<30.

If sample size > 30 and the distribution is non-normal then:

  • If population variance is known, we use z-statistic
  • If population variance is unknown, we use t-statistic. Even z-statistic is acceptable, but t-statistic is more common.

Application in Finance

Let's understand this with stock market returns:

Case 1: Normal Distribution

We can analyze the daily returns of S&P 500 index. The returns approximate a normal distribution:

  • Known Population Variance:

    • Let's say we have historical volatility (σ) = 1% daily
    • Sample mean return = 0.05% daily
    • n = 25 days
    • Here, we would use z-statistic since population variance is known
  • Unknown Population Variance:

    • Let's take the same scenario but without known historical volatility
    • In this case, we would use t-statistic with sample standard deviation
    • This is more common in reality as true population variance is rarely known

Case 2: Non-Normal Distribution

Let's take one more example, this time using Bitcoin daily returns, which are typically non-normally distributed (showing high kurtosis and skewness):

  • If our sample size is small, say we're analyzing 20 days of returns (n < 30):

    • We cannot create reliable confidence intervals
    • We need to use non-parametric methods instead
  • If we're analyzing 60 days of returns (n > 30):

    • We can create confidence intervals due to Central Limit Theorem
    • We will use t-statistic as population variance is unknown

Key Statistical Tests

Before applying these methods, it's crucial to:

  1. Test for normality using:

    • Jarque-Bera test (common in finance)
    • Shapiro-Wilk test
    • Visual inspection of Q-Q plots
  2. Consider sample size:

    • Small samples require stricter assumptions
    • Larger samples are more forgiving due to Central Limit Theorem