Confidence Intervals

Motivation

Once we have our estimate and our sampling distribution, we can use the sampling distribution to provide an interval that contains likely values of the population parameter of interest

Interpretation

True or False: Once I produce a 95% confidence interval, there is a 95% probability that the true parameter value lies in that interval.
True or False: Once I produce a 95% confidence interval, there is a 95% probability that the sample estimate lies in that interval.

Two ways of producing confidence intervals

Analytical method
Resampling method

Analytical method

In some cases, we can rely on certain assumptions about the population and characterize the sampling distribution, e.g., when the assumptions for the CLT hold and we are interested in the sample mean.

In this case, we can generate a confidence interval of the form: \[ \bar{x} \pm z^* \times \sigma/\sqrt{n} \]

Or, more generally: \(\hat{\theta} \pm t^* \times \sigma_{\hat{\theta}}\)

Resampling method

In other cases, it is difficult or impossible to characterize the sampling distribution. In these cases, a resampling method called the bootstrap is useful.

We mimic the process of collecting many different samples from the population by sampling with replacement from the data
We calculate the statistic of interest in each bootstrap sample to form the estimated sampling distribution
We use the estimated sampling distribution to calculate a confidence interval

Factors that affect confidence intervals

Sample size
Standard error
Confidence level