# 置信區間和預測區間之間的差異

The difference between a prediction interval and a confidence interval is the standard error.

The standard error for a confidence interval on the mean takes into account the uncertainty due to sampling. The line you computed from your sample will be different from the line that would have been computed if you had the entire population, the standard error takes this uncertainty into account.

The standard error for a prediction interval on an individual observation takes into account the uncertainty due to sampling like above, but also takes into account the variability of the individuals around the predicted mean. The standard error for the prediction interval will be wider than for the confidence interval and hence the prediction interval will be wider than the confidence interval.

Your question isn't quite correct. A confidence interval gives a range for $\text{E}[y \mid x]$, as you say. A prediction interval gives a range for $y$ itself. Naturally, our best guess for $y$ is $\text{E}[y \mid x]$, so the intervals will both be centered around the same value, $x\hat{\beta}$.

As @Greg says, the standard errors are going to be different---we guess the expected value of $\text{E}[y \mid x]$ more precisely than we estimate $y$ itself. Estimating $y$ requires including the variance that comes from the true error term.

To illustrate the difference, imagine that we could get perfect estimates of our $\beta$ coefficients. Then, our estimate of $\text{E}[y \mid x]$ would be perfect. But we still wouldn't be sure what $y$ itself was because there is a true error term that we need to consider. Our confidence "interval" would just be a point because we estimate $\text{E}[y \mid x]$ exactly right, but our prediction interval would be wider because we take the true error term into account.

Hence, a prediction interval will be wider than a confidence interval.

I found the following explanation helpful:

Confidence intervals tell you about how well you have determined the mean. Assume that the data really are randomly sampled from a Gaussian distribution. If you do this many times, and calculate a confidence interval of the mean from each sample, you'd expect about 95 % of those intervals to include the true value of the population mean. The key point is that the confidence interval tells you about the likely location of the true population parameter.

Prediction intervals tell you where you can expect to see the next data point sampled. Assume that the data really are randomly sampled from a Gaussian distribution. Collect a sample of data and calculate a prediction interval. Then sample one more value from the population. If you do this many times, you'd expect that next value to lie within that prediction interval in 95% of the samples.The key point is that the prediction interval tells you about the distribution of values, not the uncertainty in determining the population mean.

Prediction intervals must account for both the uncertainty in knowing the value of the population mean, plus data scatter. So a prediction interval is always wider than a confidence interval.

A prediction interval is an interval associated with a random variable yet to be observed (forecasting).

A confidence interval is an interval associated with a parameter and is a frequentist concept.

Check full answer here from Rob Hyndman, the creator of forecast package in R.

One is a prediction of a future observation, and the other is a predicted mean response. I will give a more detailed answer to hopefully explain the difference and where it comes from, as well as how this difference manifests itself in wider intervals for prediction than for confidence.

This example might illustrate the difference between confidence and prediction intervals: suppose we have a regression model that predicts the price of houses based on number of bedrooms, size, etc. There are two kinds of predictions we can make for a given $x_0$:

1. We can predict the price for a specific new house that comes on the market with characteristics $x_0$ ("what is the predicted price for this house $x_0$?"). Its true price will be $$y = x_0^T\beta+\epsilon$$. Since $E(\epsilon)=0$, the predicted price will be $$\hat{y} = x_0^T\hat{\beta}$$ In assessing the variance of this prediction, we need to include our uncertainty about $\hat{\beta}$, as well as our uncertainty about our prediction (the error of our prediction) and so must include the variance of $\epsilon$ (the error of our prediction). This is typically called a prediction of a future value.

2. We can also predict the average price of a house with characteristics $x_0$ ("what would be the average price for a house with characteristics $x_0$?"). The point estimate is still $$\hat{y} = x_0^T\hat{\beta}$$, but now only the variance in $\hat{\beta}$ needs to be accounted for. This is typically called prediction of the mean response.

Most times, what we really want is the first case. We know that $$var(x_0^T\hat{\beta}) = x_0^T(X^TX)^{-1}x_0\sigma^2$$

This is the variance for our mean response (case 2). But, for a prediction of a future observation (case 1), recall that we need the variance of $x_0^T\hat{\beta} + \epsilon$; $\epsilon$ has variance $\sigma^2$ and is assumed to be independent of $\hat{\beta}$. Using some simple algebra, this results in the following confidence intervals:

1. CI for a single future response for $x_0$: $$\hat{y}_0\pm t_{n-p}^{(\alpha/2)}\hat{\sigma}\sqrt{x_0^T(X^TX)^{-1}x_0 + 1}$$

2. CI for the mean response given $x_0$: $$\hat{y}_0\pm t_{n-p}^{(\alpha/2)}\hat{\sigma}\sqrt{x_0^T(X^TX)^{-1}x_0}$$

Where $t_{n-p}^{\alpha/2}$ is a t-statistic with $n-p$ degrees of freedom at the $\alpha/2$ quantile.

Hopefully this makes it a bit clearer why the prediction interval is always wider, and what the underlying difference between the two intervals is. This example was adapted from Faraway, Linear Models with R, Sec. 4.1.

This answer is for those readers who could not fully understand the previous answers. Let's discuss a specific example. Suppose you try to predict the people's weight from their height, sex (male, female) and diet (standard, low carb, vegetarian). Currently, there are more than 8 billion people on Earth. Of course, you can find many thousands of people having the same height and other two parameters but different weight. Their weights differ wildly because some of them have obesity and others may suffer from starvation. Most of those people will be somewhere in the middle.

One task is to predict the average weight of all the people having the same values of all three explanatory variables. Here we use the confidence interval. Another problem is to forecast the weight of some specific person. And we don't know the living circumstances of that individual. Here the prediction interval must be used. It is centered around the same point, but it must be much wider than the confidence interval.