Point Prediction Intervals

Point Prediction uses the models fit during analysis and the factor settings specified on the factors tool to compute the point predictions and interval estimates. The predicted values are updated as the levels are changed. Prediction intervals (PI) are found under the Confirmation node.

Math Details

(1-α)*100% Confidence Interval

\(\hat{y} \pm t_{(1 - \frac{a}{2}, n-p)} \cdot SE_{prediction}\)

\(where, SE_{prediction} = S \cdot \sqrt{x_0(X^TX)^{-1}x_0^T}\)

(1-α)*100% Prediction Interval

\(\hat{y} \pm t_{(1 - \frac{a}{2}, n-p)} \cdot SE_{prediction}\)

\(where, SE_{prediction} = S \cdot \sqrt{\frac{1}{N}x_0(X^TX)^{-1}x_0^T}\)

(1-α)*100% Tolerance Interval for P * 100% of the population

\(\hat{y} \pm s \cdot TI\, Multiplier\)

\(where, TI\, Multiplier = t_{(1-\alpha, n-p)} \cdot \sqrt{x_o(X^TX)^{-1}x_0^T} + \Phi ^{-1}\left(0.5 + \frac{P}{2}\right) \cdot \sqrt{\frac{n-p}{x_{(\alpha, n-p)}^2}}\)

The TI uses only alpha rather than alpha/2 to compute the two-tailed interval.

Where:

\(\hat{y}\) = predicted value at x0

s = estimated standard deviation

t = student’s t critical value

α = acceptable type I error rate (1 - confidence level)

n = number of runs in the design

N = number of observation in the future sample

p = The number of terms in the model including the intercept

P = proportion of the population contained in the tolerance interval

X = expanded model matrix [*]

x0 = expanded point vector [†]

ɸ = inverse normal function to convert the proportion to a normal score

χ2 = Chi-Square critical value

n-p is also the residual degrees of freedom (df) from the ANOVA. [‡]

The superscript T indicates the previous matrix is transposed.

The superscript -1 indicates the previous matrix is inverted.

[*]The expanded model matrix (X) has one row for each run in the design and one column for each term in the model. The values in the X matrix are assumed to be coded values. The first column is typically all 1’s to represent the intercept term of the model. For mixture designs there is no intercept term in the Scheffé polynomial thus this column is not present.
[†]The expanded point vector is a way to represent the settings of the factors for a particular location for purposes of prediction. Think of it as a matrix with one row. It has a similar structure to a row of the expanded model matrix; one element for each term in the model. The order of the terms represented by the model matrix’s columns and the point vector’s elements must match.
[‡]The residual (error) degrees of freedom for a split-plot design are estimated using the procedure outlined by Kenward and Roger. This value is shown on the point prediction output.

References

  • DeGryze, Langhans, and Vandebroek. Using the correct intervals for prediction: a tutorial on tolerance intervals of ordinary least-squares regression. Chemometrics and Intelligent Laboratory Systems, 87(2):147–154, 2007.
  • Hahn and Meeker. Statistical Intervals, A Guide for Practitioners. 1991.
  • Kenward and Roger. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53(3):983–997, 1997.