Note

Choose View, Annotated ANOVA to activate blue hints and tips for how to interpret the ANOVA results.

At the top are the name of the response, its number, and the name given when the design was built.

The next line gives a brief description of the model being fit, followed by the type of sum of squares used for the calculations.

**Block**: The block row shows how much variation in the response is attributed
to blocks. Block variation is removed from the analysis. Blocks are not tested
(no F or p- value) because they are considered a non-replicated, hard-to-change
factor. Blocks are assumed to not interact with the factors. If there are no
blocks in the design, this row will not be present.

**Model**: The model row shows how much variation in the response is explained by
the model along with the over-all model test for significance.

**Terms**: The model is separated into individual terms and tested independently.

**Residual**: The residual row shows how much variation in the response is still
unexplained.

Lack of Fit: is the amount the model predictions miss the observations.

Pure Error: is the amount of difference between replicate runs.

**Cor Total**: This row shows the amount of variation around the mean of the
observations. The model explains part of it, the residual explains the rest.

**Source**: A meaningful name for the rows.

**Sum of Squares**: Sum of the squared differences between the overall average
and the amount of variation explained by that rows source.

**df**: Degrees of Freedom: The number of estimated parameters used to compute
the source’s sum of squares.

**Mean Square**: The sum of squares divided by the degrees of freedom. Also
called variance.

**F Value**: Test for comparing the source’s mean square to the residual mean
square.

**Prob > F**: (p-value) Probability of seeing the observed F-value if the null
hypothesis is true (there are no factor effects). Small probability values call
for rejection of the null hypothesis. The probability equals the integral under
the curve of the F-distribution that lies beyond the observed F-value.

In “plain English”, if the Prob>F value is very small (less than 0.05 by default) then the source has tested significant. Significant model terms probably have a real effect on the response. Significant lack of fit indicates the model does not fit the data within the observed replicate variation.

**Std Dev**: (Root MSE) Square root of the residual mean square. Consider this
to be an estimate of the standard deviation associated with the experiment.

**Mean**: Overall average of all the response data.

**C.V.**: Coefficient of Variation, the standard deviation expressed as a
percentage of the mean. Calculated by dividing the Std Dev by the Mean and
multiplying by 100.

**PRESS**: Predicted Residual Error Sum of Squares – A measure of how the model
fits each point in the design. The PRESS is computed by first predicting where
each point should be from a model that contains all other points except the one
in question. The squared residuals (difference between actual and predicted
values) are then summed.

\(e_{-i} = y_i\, -\, \hat{y}_{-i}\, =\, \frac{e_i}{1\, -\, h_{ii}}\)

\(PRESS = \sum_{i=1}^n(e_{-i})^2\)

\(e_{-i}\) is a deletion residual computed by fitting a model without the \(i^{th}\) run then trying to predict the \(i^{th}\) observation with the resulting model.

\(e_i\) is the residual for each observation left over from the model fit to all thedata.

\(h_{ii}\) is the leverage of the run in the design.

**R-squared**: A measure of the amount of variation around the mean explained by the
model.

\(R^2 = 1 - \begin{bmatrix}\frac{SS_{residual}}{SS_{residual}\, +\, SS_{model}}\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\frac{SS_{residual}}{SS_{total}\, -\, SS_{curvature}\, -\, SS_{block}}\end{bmatrix}\)

**Adj R-squared**: A measure of the amount of variation around the mean explained
by the model, adjusted for the number of terms in the model. The adjusted
R-squared decreases as the number of terms in the model increases if those
additional terms don’t add value to the model.

\(Adj. R^2 = 1 - \begin{bmatrix}\left(\frac{SS_{residual}}{df_{residual}}\right)\, /\, \left(\frac{SS_{residual}\, +\, SS_{model}}{df_{residual}\, +\, df_{model}}\right)\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\,\left(\frac{SS_{residual}}{df_{residual}}\right)\, /\, \left(\frac{SS_{total}\, -\, SS_{curvature}\, +\, SS_{block}}{df_{total}\, -\, df_{curvature}\, +\, df_{block}}\right) \end{bmatrix}\)

**Pred R-squared**: A measure of the amount of variation in new data explained by
the model.

\(Pred. R^2 = 1 - \begin{bmatrix}\frac{PRESS}{SS_{residual}\, +\, SS_{model}}\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\frac{PRESS}{SS_{total}\, -\, SS_{curvature}\, -\, SS_{block}}\end{bmatrix}\)

The predicted R-squared and the adjusted R-squared should be within 0.20 of each other. Otherwise there may be a problem with either the data or the model. Look for outliers, consider transformations, or consider a different order polynomial.

**Adequate Precision**: This is a signal-to-noise ratio. It compares the range
of the predicted values at the design points to the average prediction error.
Ratios greater than 4 indicate adequate model discrimination.

\(\frac{max(\hat{Y})\, -\, min(\hat{Y})}{\sqrt{\bar{V}_{\hat{Y}}}}\, >\, 4\)

\(\bar{V}_{\hat{Y}}\, =\, \frac{p\hat{\sigma}^2}{n}\)

**-2 Log Likelihood**: This is derived by iteratively improving the coefficient
estimates for the chosen model to maximize the likelihood that the fitted model
is the correct model. For balanced, orthogonal designs, this is exactly the same
result as least squares regression. The -2 log likelihood is used to compute the
following penalized modeling statistics.

**BIC**: a large design penalized likelihood statistic used to choose the best
model.

\(BIC(M) = -2\, \cdot\, \ln(L[M|Data])\, +\, \ln(n)\, \cdot\, p\)

**AICc**: a small to medium (most designs) penalized likelihood statistic used to
choose the best model.

\(AIC(M) = -2\, \cdot\, \ln(L[M|Data])\, +\, 2\, \cdot\, p\)

\(AICc(M) = AIC(M)\, +\, \frac{2p(p\, +\, 1)}{n\, -\, p\, -\, 1}\)

\(p\) = number of model parameters (including intercept (b0) and any block coefficients)

\(n\) = number of runs in the experiment

\(\sigma^2\) = residual mean square from the ANOVA table

This table has one row per estimated term in the model.

The number of columns depends on the type of analysis.

**Factor**: Experimental variables selected for inclusion in the predictive model.

**Coefficient Estimate**: Regression coefficient representing the expected change
in response Y per unit change in X when all remaining factors are held constant.
In orthogonal two-level designs, it equals one-half the factorial effect.

**Coefficient Estimate for General Factorial Designs**: Coefficients for multi-level
categorical factors are not as simple to interpret. Beta(1) is the difference of
level 1’s average from the overall average. Beta(2) is the difference of level
2’s average from the overall average. Beta(k-1) is the difference of level
(k-1)’s average from the overall average. The negative sum of the coefficients
will be the difference of level k’s average from the overall average. Don’t use
these coefficients for interpretation of the model – use the model graphs!

**df**: Degrees of Freedom – equal to one for testing coefficients.

**Standard Error**: The standard deviation associated with the coefficient
estimate.

**95% CI High and Low**: If this range spans 0 (one limit is positive and the
other negative) then the coefficient of 0 could be true, indicating the term is
not significant.

**VIF**: Variance Inflation Factor – Measures how much the variance around the
coefficient estimate is inflated by the lack of orthogonality in the design. If
the factor is orthogonal to all other factors in the model, the VIF is one.
Values greater than 10 indicate that the factors are too correlated together
(they are not independent.) VIF’s are a less important statistic when working
with mixture designs and constrained response surface designs.

The **predictive model** is listed in both actual and coded terms. (For mixture
experiments, the prediction equations are given in pseudo, real and actual
values of the components.) The coded (or pseudo) equation is useful for
identifying the relative significance of the factors by comparing the factor
coefficients. All equations give identical predictions when hierarchy is
enforced. These equations, used for prediction, have no block effects. Blocking
is a restriction on the randomization of the experiment, used to reduce error.
It is not a factor being studied. Blocks are only used to fit the observation
for this experiment, not to make predictions.

**For Linear Mixture Models Only**:

The coefficient table is augmented for linear mixture models to include statistics on the adjusted linear effects. Because the linear coefficients cannot be compared to zero, the linear effect of component i is measured by how different the ith coefficient is from the other (q-1) coefficients. The t-test is applicable to the difference in the mixture coefficient estimates. When the design space is not a simplex, the formula for calculating the component effects is adjusted for the differences in the ranges.

The gradient is the estimated slope across the linear response surface projected through a reference blend in both Cox and Piepel’s direction. The total effect of a component is the gradient times the range the component varied. These effects are plotted as a trace plot under the model graphs button.

**For One Factor Designs Only**:

The next section in the ANOVA lists results for each treatment (factor level) and shows the significance of the difference between each pair of treatments.

**Estimated Mean**: The average response at each treatment level.

**Standard Error**: The standard error associated with the calculation of this
mean. It comes from the standard deviation of the data divided by the square root
of the number of repetitions in a sample.

**Treatment**: This lists each pairwise combination of the factor levels.

**Mean Difference**: The difference between the average response from the two
treatments.

**df**: The degrees of freedom associated with the difference.

**Standard Error**: The standard error associated with the difference between the
two means.

**t value**: This is calculated by the Mean Difference divided by the Standard
Error. It represents the number of standard deviations separating the two means.

**Prob>t**: This is the probability of getting this t-value if the two means are
truly not different. A value less than 0.05 indicates that there is a
statistically significant difference between the means.