Design-Expert® software provides powerful features to add confidence, prediction, or tolerance intervals to its graphical optimization plots. All users can benefit by seeing how this provides a more conservative ‘sweet spot’. However, this innovative enhancement is of particular value for those in the pharmaceutical industry who hope to satisfy the US FDA’s QbD (quality by design) requirements.
Here are the definitions:
Confidence Interval (CI): an interval that covers a population parameter (like a mean) with a pre-determined confidence level (such as 95%.)
Prediction Interval (PI): an interval that covers a future outcome from the same population with a pre-determined confidence level.
Tolerance Interval (TI): an interval that covers a fixed proportion of outcomes from the population with a pre-determined confidence level for estimating the population mean and standard deviation. (For example, 99% of the product will be in spec with 95% confidence.)
Note that a confidence interval contains a parameter (σ, μ, ρ, etc.) with “1-alpha” confidence, while a tolerance interval contains a fixed proportion of a population with “1-alpha” confidence.
These intervals are displayed numerically under Point Prediction as shown in Figure 1. They can be added as interval bands in graphical optimization, as shown in Figure 2. (Data is taken from our microwave popcorn DOE case, available upon request.) This pictorial representation is great for QbD purposes because it helps focus the experimenter on the region where they are most likely to get consistent production results. The confidence levels (alpha value) and population proportion can be changed under the Edit Preferences option.
Ever wonder what the difference is between the various response surface method (RSM) optimization design options? To help you choose the best design for your experiment, I’ve put together a list of things you should know about each of the three primary response surface designs—Central Composite, Box-Behnken, and Optimal.
Central Composite Design (CCD)
Box-Behnken Design (BBD)
For an in-depth exploration of both factorial and response surface methods, attend Stat-Ease’s Modern DOE for Process Optimization workshop.
We’ve designed Design-Expert® software to be flexible and user-friendly. For those of you who haven’t had a chance to fully explore its capabilities, here are some tips to help you navigate the software and find options that are useful for you:
A central composite design (CCD) is a type of response surface design that will give you very good predictions in the middle of the design space. Many people ask how many center points (CPs) they need to put into a CCD. The number of CPs chosen (typically 5 or 6) influences how the design functions.
Two things need to be considered when choosing the number of CPs in a central composite design:
1) Replicated center points are used to estimate pure error for the lack of fit test. Lack of fit indicates how well the model you have chosen fits the data. With fewer than five or six replicates, the lack of fit test has very low power. You can compare the critical F-values (with a 5% risk level) for a three-factor CCD with 6 center points, versus a design with 3 center points. The 6 center point design will require a critical F-value for lack of fit of 5.05, while the 3 center point design uses a critical F-value of 19.30. This means that the design with only 3 center points is less likely to show a significant lack of fit, even if it is there, making the test almost meaningless.
TIP: True “replicates” are runs that are performed at random intervals during the experiment. It is very important that they capture the true normal process variation! Do not run all the center points grouped together as then most likely their variation will underestimate the real process variation.0
2) The default number of center points provides near uniform precision designs. This means that the prediction error inside a sphere that has a radius equal to the +/- 1 levels is nearly uniform. Thus, your predictions in this region (+/- 1) are equally good. Too few center points inflate the error in the region you are most interested in. This effect (a “bump” in the middle of the graph) can be seen by viewing the standard error plot, as shown in Figures 1 & 2 below. (To see this graph, click on Design Evaluation, Graph and then View, 3D Surface after setting up a design.)
Ask yourself this—where do you want the best predictions? Most likely at the middle of the design space. Reducing the number of center points away from the default will substantially damage the prediction capability here! Although it can seem tedious to run all of these replicates, the number of center points does ensure that the analysis of the design can be done well, and that the design is statistically sound.
Has a low R² ever disappointed you during the analysis of your experimental results? Is this really the kiss of death? Is all lost? Let’s examine R² as it relates to factorial design of experiments (DOE) and find out.
R² measures are calculated on the basis of the change in the response (Δy) relative to the total variation of the response (Δy + σ)over the range of the independent factor:
Let’s look at an example. Response y is dependent on factor x in a linear fashion:
We run a DOE using levels x1 and x2 in Figure 1 (below) to estimate beta1 (β1). Having the independent factor levels far apart generates a large signal-to-noise ratio (Δ12) and it is relatively easy to estimate β1. Because the signal (Δy) is large relative to the noise (σ), R² approaches one.
What if we had run a DOE using levels x3 and x4 in figure 1 to estimate β1? Having the independent factor levels closer together generates a smaller signal-to-noise ratio (Δ34) and it is more difficult to estimate β1. We can overcome this difficulty by running more replicates of the experiments. If enough replicates are run, β1 can be estimated with the same precision as in the first DOE using levels x1 and x2. But, because the signal (Δy) is smaller relative to the noise (σ), R² will be smaller, no matter how many replicates are run!
In factorial design of experiments our goal is to identify the active factors and measure their effects. Experiments can be designed with replication so active factors can be found even in the absence of a huge signal-to-noise ratio. Power allows us to determine how many replicates are needed. The delta (Δ) and sigma (Σ) used in the power calculation also give us an estimate of the expected R² (see the formula above). In many real DOEs we intentionally limit a factor’s range to avoid problems. Success is measured with the ANOVA (analysis of variance) and the t-tests on the model coefficients. A significant p-value indicates an active factor and a reasonable estimate of its effects. A significant p-value, along with a low R2, may mean a proper job of designing the experiments, rather than a problem!
R² is an interesting statistic, but not of primary importance in factorial DOE. Don’t be fooled by R²!