Part 1/3 – The Basics¶
Response Surface Design and Analysis¶
This tutorial, the first of three in this series, shows how to use Design-Expert® software for response surface methodology (RSM). This class of designs is aimed at process optimization. A case study provides a real-life feel to the exercise.
If you are in a rush to get the gist of design and analysis of RSM, hop past all the “Note” sections. However, if/when you find the time, it will be well worth the effort to explore these “by-the-ways.”
Due to the specific nature of this case study, a number of features that could be helpful to you for RSM will not be implemented in this tutorial. Many of these features are used in the earlier tutorials. If you have not completed all these tutorials, consider doing so before starting this one.
We will presume that you are knowledgeable about the statistical aspects of RSM. For a good primer on the subject, see RSM Simplified 2nd edition (Anderson and Whitcomb, Productivity, Inc., New York, 2016). You will find overviews on RSM and how it’s done via Design-Expert in the on-line Help system. To gain a working knowledge of RSM, we recommend you attend our Response Surface Methods for Process Optimization workshop. Call Stat-Ease or visit our website for a schedule at www.statease.com.
The case study in this tutorial involves production of a chemical. The two most important responses, designated by the letter “y”, are:
- y1 - Conversion (% of reactants converted to product)
- y2 - Activity.
The experimenter chose three process factors to study. Their names and levels are shown in the following table.
|Factor||Units||Low Level (-1)||High Level (+1)|
|A - Time||minutes||40||50|
|B - Temperature||degrees C||80||90|
|C - Catalyst||percent||2||3|
Factors for response surface study
You will study the chemical process using a standard RSM design called a central composite design (CCD). It’s well suited for fitting a quadratic surface, which usually works well for process optimization.
The three-factor layout for this CCD is pictured below. It is composed of a core factorial that forms a cube with sides that are two coded units in length (from -1 to +1 as noted in the table above). The stars represent axial points. How far out from the cube these should go is a matter for much discussion between statisticians. They designate this distance “alpha” – measured in terms of coded factor levels. As you will see, Design-Expert offers a variety of options for alpha.
Assume that the experiments will be conducted over a two-day period, in two blocks:
- Twelve runs: composed of eight factorial points, plus four center points.
- Eight runs: composed of six axial (star) points, plus two more center points.
Design the Experiment¶
Start the program and click the blank-sheet icon on the left of the toolbar and then click Response Surface from the list of designs on the left to show the designs available for RSM.
The default selection is the Central Composite design, which is used in this case study.
To see alternative RSM designs for three or more factors, click at far left on Box Behnken (notice 17 runs near the screen bottom) and Miscellaneous designs, where you find the 3-Level Factorial option (32 runs, including 5 center points). Now go back and re-select Central Composite design.
If not already entered, click the up arrow in the Numeric Factors entry box and Select 3 as shown below.
Before entering factors and ranges, click Options. Notice that it defaults to a Rotatable design with the axial (star) points set at 1.68179 coded units from the center – a conventional choice for the CCD.
Many options are statistical in nature, but one that produces less extreme factor ranges is the “Practical” value for alpha. This is computed by taking the fourth root of the number of factors (in this case 3¼ or 1.31607). See RSM Simplified Chapter 8 “Everything You Should Know About CCDs (but dare not ask!)” for details on this practical versus other levels suggested for alpha in CCDs – the most popular of which may be the “Face Centered” (alpha equals one). Press OK to accept the rotatable value. (Note: you won’t get the “center points in each axial block” option until you change to 2 blocks in this design, as below).
Using the information provided in the table on page 1 of this tutorial (or on the screen capture below), type in the details for factor Name (A, B, C), Units, and Low and High levels.
You’ve now specified the cubical portion of the CCD. As you did this, Design-Expert calculated the coded distance “alpha” for placement on the star points in the central composite design.
Alternatively, by clicking the “entered factor ranges in terms of alphas” option you can control how far out the runs will go for each of your factors.
Now return to the bottom of the central composite design form. Leave Type at its default value of Full (the other option is a “small” CCD, which we do not recommend unless you must reduce the number of runs to the bare minimum). You will need two blocks for this design, one for each day, so click the Blocks field and select 2.
Notice the software displays how this CCD will be laid out in the two blocks – for example, 4 center points will go in one and 2 in the other. Click Next to reach the second page of the “wizard” for building a response surface design. You now have the option of identifying Block Names. Enter Day 1 and Day 2 as shown below.
Press Next to enter Responses. Select 2 from the pull down list. Now enter the response Name and Units for each response as shown below.
At any time in the design-building phase, you can return to the previous page by pressing the Back button. Then you can revise your selections. Press Finish to view the design layout (your run order may differ due to randomization).
Design-Expert offers many ways to modify the design and how it’s laid out on-screen. Preceding tutorials, especially Part 2 for General One Factor, delved into this in detail, so go back and look this over if you haven’t already. Click the Tips () button for a refresher.
Save the Data to a File¶
Now that you’ve invested some time into your design, it would be prudent to save your work. Click the File menu item and select Save As.
You can now specify your File name (we suggest tut-RSM) to Save as type “*.dxpx” in the Data folder for Design-Expert (or wherever you want to Save in).
Enter the Response Data – Create Simple Scatter Plots¶
Assume that the experiment is now completed. At this stage, the responses must be entered into Design-Expert. We see no benefit to making you type all the numbers, particularly with the potential confusion due to differences in randomized run orders. Therefore, use the Help, Tutorial Data menu and select Chemical Conversion from the list.
Let’s examine the data! Click on the Design node on the left to view the design spreadsheet. Move your cursor to Std column header and right-click to bring up a menu from which to select Sort Ascending (this can also be done via a double-click on the header).
Now right-mouse click the Select column header (top left cell) and choose Space Point Type.
Notice the new column identifying points as “Factorial,” “Center” (for center point), and so on. Notice how the factorial points align only to the Day 1 block. Then in Day 2 the axial points are run. Center points are divided between the two blocks.
Unless you change the default setting for the Select option, do not expect the Type column to appear the next time you run Design-Expert. It is only on temporarily at this stage for your information.
Before focusing on modeling the response as a function of the factors varied in this RSM experiment, it will be good to assess the impact of the blocking via a simple scatter plot. Click the Graph Columns node branching from the design ‘root’ at the upper left of your screen. You should see a scatter plot with factor A:Time on the X-axis and the Conversion response on the Y-axis.
The correlation grid that pops up with the Graph Columns can be very interesting. First off, observe that it exhibits red along the diagonal—indicating the complete (r=1) correlation of any variable with itself (Run vs Run, etc). Block versus run (or, conversely, run vs block) is also highly correlated due to this restriction in randomization (runs having to be done for day 1 before day 2). It is good to see so many white squares because these indicate little or no correlation between factors, thus they can be estimated independently.
For now, it is most useful to produce a plot showing the impact of blocks because this will be literally blocked out in the analysis. Therefore, on the floating Graph Columns tool click the button where Conversion intersects with Block as shown below.
The graph visually shows there is not much of a difference between block 1 and 2. The points on Day 1 and Day 2 are both spread around about the same average value. Bear in mind that whatever the difference may be between blocks, it will be filtered out mathematically so as not to bias the estimation of factor effects.
Change the Y Axis to Activity (by clicking down the column one box) to see how it’s affected by the day-to-day blocking (not much, either!).
Finally, to see how the responses correlate, change the X Axis to Conversion. Now that we have 2 numeric factors along the axes, we can see the correlation between them. In the upper left of the legend you will see the correlation number is 0.224, showing slight correlation.
You may also note there is a faded pink color in the box you clicked in the grid to get this graph, denoting a slight upward correlation as you go from left to right on the graph.
Feel free to make other scatter plots, but the ones done thus far will be most valuable so it will be best to move on from here.
Notice you can also color selected factors, including run (default). For example, choose Color by Block to see which points were run in block 1 (black) and block 2 (red).
However, do not get carried away with this, because it will be much more productive to do statistical analysis first – before drawing any conclusions.
Analyze the Results¶
Now let’s start analyzing the responses numerically. Under the Analysis branch click the node labeled Conversion. A new set of tabs appears at the top of your screen. They are arranged from left to right in the order needed to complete the analysis. What could be simpler?
Design-Expert provides a full array of response transformations via the Transform option. Click Tips for details. For now, accept the default transformation selection of None.
Now click the Fit Summary tab. At this point Design-Expert fits linear, two-factor interaction (2FI), quadratic, and cubic polynomials to the response. At the top is the response identification, immediately followed below, in this case, by a warning: “The Cubic Model is aliased.” Do not be alarmed. By design, the central composite matrix provides too few unique design points to determine all the terms in the cubic model. It’s set up only for the quadratic model (or some subset).
Next you will see several extremely useful tables for model selection. Each table is discussed briefly via sidebars in this tutorial on RSM.
Use the blue layout buttons to choose how many panes are visible on your screen at once.
The table of “Sequential Model Sum of Squares” (technically “Type I”) shows how terms of increasing complexity contribute to the total model.
The Sequential Model Sum of Squares table: The model hierarchy is described below:
- “Linear vs Block”: the significance of adding the linear terms to the mean and blocks,
- “2FI vs Linear”: the significance of adding the two factor interaction terms to the mean, block, and linear terms already in the model,
- “Quadratic vs 2FI”: the significance of adding the quadratic (squared) terms to the mean, block, linear, and twofactor interaction terms already in the model,
- “Cubic vs Quadratic”: the significance of the cubic terms beyond all other terms.
For each source of terms (linear, etc.), examine the probability (“Prob > F”) to see if it falls below 0.05 (or whatever statistical significance level you choose). So far, Design-Expert is indicating (via bold highlighting) the quadratic model looks best – these terms are significant, but adding the cubic order terms will not significantly improve the fit. (Even if they were significant, the cubic terms would be aliased, so they wouldn’t be useful for modeling purposes.) Move down to the Lack of Fit Tests pane for Lack of Fit tests on the various model orders.
The “Lack of Fit Tests” pane compares residual error with “Pure Error” from replicated design points. If there is significant lack of fit, as shown by a low probability value (“Prob > F”), then be careful about using the model as a response predictor. In this case, the linear model definitely can be ruled out, because its Prob > F falls below 0.05. The quadratic model, identified earlier as the likely model, does not show significant lack of fit. Remember that the cubic model is aliased, so it should not be chosen.
Look over the last pane in the Fit Summary report, which provides “Model Summary Statistics” for the ‘bottom line’ on comparing the options
The quadratic model comes out best: It exhibits low standard deviation (“Std. Dev.”), high “R-Squared” values, and a low “PRESS.”
The program automatically underlines at least one “Suggested” model. Always confirm this suggestion by viewing these tables.
From the main menu select Help, Screen Tips or simply press the lightbulb icon () for more information about the procedure for choosing model(s).
Design-Expert allows you to select a model for in-depth statistical study. Click the Model tab at the top of the screen to see the terms in the model.
The program defaults to the “Suggested” model shown in the earlier Fit Summary table.
If you want, you can choose an alternative model from the Process Order pull-down list. (Be sure to try this in the rare cases when Design-Expert suggests more than one model.)
Also, you could now manually reduce the model by clicking off insignificant effects. For example, you will see in a moment that several terms in this case are marginally significant at best. Design-Expert provides several automatic reduction algorithms as alternatives to the “Manual” method: “Backward,” “Forward,” and “Stepwise.” Click the “Auto Select…” button to see these. From more details, try Screen Tips and/or search Help.
Click the ANOVA tab to produce the analysis of variance for the selected model.
The ANOVA in this case confirms the adequacy of the quadratic model (the Model Prob > F is less than 0.05.) You can also see probability values for each individual term in the model. You may want to consider removing terms with probability values greater than 0.10. Use process knowledge to guide your decisions.
Next, move over to the Fit Statistics pane to see that Design-Expert presents various statistics to augment the ANOVA. The R-Squared statistics are very good — near to 1.
Next, move down to the Coefficients pane to bring the following details to your screen, including the mean effect-shift for each block, that is; the difference from Day 1 to Day 2 in the response.
Press Coded Equation to bring the next section to your screen — the predictive models in terms of coded factors. Click Actual Equation for the the predictive models in terms of actual factors. Block terms are left out. These terms can be used to re-create the results of this experiment, but they cannot be used for modeling future responses.
You cannot edit any ANOVA outputs. However, you can copy and paste the data to your favorite word processor or spreadsheet. Also, as detailed in the One-Factor RSM tutorial, Design-Expert provides a tool to export equations directly to Excel in a handy format that allows you to ‘plug and chug’; that is, enter whatever inputs you like to generate predicted response. This might be handy for clients who are phobic about statistics. ; )
Diagnose the Statistical Properties of the Model¶
The diagnostic details provided by Design-Expert can best be grasped by viewing plots available via the Diagnostics tab. The most important diagnostic — normal probability plot of the residuals — appears in the first pane.
Data points should be approximately linear. A non-linear pattern (such as an S-shaped curve) indicates non-normality in the error term, which may be corrected by a transformation. The only sign of any problems in this data may be the point at the far right. Click this on your screen to highlight it as shown above.
Notice that residuals are “externally studentized” unless you change their form on the drop-down menu at the top of your screen (not advised).
- Externally calculating residuals increases the sensitivity for detecting outliers.
- Studentized residuals counteract varying leverages due to design point locations. For example, center points carry little weight in the fit and thus exhibit low leverage.
Now click the Resid. vs Run tab.
Now you can see that, although the highlighted run does differ more from its predicted value than any other, there is really no cause for alarm due to it being within the red control limits.
Next move to the Cook’s Distance tab.
Nothing stands out here.
Move on to the Leverage tab. This is best explained by the previous tutorial on One-Factor RSM so go back to that if you did not already go through it. Then skip ahead to DFBETAS, which breaks down the changes in the model to each coefficient, which statisticians symbolize with the Greek letter β, hence the acronym DFBETAS — the difference in betas. For the Term click the down-list arrow and select A as shown in the following screen shot.
You can evaluate ten model terms (including the intercept) for this quadratic predictive model (see sidebar below for help).
Click outside the Term field, reposition your mouse over the Term field and simply scroll your mouse wheel to quickly move up and down the list. In a similar experiment to this one, where the chemist changed catalyst, the DFBETAS plot for that factor exhibited an outlier for the one run where its level went below a minimal level needed to initiate the reaction. Thus, this diagnostic proved to be very helpful in seeing where things went wrong in the experiment.
Now move on to the Report tab in the bottom-right pane to bring up detailed case-by-case diagnostic statistics, many which have already been shown graphically.
The footnote below the table (“Predicted values include block corrections.”) alerts you that any shift from block 1 to block 2 will be included for purposes of residual diagnostics. (Recall that block corrections did not appear in the predictive equations shown in the ANOVA report.)
Examine Model Graphs¶
The residuals diagnosis reveals no statistical problems, so now let’s generate response surface plots. Click the Model Graphs tab. The 2D contour plot of factors A versus B comes up by default in graduated color shading.
Design-Expert displays any actual point included in the design space shown. In this case you see a plot of conversion as a function of time and temperature at a mid-level slice of catalyst. This slice includes six center points as indicated by the dot at the middle of the contour plot. By replicating center points, you get a very good power of prediction at the middle of your experimental region.
The Factors Tool appears on the right with the default plot. Move this around as needed by clicking and dragging the top blue border (drag it back to the right side of the screen to “pin” it back in place. The tool controls which factor(s) are plotted on the graph.
Each factor listed in the Factors Tool has either an axis label, indicating that it is currently shown on the graph, or a slider bar, which allows you to choose specific settings for the factors that are not currently plotted. All slider bars default to midpoint levels of those factors not currently assigned to axes. You can change factor levels by dragging their slider bars or by left-clicking factor names to make them active (they become highlighted) and then typing desired levels into the numeric space near the bottom of the tool. Give this a try.
Click the C: Catalyst toolbar to see its value. Don’t worry if the slider bar shifts a bit — we will instruct you how to re-set it in a moment.
Left-Click the bar with your mouse and drag it to the right.
As indicated by the color key on the left, the surface becomes ‘hot’ at higher response levels, yellow in the ’80’s, and red above 90 for Conversion.
To enable a handy tool for reading coordinates off contour plots, go to View, Show Crosshairs Window (click and drag the titlebar if you’d like to unpin it from the left of your screen). Now move your mouse over the contour plot and notice that Design-Expert generates the predicted response for specific factor values corresponding to that point. If you place the crosshair over an actual point, for example – the one at the far upper left corner of the graph now on screen, you also see that observed value (in this case: 66).
P.S. See what happens when you press the Full option for crosshairs.
Now press the Default button on the Factors Tool to place factor C back at its midpoint.
Open the Factors Sheet by clicking the Sheet… button on the Factors Tool.
In the columns labeled Axis and Value you can change the axes settings by right-clicking, or type in specific values for factors. Give this a try. Then close the window and press the Default button.
P.S. The Terms list on the Factors Tool is a drop-down menu from which you can also select the factors to plot. Only the terms that are in the model are included in this list. At this point in the tutorial this should be set at AB. If you select a single factor (such as A) the graph changes to a One-Factor Plot. Try this if you like, but notice how Design-Expert warns if you plot a main effect that’s involved in an interaction.
Wouldn’t it be handy to see all your factors on one response plot? You can do this with the perturbation plot, which provides silhouette views of the response surface. The real benefit of this plot is when selecting axes and constants in contour and 3D plots. See it by mousing to the Graphs Toolbar and pressing Perturbation or pull it up from the View menu.
For response surface designs, the perturbation plot shows how the response changes as each factor moves from the chosen reference point, with all other factors held constant at the reference value. Design-Expert sets the reference point default at the middle of the design space (the coded zero level of each factor).
Click the curve for factor A to see it better. The software highlights it in a different color as shown above. It also highlights the legend. (You can click that too – it is interactive!) In this case, at the center point, you see that factor A (time) produces a relatively small effect as it changes from the reference point. Therefore, because you can only plot contours for two factors at a time, it makes sense to choose B and C – and slice on A.
Contour Plot: Revisited¶
Let’s look at the plot of factors B and C. Start by clicking Contour on the Graphs toolbar. Then in the Factors Tool right-click the Catalyst bar and select X1 axis by left clicking it.
You now see a catalyst versus temperature plot of conversion, with time held as a constant at its midpoint.
Design-Expert contour plots are highly interactive. For example, right-click up in the hot spot at the upper middle and select Add Flag.
That’s enough on the contour plot for now — hold off until Part 3 of this tutorial to learn other tips and tricks on making this graph and others more presentable. Right-click and Delete flag to clean the slate.
3D Surface Plot¶
Now to really get a feel for how the response varies as a function of the two factors chosen for display, select 3D Surface from the Graphs Toolbar. You then will see three-dimensional display of the response surface. If the coordinates encompass actual design points, these will be displayed. On the Factors Tool move the slide bar for A:time to the right. This presents a very compelling picture of how the response can be maximized. Right-click at the peak to set a flag.
You can see points below the surface by rotating the plot. Move your mouse over the graph. When the cursor turns into a hand, click and hold the left mouse button and then drag.
Seeing an actual result predicted so closely lends credence to the model. Things are really looking up at this point!
Remember that you’re only looking at a ‘slice’ of factor A (time). Normally, you’d want to make additional plots with slices of A at the minus and plus one levels, but let’s keep moving — still lots to be done for making the most of this RSM experiment.
Analyze the Data for the Second Response¶
This step is a BIG one. Analyze the data for the second response, activity. Be sure you find the appropriate polynomial to fit the data, examine the residuals and plot the response surface. Hint: The correct model is linear.
Before you quit, do a File, Save to preserve your analysis. Design-Expert will save your models. To leave Design-Expert, use the File, Exit menu selection. The program will warn you to save again if you’ve modified any files.