Issue: Volume 9, Number 8
Date: August 2009
Mark J. Anderson, Stat-Ease, Inc.

Here's another set of frequently asked questions (FAQs) about doing design of experiments (DOE)

P.S. Quote for the month: Something to ponder at the beach.


1. FAQ: Why give up degrees of freedom for multiple blocks?

-----Original Question-----
Research Fellow, Belgium
"I created a design with 6 blocks. Why does this use up 5 degrees of freedom? I would think that 1 degree of freedom should suffice."

Answer (from Stat-Ease Consultant Wayne Adams):
"Although block effects are not included in the predictive model, they do use up degrees of freedom. These are needed to estimate the difference between the overall grand average and the individual block averages. So if there are six blocks, you need to estimate five block differences from the overall average, with the sixth difference being found through subtraction. Our software mathematically removes this difference from the analysis by applying the block effects. We consider all blocks to be random effects. As such, we still cannot use the block effects in the final predictive model; blocking is merely a way to protect the analysis of the factor effects from uncontrolled changes and/or lack of randomization."

2. FAQ: How do the predictive model equations in coded versus actual terms differ and which is most appropriate to use?

-----Original Question-----
A chemist
"How do the predictive model equations in coded versus actual terms differ and which is most appropriate to use?"

Answer (from Stat-Ease Consultant Wayne Adams):
"This is one of those "it depends" answers. As far as the difference, the units of measure must be considered in the actual model, whereas the coded model compares the lows to the highs within the design ranges. For example, consider two factors—temperature and time. Temperature ranges from a low (-1 coded) setting of 20 deg C to a high (+1 coded) of 40 deg C; time ranges from a low (-1 coded) of 30 seconds to a high (+1 coded) of 60 seconds. If you wanted to know the predicted outcome at 30 deg C for 45 seconds you would enter 30 deg C and 45 seconds into the actual model; or 0 temp and 0 time into the coded model. The prediction would be exactly the same. Due to this simplicity and having the model centered on the design space (middle of the space is all 0), the coded model is the best one to use for interpretation. You can easily answer questions like how different the outcome is between low and high settings equally well for any factor. If you are an advanced user and exporting the model into software that performs capability analysis then you want the actual model to correctly account for the factor standard deviation that is reported in the same units of measure as the factor."

3. Expert FAQ: What is the difference between the internally and the externally studentized residuals?

-----Original Question-----
A chemist
"I have some doubt about externally studentized and internally studentized residuals—please explain the difference between them and how they help in the model diagnostics."

Studentizing residuals overcomes differences in leverages that cause some points to be more tightly fitted than others. See for the details. Calculating these externally is useful for assessing outliers because each run is set aside as it's statistic gets calculated—what's called a "deletion diagnostic." Also see this primer on studentized residuals that, oddly enough, I found on a reference site for astronomers: Further details on these statistics can be found in your Design-Expert® program Help. That's a good place to start whenever you need statistical information. Refer to topic #2 in this year's February issue of the DOE FAQ Alert posted at Last, but not least, check out this out-take from the Stat-Ease "Handbook for Experimenters" on all the diagnostics we provide in our software:

Consultant Wayne Adams adds:
"There isn't a difference between internally and externally studentized residuals if you have an infinite sample size. ;) On a more practical note, keep in mind that the 'residual' is the actual observation less the prediction from a model. The model used for internal statistics is generated to best fit all the observations. The models (note the plural) used for external statistics are generated missing one observation at a time. The residual shows how well the "new" models fit (predict) the "externalized" observation. So to summarize difference between the two residuals:

-> Internal: one model with all the data; used to check constant variance, independence, and normality of the residuals.
-> External: new model (residual) calculated for each observation; used to check whether or not a point should be considered a significant outlier."

4. Info Alert: DOE for process cheese

Food Product Design magazine ( published an article on "DOE for Process Cheese" that details how Senior scientist Mostafa Galal, Ph.D., and senior technologist Michael Scheller, applied a general factorial design on emulsifiers. They discovered an ideal ratio of salts for improved appearance without degrading taste or meltability of the processed cheese. See If you are interested in publishing your DOE story, please contact Heidi via or call her 612.746.2033. See our current collection of DOE case studies and articles at We are especially in need of applications from the life sciences. Factor details can be coded for secrecy sake, so confidentiality need not be compromised.


5. Webinar alert (2nd): Analyzing historical data

You are invited to attend a free web conference by Stat-Ease Consultant Pat Whitcomb on "Analyzing Historical Data." This free conference, which Pat will present at an intermediate level statistically, will be broadcast on Wednesday, September 15th at 2 PM USA Central Time* (CT). He will repeat his webinar on Thursday, September 16th at 8 AM. It is aimed at those who need help in trying to make any sense out of pre-existing data using regression modeling. As Pat will point out, there are many perils and pitfalls to watch for when working with happenstance data. Stat-Ease webinars vary somewhat in length depending on the presenter and the particular session—mainly due to breaks for questions: Plan for 45 minutes to 1.5 hours, with 1 hour being the target median.

6. Book Giveaway: Three first editions of "RSM Simplified"

(Sorry, due to the high cost of shipping, this offer applies only to residents of the United States and Canada.) Simply reply to this e-mail by February 13 if you'd like (free!) one of three copies of "RSM Simplified" by Anderson & Whitcomb. This book just underwent a new printing*, so we can give away these first editions. I will forward your e-mail entries to my assistant Karen. Do not expect to hear from either of us unless your name is drawn as a winner. However, we do appreciate your participation in these giveaways. Watch for more of these in future DOE FAQ Alerts. Your odds of winning a free book increase by entering each time around!

PS. On August 3rd I received this very kind kudo from a reader: "As an engineer in the semiconductor-industry I enjoyed reading your book RSM Simplified and made use of it very sucessfully. :) The style of writing is perfectly for learning and using RSM designs! I think there is no comparable book on the market."


7. Events Alert: Talk on practical aspects of algorithmic design of physical experiments

Pat Whitcomb will deliver a talk (co-authored by Wayne Adams) on the practical aspects of algorithmic (optimal) design of physical experiments at the annual conference of the European Network for Business and Industrial Statistics (ENBIS), which will be held September 20-24 in Goteborg, Sweden. For details on this event, see Click for a list of upcoming appearances by Stat-Ease professionals. We hope to see you sometime in the near future!

8. Workshop Alert: See when and where to learn about DOE

Seats are filling fast for the following DOE classes. If possible, enroll at least 4 weeks prior to the date so your place can be assured. However, do not hesitate to ask whether seats remain on classes that are fast approaching!

—> Experiment Design Made Easy (EDME)
(Detailed at
> August 18-20, 2009 (Minneapolis)
> November 3-5, 2009 (Minneapolis)

—> Mixture Design for Optimal Formulations (MIX)
> August 11-13 (Minneapolis)
> October 27-29 (Minneapolis)

—> Response Surface Methods for Process Optimization (RSM)
> December 1-3 (Minneapolis)

—> Designed Experiments for Life Sciences (DELS)
> November 10-11, 2009 (Cambridge, MA)

—> DOE for DFSS: Variation by Design (DDFSS)
> November 17-18, 2009 (Minneapolis)

Mark J. Anderson, PE, CQE
Principal, Stat-Ease, Inc.




Mark J. Anderson, PE, CQE
Principal, Stat-Ease, Inc. (
2021 East Hennepin Avenue, Suite 480
Minneapolis, Minnesota 55413 USA

PS. Quote for the month — something to ponder at the beach:

"Statistics are like a bikini—what they reveal is suggestive, but what they conceal is vital."

—Aaron Levenstein, 1911-1986, Professor of Business Administration, Baruch College 1961-1981.

Trademarks: Stat-Ease, Design-Ease, Design-Expert and Statistics Made Easy are registered trademarks of Stat-Ease, Inc.

