DOE FAQ Alert

**Issue:** Volume 2, Number 12**
Date: **December 2002

From:

Here's another set of frequently asked questions (FAQs) about doing design of experiments (DOE), plus alerts to timely information and free software updates. If you missed previous DOE FAQ Alerts, click on the links below. Feel free to forward this newsletter to your colleagues. They can subscribe by going to http://www.statease.com/doealertreg.html.

Here's an appetizer on how "transmitters could give fans and pundits quick stats fixes" according to "Nature" magazine at http://www.nature.com/nsu/021104/021104-6.html. Who wouldn't want more stats? However, the complexion of soccer (or "football" as it's called outside of the USA) could change completely, perhaps not for the better. The American version of football already makes use of many high-tech gadgets, such as telecommunications equipment allowing coaches to instruct their quarterback (via a helmet receiver) on what play to run. It may be just a matter of time before these sports get taken over completely by electronics. I wonder if this device could be modified so that referees could give badly behaving players a good jolt. Imagine if the fans could take control of this feature!

Here's what I cover in the body text
of this DOE FAQ Alert (topics that delve into statistical detail are designated
"Expert"):

1. FAQ: Explaining differences between predictive models
in coded versus actual units

2. Expert-FAQ: Why predicted R-squared is not computed
for two-level factorial design (Challenge to readers - analyze this yourself!)

3. Info Alerts: New Six Sigma article on DOE

4. Workshop alert: Upcoming classes, tuition increase

5. Reader contribution: DOE for Design for Six Sigma (DFSS),
more wanted on this subject

PS. Quote for the month - Einstein's supposed experiment
(a joke?) to explain relativity in lay person's terms (link provided to details
from Scientific American - check it out!)

Best wishes for a happy holiday season!
Mark

**1 - FAQ: Explaining differences
between predictive models in coded versus actual units**

**-----Original Question-----
From: California**

"I have looked at the coded and uncoded equations but I have never understood how to use them to determine the level of importance of the factors. The coded equation seems to say that one factor is most important, but the actual equation seems to say other factors are most important. For example, in the analysis detailed by your two-level factorial tutorial (posted at http://www.statease.info/dx6files/manual/DX03-Factorial-Levels-Two.pdf) the equations look quite different:

Final Equation in Terms of Coded
Factors:

Filtration Rate =

+70.06

+10.81 * A

+4.94 * C

+7.31 * D

-9.06 * A * C

+8.31 * A * D

Final Equation in Terms of Actual
Factors:

Filtration Rate =

-36.75000

+2.37500 * Temperature

+53.54545 * Concentration

-4.96970 * Stir Rate

-1.64773 * Temperature * Concentration

+0.20152 * Temperature * Stir Rate

Can you give me a simple way to understand this?"

**Answer (from Stat-Ease consultant
Pat Whitcomb):**

"For process understanding use coded values, because:

1. Regression coefficients tell us how the response changes relative to the
intercept. The intercept in coded values is in the center of our design. In
actual values the intercept can be, and usually is, far from the design space.

2. Units of measure are normalized (removed) by coding. Coefficients measure
half the change from -1 to +1 for all factors."

Note to readers: FYI, Pat answered a very similar FAQ last April. You probably will find this informative as a reminder about how the models get coded and uncoded. See FAQ #1 at http://www.statease.com/news/faqalert2-4.html.

(Learn more about analyzing data from two-level factorial designs by attending the 3-day computer-intensive workshop "Experiment Design Made Easy." For a complete description see http://www.statease.com/clasedme.html. Link from this page to the course outline and schedule. Then, if you like, enroll online.)

**2 - Expert-FAQ: Why predicted
R-squared is not computed for two-level factorial design (Challenge to readers
- analyze this yourself!)**

**-----Original Message-----
From: Georgia**

"Why does your software list the predicted R-squared as being not available (N/A)? It also reports significant curvature in my models. What can I do about this? I am sending you my data with disguised names and coded levels for the factors. Thanks for your help, and for your good software."

Note to readers: Here's the data in standard order for the two responses that came from this questioner's 2^3 factorial design with 3 center points.

**Y1:**

100

12.5

40

10

45.5

24.5

16

17

18

17

21.5

**Y2:**

100

12

78.5

5.5

67

0

0

0

6.5

3.5

3.5

Try analyzing this for yourself. If you have access to Design-Ease® or Design-Expert® software (obtain a free trial at http://www.statease.com/soft_ftp.html), set up the design (specify 2 responses), sort it in standard order and then copy and paste the data into the empty response columns. You can then use the software to do the statistical analysis.

**Answer:**

You've got two remarkable responses. However, you could not see this because
for both responses you picked every possible effect. That's why you got the
message in the analysis of variance (ANOVA): "Case(s) with leverage of
1.0000: Pred R-Squared and PRESS statistic not defined." The PRESS (on
which predicted (Pred) R-rquared is based) stands for "predicted residual
sum of squares." It tries to refit your model with each point in turn (one
at a time) taken out, thus providing a more acid test of how well it predicts
your response. Unfortunately, PRESS cannot work when you've used up every bit
of information, which is what you did by picking all the estimable effects.
You performed eight unique factorial runs (centerpoints don't help estimate
the factorial effects) from which you estimated the mean (overall average of
response data) plus three main effects (A, B, C), three two-factor interactions
(AB, AC, BC) and one three-factor interaction (ABC).

Starting over again with the analyses, I noticed that both your responses exhibited very wide ranges - from 10 to 100 for one, and 0 to 100 for the other. In such cases, it almost always helps to apply response transformations.

For the first response I applied a log transformation, which clarified the picture on the half-normal plot of effects considerably. Some main effects and two-factor interactions now stand out at the high end, while the three-factor interaction and other effects now line up with the pure error estimated from your centerpoints. Also, the ANOVA statistics come out much-improved. The residual diagnostics look good in this transformed scale. Finally, the Box-Cox plot supports the use of log.

For the second response I applied
a somewhat less severe transformation, the square root, thus avoiding the problem
of logging zeroes (not possible!). This did not appear to help much, but I picked
the biggest effect (A) on the half-normal plot and pressed ahead with ANOVA
and the diagnostics. At this stage it became immediately obvious that you've
got a statistical outlier (more than 6 standard deviations off according to
the Outlier t plot). After ignoring this deviant run (find out what happened!),
things cleared up amazingly on the half-normal plot - two main effects emerged
(A and C). The ANOVA shows no more significant curvature and the residuals look
great. The Box-Cox plot

recommends the square root, so this proved to be a good choice for transformation.

This worked out so well, that I'm suspicious you set me up for this by making up data. Is this for real? Do my revised analyses make sense to you?

I got this heart-warming response
back:

"Mark - Thanks for your help and for your prompt response. No, this is
real data (unless someone is setting ME up) and you have saved me and my company
real time, money, and head-scratching. As always, when you do good work, your
reward may be more work to do. But I hope this will teach me a lesson and I
will know a few more things to try. Actually, you taught me the lesson about
transforms when I took your course, I was just prejudiced toward using them
only when the physical situation implied them. And believe me, this one did
not."

**3 - Info alert: Six Sigma article
on DOE**

Here's a link to a short, but informative
article on "How To Compare Data Sets" using analysis of variance:
http://www.isixsigma.com/tools-templates/analysis-of-variance-anova/how-compare-data-sets-anova/.
It shows how to do the calculations via Microsoft Excel, but it refers to output
from our Design-Expert software, which provides a clear picture of the effects.
I like the article because it proves how great I am at bowling. My fellow consultants
at Stat-Ease don't fare so well, especially Shari, whose name got misspelled.
:( The names may not be correct, but the data is reliable, at least according
to this biased bowler. :)

**4 - Workshop alert: Upcoming classes,
tuition increase**

See http://www.statease.com/clas_pub.html for schedule and site information on all Stat-Ease workshops open to the public. To enroll, click the "register online" link at our web site or call Stat-Ease at 1.612.378.9449. If spots remain available, bring along several colleagues and take advantage of quantity discounts in tuition,* or consider bringing in an expert from Stat-Ease to teach a private class at your site. Call us to get a quote.

*(Prices will be going up for individuals
attending public Stat-Ease workshops in 2003. However, if you prepay by December
31, we will hold the tuition to the current level.)

**5 - Reader contribution: Article
posted on DOE for Design for Six Sigma (DFSS), more wanted on this subject**

Peter Peterka submitted his thoughts on DOE for Six Sigma and Design for Six Sigma (DFSS) which are posted at http://www.statease.com/pubs/sixsigma&DOE.pdf. Peter formerly worked at 3M Company as a product development and improvement specialist, where he routinely used Stat-Ease software for DOE. He is now an independent consultant (see http://www.6sigma.us). I've asked Peter to follow up on his submission by detailing an actual application of DOE for DFSS. Thanks, Peter, for your contributions on this vital subject.

Stat-Ease would appreciate more submissions on DOE for Six Sigma, not only for its DOE FAQ Alert, but also on behalf of the new International Journal of Six Sigma, which we offered to help via their Editorial Advisory Board. The editors of this double-blind refereed journal (still in the works) are asking for articles on Six Sigma, preferably ones that document DOE projects with measurable results. Please send anything you've got on this subject to me at my e-mail address shown below.

I hope you learned something from this issue. Address your questions and comments to me at:

Mark J. Anderson, PE, CQE

Principal, Stat-Ease, Inc. (http://www.statease.com)

Minneapolis, Minnesota USA

**PS. Quote for the month** -
Einstein's supposed experiment (a joke?) to explain relativity in lay person's
terms:

"When a man sits with a pretty girl for an hour, it
seems like a minute. But let him sit on a hot stove for a minute and it's longer
than any hour. That's relativity."

- Albert Einstein (for the rest of
the story, see Scientific American at http://makeashorterlink.com/?L24A12DF1.
Be patient with this link, which got shortened from the original path via an
intermediary site. It takes a moment to process.)

Trademarks: Design-Ease, Design-Expert and Stat-Ease are registered trademarks of Stat-Ease, Inc.

Acknowledgements to contributors:

- Students of Stat-Ease training and users of Stat-Ease software

- Fellow Stat-Ease consultants Pat Whitcomb and Shari Kraber (see http://www.statease.com/consult.html
for resumes)

- Statistical advisor to Stat-Ease: Dr. Gary Oehlert (http://www.statease.com/garyoehl.html)

- Stat-Ease programmers, especially Tryg Helseth (http://www.statease.com/pgmstaff.html)

- Heidi Hansel, Stat-Ease marketing director, and all the remaining staff.

**Interested
in previous FAQ DOE Alert e-mail newsletters? To view a past issue, choose
it below.**

**#1 - Mar
01, #2 - Apr 01, #3
- May 01, #4 - Jun 01, #5
- Jul 01 , #6 - Aug 01, #7
- Sep 01, #8 - Oct 01, #9
- Nov 01, #10 - Dec 01, #2-1
Jan 02, #2-2 Feb 02, #2-3
Mar 02, #2-4 Apr 02, #2-5
May 02, #2-6 Jun 02, #2-7
Jul 02, #2-8 Aug 02, #2-9
Sep 02, #2-10 Oct 02, #2-11
Nov 02, #2-12 Dec 02 (see above)**