Here's another set of
frequently asked questions (FAQs) about doing design of experiments
(DOE), plus alerts to timely information and free software updates.
If you missed previous DOE FAQ Alerts, please click on the links at
the bottom of this page. If you have a question that needs answering, click the Search tab and enter the key words. This finds not only answers from previous Alerts, but also other documents posted to the Stat-Ease web site.
Here's an appetizer to get this Alert off to a good start:
http://www.sciencedaily.com/. It brings up "Science Daily"--a free, award-winning online magazine for the latest scientific discoveries and research. Browse or easily search this comprehensive guide to what's happening in the fields of science, including statistics.
Also, check out the neat posters by the Statistical Graphics section of the American Statistical Association (ASA) offered at http://www.public.iastate.edu/~dicook/Stat.Graphics/posters.html. I especially like poster 3 ("Pick a box..."), which illustrates the value of graphics versus raw statistics (1 graph = 1000 numbers?).
Here's what I cover in the body text of this DOE FAQ Alert (topics that delve into statistical detail are designated "Expert"):
alert: The May issue of the Stat-Teaser features "Katie's
Coke versus Pepsi DOE"--a lesson on aliasing
1. Newsletter alert: The May issue of the Stat-Teaser features "Katie's Coke versus Pepsi DOE"--a lesson on aliasing
Many of you by now may have received a printed copy of the latest Stat-Teaser, but others, by choice or because you reside outside of North America, will get your first look at the May issue at http://www.statease.com/news/news0405.pdf.
The feature article, "Katie's Coke versus
Pepsi DOE," details
a fundamental mistake in design by a novice experimenter--my youngest
daughter. Her Coke versus Pepsi taste test turned out to be a good
lesson on inadvertent aliasing of effects. However, I did experience
a significant jolt from both caffeinated colas!
Other stories in the Stat-Teaser provide:
*(To master these powerful tools of DOE,
attend our "Mixture Design for Optimal Formulation" workshop.
For a description, see http://www.statease.com/clas_mix.html.
Link from this page to the course outline and schedule. You can
enroll online by linking to the Stat-Ease e-commerce page for workshops.)
2. Warning--DOE under attack:
Follow the link to "Limitations of Experimental Design..." and
read my rebuttal
"CASE STUDY: Limitations of Experimental Design for Function Approximation and Resulting Penalties in Multi-Objective Optimization" by David Lengacher and Andrew Turner
Answer (my rebuttal):
For example, L&T's specification of Y<X (where each factor ranges from 1 to 5) can be re-arranged algebraically as 0<X-Y--a multilinear constraint. Based on a statistical algorithm called "D-optimal," I used a specialized software called Design-Expert® to generate a 16-run DOE (shown below) geared for a nonlinear function such that used by L&T in their simulation.
# Type (X,Y)
This test plan shows under the "Type" column where each run is located geometrically. The points labeled "Vertex" fall at a corner of the triangular space that L&T established as the feasible region for their hypothetical process. The ones identified as "Edge" are located at the centers of the sides forming the constraints. The "Center" is the overall centroid of the experimental region. "Check" points fill in gaps within the interior space.
Notice that a number of the points, most at the center, are replicated for estimating error. The run order should be randomized to counteract lurking variables such as machine wear. Thus the replicates would occur at varying intervals, providing a good barometer of any drift in the process.
Using Design-Expert software, I re-created the simulations of L&T for both responses--including the standard deviation. Then I ran my 16-run RSM design. It generated very close approximations throughout the feasible space with no need for extrapolation.
Finally, I ran Design-Expert's numerical optimization with L&T's specification that both responses be minimized. The software produced a most desirable solution of (5, 3.65) for the (X,Y) inputs--very close to the result of (5, 3.75) suggested by L&T.
Readers must draw their own conclusions on the efficacy of DOE versus the alternatives proposed by Lengacher and Turner. However, they should be aware that a fair comparison results only by making use of more sophisticated tools, namely D-optimal RSM designs.
Last, but not least, it should be noted that the article by Lengacher and Turner advocates use of happenstance, historical data fitted via neural network, versus a pro-active experiment, such as what I've laid out using appropriate response surface methods for design of experiments. Models based on happenstance data often prove to be very unstable due to highly correlated inputs--very typical from tightly-controlled processes, such as that described by L&T.
I would be happy to expand this letter into a complete article.
Mark J. Anderson, Principal, Stat-Ease, Inc.
Here's the response from "Desktop Engineering":
"I must commend you on your alacrity and thoroughness. I have never had an article generate so complete a response so quickly. I will be forwarding this to the authors for consideration.
I welcome your offer to expand your thoughts into a full-length article. I'd be delighted to take it under consideration for publication."
I have since submitted a complete article on this subject.
(Learn more D-optimal design by attending the
three-day computer-intensive workshop "Response Surface Methods
for Process Optimization." See http://www.statease.com/clas_rsm.html for a complete description. Link from this page to the course outline
and schedule. Then, if you like, enroll online.)
alert: A Link to forums on DOE and other statistical tools
Heidi Hansel, Stat-Ease Marketing Director, has
these suggestions for discussions on statistical tools:
*(For a question on DOE and standard error that I answered, see http://www.isixsigma.com/forum/showmessage.asp?messageID=39709.)
4. Reader reply: Another way to explain degrees of freedom
From: Kip Hilshafer, Research Associate, Stepan Company, Illinois
"The problem is, mathematicians and statisticians may have good ideas, but that does not mean they write well. Your explanation is very good. I came up with an analogy that the chemists at Stepan liked.
Consider fitting a linear equation (y = mx + b) through data. Because there are two parameters, the slope and y-intercept, which much be determined, there are two unknowns. As a system of equations or collection of data, two independent experiments are required for two simultaneous equations for an exact solution (and a "full-rank" matrix). The two independent data pairs of (x,y) constitute two degrees of freedom (df) and each parameter in the line (a and b) requires a df. Thus, to solve, you need two and have two; life is good.
To fit y = mx + b to three ordered pairs, though, you still require two df for the equation but with three independent measurements, you have 3 df. From there, it's like the First Law of Thermodynamics and not believing in witchcraft--nothing just vanishes into thin air. You have three df and use only two for the line, then one is left over, and the remaining one is error. Your friend has heard and may (justifiably) be confused with df total = df model + df error. Just like thermodynamics, you can't get something for nothing and nothing gets annihilated. Using five points to figure the same line figures as 5 - 2 = 3 df for error.
This is merely a formalization of what most people knew all along: The more data, the better the estimate, because the estimate of error improves. And so on ...
Thus, degrees of freedom might be viewed as a bookkeeping or accounting procedure to keep track of everything. Once you use more stuff, it has to go somewhere and you must know where it goes. It's not witchcraft. Does this help/roil/cause more harm than good?"
[Pat Whitcomb, Stat-Ease Consultant, then replied that having achieved a master's degree in chemical engineering, he liked Kip's analogy to thermodynamics (although only a bachelor's level chemical engineer,* so did I). However, Pat feared that for many readers adding thermodynamics to statistics may be going from bad to worse.]
*(Perhaps what persuaded me to pursue a master's in business
rather than chemical engineering was a University of Minnesota required course
with the terrifying title: "Statistical Thermodynamics". The teacher,
a grad student who clearly resented being called away from his research, made
5. Events alert: Link to my May 2004 Annual Quality Congress talk on screening designs; also, see a list of upcoming appearances
On May 26 in Toronto, I presented a talk (co-authored by Pat Whitcomb) titled "Screening Process Factors in the Presence Of Interactions" to the Annual Quality Congress of the American Society of Quality. It introduced a new, more efficient type of fractional two-level factorial design of experiments (DOE) tailored for screening of process factors. These designs are referred to as "Min Res IV" because they require a minimal number of factor combinations (runs) to resolve main effects from two-factor interactions (resolution IV). To view the proceedings, click on http://www.statease.com/pubs/aqc2004.pdf.
See http://www.statease.com/events.html for a list of appearances by Stat-Ease professionals. We hopeto see you sometime in the near future!
6. Workshop alert: See when and where to learn about DOE--response surface methods (RSM) are next on the educational agenda
If you've mastered the basics of DOE, take the next step by attending "Response Surface Methods for Process Optimization"--a 3 day, computer-intensive workshop, which will be presented on June 22-24 at the Stat-Ease training center in Minneapolis.
See http://www.statease.com/clas_pub.html for
schedule and site information on all Stat-Ease workshops open to
the public. To enroll, click the "register online" link
on our web site or call Stat-Ease at 1.612.378.9449. If spots remain
available, bring along several colleagues and take advantage of quantity
in tuition, or consider bringing in an expert from Stat-Ease to teach
a private class at your site. Call us to get a quote.
I hope you learned something from this issue. Address your general questions and comments to me at: email@example.com.
Mark J. Anderson, PE, CQE
PS. Quote for the month--Galileo on being a lone voice of reason against the ignorance of the masses:
"In questions of science
the authority of a thousand is not worth the humble reasoning of
a single individual."
Trademarks: Design-Ease, Design-Expert and Stat-Ease are registered trademarks of Stat-Ease, Inc.
Acknowledgements to contributors:
DOE FAQ Alert - Copyright 2004
in previous FAQ DOE Alert e-mail newsletters?
Click here to add your name to the FAQ
DOE Alert newsletter list server.
2021 E. Hennepin Avenue, Ste 480
Minneapolis, MN 55413-2726
p: 612.378.9449, f: 612.378.2152