Has a low R² ever disappointed you during the analysis of your experimental results? Is this really the kiss of death? Is all lost? Let’s examine R² as it relates to factorial design of experiments (DOE) and find out.
R² measures are calculated on the basis of the change in the response (Δy) relative to the total variation of the response (Δy + σ)over the range of the independent factor:
Let’s look at an example. Response y is dependent on factor x in a linear fashion:
We run a DOE using levels x1 and x2 in Figure 1 (below) to estimate beta1 (β1). Having the independent factor levels far apart generates a large signal-to-noise ratio (Δ12) and it is relatively easy to estimate β1. Because the signal (Δy) is large relative to the noise (σ), R² approaches one.
What if we had run a DOE using levels x3 and x4 in figure 1 to estimate β1? Having the independent factor levels closer together generates a smaller signal-to-noise ratio (Δ34) and it is more difficult to estimate β1. We can overcome this difficulty by running more replicates of the experiments. If enough replicates are run, β1 can be estimated with the same precision as in the first DOE using levels x1 and x2. But, because the signal (Δy) is smaller relative to the noise (σ), R² will be smaller, no matter how many replicates are run!
In factorial design of experiments our goal is to identify the active factors and measure their effects. Experiments can be designed with replication so active factors can be found even in the absence of a huge signal-to-noise ratio. Power allows us to determine how many replicates are needed. The delta (Δ) and sigma (Σ) used in the power calculation also give us an estimate of the expected R² (see the formula above). In many real DOEs we intentionally limit a factor’s range to avoid problems. Success is measured with the ANOVA (analysis of variance) and the t-tests on the model coefficients. A significant p-value indicates an active factor and a reasonable estimate of its effects. A significant p-value, along with a low R2, may mean a proper job of designing the experiments, rather than a problem!
R² is an interesting statistic, but not of primary importance in factorial DOE. Don’t be fooled by R²!
Hello, Design-Expert® software users, Stat-Ease clients and statistics fans! I’m Rachel, the Client Specialist here at Stat-Ease. If you’ve ever called our general line, I’m probably the one who picked up; I’m the one who prints and binds your workshop materials when you take our courses. I am not, by any stretch of the imagination, a statistician. So why am I, a basic office administrator who hasn’t taken a math class since high school, writing a blog post for Stat-Ease?It’s because I entered this year’s Minnesota State Fair Creative Activities Contest thanks to Design-Expert and help from the Stat-Ease consultant team.
I’m what you’d call a subject matter expert when it comes to baking challah bread. Challah is a Jewish bread served on Shabbat, typically braided, and made differently depending if you’re of Ashkenazi or Sephardi heritage. I started making challah with my mom when I was 8 years old (Ashkenazi style), and have been making it regularly since I left home for college. As I developed my own cooking and baking styles, I began to feel like my mother’s recipe had gotten a bit stale. So I’ve started to add things to the dough — just a little vanilla extract at first, then a dash of almond extract, then a batch with cinnamon and raisins, another one with chocolate chips, a Rosh Hashanah version that swaps honey for sugar and includes apple bits (we eat apples and honey for a sweet New Year), even one batch with red food coloring and strawberry bits for a breast cancer awareness campaign. None of these additions were tested in a terribly scientific way; I’m a baker, not a lab chemist. So when I decided I wanted to enter the State Fair with my challah this year, I got to wondering: what is actually the best way to make this challah? And lucky me, I’m employed at the best place in the world to find out.
I brought up the idea of running a designed experiment on my bread with my supervisor, and one of our statisticians, Brooks Henderson, was assigned to me as my “consultant” on the project. Before designing the experiment, we first needed to narrow down the factors we wanted to test and the results we wanted to measure. I set a hard line on not changing any of my mother’s original recipe — I know what Mom’s challah tastes like, I know it’s good, and I don’t want to mess with the complex chemistry involved in baking. We settled on adjusting the amount of vanilla and almond extracts I add to the dough, and since the Fair required me to submit a smaller loaf than Mom’s recipe makes, we tested the time and temperature required to bake. For our results, we asked our coworkers to judge 7 attributes of the bread, including taste, texture, and overall appeal. A statistician and I judged the color of each loaf and measured the thickness of the crust.
It sounds so simple, right? That’s what I thought: plug the factors into Design-Expert, let it work its magic, and poof! the best bread recipe. But that just shows you how little I know! If you’re a formulator, or you’ve taken our Mixture & Combined Designs for Optimal Formulations workshop, you know what the first hurdle was: even though we only changed two ingredients, we were still dealing with a combined mixture/process design. Since mixture designs work with ratios of ingredients as opposed to independent amounts, adding 5g of vanilla extract and 3g of almond extract is a different ratio within the dough, and therefore a different mixture, than adding 2g of vanilla and 6g of almond. To make this work, the base recipe had to become a third part of the mixture. Consultant Wayne Adams stepped in at that point to help us design the experiment. He and Brooks built a mixture/numeric combined design that specified proportions of the 3 ingredients (base recipe, vanilla, and almond), along with the time and temperature settings.
Our second major problem was the time constraint. I brought up the idea for this bread experiment on July 18, and I had to bring my loaves to the fairgrounds on the morning of August 20. We wanted our coworkers to taste this bread, and I had a required family vacation to attend that first week of August. When we accounted for that, along with the time it took to design the experiment, we were left with just 14 days of tasting. At a rate of 2 loaves per weeknight, 4 per weekend, and at the cost of my social life, our maximum budget allowed for a design with only 26 runs. I’m sure there are some of you reading this and wondering how on earth I’d get any meaningful model out of a paltry 26 runs. Well, you’ve got reason to: we just barely got information I could use. Brooks ran through a number of different designs before he got one with even halfway decent power, and we also had to accept that, if there were any curvature to the results, we would not be able to model it with much certainty. Our final design had just two center points to find any curvature related to time or temperature, with no budgeted time for follow-up. Since our working hypothesis was that we’d see a linear relationship between time and temperature, not a quadratic one, the center points were to check this assumption and ensure it was correct. We got a working model, yes, but we took a big risk — and the fact that I didn’t even place in the top 5 entries only underlines that.
On top of all these constraints? I’m only human, and as you well know, human operators make mistakes. My process notes are littered with “I messed up and…” Example: the time I stacked my lunchbox on top of a softer loaf of challah in my bicycle bag for the half-hour ride to work. I’ll give you three guesses how that one rated on “uniformity” and “symmetry,” and your first two don’t count. If we had more time, we could have added more runs and gotten data that didn’t have that extra variability, but the fair submission date was my hard deadline. Mark Anderson, a Stat-Ease principal, tells me this is a common issue in many industries. When there is a “real-time rush to improve a product,” it may not be the best science to accept flawed data, but you make do and account for variations as best you can.
During the analysis, we used the Autoselect tool in Design-Expert to determine which factors had significant effects on the responses (mostly starting with the 2FI model). Another statistician here at Stat-Ease, Martin Bezener, just presented a webinar about this incredible tool — visit our web site to view a recording and learn more about it. When all of our tasters’ ratings were averaged together, we got significant models for Aroma, Appeal, Texture, Overall Taste, Color, and Crust Thickness, with Adj. R² values above 0.8 in most cases. This means that our models captured 80% of the variation in the data, with about 20% unexplained variation (noise) leftover. In general, the time and temperature effects seem to be the most important — we didn’t learn much about the two extracts. Almond only showed up as an effect (and a minor one at that) in one model for the aroma response, and vanilla didn’t show up at at all!
The other thing that surprised me was that I expected to be able to block this experiment. Blocking is a technique covered in our Experiment Design Made Easy workshop by which it’s possible to account for variation between any impossible-to-change source of variation, such as personal differences between tasters. However, since our tasters weren’t always present at every tasting and because we had so few runs in the experiment, we had too few degrees of freedom to block the results and still get a powerful model. It turned out that blocking wouldn’t have shown us much. We looked at a few individual tasters’ results individually, and that didn’t seem to illuminate anything different from what we saw before — which tells us that blocking the whole experiment wouldn’t have uncovered anything new, either.
In the end, I’m happy with our kludged-together experiment. I got a lot of practice baking, and determined the best process for my bread. If we were to do this again, I’d want to start in April to train my tasters better, determine appropriate amounts of other additions like chocolate chips, and really delve into ingredient proportions in a proper mixture design. And of course, I couldn’t have done any of this without the Stat-Ease consulting team. If you have questions on how our consultants can help design and analyze your experiments, send us an e-mail.
Stat-Ease, Inc. and CQ Consultancy hosted the 6th European DOE User Meeting in Leuven, Belgium on May 18th-20th, 2016. We offered two design of experiments (DOE) workshops on the 18th, and then the two-day DOE User Meeting on the 19th-20th. The venue at the historic Faculty Club was charming--complete with cobblestone streets, flowers in full bloom, and delicious artistically-arranged cuisine. The technical program was also excellent. We learned about the latest techniques in design of experiments (DOE) from well-known speakers in the field, and practitioners from a variety of industries spoke about their DOE successes, as well as some not-so-successful experiments and what they learned from them.
Attendees had the chance to network with others and consult with experts about their DOE questions. A highlight of the meeting was our special event on the 19th that included a concert by the choral group, Currende, a delicious dinner at Improvisio, and then beer sampling under the stars at The Capital. A good time was had by all. Hopefully you can join us for the 7th DOE User Meeting in 2018. Look for details to come!
Stat-Ease, Inc. and CQ Consultancy are pleased to announce the 6th European DOE User Meeting and Workshops on May 18-20, 2016 in Leuven, Belgium. It will be held at the Faculty Club, which is part of the historic Grand Béguinage. The béguinage originated in the early 13th century and is a UNESCO World Heritage location.
On the first day of the event there will be two workshop tracks offered: Figuring Out Factorials and Optimal Formulation and Processing of Mixtures via Combined Designs. Then on days two and three, a user meeting will be held with presentations by keynote speakers including Pat Whitcomb, Mark Anderson, Peter Goos, and more, as well as case study presentations by DOE practitioners. This is your chance to increase your DOE know-how, network with others, and do some sightseeing in beautiful Leuven. For more information and to register, click here.
The Optimal Formulation and Processing of Mixtures via Combined Designs workshop is a must for those with some familiarity of process experiment design who work in industries that formulate products. In this one-day presentation Stat-Ease experts will lay out the essentials of mixture design and build up from there to experiments that lead to the optimal combination of ingredients and how they get processed, thus hitting the sweet spot for manufacturing the highest-quality product at lowest cost.
Limited seats are left for the 2nd Asian DOE User Meeting and Workshop in Udaipur, India this March 3-5. Don't miss this opportunity to meet Pat Whitcomb, Founder of Stat-Ease, and be introduced to newly released Design-Expert® software, version 10! Increase your DOE skills, learn from others' successes and challenges, and network in this beautiful area voted the best city in the world in 2009 by Travel + Leisure magazine. For more information and to register, click here.