Stat-Ease Blog

Perfecting pound cake via mixture design for optimal formulation

posted by Mark Anderson on Nov. 21, 2024

Thanksgiving is fast approaching—time to begin the meal planning. With this in mind, the NBC Today show’s October 22nd tips for "75 Thanksgiving desserts for the sweetest end to your feast" caught my eye, in particular the Donut Loaf pound cake. My 11 grandkids would love this “giant powdered sugar donut” (and their Poppa, too!).

I became a big fan of pound cake in the early 1990s while teaching DOE to food scientists at Sara Lee Corporation. Their ready-made pound cakes really hit the spot. However, it is hard to beat starting from scratch and baking your own pound cake. The recipe goes backs hundreds of years to a time when many people could not read, thus it simply called for a pound each of flour, butter, sugar and eggs. Not having a strong interest in baking and wanting to minimize ingredients and complexity (other than adding milk for moisture and baking powder for tenderness), I made this formulation my starting point for a mixture DOE, using the Sara Lee classic pound cake as the standard for comparison.

As I always advise Stat-Ease clients, before designing an experiment, begin with the first principles. I took advantage of my work with Sara Lee to gain insights on the food science of pound cake. Then I checked out Rose Levy Beranbaum’s The Cake Bible from my local library. I was a bit dismayed to learn from this research that the experts recommended cake flour, which costs about four times more than the all-purpose (AP) variety. Having worked in a flour mill during my time at General Mills as a process engineer, I was skeptical. Therefore, I developed a way to ‘have my cake and eat it too’: via a multicomponent constraint (MCC), my experiment design incorporated both varieties of flour. Figure 1 shows how to enter this in Stat-Ease software.

Setting up an optimal design with constraints in Stat-Ease software

Figure 1. Setting up the pound cake experiment with a multicomponent constraint on the flours

By the way, as you can see in the screen shot, I scaled back the total weight of each experimental cake to 1 pound (16 ounces by weight), keeping each of the four ingredients in a specified range with the MCC preventing the combined amount of flour from going out of bounds.

The trace plot shown in Figure 2 provides the ingredient directions for a pound cake that pleases kids (based on tastes of my young family of 5 at the time) are straight-forward: more sugar, less eggs and go with the cheap AP flour (its track not appreciably different than the cake flour.)

Screenshot of the trace plot in Stat-Ease software

Figure 2. Trace plot for pound cake experiment

For all the details on my pound cake experiment, refer to "Mixing it up with Computer-Aided Design"—the manuscript for a publication by Today's Chemist at Work in their November 1997 issue. This DOE is also featured in “MCCs Made as Easy as Making a Pound Cake” in Chapter 6 of Formulation Simplified: Finding the Sweet Spot through Design and Analysis of Experiments with Mixtures.

The only thing I would do different nowadays is pour a lot of powdered sugar over the top a la the Today show recipe. One thing that I will not do, despite it being so popular during the Halloween/Thanksgiving season, is add pumpkin spice. But go ahead if you like—do your own thing while experimenting on pound cake for your family’s feast. Happy holidays! Enjoy!

To learn more about MCCs and master DOE for food, chemical, pharmaceutical, cosmetic or any other recipe improvement projects, enroll in a Stat-Ease “Mixture Design for Optimal Formulations” public workshop or arrange for a private presentation to your R&D team.

Tips and tools for modeling counts most precisely

posted by Mark Anderson on July 10, 2024

In a previous Stat-Ease blog, my colleague Shari Kraber provided insights into Improving Your Predictive Model via a Response Transformation. She highlighted the most commonly used transformation: the log. As a follow up to this article, let’s delve into another transformation: the square root, which deals nicely with count data such as imperfections. Counts follow the Poisson distribution, where the standard deviation is a function of the mean. This is not normal, which can invalidate ordinary-least-square (OLS) regression analysis. An alternative modeling tool, called Poisson regression (PR) provides a more precise way to deal with count data. However, to keep it simple statistically (KISS), I prefer the better-known methods of OLS with application of the square root transformation as a work-around.

When Stat-Ease software first introduced PR, I gave it a go via a design of experiment (DOE) on making microwave popcorn. In prior DOEs on this tasty treat I worked at reducing the weight of off-putting unpopped kernels (UPKs). However, I became a victim of my own success by reducing UPKs to a point where my kitchen scale could not provide adequate precision.

With the tools of PR in hand, I shifted my focus to a count of the UPKs to test out a new cell-phone app called Popcorn Expert. It listens to the “pops” and via the “latest machine learning achievements” signals users to turn off their microwave at the ideal moment that maximizes yield before they burn their snack. I set up a DOE to compare this app against two optional popcorn settings on my General Electric Spacemaker™ microwave: standard (“GE”) and extended (“GE++”). As an additional factor, I looked at preheating the microwave with a glass of water for 1 minute—widely publicized on the internet to be the secret to success.

Table 1 lays out my results from a replicated full factorial of the six combinations done in random order (shown in parentheses). Due to a few mistakes following the software’s plan (oops!), I added a few more runs along the way, increasing the number from 12 to 14. All of the popcorn produced tasted great, but as you can see, the yield varied severalfold.

Table 1: Data with run numbers in parentheses
A:	B:	UPKs
Preheat	Timing	Rep 1	Rep 2	Rep 3
No	GE	41 (2)	92 (4)
No	GE++	23 (6)	32 (12)	34 (13)
No	App	28 (1)	50 (8)	43 (11)
Yes	GE	70 (5)	62 (14)
Yes	GE++	35 (7)	51 (10)
Yes	App	50 (3)	40 (9)

I then analyzed the results via OLS with and without a square root transformation, and then advanced to the more sophisticated Poisson regression. In this case, PR prevailed: It revealed an interaction, displayed in Figure 1, that did not emerge from the OLS models.

Figure 1: Interaction of the two factors—preheat and timing method

Going to the extended popcorn timing (GE++) on my Spacemaker makes time-wasting preheating unnecessary—actually producing a significant reduction in UPKs. Good to know!

By the way, the app worked very well, but my results showed that I do not need my cell phone to maximize the yield of tasty popcorn.

To succeed in experiments on counts, they must be:

discrete whole numbers with no upper bound
kept with within over a fixed area of opportunity
not be zero very often—avoid this by setting your area of opportunity (sample size) large enough to gather 20 counts or more per run on average.

For more details on the various approaches I’ve outlined above, view my presentation on Making the Most from Measuring Counts at the Stat-Ease YouTube Channel.

State Fair Bread — through SCIENCE!

posted by Rachel Poleke on Sept. 13, 2016

Hello, Design-Expert® software users, Stat-Ease clients and statistics fans! I’m Rachel, the Client Specialist [ed. 2024: now I'm the Market Development Manager] here at Stat-Ease. If you’ve ever called our general line, I’m probably the one who picked up; I’m the one who prints and binds your workshop materials when you take our courses. I am not, by any stretch of the imagination, a statistician. So why am I, a basic office administrator who hasn’t taken a math class since high school, writing a blog post for Stat-Ease? It’s because I entered this year’s Minnesota State Fair Creative Activities Contest thanks to Design-Expert and help from the Stat-Ease consultant team.

I’m what you’d call a subject matter expert when it comes to baking challah bread. Challah is a Jewish bread served on Shabbat, typically braided, and made differently depending if you’re of Ashkenazi or Sephardi heritage. I started making challah with my mom when I was 8 years old (Ashkenazi style), and have been making it regularly since I left home for college. As I developed my own cooking and baking styles, I began to feel like my mother’s recipe had gotten a bit stale. So I’ve started to add things to the dough — just a little vanilla extract at first, then a dash of almond extract, then a batch with cinnamon and raisins, another one with chocolate chips, a Rosh Hashanah version that swaps honey for sugar and includes apple bits (we eat apples and honey for a sweet New Year), even one batch with red food coloring and strawberry bits for a breast cancer awareness campaign. None of these additions were tested in a terribly scientific way; I’m a baker, not a lab chemist. So when I decided I wanted to enter the State Fair with my challah this year, I got to wondering: what is actually the best way to make this challah? And lucky me, I’m employed at the best place in the world to find out.

I brought up the idea of running a designed experiment on my bread with my supervisor, and one of our statisticians, Brooks Henderson, was assigned to me as my “consultant” on the project. Before designing the experiment, we first needed to narrow down the factors we wanted to test and the results we wanted to measure. I set a hard line on not changing any of my mother’s original recipe — I know what Mom’s challah tastes like, I know it’s good, and I don’t want to mess with the complex chemistry involved in baking. We settled on adjusting the amount of vanilla and almond extracts I add to the dough, and since the Fair required me to submit a smaller loaf than Mom’s recipe makes, we tested the time and temperature required to bake. For our results, we asked our coworkers to judge 7 attributes of the bread, including taste, texture, and overall appeal. A statistician and I judged the color of each loaf and measured the thickness of the crust.

It sounds so simple, right? That’s what I thought: plug the factors into Design-Expert, let it work its magic, and poof! the best bread recipe. But that just shows you how little I know! If you’re a formulator, or you’ve taken our Mixture Design for Optimal Formulations workshop, you know what the first hurdle was: even though we only changed two ingredients, we were still dealing with a combined mixture/process design. Since mixture designs work with ratios of ingredients as opposed to independent amounts, adding 5g of vanilla extract and 3g of almond extract is a different ratio within the dough, and therefore a different mixture, than adding 2g of vanilla and 6g of almond. To make this work, the base recipe had to become a third part of the mixture. Consultant Wayne Adams stepped in at that point to help us design the experiment. He and Brooks built a mixture/numeric combined design that specified proportions of the 3 ingredients (base recipe, vanilla, and almond), along with the time and temperature settings.

Our second major problem was the time constraint. I brought up the idea for this bread experiment on July 18, and I had to bring my loaves to the fairgrounds on the morning of August 20. We wanted our coworkers to taste this bread, and I had a required family vacation to attend that first week of August. When we accounted for that, along with the time it took to design the experiment, we were left with just 14 days of tasting. At a rate of 2 loaves per weeknight, 4 per weekend, and at the cost of my social life, our maximum budget allowed for a design with only 26 runs. I’m sure there are some of you reading this and wondering how on earth I’d get any meaningful model out of a paltry 26 runs. Well, you’ve got reason to: we just barely got information I could use. Brooks ran through a number of different designs before he got one with even halfway decent power, and we also had to accept that, if there were any curvature to the results, we would not be able to model it with much certainty. Our final design had just two center points to find any curvature related to time or temperature, with no budgeted time for follow-up. Since our working hypothesis was that we’d see a linear relationship between time and temperature, not a quadratic one, the center points were to check this assumption and ensure it was correct. We got a working model, yes, but we took a big risk — and the fact that I didn’t even place in the top 5 entries only underlines that.

On top of all these constraints? I’m only human, and as you well know, human operators make mistakes. My process notes are littered with “I messed up and…” Example: the time I stacked my lunchbox on top of a softer loaf of challah in my bicycle bag for the half-hour ride to work. I’ll give you three guesses how that one rated on “uniformity” and “symmetry,” and your first two don’t count. If we had more time, we could have added more runs and gotten data that didn’t have that extra variability, but the fair submission date was my hard deadline. Mark Anderson, a Stat-Ease principal, tells me this is a common issue in many industries. When there is a “real-time rush to improve a product,” it may not be the best science to accept flawed data, but you make do and account for variations as best you can.

During the analysis, we used the Autoselect tool in Design-Expert to determine which factors had significant effects on the responses (mostly starting with the 2FI model). Another statistician here at Stat-Ease, Martin Bezener, just presented a webinar about this incredible tool — visit our web site to view a recording and learn more about it. When all of our tasters’ ratings were averaged together, we got significant models for Aroma, Appeal, Texture, Overall Taste, Color, and Crust Thickness, with Adj. R² values above 0.8 in most cases. This means that our models captured 80% of the variation in the data, with about 20% unexplained variation (noise) leftover. In general, the time and temperature effects seem to be the most important — we didn’t learn much about the two extracts. Almond only showed up as an effect (and a minor one at that) in one model for the aroma response, and vanilla didn’t show up at at all!

The other thing that surprised me was that I expected to be able to block this experiment. Blocking is a technique covered in our Modern DOE for Process Optimization workshop by which it’s possible to account for variation between any impossible-to-change source of variation, such as personal differences between tasters. However, since our tasters weren’t always present at every tasting and because we had so few runs in the experiment, we had too few degrees of freedom to block the results and still get a powerful model. It turned out that blocking wouldn’t have shown us much. We looked at a few individual tasters’ results individually, and that didn’t seem to illuminate anything different from what we saw before — which tells us that blocking the whole experiment wouldn’t have uncovered anything new, either.

In the end, I’m happy with our kludged-together experiment. I got a lot of practice baking, and determined the best process for my bread. If we were to do this again, I’d want to start in April to train my tasters better, determine appropriate amounts of other additions like chocolate chips, and really delve into ingredient proportions in a proper mixture design. And of course, I couldn’t have done any of this without the Stat-Ease consulting team. If you have questions on how our consultants can help design and analyze your experiments, send us an email.

Stat-Ease Blog

Categories

Perfecting pound cake via mixture design for optimal formulation

Tips and tools for modeling counts most precisely

State Fair Bread — through SCIENCE!