Transformation of the response is an important component of any data analysis. Transformation is needed if the error (residuals) is a function of the magnitude of the response (predicted values). Design-Expert provides extensive diagnostic capabilities to check if the statistical assumptions underlying the data analysis are met. The normal plot of the residuals tests their normality. The residuals versus predicted response values plot will indicate a problem if a pattern exists. Unless the ratio of the maximum response to the minimum response is large, transforming the response will not make much difference.

The Box-Cox plot on the Diagnostics button will provide a recommended transformation from the power family. The two non-power law transformations, logit for bounded data and arcsin-sqrt for proportions, must be applied based on the type of response. The Box-Cox plot will often recommend a square-root transformation when proportion data is present, and the log transformation for bounded data.

Design-Expert provides a broad range of possible transformations - most are from the power family, plus there are two additional transformations, the logit and the arcsine square root.

Most data transformations can be described by the power function, l) power gives a scale satisfying the equal variance requirement of the statistical model.

The appropriate choice of a response transformation relies on subject-matter knowledge and/or statistical considerations. The available transformations and examples for their use are:

**Square Root** – count, frequency data

**Natural log** – variance or growth data

**Base 10 log** – variance or growth data

**Inverse square root**

**Inverse** – rate/time, decay rate

**Power** – for more extreme transformation needs

The **power** transformation allows transformation to any power in the range –3
to +3, provided the data are positive. You may add a constant to the data to
avoid powers of negative numbers. If the standard deviation associated with an
observation is proportional to the mean raised to some power, then transforming
the observation by a power gives a scale satisfying the equal variance
requirement of the ANOVA. The Box-Cox plot is provided in the Diagnostics plots
to help you choose an appropriate power transformation.

Logistic regression analysis estimates the odds of an event.

Logistic regression models the odds (or chance) of an outcome based on input factors. Because odds is a ratio, what will actually be modeled is the logarithm of the odds given by:

\[Logit(p) = \left(\frac{p(y=1)}{1 - p(y=1)}\right) = \beta_{0} +
\beta_{1}x_{1} + \beta_{2}x_{2} + \cdots + \beta_{k}x_{k}\]

\[\hat{p}=\frac{e^{\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\cdots+
\beta_{k}x_{k}}}{1+e^{\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\cdots+
\beta_{k}x_{k}}}=\frac{1}{1+e^{-(\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\cdots+
\beta_{k}x_{k})}}\]

We fit a model (*z*) for *Logit(p)*:

\[z = \beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\cdots+ \beta_{k}x_{k}\]

Then we apply the inverse transformation:

\[\hat{p} = \frac{1}{1+e^{-(z)}}\]

Poisson regression is used to model count data.

**Logit**

The **logit** transformation is used when the response has a unreachable lower
and upper physical limits. One example is the yield of a chemical reaction. The
physical bounds are 0% and 100%, but in practice the actual yields will not
quite reach 100% due to impurities, energy loss, etc. The logit transform
spreads out the values near the boundaries. When using this transformation, it
is very important to correctly set the lower and upper limits to the natural
limits of the response.

\[\log_{e}\begin{bmatrix} \frac{Y\: -\: lower\: limit\: of\: Y} {upper\: limit\: of\: Y\: -\: Y } \end{bmatrix}\]

**Arcsine square root**

The **arcsine square root** should be used for proportion data. Proportion data
is a fraction between 0 and 1 inclusive. The assumption is a batch of size “n” is
generated by the settings of each run. Each individual member of the batch has a
binomial outcome, either passing or failing a specified criteria.

\[\arcsin \begin{pmatrix}{\sqrt{Y}}\end{pmatrix}\]

References

D. Miller. Reducing transformation bias in curve fitting.

*The American Statistician*, 38(2):124–126, 1984.