Re-Expressing+Data

If a scatterplot is not linear, the values of x and/or y can be re-expressed to straighten the scatterplot.

This can be done by changing the powers, changing the logs, or doing the reciprocal.
 * ~ The Ladder of Powers ||
 * **Power** || **Name** || **Comment** ||
 * 2 || The square of the data values, y^2 || Try this for unimodal distributions that are skewed to the left ||
 * 1 || The raw data--no change at all. This is "home base." The farther you step from here up or down the ladder, the the great the effect. || Data can take on both positive and negative values with no bounds are less likely to benefit from re-expression. ||
 * 1/2 || The square root of the data values. || Counts often benefit from a square root re-expression data. For counted data, start here. ||
 * "0" || Although mathematicians define the "0th" power differently, for us the place is held by the logarithm. You may feel uneasy about logarithms. Don't worry, the computer or calculator does the work. || Measurements that cannot be negative, and especially values that grow by percentage increases such as salaries or populations, often benefit from a log re-expression. When in doubt, start here. If your data have zeros, try adding a small constant to all values before finding the logs. ||
 * -1/2 || The reciprocal square root || An uncommon re-expression, but sometimes useful. Changing the sign to take the negative of the reciprocal square root preserves the direction of relationships, which can be a bit simpler. ||
 * -1 || The reciprocal || Ratios of 2 quantities, (miles per hour, for example), often benefit from a reciprocal. Often, the reciprocal will have simple units (hours per mile). Change the sign if you want to preserve the direction of relationships. If your data have zeros, try adding a small constant to all values before finding the reciprocal. ||

Types of transformed functions using logarithms:

ŷ=a+blogx is a logarithmic function logŷ=a+bx is an exponential function logŷ=a+blogx is a power function


 * Similar re-expression models**

In order to know what re-expression is the best when two graphs appear to be about the same, look at r and r^2 values. The re-expression with the r and r^2 values closest to 1 is the one that you should choose.

Remember: do NOT forget to put the log in front of x or y when you get the regression line off of the calculator! EX:ŷ= 1.94+.0497x transformed logarithmically becomes ŷ= 1.94+.0497logx.