Scatterplots,+Association,+and+Correlation

A scatterplot is the most common and effective way to display two quantitative variables. Trends and relationships are easy to recognize by just looking at the graph of plotted points. A scatterplot does not connect each data value with a line.

The difference between an association and a correlation is that an association is simply a relationship, while a correlation is a linear relationship between two quantitative variables. Association is the much more general, vague term, used when the relationship is simply between two variables.




 * Key Concepts:**
 * **Scatterplots:** A scatterplot shows the relationship between two quantitative variables measured on the same cases.
 * **Looking at Scatterplots:**
 * Direction: If the pattern of the points run from the upper left to the lower right, then that's negative direction. if the pattern of the points run from the upper right to the lower left, then that's positive direction.
 * Form: If there is a straight line relationship, then it will appear as a cloud or swarm of points in a generally straight form. Observe any other patterns-- is it an exponential curve? A sinusoidal wave pattern?
 * Scatter: Ask yourself, do the points follow a single stream? Or are they a vague cloud which we can't really seen a trend or pattern?
 * **Explanatory-variable, response variable (x-variable, y-variable):** In a scatterplot, you must choose a role for each variable. Assign the the y-axis the variable that you hope to predict or explain. Assign to the x-axis the variable that accounts for, explains, predicts, or is otherwise responsible for the y-variable.
 * **Correlation:**Correlation is the numerical measure of the direction and strength of a linear association. More precisely, it measures the strength and distance between two quantitative variables.
 * Different conditions:
 * **Quantitative Variables Condition**: applies to __only__ quantitative variables.
 * **Straight Enough Condition**: Measures __only__ the strength of the linear association, and will not be misleading if the relationship is not linear.
 * **Outlier Condition**: An outlier can make an otherwise small correlation look big or hide a large correlation. It can even give an otherwise positive association a negative correlarion coefficient (and vice versa). When there's an outlier, it's often a good idea to __report the correlations__ with and without the point.