In order to have correlation defined, you must understand that it is a comparison. It is a comparison between two datasets, distributions, lines or curves. The comparison will tell you how alike or unalike they are. It will even tell you if they have a negative relationship with each other. Correlation's strength is its ability to not only find black and white relationships, but gray ones as well.

Comparison Measure

The sample correlation coefficient, r has a range of -1 to 1. In general, the meaning is:

1 Perfect correlation. They are identical. All data points are on the line.

0 No correlation. The datasets are not related or correlated.

-1 Perfect negative correlation. They are statistical opposites. All data points are on the line.

Of course there are all decimal points between these numbers (the gray). You are basically looking for an r greater than .7 to say that you have a high correlation between the 2 entities.

The Equation

The sample correlation coefficient is obtained using this equation:

The Regression Correlation Coefficient

If you go to Simple Regression Calculation, you can see how correlation is used. In this case, it tells you how well the regression line fits the data from which it is derived.

Summary

With correlation defined, you have discovered that it is for comparison. It provides a measure of how well similar sets of data are related to each other.