Pearson's correlation coefficient

(noun)

a measure of the linear correlation (dependence) between two variables $X$ and $Y$, giving a value between $+1$ and $-1$ inclusive, where 1 is total positive correlation, 0 is no correlation, and $-1$ is negative correlation

Examples of Pearson's correlation coefficient in the following topics:

  • Coefficient of Correlation

    • The most common coefficient of correlation is known as the Pearson product-moment correlation coefficient, or Pearson's $r$.
    • Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.
    • Pearson's correlation coefficient when applied to a population is commonly represented by the Greek letter $\rho$ (rho) and may be referred to as the population correlation coefficient or the population Pearson correlation coefficient.
    • Pearson's correlation coefficient when applied to a sample is commonly represented by the letter $r$ and may be referred to as the sample correlation coefficient or the sample Pearson correlation coefficient.
    • This fact holds for both the population and sample Pearson correlation coefficients.
  • Rank Correlation

    • It is common to regard these rank correlation coefficients as alternatives to Pearson's coefficient, used either to reduce the amount of calculation or to make the coefficient less sensitive to non-normality in distributions.
    • However, this view has little mathematical basis, as rank correlation coefficients measure a different type of relationship than the Pearson product-moment correlation coefficient.
    • In the same way, if $y$ always decreases when $x$ increases, the rank correlation coefficients will be $-1$ while the Pearson product-moment correlation coefficient may or may not be close to $-1$.
    • This graph shows a Spearman rank correlation of 1 and a Pearson correlation coefficient of 0.88.
    • In contrast, this does not give a perfect Pearson correlation.
  • Values of the Pearson Correlation

    • Give the symbols for Pearson's correlation in the sample and in the population
    • The Pearson product-moment correlation coefficient is a measure of the strength of the linear relationship between two variables.
    • It is referred to as Pearson's correlation or simply as the correlation coefficient.
    • The symbol for Pearson's correlation is "$\rho$" when it is measured in the population and "r" when it is measured in a sample.
    • Pearson's r can range from -1 to 1.
  • Hypothesis Tests with the Pearson Correlation

    • Pearson's correlation coefficient, $r$, tells us about the strength of the linear relationship between $x$ and $y$ points on a regression plot.
    • We decide this based on the sample correlation coefficient $r$ and the sample size $n$.
    • If the test concludes that the correlation coefficient is significantly different from 0, we say that the correlation coefficient is "significant."
    • If the test concludes that the correlation coefficient is not significantly different from 0 (it is close to 0), we say that correlation coefficient is "not significant. "
    • Use a hypothesis test in order to determine the significance of Pearson's correlation coefficient.
  • Other Types of Correlation Coefficients

    • Other types of correlation coefficients include intraclass correlation and the concordance correlation coefficient.
    • For example, in a paired data set where each "pair" is a single measurement made for each of two units (e.g., weighing each twin in a pair of identical twins) rather than two different measurements for a single unit (e.g., measuring height and weight for each individual), the ICC is a more natural measure of association than Pearson's correlation.
    • Thus, if we are correlating $X$ and $Y$, where, say, $Y=2X+1$, the Pearson correlation between $X$ and $Y$ is 1: a perfect correlation.
    • Whereas Pearson's correlation coefficient is immune to whether the biased or unbiased version for estimation of the variance is used, the concordance correlation coefficient is not.
    • Distinguish the intraclass and concordance correlation coefficients from previously discussed correlation coefficients.
  • The Correlation Coefficient r

    • Use the correlation coefficient as another indicator (besides the scatterplot) of the strength of the relationship between x and y.
    • The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is a numerical measure of the strength of association between the independent variable x and the dependent variable y.
    • If r = 1, there is perfect positive correlation.
    • We say "correlation does not imply causation."
    • The correlation coefficient r is the bottom item in the output screens for the LinRegTTest on the TI-83, TI-83+, or TI-84+ calculator (see previous section for instructions).
  • Properties of Pearson's r

    • State the relationship between the correlation of Y with X and the correlation of X with Y
    • A basic property of Pearson's r is that its possible range is from -1 to 1.
    • Pearson's correlation is symmetric in the sense that the correlation of X with Y is the same as the correlation of Y with X.
    • For example, the correlation of Weight with Height is the same as the correlation of Height with Weight.
    • A critical property of Pearson's r is that it is unaffected by linear transformations.
  • Statistical Literacy

    • he graph below showing the relationship between age and sleep is based on a graph that appears on this web page (http://www.shmoop.com/basic-statistics-probability/scatter-plots-correlation-examples.html).
    • Why might Pearson's correlation not be a good way to describe the relationship?
    • Pearson's correlation measures the strength of the linear relationship between two variables.
  • Randomization Tests: Association (Pearson's r)

    • A significance test for Pearson's r is described in the section inferential statistics for b and r.
    • The approach is to consider the X variable fixed and compare the correlation obtained in the actual data to the correlations that could be obtained by rearranging the Y variable.
    • For the data shown in Table 1, the correlation between X and Y is 0.385.
    • There is only one arrangement of Y that would produce a higher correlation.
    • Therefore, there are two arrangements of Y that lead to correlations as high or higher than the actual data.
  • Confidence Intervals

Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.