Statnote 14: the correlation of two variables (Pearson's 'r')

Anthony Hilton; Richard A. Armstrong

Statnote 14: the correlation of two variables (Pearson's 'r')

Anthony Hilton, Richard A. Armstrong

Research output: Contribution to specialist publication or newspaper › Article

Abstract

Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.

Original language	English
Pages	34-36
Number of pages	3
Volume	2008
Specialist publication	Microbiologist
Publication status	Published - Sept 2008

Keywords

Pearson's correlation coefficient
statistics

Cite this

@misc{f0ab4d036b974e2495fff353ec48c9cf,

title = "Statnote 14: the correlation of two variables (Pearson's 'r')",

abstract = "Pearson's correlation coefficient ({\textquoteleft}r{\textquoteright}) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of {\textquoteleft}r{\textquoteright} become significant and the X variable may only account for a small proportion of the variance in Y. Hence, {\textquoteleft}r squared{\textquoteright} should always be calculated and included in a discussion of the significance of {\textquoteleft}r{\textquoteright}. The use of {\textquoteleft}r{\textquoteright} also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating {\textquoteleft}causation{\textquoteright} especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.",

keywords = "Pearson's correlation coefficient, statistics",

author = "Anthony Hilton and Armstrong, {Richard A.}",

year = "2008",

month = sep,

language = "English",

volume = "2008",

pages = "34--36",

journal = "Microbiologist",

issn = "1479-2699",

}

TY - GEN

T1 - Statnote 14: the correlation of two variables (Pearson's 'r')

AU - Hilton, Anthony

AU - Armstrong, Richard A.

PY - 2008/9

Y1 - 2008/9

N2 - Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.

AB - Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.

KW - Pearson's correlation coefficient

KW - statistics

UR - http://issuu.com/societyforappliedmicrobiology/docs/sept08micro

M3 - Article

SN - 1479-2699

VL - 2008

SP - 34

EP - 36

JO - Microbiologist

JF - Microbiologist

ER -

Statnote 14: the correlation of two variables (Pearson's 'r')

Abstract

Keywords

Other files and links

Fingerprint

Cite this