Statnote 14: the correlation of two variables (Pearson's 'r')

Anthony Hilton, Richard A. Armstrong

Research output: Contribution to specialist publicationArticle

Abstract

Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.
Original languageEnglish
Pages34-36
Number of pages3
Volume2008
Specialist publicationMicrobiologist
Publication statusPublished - Sep 2008

Fingerprint

distribution
statistics

Keywords

  • Pearson's correlation coefficient
  • statistics

Cite this

Hilton, Anthony ; Armstrong, Richard A. / Statnote 14: the correlation of two variables (Pearson's 'r'). In: Microbiologist. 2008 ; Vol. 2008. pp. 34-36.
@misc{f0ab4d036b974e2495fff353ec48c9cf,
title = "Statnote 14: the correlation of two variables (Pearson's 'r')",
abstract = "Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.",
keywords = "Pearson's correlation coefficient, statistics",
author = "Anthony Hilton and Armstrong, {Richard A.}",
year = "2008",
month = "9",
language = "English",
volume = "2008",
pages = "34--36",
journal = "Microbiologist",
issn = "1479-2699",

}

Statnote 14: the correlation of two variables (Pearson's 'r'). / Hilton, Anthony; Armstrong, Richard A.

In: Microbiologist, Vol. 2008, 09.2008, p. 34-36.

Research output: Contribution to specialist publicationArticle

TY - GEN

T1 - Statnote 14: the correlation of two variables (Pearson's 'r')

AU - Hilton, Anthony

AU - Armstrong, Richard A.

PY - 2008/9

Y1 - 2008/9

N2 - Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.

AB - Pearson's correlation coefficient (‘r’) is one of the most widely used of all statistics. Nevertheless, care needs to be used in interpreting the results because with large numbers of observations, quite small values of ‘r’ become significant and the X variable may only account for a small proportion of the variance in Y. Hence, ‘r squared’ should always be calculated and included in a discussion of the significance of ‘r’. The use of ‘r’ also assumes that the data follow a bivariate normal distribution (see Statnote 17) and this assumption should be examined prior to the study. If the data do not conform to such a distribution, the use of a non-parametric correlation coefficient should be considered. A significant correlation should not be interpreted as indicating ‘causation’ especially in observational studies, in which the two variables may be correlated because of their mutual correlations with other confounding variables.

KW - Pearson's correlation coefficient

KW - statistics

UR - http://issuu.com/societyforappliedmicrobiology/docs/sept08micro

M3 - Article

VL - 2008

SP - 34

EP - 36

JO - Microbiologist

JF - Microbiologist

SN - 1479-2699

ER -