Pearson's correlation coefficient Pearson's correlation coefficient

Pearson's correlation coefficient

SHARE THE ARTICLE ON

Table of Contents

What is Pearson’s correlation coefficient?

Pearson’s correlation coefficient, also known as Pearson’s r, is a statistical measurement that defines the strength of relationship between two variables and their association with each other. 

In simple words, Pearson correlation coefficient  determines any change in one variable that is influenced by the other related variable. Pearson correlation coefficient  is influenced by the concept of covariance which makes it a best method to determine the relationship and interdependency between the two variables. 

Example: a kid’s age increases with time. As time grows, its age starts increasing in years too. 

Download Market Research Toolkit

Get market research trends guide, Online Surveys guide, Agile Market Research Guide & 5 Market research Template

Making the most of your B2B market research in 2021 PDF 3 s 1.png

What does Pearson’s correlation coefficient test do?

Pearson correlation coefficient looks at the relationship between two variables and determine the influence of one on the other. It draws a line through the data of both of those relationships and this relationship between two variables is defined through a Pearson correlation coefficient  calculator. 

Generally there are two types of relationships between two variables:

Positive linear relationship – when one variable goes up, the other goes up too. Example: as the number of day increases, the plant grows (increase) too. 

Pearson's correlation coefficient Pearson's correlation coefficient

Negative linear relationship – when one variable goes up the other one goes down. Example: when a car is travelling to a destination, as its distance travelled increases, the distance till the destination decreases. 

Pearson's correlation coefficient Pearson's correlation coefficient

Pearson correlation coefficient formula

Being a statistical measurement, it is obvious that there is going to be a systematic way of calculating the relationship between two variables. In this section we will see what the Pearson correlation coefficient  formula is and what it means:

Where the variables mean:

  • N = the number of pairs of scores
  • Σxy = the sum of the products of paired scores
  • Σx = the sum of x scores
  • Σy = the sum of y scores
  • Σx2 = the sum of squared x scores
  • Σy2 = the sum of squared y scores

Transform your insight generation process

Create an actionable feedback collection process.

online survey

How to calculate Pearson correlation coefficient :

In this section, we will be taking an example: the height of children increases with their age (not considering some exceptions)

Step 1: Make a table and first include three columns, representing your available data and the other three for the variable calculation of xy, x2 and y2

Child

Age (yrs) 

x

Height(ft)

y

xy

x2

y2

1

1

2

   

2

4

3

   

3

10

4

   

4

13

5

   

5

20

6

   

 

Step 2: Do the basic multiplication to complete the blank fields.

 

Child

Age (yrs) 

x

Height(ft)

y

xy

x2

y2

1

1

2

2

1

4

2

4

3

12

8

9

3

10

4

40

100

16

4

13

5

65

169

25

5

20

6

120

400

36

Step 3: Add one more row which will give you the sum of each column

Child

Age (yrs) 

x

Height(ft)

y

xy

x2

y2

1

1

2

2

1

4

2

4

3

12

8

9

3

10

4

40

100

16

4

13

5

65

169

25

5

20

6

120

400

36

Total

48

20

239

678

90

 

Step 4: Substitute the values in the Pearson correlation coefficient formula to derive your answer. 

  • Σx = 48
  • Σy = 20
  • Σxy = 239
  • Σx2 = 678
  • Σy2 = 90

Hence, according to the formula, the substitutions will look like:

5(239) – (48) (20) / √ [5(678) – (48)2] [5(90) – (20)2]

1195 – 960 / √ (3390 – 2304) (450 – 400)

235 / √ (1086)(50)

235 / √54300

235 / 233.02

1.00

Step 5: Interpret your Pearson correlation coefficient value to identify how strong or weak the relationship between your variables is. There are pre-defined guidelines that will tell you the difference depending on your Pearson correlation coefficient value. 

Pearson's correlation coefficient Pearson's correlation coefficient

In our example, as our Pearson correlation coefficient value is 1, the strength of association between the two variables age and height is large. 

The Pearson correlation coefficient value you get highly depend on the sample size you choose and what measure you take. A graphical representation of the relationship will tell you how the variables are related even before you start your measurement. The scatterplots do the work. If they are close to the line, then the relationship is strong, else if they are scattered away from the line, the relationship is weak. If the line is almost parallel to the x-axis with the scatterplots randomly plotted on the graph, we can say that there is no correlation between the variables. 

See Voxco survey software in action with a Free demo.

Examples of Pearson correlation coefficient

As discussed how the scatterplot represents how strong or weak the relationship is, we will be seeing how it actually looks like:

Large positive correlation

Pearson's correlation coefficient Pearson's correlation coefficient
  • Correlation is almost +1
  • Scatterplots are so close to the line.
  • The slope is positive
  • If one variable increases, the other increases too
  • The change in one variable is directly proportional to the change in other

Medium positive correlation

Pearson's correlation coefficient Pearson's correlation coefficient
  • Positive correlation
  • Above +0.8 below 1+
  • Strong linear relationship

Small negative correlation

Pearson's correlation coefficient Pearson's correlation coefficient
  • Scatterplots are not as close
  • Negative linear of approx. -0.5
  • Change in one variable is inversely proportional to the change in another variable

Weak/no correlation

Pearson's correlation coefficient Pearson's correlation coefficient
  • Scatterplot far from the line
  • The line is almost parallel to the x-axis
  • Correlation is approx. +0.5
  • The relationship between variables cannot be judged. 

Explore all the survey question types
possible on Voxco

Read more

Hindol Basu 
GM, Voxco Intelligence

Webinar

How to Derive the ROI of a Customer Churn Model

30th November
11:00 AM ET