__Spearman’s Rank Correlation Coefficient__Spearman’s Rank Correlation Coefficient, also referred to as Spearman’s rho, is a non-parametric alternative to the Pearson correlation. The Spearman alternative is utilized in circumstances when either data samples are non-linear*, or the data type contained within those samples are ordinal**. The output variable that this method produces is known as “rho”. Hence the alternative name which this method is referred to as (“Spearman’s Rho).

As is case with non-parametric alternatives, the particular design of this procedure utilizes a rank system.

__Example:__We are presented with the following data vectors from two survey prompts:

**# Create data vector (scale 1-5) #**

x <- c(5, 1, 1, 1, 3, 2, 5, 3, 3, 2, 4, 4, 4, 2, 5, 4, 4, 4, 4, 2)

# Create data vector (scale 1-5) #

y <- c(4,5, 4, 3, 1, 1, 5, 4, 5, 4, 3, 4, 3, 4, 5, 5, 3, 3, 5, 4)

# Create Spearman’s Rank Correlation #

cor.test(x, y, method=c("spearman"))

x <- c(5, 1, 1, 1, 3, 2, 5, 3, 3, 2, 4, 4, 4, 2, 5, 4, 4, 4, 4, 2)

# Create data vector (scale 1-5) #

y <- c(4,5, 4, 3, 1, 1, 5, 4, 5, 4, 3, 4, 3, 4, 5, 5, 3, 3, 5, 4)

# Create Spearman’s Rank Correlation #

cor.test(x, y, method=c("spearman"))

This produces the output:

*Spearman's rank correlation rho*

data: x and y

S = 1072.1, p-value = 0.4126

alternative hypothesis: true rho is not equal to 0

sample estimates:

rho

0.1939455

data: x and y

S = 1072.1, p-value = 0.4126

alternative hypothesis: true rho is not equal to 0

sample estimates:

rho

0.1939455

From this output, we can first determine that the model strength is not the best, as the p-value = 0.4126, a value which is far above the common alpha level of .05. Next, we will assess the rho value output, which is 0.192955. This value is measured on a scale similar to the Pearson’s correlation. Since this value is relatively low, we will assume a weak positive correlation.

**- For an example as to what non-linear data might resemble, please refer to the article “(R) Polynomial Regression”, published April 17, 2018.*

**- For example, survey response data which asked the respondent to rank a particular item on a scale of 1-10.

**- For example, survey response data which asked the respondent to rank a particular item on a scale of 1-10.

__Kendall Rank Correlation Coefficient__The Kendall Rank Correlation Coefficient, also referred to as Kendall’s Tau, is also a non-parametric alternative to the Pearson correlation. Like Spearman’s rho, Kendall’s Tau is also utilized in circumstances when either data samples are non-linear, or the data type contained within those samples are ordinal. The output variable that this method produces is known as “rho”. As is case with non-parametric alternatives, the particular design of this procedure utilizes a rank system.

__Example:__We are presented with the following data vectors from two survey prompts:

**# Create data vector (scale 1-5) #**

x <- c(5, 1, 1, 1, 3, 2, 5, 3, 3, 2, 4, 4, 4, 2, 5, 4, 4, 4, 4, 2)

# Create data vector (scale 1-5) #

y <- c(4,5, 4, 3, 1, 1, 5, 4, 5, 4, 3, 4, 3, 4, 5, 5, 3, 3, 5, 4)

# Create Kendall Rank Correlation #

cor.test(x, y, method=c("kendall"))

x <- c(5, 1, 1, 1, 3, 2, 5, 3, 3, 2, 4, 4, 4, 2, 5, 4, 4, 4, 4, 2)

# Create data vector (scale 1-5) #

y <- c(4,5, 4, 3, 1, 1, 5, 4, 5, 4, 3, 4, 3, 4, 5, 5, 3, 3, 5, 4)

# Create Kendall Rank Correlation #

cor.test(x, y, method=c("kendall"))

This produces the output:

*Kendall's rank correlation tau*

data: x and y

z = 0.84528, p-value = 0.398

alternative hypothesis: true tau is not equal to 0

sample estimates:

tau

0.1617271

data: x and y

z = 0.84528, p-value = 0.398

alternative hypothesis: true tau is not equal to 0

sample estimates:

tau

0.1617271

From this output, we can first determine that the model strength is not the best, as the p-value = 0.398, a value which is far above the common alpha level of .05. Next, we will assess the rho value output, which is 0.1617271. This value is measured on a scale similar to the Pearson’s correlation. Since this value is relatively low, we will assume a weak positive correlation.

__Conclusion:__**While both methods provide similar functionality, the Spearman’s Rank Correlation is utilized far more frequently than the Kendall Rank Correlation. I typically utilize both methodologies, compare the results of each, and then report my findings in a subsequent research composition.**

I hope that you have found this article to be informative and interesting. Until next time, stay inquisitive, Data Heads!

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.