Reflections of a Data Scientist: (R) Mann-Whitney U Test (SPSS)

In the last article, we discussed non-parametric tests, specifically, the Wilcox Signed Rank Test. In this article, we will be discussing another test of similar nature, the Mann-Whitney U Test. The Mann-Whitney U Test is a spiritual relative of the Wilcox Signed Rank Test, as it utilizes rank, and is employed almost exclusively for the analyzation of non-parametric data.

The Mann-Whitney U Test provides a non-parametric alternative to The Two Sample Student’s T-Test. While I would recommend the latter simply due to its own innate robustness, the Mann-Whitney U Test will appear from time to time in research papers. Therefore, for this reason, and for a greater understanding as it pertains to the inner workings of the underlying methodology, the Mann-Whitney U Test should at the very least, be momentarily contemplated.

Example:

A scientist creates a chemical which he believes changes the temperature of water. He applies this chemical to water and takes the following measurements:

70, 74, 76, 72, 75, 74, 71, 71

He then measures temperature in samples which the chemical was not applied.

74, 75, 73, 76, 74, 77, 78, 75

Can the scientist conclude, with a 95% confidence interval, that his chemical is in some way altering the temperature of the water?

For this, we will utilize the code:

N1 <- c(70, 74, 76, 72, 75, 74, 71, 71)
N2 <- c(74, 75, 73, 76, 74, 77, 78, 75)

wilcox.test(N2, N1, alternative = "two.sided", paired = FALSE, conf.level = 0.95)

Which produces the output:

Wilcoxon rank sum test with continuity correction

data: N2 and N1
W = 50.5, p-value = 0.05575
alternative hypothesis: true location shift is not equal to 0

From this output we can conclude:

With a p-value of 0.05575 (0.05575 > .05), we can state that, at a 95% confidence interval, that the scientist's chemical is not altering the temperature of the water.

The t-test equivalent of this analysis would resemble:

(If we were measuring mean values)

t.test(N2, N1, alternative = "two.sided", var.equal = TRUE, paired=FALSE, conf.level = 0.95)

Which produces the output:

Two Sample t-test

data: N2 and N1
t = 2.4558, df = 14, p-value = 0.02773
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.3007929 4.4492071
sample estimates:
mean of x mean of y
75.250 72.875

From observing the output of both tests, you can witness the differentiation of p-values provided by the two analysis methods: p-value = 0.05575 (Wilcox) vs. 0.02773 (T-Test).

Below are the steps necessary to perform the above analysis within the SPSS platform.

Mann-Whitney U Test Example:

For this particular test, data must be structured in an un-conventional manner. The cases are combined into one single variable, with their group identity providing their initial designation.

Below is our example data set:

From the “Analyze” menu, select “Nonparametric Tests”, then select “Legacy Dialogues”, followed by “2 Independent Samples”.

This should populate the menu below:

Select “N1N2”, and utilize the top center arrow to designate these values as “Test Variable(s)”. Once this has been completed, utilize the bottom center arrow to designate “Group” as our “Grouping Variable”. Two groups exist, which we must specifically define. To achieve this, click “Define Groups”, then enter the value “1” into the input adjacent to “Group 1”. Next, enter the value “2” into the input adjacent to “Group 2”. Once this step has been completed, click “Continue”, and then click “OK”.

This will generate the output below:

The two values from the output that are relevant for our purposes are those labeled “Asymp Sig.” and “Exact Sig”. There is some debate amongst researchers as to which value should be utilized for reaching a statistical conclusion. Some recommend utilizing “Exact Sig” when conducting analysis that contains only a few data points, and relying on “Asymp Sig” when working with larger data sets.

Remember, SPSS and R calculate output values differently for both the Mann-Whitney U Test, and the Wilcox Ranked Signed Rank Test. This differentiation arises from the methodology utilized to resolve rank order.

Reflections of a Data Scientist

Tuesday, February 27, 2018

(R) Mann-Whitney U Test (SPSS)

No comments:

Post a Comment