Today’s post will discuss Post Hoc Analysis, specifically Tukey’s Honest Significance Test. This test is also known as The Tukey Method, Tukey’s HSD, or TukeyHSD() in R.

Post Hoc refers to the testing that is performed following an ANOVA test. What this testing seeks to discover, is the significance of relationships that exist between variables within an ANOVA model. There are many different Post Hoc tests that can be utilized. For the purpose of this article, we will be specifically discussing Tukey’s HSD.

Something that I should mention before proceeding, is the reason for the utilization of ANOVA as opposed to a T-Test. ANOVA allows us to compare the means between various groups simultaneously, while maintaining the same confidence interval. If we had four experimental groups to test between, this would require 6 T-Tests.

**1 vs. 2 | 1 vs. 3 | 1 vs. 4**

2 vs. 3 | 2 vs. 4

3 vs. 4

2 vs. 3 | 2 vs. 4

3 vs. 4

Each T-Test, if assuming an alpha of .05, has a 5% chance of a Type I error occurring. This means, that there is a 30% chance (.05 * 6), that at least one Type I error would occur. The T-Test will analyze for a statistical difference between the means of two groups, whereas the ANOVA, analyzes for differences within the set of means.

If you recall from the previous article, we addressed two separate scenarios, one in which a cook was testing for the salt content of soup, and the other, in which the impact of study time was being assessed as it applied to students from two different schools.

We will run a Tukey’s HSD on the data collected from each study.

__Scenario A: The Soup Scenario__**satisfaction <- c(4, 1, 8, 4, 5, 3, 5, 3, 2, 5)**

salt <- c(rep("low",3), rep("med",4), rep("high",3))

salttest <- data.frame(satisfaction, salt)

results <- aov(satisfaction~salt, data=salttest)

salt <- c(rep("low",3), rep("med",4), rep("high",3))

salttest <- data.frame(satisfaction, salt)

results <- aov(satisfaction~salt, data=salttest)

Now to run the Tukey HSD Post Hoc Inquiry:

**TukeyHSD(results)**

Which produces the output:

Tukey multiple comparisons of means

*95% family-wise confidence level*

*Fit: aov(formula = satisfaction ~ salt, data = salttest)*

*$salt*

*diff lwr upr p adj*

*low-high 1.00000000 -4.148005 6.148005 0.8387911*

*med-high 0.91666667 -3.898852 5.732185 0.8445186*

*med-low -0.08333333 -4.898852 4.732185 0.9985693*

Let us review each aspect of this output:

Each value within the “p adj” column corresponds to an assessment of the significance pertaining to separate categorical aspects of the model. In the case of the above output, assuming an alpha value of .05, there were no significant differences between any of the categorical factors (p = 0.839; p = 0.844; p = 0.999).

Now to run the Tukey HSD Post Hoc Inquiry:

This produces the following output:

*Diff – Is the difference of the averages between the values.*

Lwr – Is the lower confidence interval of the difference.

Upr – Is the upper confidence interval of the difference.

P adj – The p-values pertaining to the significance of the compound values. Again, if 95%, we will be looking for values of significance that are less than .05.Lwr – Is the lower confidence interval of the difference.

Upr – Is the upper confidence interval of the difference.

P adj – The p-values pertaining to the significance of the compound values. Again, if 95%, we will be looking for values of significance that are less than .05.

Each value within the “p adj” column corresponds to an assessment of the significance pertaining to separate categorical aspects of the model. In the case of the above output, assuming an alpha value of .05, there were no significant differences between any of the categorical factors (p = 0.839; p = 0.844; p = 0.999).

__Scenario B: Schools, Study Time and Stress Scenario__

satisfaction <- c(7, 2, 10, 2, 2, 8, 5, 1, 3, 10, 9, 10, 3, 10, 8, 7, 5, 6, 4, 10, 3, 6, 4, 7, 1, 5, 5, 2, 2, 2)

studytime <- c(rep("One Hour",10), rep("Two Hours",10), rep("Three Hours",10))

school = c(rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5))

schooltest <- data.frame(satisfaction, studytime, school)

results <- aov(lm(satisfaction ~ studytime * school, data=schooltest))

summary(results)satisfaction <- c(7, 2, 10, 2, 2, 8, 5, 1, 3, 10, 9, 10, 3, 10, 8, 7, 5, 6, 4, 10, 3, 6, 4, 7, 1, 5, 5, 2, 2, 2)

studytime <- c(rep("One Hour",10), rep("Two Hours",10), rep("Three Hours",10))

school = c(rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5))

schooltest <- data.frame(satisfaction, studytime, school)

results <- aov(lm(satisfaction ~ studytime * school, data=schooltest))

summary(results)

Now to run the Tukey HSD Post Hoc Inquiry:

**TukeyHSD(results)**This produces the following output:

*$studytime*

*diff lwr upr p adj*

*Three Hours-One Hour -1.3 -4.5013364 1.901336 0.5753377*

*Two Hours-One Hour 2.2 -1.0013364 5.401336 0.2198626*

**Two Hours-Three Hours 3.5 0.2986636 6.701336 0.0302463**Is describing the relationship between the varying levels of study time as it pertains to stress.

The next portion of the output:

*$school*

*diff lwr upr p adj*

*SchoolB-SchoolA -0.6 -2.760257 1.560257 0.571817*

Describes the relationship between the two school types as it pertains to stress.

Finally, the last portion of the output:

*$`studytime:school`*

*diff lwr upr p adj*

*Three Hours:SchoolA-One Hour:SchoolA -0.4 -6.005413 5.2054132 0.9999178*

*Two Hours:SchoolA-One Hour:SchoolA 3.4 -2.205413 9.0054132 0.4401459*

*One Hour:SchoolB-One Hour:SchoolA 0.8 -4.805413 6.4054132 0.9976117*

*Three Hours:SchoolB-One Hour:SchoolA -1.4 -7.005413 4.2054132 0.9696463*

*Two Hours:SchoolB-One Hour:SchoolA 1.8 -3.805413 7.4054132 0.9157375*

*Two Hours:SchoolA-Three Hours:SchoolA 3.8 -1.805413 9.4054132 0.3223867*

*One Hour:SchoolB-Three Hours:SchoolA 1.2 -4.405413 6.8054132 0.9844928*

*Three Hours:SchoolB-Three Hours:SchoolA -1.0 -6.605413 4.6054132 0.9932117*

*Two Hours:SchoolB-Three Hours:SchoolA 2.2 -3.405413 7.8054132 0.8260605*

*One Hour:SchoolB-Two Hours:SchoolA -2.6 -8.205413 3.0054132 0.7067715*

*Three Hours:SchoolB-Two Hours:SchoolA -4.8 -10.405413 0.8054132 0.1240592*

*Two Hours:SchoolB-Two Hours:SchoolA -1.6 -7.205413 4.0054132 0.9470847*

*Three Hours:SchoolB-One Hour:SchoolB -2.2 -7.805413 3.4054132 0.8260605*

*Two Hours:SchoolB-One Hour:SchoolB 1.0 -4.605413 6.6054132 0.9932117*

*Two Hours:SchoolB-Three Hours:SchoolB 3.2 -2.405413 8.8054132 0.5052080*

Describes the relationships between the combination of hours studied and school types.

We can make the following interpretations from the above outputs:

There was a significant difference in stress level between students who study two hours and students who study three hours (p = 0.0302463).

There was not a significant difference in stress level between students who attend SchoolA, and students who attend SchoolB.

There were not a significant differences in stress levels as it pertains to the combination of factors: school and study time.

I have often been asked what differentiates an ANOVA Post Hoc Test (such as Tukey’s HSD), from a T-Test proceeding an ANOVA calculation. The reasons for performing a Post Hoc Test, Tukey's in our case, as opposed to a T-Test, are as follows:

1. Performing multiple T-Tests to check for significance is ultimately time consuming, and nullifies the initial convenience of running an ANOVA test. Additionally, doing such, re-creates the compounding probability of error that we originally sought to avoid.

2. Tukey’s HSD takes into account the significance of each variable as they interact with other variables within the ANOVA model. Re-testing with the T-Test may show what data sets INDENPENDENTLY differ from the other data sets, but it will not illustrate what data sets differ within the model.

Now, I will demonstrate how to perform both a One Way ANOVA Test and a Post Hoc Tukey’s HSD test within SPSS .

First, we will need to define the variables, this can be achieved within the “Variable View” portion of SPSS. Selecting this view can be achieved by clicking the “Variable View” tab on the lower right hand side of the SPSS console.

Once

**“Variable View”**has been selected, we can begin by defining our variable types.

Here I have defined two variables,

**“Satisfaction”**and

**“Soup”**, both were assigned the default variable parameters by the SPSS system.

Next, we need to define our value labels, to achieve this, I clicked on the cell which coincides with the variable

**“Soup”**. This brings up a user interface, which allows for the entry of value labels, and the value for which the label is assigned.

Once this data has been input, we can now input the corresponding values into SPSS which are required for the assembly of our ANOVA model.

When this step has been completed, to proceed, we must choose

**“Analyze”**from the upper most drop down menu. Select the option

**“Compare Means”**, and the subsequent option,

**“One-Way ANOVA”**.

This course of action should cause a menu to appear. For our

**“Dependent List”**variable, we will choose,

**“Satisfaction”**. For our “

**Factor”**variable, we will choose

**“Soup”**. Once this has been completed, select the middle box on the right corner of the menu which reads,

**“Post Hoc”**. This causes another menu to appear which presents Post Hoc Analysis options. For our purposes, we will be checking the box next to

**“Tukey”**prior to proceeding. Significance level should be left at .05 (or Alpha = .05). Click

**“Continue”**, then click,

**“OK”**.

This presents a more detailed Tukey’s HSD output than what was originally available in R:

*Mean Difference – Is the difference of the averages between the values. “diff” in R.*

Std. Error – The standard error of the compared values. No equivalent in R.

Lower Bound – Is the lower confidence interval of the difference. “lwr” in R.

Upper Bound – Is the upper confidence interval of the difference. “upr” in R.

Sig. – The p-values pertaining to the significance of the compared values. Again, if 95%, we will be looking for values of significance that are less than .05. “p adj” in R.

Std. Error – The standard error of the compared values. No equivalent in R.

Lower Bound – Is the lower confidence interval of the difference. “lwr” in R.

Upper Bound – Is the upper confidence interval of the difference. “upr” in R.

Sig. – The p-values pertaining to the significance of the compared values. Again, if 95%, we will be looking for values of significance that are less than .05. “p adj” in R.

That’s all for now, Data Heads. I’ll see you again soon with a brand new article, the subject matter of such is undetermined.