Reflections of a Data Scientist: APA Format

In today’s article, we will discuss the standard methodology which is utilized to report statistical findings. In previous examples featured on this website, model outputs were explained in a more simplistic manner in order to decrease the level of complexity related to such. However, if the purpose of the overall research endeavor is to produce results for publication, then the APA format should be applied to whatever experimental findings are generated from the application of methodologies.

“APA” is an abbreviation for The American Psychological Association. Regardless of the type of research that is being conducted, the formatting standards maintained by the APA as it applies to statistical research, should always be utilized when presenting data in a professional manner.

Details

All figures which contain decimal values should be rounded to the nearest hundredth. Ex. .105 = .11. Reporting p-values being the exception to this rule. P-values should, in most cases, be reported in a format which contains two decimals. The exception occurring when a greater amount of specificity is required to illustrate the details of the findings.

Another rule to keep in mind pertains to leading zeroes. A leading zero prior to a decimal place is only required if the represented figure has the potential to exceed “1”. If the value cannot exceed “1”, then a leading zero is un-necessary.

Below are examples which demonstrate the most common application of the APA format.

Chi-Square

Template:

A chi-square test of independence was performed to examine the relation between CATEGORY and OUTCOME. The relation between these variables was found to be significant at the p < .05 level, χ2 (DEGREES OF FREEDOM, N = SAMPLE SIZE) = X-Squared Value, p = p - value.

- OR -

A chi-square test of independence was performed to examine the relation between CATEGORY and OUTCOME. The relation between these variables was not found to be significant at the p < .05 level, χ2 (DEGREES OF FREEDOM, N = SAMPLE SIZE) = X-Squared Value, p = p - value.

Example:

While working as a statistician at a local university, you are tasked to evaluate, based on survey data, the level of job satisfaction that each member of the staff currently has for their occupational role (Assume a 95% Confidence Interval).

The data that you gather from the surveys is as follows:

General Faculty
130 Satisfied 20 Unsatisfied

Professors
30 Satisfied 20 Unsatisfied

Adjunct Professors
80 Satisfied 20 Unsatisfied

Custodians
20 Satisfied 10 Unsatisfied

# Code #

Model <- matrix(c(130, 30, 80, 20, 20, 20, 20, 10), nrow = 4, ncol=2)

N <- sum(130, 30, 80, 20, 20, 20, 20, 10)

chisq.test(Model)

N

# Console Output #

Pearson's Chi-squared test

data: Model
X-squared = 18.857, df = 3, p-value = 0.0002926

> N
[1] 330

APA Format:

A chi-square test of independence was performed to examine the relation between occupational role and job satisfaction. The relation between these variables was found to be significant at the p < .05 level, χ2 (3, N = 330) = 18.56, p < .001.

Tukey HSD

Template:

Post hoc comparisons using the Tukey HSD test indicated that the mean score for the CONDITION A (M = Mean1, SD = Standard Deviation1) was significantly different than CONDITION B (M = Mean2, SD = Standard Deviation2), p = p-value.

Analysis of Variance (ANOVA)

(One Way)

Template:

There was a significant effect of the CATEGORY on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(1), Degrees of Freedom(2)) = F Value, p = p - value).

- OR -

There was not a significant effect of the CATEGORY on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(1), Degrees of Freedom(2)) = F Value, p = p - value).

Example:

A chef wants to test if patrons prefer a soup which he prepares based on salt content. He prepares a limited experiment in which he creates three types of soup: soup with a low amount of salt, soup with a high amount of salt, and soup with a medium amount of salt. He then servers this soup to his customers and asks them to rate their satisfaction on a scale from 1-8.

Low Salt Soup it rated: 4, 1, 8
Medium Salt Soup is rated: 4, 5, 3, 5
High Salt Soup is rated: 3, 2, 5

(Assume a 95% Confidence Interval)

# Code #

satisfaction <- c(4, 1, 8, 4, 5, 3, 5, 3, 2, 5)

salt <- c(rep("low",3), rep("med",4), rep("high",3))

salttest <- data.frame(satisfaction, salt)

results <- aov(satisfaction~salt, data=salttest)

summary(results)

# Console Output #

Df Sum Sq Mean Sq F value Pr(>F)
salt 2 1.92 0.958 0.209 0.816
Residuals 7 32.08 4.583

APA Format:

There not was a significant effect of the level of salt content on patron satisfaction at the p<.05 level for the three conditions (F(2, 7) = 0.21, p = 0.82).

(Two Way)

Template:

Hypothesis 1:

There was a significant effect of the CATEGORY on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(1), Degrees of Freedom(2)) = F Value, p = p - value).

- OR -

There was not a significant effect of the CATEGORY on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(1), Degrees of Freedom(2)) = F Value, p = p - value).

Hypothesis 2:

There was a significant effect of the CATEGORY2 on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(2), Degrees of Freedom(4)) = F Value, p = p - value).

- OR -

There was not a significant effect of the CATEGORY2 on the OUTCOME for SCENARIO at the p <. 05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(2), Degrees of Freedom(4)) = F Value, p = p - value).

Hypothesis 3:

There was a statistically significant interaction effect of the CATEGORY1 on the CATEGORY2 at the p < .05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(3), Degrees of Freedom(4)) = F Value, p = p - value).

- OR -

There was not a statistically significant interaction effect of the CATEGORY1 on the CATEGORY2 at the p < .05 level for the NUMBER OF CONDITIONS (F(Degrees of Freedom(3), Degrees of Freedom(4)) = F Value, p = p - value).

Example:

Researchers want to test study habits within two schools as they pertain to student life satisfaction. The researchers also believe that the school that each group of students is attending may also have an impact on study habits. Students from each school are assigned study material which in sum, totals to 1 hour, 2 hours, and 3 hours on a daily basis. Measured is the satisfaction of each student group on a scale from 1-10 after a 1 month duration.

(Assume a 95% Confidence Interval)

School A:

1 Hour of Study Time: 7, 2, 10, 2, 2
2 Hours of Study Time: 9, 10, 3, 10, 8
3 Hours of Study Time: 3, 6, 4, 7, 1

School B:

1 Hour of Study Time: 8, 5, 1, 3, 10
2 Hours of Study Time: 7, 5, 6, 4, 10
3 Hours of Study Time: 5, 5, 2, 2, 2

satisfaction <- c(7, 2, 10, 2, 2, 8, 5, 1, 3, 10, 9, 10, 3, 10, 8, 7, 5, 6, 4, 10, 3, 6, 4, 7, 1, 5, 5, 2, 2, 2)

studytime <- c(rep("One Hour",10), rep("Two Hours",10), rep("Three Hours",10))

school = c(rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5), rep("SchoolA",5), rep("SchoolB",5))

schooltest <- data.frame(satisfaction, studytime, school)

results <- aov(lm(satisfaction ~ studytime * school, data=schooltest))

summary(results)

Which produces the output:

Df Sum Sq Mean Sq F value Pr(>F)
studytime 2 62.6 31.300 3.809 0.0366 *
school 1 2.7 2.700 0.329 0.5718
studytime:school 2 7.8 3.900 0.475 0.6278
Residuals 24 197.2 8.217
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

APA Format:

There was a significant effect as it pertains to study time impacting student stress levels at the p < .05 level for the three conditions (F(2, 24) = 3.81, p = .04).

There was not a significant effect as it relates to the school attended impacting student stress levels at the p < .05 level for the two conditions (F(1, 24) = 0.329, p > .05).

There was not a statistically significant interaction effect of the school variable on the study time variable at the p < .05 level (F(2, 24) = 0.475, p > .05).

TukeyHSD(results)

> TukeyHSD(results)
Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = lm(satisfaction ~ studytime * school, data = schooltest))

$studytime
diff lwr upr p adj
Three Hours-One Hour -1.3 -4.5013364 1.901336 0.5753377
Two Hours-One Hour 2.2 -1.0013364 5.401336 0.2198626
Two Hours-Three Hours 3.5 0.2986636 6.701336 0.0302463

$school
diff lwr upr p adj
SchoolB-SchoolA -0.6 -2.760257 1.560257 0.571817

$`studytime:school`

diff lwr upr p adj
Three Hours:SchoolA-One Hour:SchoolA -0.4 -6.005413 5.2054132 0.9999178
Two Hours:SchoolA-One Hour:SchoolA 3.4 -2.205413 9.0054132 0.4401459
One Hour:SchoolB-One Hour:SchoolA 0.8 -4.805413 6.4054132 0.9976117
Three Hours:SchoolB-One Hour:SchoolA -1.4 -7.005413 4.2054132 0.9696463
Two Hours:SchoolB-One Hour:SchoolA 1.8 -3.805413 7.4054132 0.9157375
Two Hours:SchoolA-Three Hours:SchoolA 3.8 -1.805413 9.4054132 0.3223867
One Hour:SchoolB-Three Hours:SchoolA 1.2 -4.405413 6.8054132 0.9844928
Three Hours:SchoolB-Three Hours:SchoolA -1.0 -6.605413 4.6054132 0.9932117
Two Hours:SchoolB-Three Hours:SchoolA 2.2 -3.405413 7.8054132 0.8260605
One Hour:SchoolB-Two Hours:SchoolA -2.6 -8.205413 3.0054132 0.7067715
Three Hours:SchoolB-Two Hours:SchoolA -4.8 -10.405413 0.8054132 0.1240592
Two Hours:SchoolB-Two Hours:SchoolA -1.6 -7.205413 4.0054132 0.9470847
Three Hours:SchoolB-One Hour:SchoolB -2.2 -7.805413 3.4054132 0.8260605
Two Hours:SchoolB-One Hour:SchoolB 1.0 -4.605413 6.6054132 0.9932117
Two Hours:SchoolB-Three Hours:SchoolB 3.2 -2.405413 8.8054132 0.5052080

twohours <- c(9, 10, 3, 10, 8, 7, 5, 6, 4, 10)
threehours <- c(3, 6, 4, 7, 1, 5, 5, 2, 2, 2)

mean(twohours)
sd(twohours)

mean(threehours)
sd(threehours)

> mean(twohours)
[1] 7.2
> sd(twohours)
[1] 2.616189
>
> mean(threehours)
[1] 3.7
> sd(threehours)
[1] 2.002776

APA Format:

Post hoc comparisons using the Tukey HSD test indicated that at the p < .05 level, the mean score for the level of stress exhibited by students who studied for Two Hours (M = 7.20, SD = 2.62), was significantly different as compared to the scores of the students who studied for Three Hours (M = 3.70, SD = 2.00), p = .03.

(Repeated Measures)

Template:

Example:

Researchers want to test the impact of reading existential philosophy on a group of 8 individuals. They measure the happiness of the participants three times, once prior to reading, once after reading the materials for one week, and once after reading the materials for two weeks. We will assume an alpha of .05.

Before Reading = 1, 8, 2, 4, 4, 10, 2, 9
After Reading = 4, 2, 5, 4, 3, 4, 2, 1
After Reading (wk. 2) = 5, 10, 1, 1, 4, 6, 1, 8

library(lme4) # You will need to install and enable this package #
library(nlme) # You will also need to install and enable this package #

happiness <- c(1, 8, 2, 4, 4, 10, 2, 9, 4, 2, 5, 4, 3, 4, 2, 1, 5, 10, 1, 1, 4, 6, 1, 8 )

week <- c(rep("Before", 8), rep("Week1", 8), rep("Week2", 8))

id <- c(1,2,3,4,5,6,7, 8)

survey <- data.frame(id, happiness, week)

model <- lme(happiness ~ week, random=~1|id, data=survey)

anova(model)

This method saves some time by producing the output:

numDF denDF F-value p-value
(Intercept) 1 14 37.21053 <.0001
week 2 14 1.04624 0.3772

There was not a significant effect of the health assessment on the survey questions related to stroke concern at the p < .05 level for the five conditions (F(1, 14) = 1.05, p > .05).

Student’s T-Test

(One Sample T-Test)

Template:

(Right Tailed)

There was a significant increase in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the historically assumed mean (M = Historic Mean Value); t(Degrees of Freedom) = t-value, p = p-value.

- OR -

There was not a significant increase in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the historically assumed mean (M = Historic Mean Value); t(Degrees of Freedom) = t-value, p = p-value.

Example:

A factory employee believes that the cakes produced within his factory are being manufactured with excess amounts of corn syrup, thus altering the taste. 10 cakes were sampled from the most recent batch and tested for corn syrup composition. Typically, each cake should comprise of 20% corn syrup. Utilizing a 95 % confidence interval, can we assume that the new batch of cakes contain more than a 20% proportion of corn syrup?

The levels of the samples were:

.27, .31, .27, .34, .40, .29, .37, .14, .30, .20

N <- c(.27, .31, .27, .34, .40, .29, .37, .14, .30, .20)

t.test(N, alternative = "greater", mu = .2, conf.level = 0.95)

# " alternative = " Specifies the type of test that R will perform. "greater" indicates a right tailed test. "left" indicates a left tailed test."two.sided" indicates a two tailed test. #

One Sample t-test

data: N
t = 3.6713, df = 9, p-value = 0.002572
alternative hypothesis: true mean is greater than 0.2
95 percent confidence interval:
0.244562 Inf
sample estimates:
mean of x
0.289

mean(N)
sd(N)

> mean(N)
[1] 0.289
>
> sd(N)
[1] 0.07665942

APA Format:

A one sample t-test was conducted to compare the level of corn syrup in the current sample batch of cakes, to the assumed historical level of corn syrup contained within previously manufactured cakes.

There was a significant increase in the amount of corn syrup in the recent batch of cakes (M = .29, SD = .08), as compared to the historically assumed mean (M =.20); t(9) = 3.67, p = .003.

(Two Sample T-Test)

Template:

(Two Tailed)

There was a significant difference in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the GROUP B (M = Mean of GROUP B, SD = Standard Deviation of GROUP B), t(Degrees of Freedom) = t-value, p = p-value.

-OR-

There was not a significant difference in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the GROUP B (M = Mean of GROUP B, SD = Standard Deviation of GROUP B), t(Degrees of Freedom) = t-value, p = p-value.

A scientist creates a chemical which he believes changes the temperature of water. He applies this chemical to water and takes the following measurements:

70, 74, 76, 72, 75, 74, 71, 71

He then measures temperature in samples which the chemical was not applied.

74, 75, 73, 76, 74, 77, 78, 75

Can the scientist conclude, with a 95% confidence interval, that his chemical is in some way altering the temperature of the water?

N1 <- c(70, 74, 76, 72, 75, 74, 71, 71)

N2 <- c(74, 75, 73, 76, 74, 77, 78, 75)

t.test(N2, N1, alternative = "two.sided", var.equal = TRUE, conf.level = 0.95)

Two Sample t-test

data: N2 and N1
t = 2.4558, df = 14, p-value = 0.02773
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.3007929 4.4492071
sample estimates:
mean of x mean of y
75.250 72.875

mean(N1)

sd(N1)

mean(N2)

sd(N2)

> mean(N1)
[1] 72.875
>
> sd(N1)
[1] 2.167124
>
> mean(N2)
[1] 75.25
>
> sd(N2)
[1] 1.669046

APA Format:

A two sample t-test was conducted to compare the temperature of water prior to the application of the chemical, to the temperature of water subsequent to the application of the chemical

There was a significant difference in the temperature of water prior to the application of the chemical (M = 72.88, SD = 2.17), as compared to the temperature of the water subsequent to the application of the chemical (M = 75.25, SD = 1.67); t(14) = 2.46, p = .03.

(Paired T-Test)

Template:

(Right Tailed)

There was a significant increase in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the GROUP B (M = Mean of GROUP B, SD = Standard Deviation of GROUP B), t(Degrees of Freedom) = t-value, p = p-value.

- OR -

There was not a significant increase in the GROUP A (M = Mean of GROUP A, SD = Standard Deviation of GROUP A), as compared to the GROUP B (M = Mean of GROUP B, SD = Standard Deviation of GROUP B), t(Degrees of Freedom) = t-value, p = p-value.

Example:

A watch manufacturer believes that by changing to a new battery supplier, that the watches that are shipped which include an initial battery, will maintain longer lifespan. To test this theory, twelve watches are tested for duration of lifespan with the original battery.

The same twelve watches are then re-rested for duration with the new battery.

Can the watch manufacturer conclude, that the new battery increases the duration of lifespan for the manufactured watches? (We will assume an alpha value of .05).

For this, we will utilize the code:

N1 <- c(376, 293, 210, 264, 297, 380, 398, 303, 324, 368, 382, 309)
N2 <- c(337, 341, 316, 351, 371, 440, 312, 416, 445, 354, 444, 326)

t.test(N2, N1, alternative = "greater", paired=TRUE, conf.level = 0.95 )

Paired t-test

data: N2 and N1
t = 2.4581, df = 11, p-value = 0.01589
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
12.32551 Inf
sample estimates:
mean of the differences
45.75

mean(N1)
sd(N1)

mean(N2)
sd(N2)

> mean(N1)
[1] 325.3333
>
> sd(N1)
[1] 56.84642
>
> mean(N2)
[1] 371.0833
>
> sd(N2)
[1] 51.22758

APA Format:

A paired t-test was conducted to the lifespan duration of watches which contained the new battery, to the lifespan of watches which contained the initial battery.

There was a significant increase in the lifespan duration of watches which contained the new battery (M = 325.33, SD =56.85), as compared to the lifespan of watches which contained the initial battery (M = 371.08, SD = 51.23); t(11) = 2.46, p = .02.

Regression Models

Example:

(Standard Regression Model)

x <- c(27, 34, 22, 30, 17, 32, 25, 34, 46, 37)
y <- c(70, 80, 73, 77, 60, 93, 85, 72, 90, 85)
z <- c(13, 22, 18, 30, 15, 17, 20, 11, 20, 25)

multiregress <- (lm(y ~ x + z))

Call:
lm(formula = y ~ x + z)

Residuals:
Min 1Q Median 3Q Max
-6.4016 -5.0054 -1.7536 0.8713 14.0886

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 47.1434 12.0381 3.916 0.00578 **
x 0.7808 0.3316 2.355 0.05073 .
z 0.3990 0.4804 0.831 0.43363
---
Residual standard error: 7.896 on 7 degrees of freedom
Multiple R-squared: 0.5249, Adjusted R-squared: 0.3891
F-statistic: 3.866 on 2 and 7 DF, p-value: 0.07394

APA Format:

A linear regression model was utilized to test if variables “x” and “z” significantly predicted outcomes within the observations of “y” included within the sample data set. The results indicated that while “x” (B = .781, p = .051) is a significant predictor variable, the overall model itself does not possess a worthwhile predictive capacity (r2 = .041).

(Non-Standard Regression Model)

Example:

# Model Creation #

Age <- c(55, 45, 33, 22, 34, 56, 78, 47, 38, 68, 49, 34, 28, 61, 26)

Obese <- c(1,0,0,0,1,1,0,1,1,0,1,1,0,1,0)

Smoking <- c(1,0,0,1,1,1,0,0,1,0,0,1,0,1,1)

Cancer <- c(1,0,0,1,0,1,0,0,1,1,0,1,1,1,0)

# Summary Creation and Output #

CancerModelLog <- glm(Cancer~ Age + Obese + Smoking, family=binomial)

summary(CancerModelLog)

# Output #

Call:

glm(formula = Cancer ~ Age + Obese + Smoking, family = binomial)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.6096 -0.7471 0.5980 0.8260 1.8485

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.34431 2.25748 -1.038 0.2991
Age 0.02984 0.04055 0.736 0.4617
Obese -0.38924 1.39132 -0.280 0.7797
Smoking 2.54387 1.53564 1.657 0.0976 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 20.728 on 14 degrees of freedom
Residual deviance: 16.807 on 11 degrees of freedom
AIC: 24.807
Number of Fisher Scoring iterations: 4

# Generate Nagelkerke R Squared #

# Download and Enable Package: "BaylorEdPsych" #

PseudoR2(CancerModelLog)

# Console Output #

McFadden Adj.McFadden Cox.Snell Nagelkerke McKelvey.Zavoina Effron
0.2328838 -0.2495624 0.2751639 0.3674311 0.3477522 0.3042371 0.8000000
Adj.Count AIC Corrected.AIC
0.5714286 23.9005542 27.9005542

APA Format:

A logistic regression model was utilized to test if a model containing the variables “Age”, “Smoking Status”, and “Obesity”, could predict Cancer outcomes as it pertains to the individuals included within the sample data set. The results indicated that the model does not possess a worthwhile predictive capacity (Nagelkerke R-Square = .37).

Reflections of a Data Scientist

Sunday, August 4, 2019

APA Format

No comments:

Post a Comment