All data in the exercises are simulated fake data.
<-
, ->
: assign values to objects.==
: check whether two values are equal.!=
: check whether two values are not equal.&
: and. Returns TRUE if both elements are TRUE.|
: or. Returns TRUE if one or both of the elements are TRUE.!
: not. Returns FALSE if statement is TRUE.:
: creates a series of numbers in a sequance. E.g. a <- c(1:10)
.%in%
: find out if an element belongs to a vector. E.g. x %in% y
.There are two ways to write the if statement in R: the standard if statement, and the ifelse()
function.
# If expression1 is true, statement1 will be executed. If expression1 is false but 2 is true, statement2 will be executed. Otherwise, statement3 will be executed.
if (expression1) {
statement1
} else if (expression2) {
statement2
} else {
statement3
}
# If expression is ture, statement1 will be executed, otherwise statement2 will be executed.
ifelse(expression, statement1, statement2)
In a while loop, test_expression
is evaluated and the statement is executed if the result is TRUE. Then the expression is evaluated again. A while loop will run repeatedly until text_expression
is no longer TRUE.
while (test_expression)
{
statement
}
In a for loop, value
takes on each element in the vector sequence
. statement
is executed in each iteration.
for (value in sequence)
{
statement
}
A teacher designed three sets of class materials (sets A, B, and C) for a language learning class. The teacher wants to know whether students’ learning outcome differ depending on the set of material used, and/or whether they receive a 10-min revision session after the class. 60 students participated in a round of trial classes: 10 students took part in the class with material set A, and received revision after class; 10 students also took part in the class with material set A, but did not receive revision; and so on.
This study is a 3 x 2 design with Material set (A vs. B vs. C) and Revision (revision vs. no revision).
data1 <- read.csv('language-class.csv', header = TRUE)
There are two factors in this study, Material set and Revision. Each factor has two levels. Both factors are between-subjects. Therefore the suitable test is the two-way between-subject ANOVA.
library(ggplot2)
data1_groupmean <- aggregate(score ~ material+revision, data = data1, FUN = mean)
ggplot(data1_groupmean, aes(x = material, y = score, group=revision, color = revision)) +
geom_point() +
geom_line() +
ylim(c(50,100)) +
theme_light()
library(psych)
describeBy(score ~ material*revision, data=data1)
##
## Descriptive statistics by group
## material: A
## revision: no_revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 62.3 11.14 61.5 62.62 12.6 44 78 34 -0.3 -1.24 3.52
## ------------------------------------------------------------
## material: B
## revision: no_revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 77.5 9.41 77.5 77.12 8.9 64 94 30 0.24 -1.2 2.97
## ------------------------------------------------------------
## material: C
## revision: no_revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 84.4 8.07 86.5 85 7.41 69 95 26 -0.56 -0.99 2.55
## ------------------------------------------------------------
## material: A
## revision: revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 78.5 9.79 80 78.75 11.86 63 92 29 -0.15 -1.53 3.1
## ------------------------------------------------------------
## material: B
## revision: revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 76.4 11.22 73.5 76.5 13.34 59 93 34 0.03 -1.43 3.55
## ------------------------------------------------------------
## material: C
## revision: revision
## vars n mean sd median trimmed mad min max range skew kurtosis se
## score 1 10 91 6.31 89 91 6.67 82 100 18 0.23 -1.6 1.99
data1_result <- aov(score ~ material*revision, data = data1)
summary(data1_result)
## Df Sum Sq Mean Sq F value Pr(>F)
## material 2 3052 1525.9 16.982 1.9e-06 ***
## revision 1 785 784.8 8.735 0.00462 **
## material:revision 2 751 375.6 4.180 0.02051 *
## Residuals 54 4852 89.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
data1_residuals <- residuals(object = data1_result)
shapiro.test(data1_residuals)
##
## Shapiro-Wilk normality test
##
## data: data1_residuals
## W = 0.97604, p-value = 0.2846
library(car)
leveneTest(score ~ material*revision, data = data1)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 5 0.5797 0.7153
## 54
Different ways to do pair-wise comparisons: In the two-way between-subject example in the handout, I simply did pairwise comparisons between each possible combination of gropus. This is totally ok, like this:
# Two-way between-subject ANOVA post-hocs
# This is the method I showed you in the handout, which does all possible pair-wise comparisons
data1$group <- paste('set_', data1$material, '_', data1$revision, sep = '')
data1$group <- as.factor(data1$group)
pairwise.t.test(data1$score, data1$group, p.adjust.method = 'bonferroni')
##
## Pairwise comparisons using t tests with pooled SD
##
## data: data1$score and data1$group
##
## set_A_no_revision set_A_revision set_B_no_revision
## set_A_revision 0.0052 - -
## set_B_no_revision 0.0108 1.0000 -
## set_B_revision 0.0238 1.0000 1.0000
## set_C_no_revision 4.5e-05 1.0000 1.0000
## set_C_revision 1.5e-07 0.0706 0.0361
## set_B_revision set_C_no_revision
## set_A_revision - -
## set_B_no_revision - -
## set_B_revision - -
## set_C_no_revision 0.9677 -
## set_C_revision 0.0167 1.0000
##
## P value adjustment method: bonferroni
However we can also approach this in a more careful way: you might wonder what’s the point in comparisons such as set_A_revision vs. set_B_no-revision? These comparisons don’t sound very informative to the research question of this study.
Can’t I do something that simply lets me know whether for each set of materials, there is an effect of revision or not (I only care about set_A_revision vs. set_A_no-revision; set_B_revision vs. set_B_no-revision; etc.)? You can, of course, and in fact in the Mixed-ANOVA example in the handout, I did something very similar. When you have a very clear idea of what comparisons will be useful for your research question, then all you need are those comparisons you planned to do.
Let’s say we only want to know whether, for each set of materials, revision significantly changes students’ performance:
# Let's look at whether revision has a significant effect for each material set.
library(rstatix)
data1_pt <- data1 %>%
group_by(material) %>%
rstatix::pairwise_t_test(score ~ revision, p.adjust.method = "bonferroni")
data1_pt
## # A tibble: 3 × 10
## material .y. group1 group2 n1 n2 p p.sig…¹ p.adj p.adj…²
## * <chr> <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 A score no_revision revisi… 10 10 0.00282 ** 0.00282 **
## 2 B score no_revision revisi… 10 10 0.815 ns 0.815 ns
## 3 C score no_revision revisi… 10 10 0.0566 ns 0.0566 ns
## # … with abbreviated variable names ¹p.signif, ²p.adj.signif
You might notice that the p-values are slightly different, this is because we are not doing the ‘useless’ comparisons, so we don’t need to adjust the p-values that much.
Alternatively (or in addition), if you care about whether the sets of materials differ when student get the revision session, and whether the sets of materials differ when students don’t get the revision session, you can do something like this:
# we can also look at whether the sets of materials differ, when students revised vs. when they didn't revise.
data1_pt2 <- data1 %>%
group_by(revision) %>%
rstatix::pairwise_t_test(score ~ material, p.adjust.method = "bonferroni")
data1_pt2
## # A tibble: 6 × 10
## revision .y. group1 group2 n1 n2 p p.signif p.adj p.adj…¹
## * <chr> <chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
## 1 no_revision score A B 10 10 0.0015 ** 4.5 e-3 **
## 2 no_revision score A C 10 10 0.0000211 **** 6.33e-5 ****
## 3 no_revision score B C 10 10 0.12 ns 3.61e-1 ns
## 4 revision score A B 10 10 0.619 ns 1 e+0 ns
## 5 revision score A C 10 10 0.00583 ** 1.75e-2 *
## 6 revision score B C 10 10 0.00165 ** 4.94e-3 **
## # … with abbreviated variable name ¹p.adj.signif
It’s up to you how you want to approach your pair-wise comparisons.
library(lsr)
etaSquared(data1_result)
## eta.sq eta.sq.part
## material 0.32328529 0.3861152
## revision 0.08314044 0.1392329
## material:revision 0.07958275 0.1340738
Two-way between-subjects ANOVA revealed significant main effects of Material set (F(2, 54)=16.98, p<0.001, \(\eta^2\)=0.39) and Revision (F(1, 54)=8.74, p=0.005, \(\eta^2\)=0.14), as well as a significant interaction (F(2, 54)=4.18, p=0.02, \(\eta^2\)=0.13). Post-hoc pair-wise t-tests with Bonferroni correction revealed that the effect of Revision was only sigificant for Material set A (p=0.003), but not the other two sets of materials. (Alternatively or in addition: Without revision, Material sets B and C were significantly more effective than Material set A (both p’s<0.01), while there was no significant difference between sets B and C. With revision, Material set A became as effective as Material set B; however, set C was still significantly more effective than both A and B (both p’s<0.01). )
Saying an affirmative sentence is true is easier than saying it’s false. For example, saying “A peguin is a bird” is true is easier (takes shorter time) than saying “A dolphin is a fish” is false. A researcher wants to know whether the same can be said about negative sentences.
10 participants took part in an experiment investigating negation processing. In each trial, the participant saw a visual display of a picture together with a sentence that is either congruent or incongruent with the visual display. Half of the sentences were affirmative, while the other half were negative. Participants were asked to jedge the truth value of each sentence. Their reaction time (seconds) was measured.1 Example adapted from Wang, S., Sun, C., Tian, Y., & Breheny, R. (2021). Verifying Negative Sentences. Journal of Psycholinguistic Research, 50(6), 1511-1534. Note that data used in this example is simulated and conclusions may differ from the real experiment.
This study has a 2 x 2 design with Polarity (affirmative vs. negative) and Truth value (true vs. false), resulting in four conditions. Each participant completed 15 trials in each condition.
data2 <- read.csv('negation.csv', header = TRUE)
There are two factors, Polarity and Truth value, each having 2 levels. Both factors are within-subject. This should be a two-way within-subject ANOVA.
data_frame$column_name <- as.factor(data_frame$column_name)
.data2$polarity <- as.factor(data2$polarity)
data2$truth_value <- as.factor(data2$truth_value)
# descriptives
library(psych)
describeBy(RT ~ polarity*truth_value, data = data2)
##
## Descriptive statistics by group
## polarity: Affirmative
## truth_value: 0
## vars n mean sd median trimmed mad min max range skew kurtosis se
## RT 1 150 6.63 1.54 6.61 6.63 1.41 2.97 10.31 7.34 0.05 -0.16 0.13
## ------------------------------------------------------------
## polarity: Negative
## truth_value: 0
## vars n mean sd median trimmed mad min max range skew kurtosis se
## RT 1 150 8.95 1.48 8.79 8.92 1.45 5.42 12.73 7.31 0.21 -0.33 0.12
## ------------------------------------------------------------
## polarity: Affirmative
## truth_value: 1
## vars n mean sd median trimmed mad min max range skew kurtosis se
## RT 1 150 4.42 1.59 4.47 4.42 1.67 0.01 8.35 8.34 -0.1 -0.15 0.13
## ------------------------------------------------------------
## polarity: Negative
## truth_value: 1
## vars n mean sd median trimmed mad min max range skew kurtosis se
## RT 1 150 8.72 1.64 8.77 8.69 1.7 5.12 12.97 7.86 0.18 -0.34 0.13
# visualisation
library(ggplot2)
data2_grandmean <- aggregate(RT ~ polarity+truth_value, data = data2, FUN = mean)
data2_grandmean$truth_value <- ifelse(data2_grandmean$truth_value==1, 'True','False')
ggplot(data2_grandmean, aes(x = truth_value, y = RT, group=polarity, color = polarity)) +
geom_point() +
geom_line() +
theme_light()
6. Remember we need by-participant (or by-item) means rather than raw data. Convert raw data to by-participant means. Hint:
data_mean <- aggregate(DV ~ factor1+factor2+factor3..., data=data_frame, FUN=mean)
.
data2_by_par <- aggregate(RT ~ participant+polarity+truth_value+condition, data = data2, FUN = mean)
# assumption check: normality (of data in each group)
# use shapiro test
tapply(data2_by_par$RT, data2_by_par$condition, shapiro.test)
## $Aff_false
##
## Shapiro-Wilk normality test
##
## data: X[[i]]
## W = 0.95892, p-value = 0.7735
##
##
## $Aff_true
##
## Shapiro-Wilk normality test
##
## data: X[[i]]
## W = 0.94552, p-value = 0.6159
##
##
## $Neg_false
##
## Shapiro-Wilk normality test
##
## data: X[[i]]
## W = 0.98402, p-value = 0.983
##
##
## $Neg_true
##
## Shapiro-Wilk normality test
##
## data: X[[i]]
## W = 0.93444, p-value = 0.493
# run the within-subjects ANOVA with package ez
library(ez)
data2_result <- ezANOVA(data = data2_by_par, dv=.(RT), wid=.(participant),within=c(polarity, truth_value), type = 3)
data2_result
## $ANOVA
## Effect DFn DFd F p p<.05 ges
## 2 polarity 1 9 425.22450 6.932215e-09 * 0.9403198
## 3 truth_value 1 9 94.86405 4.454277e-06 * 0.6817973
## 4 polarity:truth_value 1 9 44.25836 9.344969e-05 * 0.5856094
# post-hoc: pairwise t-test (within-subjects)
# for the two-way within anova, we can break the data down to each factor to compute post hocs
# Here, we look at the effect of truth value at each level of polarity
library(rstatix)
truth_effect <- data2_by_par %>%
group_by(polarity) %>%
anova_test(dv = RT, wid = participant, within = truth_value)%>%
get_anova_table() %>%
adjust_pvalue(method = "bonferroni")
truth_effect
## # A tibble: 2 × 9
## polarity Effect DFn DFd F p `p<.05` ges p.adj
## <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 Affirmative truth_value 1 9 84.0 0.00000737 "*" 0.86 0.0000147
## 2 Negative truth_value 1 9 3.00 0.117 "" 0.081 0.234
# You can also explore the interaction in another way and look at the effect of polarity at each level of truth value
# It's up to you which is the appropriate way(s) of exploring your interaction (decide based on your experiment design).
# For me this makes less sense in this experiment design.
polarity_effect <- data2_by_par %>%
group_by(truth_value) %>%
anova_test(dv = RT, wid = participant, within = polarity)%>%
get_anova_table() %>%
adjust_pvalue(method = "bonferroni")
polarity_effect
## # A tibble: 2 × 9
## truth_value Effect DFn DFd F p `p<.05` ges p.adj
## <fct> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 0 polarity 1 9 129. 0.00000123 * 0.906 0.00000246
## 2 1 polarity 1 9 341. 0.0000000183 * 0.957 0.0000000366
Two-way within-subjects ANOVA revealed significant main effects of Polarity (F(1,9)=425.22, p<0.001, \(\eta^2\)=0.94) and Truth value (F(1,9)=94.86, p<0.001, \(\eta^2\)=0.68), as well as a sigificant interaction (F(1,9)=44.26, p<0.001, \(\eta^2\)=0.59). Post-hoc pair-wise t-tests revealed that for affirmative sentences, RT was significantly faster for true sentences than false sentences (p<0.001); However, for negative sentences, there was no significant differences in terms of RT between true and false sentences (p=0.12).
Suppose there is a language A, where sentences such as “Every girl in this class read her favorite book” can have two different interpretations: either (A) for each girl (e.g. \(x_1\), \(x_2\), \(x_3\),…) in the class, there is a book that she likes (e.g. \(x_1\) likes \(book_1\), \(x_2\) likes \(book_2\), \(x_3\) likes \(book_3\), etc.), and each girl read their respective favorite book; or (B) there is some girl/woman \(x\), such that each girl in this class read \(x\)’s favorite book.
A speaker of language A will wither consistently prefer one or the other of the two readings for this type of sentences. A linguist wants to know whether this kind of speaker variability depends on the region that the speaker is from (south vs. north). 100 native speakers (50 from the south, 50 from the north) read the example sentence and gave their preferred interpretation (A or B).
data3 <- read.csv('pronoun.csv', header = TRUE)
Here we have categorial data: the dependent varible is categorial and takes the value of either A or B. We can use the Chi-squared test to see if there is a significant difference between the proportion/frequency of A and B depending on the groups/conditions.
data3_freq <- table(data3$region, data3$interpretation)
data3_freq
##
## A B
## north 21 29
## south 17 33
library(ggplot2)
data3_plot <- as.data.frame(table(subset(data3, select = -c(participant))))
ggplot(data=data3_plot, aes(x=region, y=Freq, fill=interpretation)) +
geom_bar(stat="identity", width=0.4)
6. Run the statistical test.
chisq.test(data3_freq)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: data3_freq
## X-squared = 0.382, df = 1, p-value = 0.5365
Chi-squared test revealed no significant difference between regions (south vs. north) in terms of native speakers’ preferred interpretation (\(\chi^2\)(1)=0.38, p=0.54).