Kevin P. McIntyre developed this amazing resource for students of psychology. Check out Open Stats Lab for a collection of all activities.

Each activity includes an article from *Psychological Science*, a data set, and an activity to complete in SPSS. However, if you are an open source fanatic, you could also complete the activity in JASP. For tips on how to use JASP, check out this resource created by Buchanan, Hopke, and Donaldson (2018).

I prefer to get my hands deep into the data. Dr. McIntyre does not *yet* offer an R activity to accompany the work of Maglio and Polman (2014), so here is one possible solution written in R.

# Analysis

I will perform assumption checks for each test prior to running it. We already know that the data meet all assumptions, otherwise the authors would have used a different analytic approach. However, checking the assumptions is helpful because:

- reproducibility and accuracy can be verified; and
- if you are a student, then you should form the habit of testing assumptions.

This analysis will follow the data science workflow advocated by Garrett Grolemund and Hadley Wickham. First, we will set-up our session and import the data. Then, we must clean the data. Next, we will transform, model, and visualize the data to understand it. Finally, we will communicate our findings.

## Import

Setting up the workspace involves loading packages and importing the data.

```
library(tidyverse)
library(car)
library(broom)
library(psych)
```

A relative path is used because the working directory is set.

`maglio <- read_csv("https://www.cjcascalheira.com/data/osl-maglio-polman-2014/Maglio%20and%20Polman%202014%20Experiment%201.csv")`

## Clean

Coding clarity is important since we did not design the study. Which integer of the `station`

variable correponds to Spadina, St. George, Bloor-Yonge, and Sherbourne? It is reasonable to think that the assignment of integers reads from left to right based on the Green Line map (see activity description).

Calculation of the means enables verification of this assumption. Count is included to understand the grouping of participants better. We can compare our average subjective distance to the numbers reported in Figure 1.

```
maglio %>%
group_by(direction, station) %>%
summarize(mean = mean(subjective_distance), count = n())
```

```
## # A tibble: 8 x 4
## # Groups: direction [2]
## direction station mean count
## <chr> <dbl> <dbl> <int>
## 1 EAST 1 3.65 26
## 2 EAST 2 2.77 26
## 3 EAST 3 1.61 23
## 4 EAST 4 2.77 26
## 5 WEST 1 2.64 25
## 6 WEST 2 1.64 25
## 7 WEST 3 2.19 26
## 8 WEST 4 4 25
```

The assumption held true: 1 corresponds to Spadina, 2 to St. George, and so on.

Notice that the sample sizes in each **cell** are unequal. That is, the number of participants within each level combination of factors (i.e., Spadina East, Spadina West, etc.) varies. Four cells have 26 participants, three cells have 25 participants, and 1 cell has 23 participants.

Uneven sample sizes across cells creates an **unbalanced design**. This will affect the code we write to analyze these data.

But let’s not get ahead of ourselves. First, we must code these integers to transform `station`

into a factor. Might as well transform `direction`

and `orientation`

into factors in the same call. The purpose of these transformations is to meet a requriement of factorial ANOVA: categorical predictors that are factors with two or more levels.

```
maglio_clean <- within(maglio, {
station <- factor(station, labels = c("spadina", "st_george", "bloor_yonge", "sherbourne"))
direction <- factor(direction)
orientation <- factor(orientation, labels = c("towards", "away"))
})
```

We know from the instructions that our task includes conducting independent t-tests. Let’s split the data frame according the station and direction ahead of time.

```
stations <- c("spadina", "st_george", "bloor_yonge", "sherbourne")
for (i in stations) {
assign(paste(i, "east", sep = "_"),
maglio_clean %>% filter(station == i & direction == "EAST"))
assign(paste(i, "west", sep = "_"),
maglio_clean %>% filter(station == i & direction == "WEST"))
}
```

## Understand

A factorial ANOVA compares the means across two or more independent variables, thereby splitting the sample into four or more groups. Contrast this with a one-way ANOVA, which only has one independent variable that splits the sample into two or more groups. In terms of analyzing variance, if we have more than one independent variable, then conduct a factorial ANOVA. Notice that, in both cases, we only have *one* dependent variable.

The presence of two independent variables classifies a factorial ANOVA as two-way. The assumptions of a two-way ANOVA echo those of the one-way ANOVA. Three are met before data collection, three are tested.

**Assumptions Before Data Collection**

**Two or more categorical independent variables**as factors with two or more levels;**One continuous dependent variable**at the interval or ratio level;**Independence of observations**, or a between-groups design, in which participants are separated into distinct groups with no overlap;

**Tested Assumptions**

**No significant outliers**, assessed with boxplots;**Normality**, assessed with the Shapiro-Wilk test, histograms, Q-Q plots, and density curves for each level combination of the two factors;**Homoscedasticity**, assessed with Levene’s test for each combination of the two factors.

An analysis of variance is robust. It can provide valid results with minor violations of normality. Since the first three assumptions are met during experimental design, we must test for outliers, normality, and homogeneity of variances.

The null and alternative hypotheses, assessed with the *F* test, can make claims about the means between cells or claims about the interaction between factors. For this study, they are:

\[H_0: \text{there is no interaction between orientation and station} \\ H_1: \text{there is an interaction}\]

Maglio and Polman (2014) conducted a 2 x 4 between-groups factorial experiment. The first factor, `direction`

, has two levels: eastbound and westbound. The second factor, `station`

, has four levels: Spadina, St. George, Bloor-Yonge, and Sherbourne. Thus, the sample of participants is divided into eight groups.

### Two-way ANOVA

Since we have an unbalanced design with unequal sample sizes in each cell, we must specify the type of *sum of squares*. Type-III is usually recommended. When the design in unbalanced, each type of *sum of squares* will yield different results.

With that in mind, let’s start by constructing the analysis of variance.

`maglio_aov <- aov(subjective_distance ~ orientation * station, data = maglio_clean)`

Now we summarize the model to test for interaction effects. Note that we *must* include the `options()`

call **first** since the sum of squares is Type III. Failure to do so will produce a correct effect with the interaction, but spurious effects for each factor by itself. In this case, failure to set `options()`

will cause all effects to appear significant.

```
options(contrasts = c("contr.sum", "contr.poly"))
(maglio_anova <- Anova(maglio_aov, type = "III"))
```

```
## Anova Table (Type III tests)
##
## Response: subjective_distance
## Sum Sq Df F value Pr(>F)
## (Intercept) 347.12 1 323.515 < 2.2e-16 ***
## orientation 13.10 1 12.210 0.0005889 ***
## station 51.19 3 15.903 2.763e-09 ***
## orientation:station 52.41 3 16.283 1.765e-09 ***
## Residuals 208.15 194
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

There are two significant effects, one on station and the other on the interaction between orientation and station. Therefore, we should assess the effect size, or partial eta-squared. This is given by the equation:

\[\eta_p^2 = \frac{SS_{predictor}}{SS_{predictor} + SS_{residuals}}\]

We can use `broom::tidy`

to sweep the model’s output into a tidy data frame. This will prevent entry errors and promote exact calculations.

`(maglio_tidy <- tidy(maglio_anova))`

```
## # A tibble: 5 x 5
## term sumsq df statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 347. 1 324. 3.34e-43
## 2 orientation 13.1 1 12.2 5.89e- 4
## 3 station 51.2 3 15.9 2.76e- 9
## 4 orientation:station 52.4 3 16.3 1.77e- 9
## 5 Residuals 208. 194 NA NA
```

Now, calculating the effect size is a simple plug-in procedure.

`(eta_station <- maglio_tidy$sumsq[3] / (maglio_tidy$sumsq[3] + maglio_tidy$sumsq[5]))`

`## [1] 0.1973795`

`(eta_interact <- maglio_tidy$sumsq[4] / (maglio_tidy$sumsq[4] + maglio_tidy$sumsq[5]))`

`## [1] 0.2011516`

The two-way ANOVA revealed no main effect on orientaion, *F* < 1, *p* = .573. The *F*-value did not exceed the critical value. However, there was a significant effect on station, *F*(3, 194) = 24.10, *p* < .001, \(\eta_p^2 = .27\). The interaction between orientation and station was also significant, *F*(3, 194) = 16.28, *p* < .001, \(\eta_p^2 = .20\).

#### Outliers?

Outliers can reduce the accuracy of results and skew data.

```
ggplot(maglio_clean, aes(x = station, y = subjective_distance, color = orientation)) +
geom_boxplot()
```

There are three outliers present.

#### Normality?

The null hypothesis of the Shapiro-Wilk test is that the data are normally distributed. Remember that, for analysis of variance, we test the normality of residuals.

```
aov_residuals <- residuals(maglio_aov)
shapiro.test(aov_residuals)
```

```
##
## Shapiro-Wilk normality test
##
## data: aov_residuals
## W = 0.97245, p-value = 0.0005304
```

Since *p* < .001, we reject the null hypothesis: the data are not normally distributed. The ANOVA is a robust procedure, so it can withstand violations of normality. Let’s visualize the distribution.

One method is the Q-Q plot, a visualization that compares the quantiles of a normal distribution against the quantiles of the model’s residuals. Normally distributed residuals will follow a straight diagonal line.

`plot(maglio_aov, 2)`

Here, we see that the residuals undulate along the line. The pattern seems consistent if not normal, despite the labeled outliers in the upper right-hand corner of the plot.

What does a density curve for each cell (i.e., group) reveal?

```
ggplot(maglio_clean, aes(x = subjective_distance)) +
geom_density() +
facet_grid(orientation ~ station)
```

The data are clearly not normal. They are not even skewed in similar directions!

#### Homoscedasticity?

The null hypothesis of Levene’s test is that the variances are homogeneous.

`leveneTest(subjective_distance ~ orientation * station, data = maglio_clean)`

```
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 7 0.5781 0.7732
## 194
```

There is no evidence in favor of rejecting the null hypothesis, *F* < 1, *p* = .773. The data are homoscedastic.

We can also visualize homogenity of variances as the relationship between residuals and fitted values.

`plot(maglio_aov, 1)`

The plot shows no relationship. The red line is practically at zero.

### Independent t-Tests

The assumptions of the independent t-test are nearly identical to the two-way ANOVA, except for the requirement of the independent variable, which in this case, must be a single factor with two or more levels. The results from the three assumptions that are tested (i.e., outliers, normality, and homoscedasticity) will likely be inherited from the ANOVA. However, we will test these assumptions below for each independent t-test.

We shall delineate the interaction between orientation and station by comparing the subjective-distance scores across eastbound and westbound participants for each of the stations.

Social scientists typically report Cohen’s *d* as the effect size of independent t-tests. Cohen’s *d* is standardized. In the present study, the author’s chose to report partial eta-squared, which indicates the proportion of overlap between the factor and the reponse variable. Effect sizes are considered:

**small**= .01**medium**= .06**large**= .14

Here’s how we calculate partial eta-squared for independent t-tests:

\[\eta_p^2 = \frac{t_{obs}^2}{t_{obs}^2 + df}\]

We will tidy the t-test model like we did above. Once the model is in a data frame, `statistic`

corresponds to the observed *t*-value, whereas parameter corresponds to the degrees of freedom.

#### Spadina

```
# Spadina t-test
(spadina_t <- t.test(spadina_east$subjective_distance, spadina_west$subjective_distance,
paired = FALSE, var.equal = TRUE))
```

```
##
## Two Sample t-test
##
## data: spadina_east$subjective_distance and spadina_west$subjective_distance
## t = 3.4592, df = 49, p-value = 0.001131
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.424870 1.602822
## sample estimates:
## mean of x mean of y
## 3.653846 2.640000
```

```
# Spadina effect size
(spadina_tidy <- tidy(spadina_t))
```

```
## # A tibble: 1 x 9
## estimate1 estimate2 statistic p.value parameter conf.low conf.high method
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 3.65 2.64 3.46 0.00113 49 0.425 1.60 " Two~
## # ... with 1 more variable: alternative <chr>
```

`(eta_spadina <- spadina_tidy$statistic^2 / (spadina_tidy$statistic^2 + spadina_tidy$parameter))`

`## [1] 0.1962763`

Westbound participants rated Spadina as being closer than eastbound participants, *p* < .001, \(\eta_p^2 = .20\).

##### Outliers?

```
maglio_clean %>% filter(station == "spadina") %>%
ggplot(aes(x = direction, y = subjective_distance)) +
geom_boxplot()
```

There are two outliers.

##### Normality?

`with(spadina_east, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.89051, p-value = 0.009669
```

`with(spadina_west, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.88884, p-value = 0.01057
```

Both *p*s < .05, so we reject the null hypothesis: the data are not normal.

##### Homoscedasticity?

`var.test(spadina_east$subjective_distance, spadina_west$subjective_distance)`

```
##
## F test to compare two variances
##
## data: spadina_east$subjective_distance and spadina_west$subjective_distance
## F = 1.2075, num df = 25, denom df = 24, p-value = 0.6467
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.5348865 2.7073917
## sample estimates:
## ratio of variances
## 1.207459
```

The variances are homogeneous, *p* = .647.

#### St. George

```
# St. George t-test
(st_george_t <- t.test(st_george_east$subjective_distance, st_george_west$subjective_distance,
paired = FALSE, var.equal = TRUE))
```

```
##
## Two Sample t-test
##
## data: st_george_east$subjective_distance and st_george_west$subjective_distance
## t = 4.3351, df = 49, p-value = 7.229e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.6057612 1.6527004
## sample estimates:
## mean of x mean of y
## 2.769231 1.640000
```

```
# St. George effect size
(st_george_tidy <- tidy(st_george_t))
```

```
## # A tibble: 1 x 9
## estimate1 estimate2 statistic p.value parameter conf.low conf.high method
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 2.77 1.64 4.34 7.23e-5 49 0.606 1.65 " Two~
## # ... with 1 more variable: alternative <chr>
```

```
(eta_st_george <- st_george_tidy$statistic^2 /
(st_george_tidy$statistic^2 + st_george_tidy$parameter))
```

`## [1] 0.2772092`

Westbound participants rated St. George as being closer than eastbound participants, *p* < .001, \(\eta_p^2 = .28\).

##### Outliers?

```
maglio_clean %>% filter(station == "st_george") %>%
ggplot(aes(x = direction, y = subjective_distance)) +
geom_boxplot()
```

There is one outlier among eastbound participants.

##### Normality?

`with(st_george_east, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.88253, p-value = 0.006513
```

`with(st_george_west, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.72533, p-value = 1.629e-05
```

Since both *p*s < .05, we accept the alternative hypothesis: the data are not normally distributed.

##### Homoscedasticity?

`var.test(st_george_east$subjective_distance, st_george_west$subjective_distance)`

```
##
## F test to compare two variances
##
## data: st_george_east$subjective_distance and st_george_west$subjective_distance
## F = 1.6212, num df = 25, denom df = 24, p-value = 0.2407
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.718186 3.635184
## sample estimates:
## ratio of variances
## 1.621242
```

Since *p* = .241, we accept the null hypothesis: the data are homoscedastic.

#### Bloor-Yonge

```
# Bloor-Yonge t-test
(bloor_yonge_t <- t.test(bloor_yonge_west$subjective_distance,
bloor_yonge_east$subjective_distance,
paired = FALSE, var.equal = TRUE))
```

```
##
## Two Sample t-test
##
## data: bloor_yonge_west$subjective_distance and bloor_yonge_east$subjective_distance
## t = 1.9863, df = 47, p-value = 0.05285
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.00746818 1.17469226
## sample estimates:
## mean of x mean of y
## 2.192308 1.608696
```

```
# Bloor-Yonge effect size
(bloor_yonge_tidy <- tidy(bloor_yonge_t))
```

```
## # A tibble: 1 x 9
## estimate1 estimate2 statistic p.value parameter conf.low conf.high method
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 2.19 1.61 1.99 0.0528 47 -0.00747 1.17 " Two~
## # ... with 1 more variable: alternative <chr>
```

```
(eta_bloor_yonge <- bloor_yonge_tidy$statistic^2 /
(bloor_yonge_tidy$statistic^2 + bloor_yonge_tidy$parameter))
```

`## [1] 0.0774451`

Eastbound participants rated Bloor-Yonge as being closer than westbound participants, *p* = .053, \(\eta_p^2 = .08\).

##### Outliers?

```
maglio_clean %>% filter(station == "bloor_yonge") %>%
ggplot(aes(x = direction, y = subjective_distance)) +
geom_boxplot()
```

No outliers present.

##### Normality?

`with(bloor_yonge_west, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.83665, p-value = 0.0007924
```

`with(bloor_yonge_east, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.75276, p-value = 7.355e-05
```

Both *p*s < .05, so we reject the null hypothesis. The normality assumption is violated.

##### Homoscedasticity?

`var.test(bloor_yonge_west$subjective_distance, bloor_yonge_east$subjective_distance)`

```
##
## F test to compare two variances
##
## data: bloor_yonge_west$subjective_distance and bloor_yonge_east$subjective_distance
## F = 2.9163, num df = 25, denom df = 22, p-value = 0.01352
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.257153 6.616943
## sample estimates:
## ratio of variances
## 2.916282
```

Since *p* < .05, we reject the null hypothesis: the variances are heterogeneous.

#### Sherbourne

```
# Sherbourne t-test
(sherbourne_t <- t.test(sherbourne_west$subjective_distance, sherbourne_east$subjective_distance,
paired = FALSE, var.equal = TRUE))
```

```
##
## Two Sample t-test
##
## data: sherbourne_west$subjective_distance and sherbourne_east$subjective_distance
## t = 3.8869, df = 49, p-value = 0.0003052
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.5944498 1.8670886
## sample estimates:
## mean of x mean of y
## 4.000000 2.769231
```

```
# Sherbourne effect size
(sherbourne_tidy <- tidy(sherbourne_t))
```

```
## # A tibble: 1 x 9
## estimate1 estimate2 statistic p.value parameter conf.low conf.high method
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 4 2.77 3.89 3.05e-4 49 0.594 1.87 " Two~
## # ... with 1 more variable: alternative <chr>
```

```
(eta_sherbourne <- sherbourne_tidy$statistic^2 /
(sherbourne_tidy$statistic^2 + sherbourne_tidy$parameter))
```

`## [1] 0.2356667`

Eastbound participants rated Sherbourne as being closer than westbound participants, *p* < .001, \(\eta_p^2 = .24\).

##### Outliers?

```
maglio_clean %>% filter(station == "sherbourne") %>%
ggplot(aes(x = direction, y = subjective_distance)) +
geom_boxplot()
```

No outliers.

##### Normality?

`with(sherbourne_west, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.89339, p-value = 0.01321
```

`with(sherbourne_east, shapiro.test(subjective_distance))`

```
##
## Shapiro-Wilk normality test
##
## data: subjective_distance
## W = 0.90999, p-value = 0.02635
```

The data are not normally distributed (both *p*s < .05).

##### Homoscedasticity?

`var.test(sherbourne_west$subjective_distance, sherbourne_east$subjective_distance)`

```
##
## F test to compare two variances
##
## data: sherbourne_west$subjective_distance and sherbourne_east$subjective_distance
## F = 0.95814, num df = 24, denom df = 25, p-value = 0.9188
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.4273157 2.1629094
## sample estimates:
## ratio of variances
## 0.9581368
```

The variances are equal (*p* = .919).

### Visualize

Let’s generate a line graph to depict the relationship between station and the direction of travel, which corresponds to orientation.

First we need to construct an APA-style theme.

```
apa_theme <- theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
axis.line = element_line(),
plot.title = element_text(hjust = 0.5),
text = element_text(size = 12))
```

Next, we need the mean and standard error of subjective-distance ratings for westbound and eastbound participants for each station. The following code is clunky, but it accomplishes the task.

```
sp_e <- describe(spadina_east$subjective_distance)[,-c(1:2,4:12)]
sp_w <- describe(spadina_west$subjective_distance)[,-c(1:2,4:12)]
st_e <- describe(st_george_east$subjective_distance)[,-c(1:2,4:12)]
st_w <- describe(st_george_west$subjective_distance)[,-c(1:2,4:12)]
bl_e <- describe(bloor_yonge_east$subjective_distance)[,-c(1:2,4:12)]
bl_w <- describe(bloor_yonge_west$subjective_distance)[,-c(1:2,4:12)]
sh_e <- describe(sherbourne_east$subjective_distance)[,-c(1:2,4:12)]
sh_w <- describe(sherbourne_west$subjective_distance)[,-c(1:2,4:12)]
```

Bind the temporary variables holding the descriptive statistics into one variable, adding two new columns for direction and station.

```
descriptives <- rbind(sp_e, sp_w, st_e, st_w, bl_e, bl_w, sh_e, sh_w)
# Add columns
(descriptives <- descriptives %>%
mutate(
direction = c("east", "west", "east", "west", "east", "west", "east", "west"),
station = c("spadina", "spadina", "st_george", "st_george",
"bloor_yonge", "bloor_yonge", "sherbourne", "sherbourne")
))
```

```
## mean se direction station
## 1 3.653846 0.2144209 east spadina
## 2 2.640000 0.1989975 west spadina
## 3 2.769231 0.2023530 east st_george
## 4 1.640000 0.1620699 west st_george
## 5 1.608696 0.1506131 east bloor_yonge
## 6 2.192308 0.2419105 west bloor_yonge
## 7 2.769231 0.2240034 east sherbourne
## 8 4.000000 0.2236068 west sherbourne
```

Transform the variables station and direction into factors with pretty labels.

```
# Station
descriptives$station <- factor(descriptives$station,
levels = c("spadina", "st_george", "bloor_yonge", "sherbourne"),
labels = c("Spadina", "St. George", "Bloor-Yonge", "Sherbourne"))
# Direction
descriptives$direction <- factor(descriptives$direction,
levels = c("east", "west"),
labels = c("Eastbound", "Westbound"))
```

Finally, we construct the line graph to plot this two-way design.

```
ggplot(descriptives, aes(x = station, y = mean, group = direction)) +
geom_line(aes(linetype = direction), size = 1) +
geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width = 0.1) +
geom_point(size = 5, position = position_dodge(width = 0.01)) +
geom_point(size = 4, position = position_dodge(width = 0.01), color = "white") +
expand_limits(y = c(0, 5)) +
scale_y_continuous(breaks = seq(0, 5, 0.5), expand = c(0, 0)) +
guides(linetype = guide_legend("")) +
labs(x = "", y = "Subjective Distance") +
apa_theme +
theme(legend.position = c(0.1, 0.99))
```

**Figure 1**: The mean subjective-distance rating for each group based on their orientation towards and away from the four stations.

## Communicate

The two-way ANOVA revealed no main effect on orientaion, *F* < 1, *p* = .573. The *F*-value did not exceed the critical value. However, there was a significant effect on station, *F*(3, 194) = 24.10, *p* < .001, \(\eta_p^2 = .27\). The interaction between orientation and station was also significant, *F*(3, 194) = 16.28, *p* < .001, \(\eta_p^2 = .20\). Delineation of this interaction revealed that eastbound participants perceived stations to the East as being closer than did westbound participants, both for Bloor-Yonge (*p* = .053, \(\eta_p^2 = .08\)) and Sherbourne (*p* < .001, \(\eta_p^2 = .24\)). Travlers heading west from Bay station rated the subjective-distance of St. George (*p* < .001, \(\eta_p^2 = .28\)) and Spadina (*p* < .001, \(\eta_p^2 = .20\)) as closer than did eastbound participants.

# Acknowledgements

I am thankful for my advisor, Dr. Brandt A. Smith for introducing me to R, JASP, and OSL. The discipline of psychology is advocating for preregistered, open materials. His encouragement to utilize open data and open source software has positioned me in the middle of the reproducible movement.

I would still be clicking checkboxes and dropdowns to analyze data if it were not for DataCamp, Rose Maier, Alboukadel Kassambara, Jonathan Baron, and the team behind personality-project.