This page reports the analyses for the first experiment described in ‘Colour biases in learned foraging preferences in Trinidadian guppies.’ The R script to reproduce the analysis and this site are in the analysis-experiment-1.Rmd file. The raw data used to conduct these analyses are available in the colour-learning-experiment-1-data.csv file. Note the code blocks that produce the figures and tables are not shown on this page as they are rather long, however the code to produce the figures and tables can also be seen in analysis-experiment-1.Rmd. To get straight to the results go to the Models section. To see how to reproduce these results please visit the How to Reproduce the Results section of the README.
At the time experiment 1 was conducted I (MWT) still performed much data
preparation manually in Excel so unlike in experiment 2, we do not format the
data in R for experiment 1. Additionally, several variables were created in
Excel. This Excel sheet was then exported as the .csv
file
colour-learning-experiment-1-data.csv.
All measures in
colour-learning-experiment-1-data.csv
except ate
(denoting whether a fish ate during a trial or not) are derived
from the first 16 columns of that data sheet. Please see the Variable
Creation
section of Experiment 2’s analysis page to see how the additional measures are
derived. Variables necessary for the analysis which were not created manually
are created below. Descriptions of the variables found in the data set are given
in the Experiment 1 Metadata section of
metadata.md
.
The preference metrics green.object.preference
and
rewarding.object.preference
are created by subtracting the time spent near the
blue object from the time spent near the green object and subtracting the time
spent near the untrained object from the time spent near the trained object
respectively. The proportional rewarding object preference
prop.rewarding.object.preference
is created by dividing the time spent near
the trained object by the time spent near both objects.
# Reading in Data
my_data <- read.csv("data/colour-learning-experiment-1-data.csv")
# Creating new variables
## Green object preference
my_data <- my_data %>%
mutate(
green.object.preference =
case_when(
rewarding.object.colour == "green" ~
time.with.trained.object - time.with.untrained.object,
rewarding.object.colour == "blue" ~
time.with.untrained.object - time.with.trained.object
)
)
## Rewarding object preference
my_data <- my_data %>%
mutate(
rewarding.object.preference =
time.with.trained.object - time.with.untrained.object
)
## Rewarding object proportional preference
my_data <- my_data %>%
mutate(
prop.rewarding.object.preference =
(time.with.trained.object / (time.with.trained.object + time.with.untrained.object))
)
We remove males from our analyses due to consistently low feeding motivation in this particular experimental design across all males. However, we note that the conclusions of the experiment do not change even if males are included, one can remove the line of code below removing males from the analyses and re run the R script and find that the statistical conclusions are maintained. Additionally, two fish died before testing in Experiment 1 and were not included in the analyses.
# Removing males
my_data <- my_data %>%
filter(
id != "a3", id != "b3",
id != "c3", id != "e3",
id != "g3", id != "v3",
id != "w3", id != "x3",
id != "b1", id != "b2"
) %>%
droplevels()
To conduct the analyses we planned, we create subsets of the full data set that
are restricted to the training trials (reinforced), the test trials
(unreinforced), and the initial test trial (unreinforced) using the filter()
function from dplyr
. We change trial to a factor for the unreinforced test
trial data subset since there are two levels of trial being compared to each
other for the analysis on this data set. Trial in the training data subset is
coded as integer to allow us to look at trends in the shift of rewarding object
preference during training.
# Restrict data to training data
training_data <- my_data %>%
filter(trial.type == "training")
# Restrict data to only the baseline and re-test data
test_data <- my_data %>%
filter(trial.type == "test")
# Restrict data to only the baseline data
baseline_data <- my_data %>%
filter(trial == 0)
# Change trial to factor
test_data$trial <- as.factor(test_data$trial)
baseline_data$trial <- as.factor(baseline_data$trial)
This script will generate figures but to store them we need to create specific
directories that have been hard coded into the script. We create the figs/
directory and all the subdirectories within it using the next line of code. Now
all figures that are created in this script are accessible as individual files
in the figs/
directory. If this script is run multiple times it will return a
warning saying that the directory already exists. However, this is not
problematic since the figures are always regenerated by running the script so we
set showWarnings
to FALSE
. Every figure seen on this page will be available
as an individual file in the figs/
directory if one runs all the code chunks
of analysis-experiment-1.Rmd
.
dir.create(file.path("figs/exp-1/residual-plots"),
recursive = TRUE,
showWarnings = FALSE
)
We analysed the data from our experiment using linear, linear mixed effect,
generalized linear mixed effect, and generalized linear models with the lm()
,
lmer()
, glmmTMB()
, and glm.nb()
functions from the stats
, lme4
,
glmmTMB
, and MASS
packages respectively. P-values and effective degrees of
freedom for lme4
models were obtained using the lmerTest
package. Model
residuals were checked they met distributional assumptions with the DHARMa
package, you can click the ‘See Model Residuals’ button below the model formulas
to see the residual diagnostic plots produced by DHARMa
for that particular
model.
This first model contains the data for all individual guppies. We looked at the
green object preference of all guppies in an intercept only model to see if the
green object preference at baseline was significantly different from zero.
green.object.preference
is the time spent near the green object subtracted by
the time spent near the blue object
baseline_data_model <-
lm(green.object.preference ~ 1,
data = baseline_data
)
simulationOutput <- simulateResiduals(fittedModel = baseline_data_model)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-model-1-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 4.976 | 3.624 | 1.373 | 0.193 |
During the initial test, there was no significant preference for the green object across all guppies (green object preference: 5 ± 4 seconds, p = 0.193).
To see whether fish were responsive during training our second model asks whether the preference for the rewarding object changes throughout training and whether the change in rewarding object preference is different between the treatments.
rewarding.object.preference
is the time (seconds)
spent near the rewarding object subtracted by the time spent near the
unrewarding objectrewarding.object.colour
is the identity of the rewarding
object (blue or green)trial
is the number of the training trial. In this model
it is supplied as an integerid
is the identity of the individual fishtraining_data_model <-
lmer(rewarding.object.preference ~ rewarding.object.colour * trial + (1 | id),
data = training_data
)
# Residual diagnostics
simulationOutput <- simulateResiduals(
fittedModel = training_data_model,
n = 1000
)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-model-2-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
We see that there is a slight amount of structure in the residuals, however, visual inspection of the residuals reveals that this structure is minor. There is not an indication of a gross model misfit so our model is still suitable.
Factor | Estimate | Std. Error | T statistic | df | P value |
---|---|---|---|---|---|
Intercept | -10.343 | 21.362 | -0.484 | 44.178 | 0.631 |
Rewarding object colour | 61.436 | 28.128 | 2.184 | 43.451 | 0.034 |
Trial | 10.910 | 1.422 | 7.672 | 260.793 | < .001 |
Rewarding object colour X Trial | -1.630 | 1.870 | -0.872 | 260.775 | 0.384 |
There was a significant effect of trial. Over the 20 training trials, guppies in the two treatments increased their relative preference for their respective rewarded objects by 11 ± 1 seconds each trial (Figure 2, p < .001). There was also a significant effect of rewarded-object colour (p = 0.034). during training green-rewarded guppies expressed a stronger preference for their rewarded object (the green object) than did blue-rewarded guppies did for the blue object, a difference of 61 ± 28 seconds. However, there was no significant interaction effect between rewarding object colour and trial (-2 ± 2 seconds, p = 0.384), i.e., the rate of increase in object preference over trials did not significantly differ between the treatments.
For the main effects of training and rewarding object colour on rewarding object
preference we fit a generalized linear mixed effects model with a Gaussian
distribution which modeled the variances to account for variance heterogeneity
using the package
glmmTMB
. Our
third model asks whether the preference for the rewarding object changed between
baseline and final test and looks for an interaction with rewarded object
colour. To control for heterogeneous variance across
trials we additionally modelled the variance due to trial across both colour
treatments.
test_data_model_glm <-
glmmTMB(rewarding.object.preference ~
trial * rewarding.object.colour + (1|id) +
diag(0 + rewarding.object.colour:trial |id),
data = test_data,
family = gaussian
)
simulationOutput <- simulateResiduals(fittedModel = test_data_model_glm, n = 1000)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-model-3-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 1.135 | 3.494 | 0.325 | 0.745 |
Trial | 19.586 | 13.295 | 1.473 | 0.141 |
Rewarding object colour | 8.425 | 6.046 | 1.393 | 0.163 |
Rewarding object colour X Trial | 84.341 | 22.598 | 3.732 | < .001 |
When comparing the initial and final preference test, both conducted after training and without food rewards present, we found a significant interaction effect between test and rewarding object colour (p = < .001). Guppies that had been green-rewarded had a shift in their rewarded object preference that was on average 84 ± 23 seconds stronger than the shift in rewarded object preference of guppies trained to blue (Figure 3).
To determine whether the means of the final rewarding object preference for the
two treatments were different We conducted post-hoc comparisons with the package
emmeans
. We compared the
following means:
test_data_model_emmeans <- emmeans(test_data_model_glm,
specs = ~ trial:rewarding.object.colour
)
# Making vectors to represent means of interest from emmeans() output
blue0 <- c(1, 0, 0, 0)
blue21 <- c(0, 1, 0, 0)
green0 <- c(0, 0, 1, 0)
green21 <- c(0, 0, 0, 1)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
custom_contrasts <- contrast(test_data_model_emmeans,
method = list(
"21 blue - 0 blue" = blue21 - blue0,
"21 green - 0 green " = green21 - green0,
"21 green - 21 blue" = green21 - blue21,
"0 green - 0 blue" = green0 - blue0
), adjust = "mvt"
) %>%
summary(infer = TRUE)
Contrast | Estimate | Lower CL | Upper CL | df | P Value |
---|---|---|---|---|---|
21 blue - 0 blue | 19.586 | -16.091 | 55.262 | 18 | 0.413 |
21 green - 0 green | 103.926 | 54.892 | 152.961 | 18 | < .001 |
21 green - 21 blue | 92.766 | 34.336 | 151.195 | 18 | 0.002 |
0 green - 0 blue | 8.425 | -7.800 | 24.649 | 18 | 0.459 |
Post-hoc comparisons (Table 2) reveal that initially, before training, there was no significant difference in the strength of preference for the rewarded object between the treatments (8 ± 6 seconds, p = 0.459). The shift in rewarded object preference between the initial and final preference tests was significant for green-trained guppies but not for blue-trained guppies: green trained guppies increased their preference for the green object by 104 ± 18 seconds, from initial to final (p < .001) test whereas blue-trained guppies increased their preference for the blue object by 20 ± 13 seconds an effect that was not statistically significant (p = 0.413). At final test, green-rewarded guppies had a significantly stronger preference for the previously rewarded object compared to the blue-rewarded guppies (Difference in preference: 93 ± 22 seconds, p = 0.002).
Additionally, we compared the mean for each colour group in each test trial against zero to see if the preference for the rewarding object was significantly different from zero within a test trial.
set.seed(123)
emmeans(test_data_model_glm,
specs = ~ rewarding.object.colour:trial,
adjust = "mvt"
) %>%
summary(infer = TRUE)
Rewarding object colour | Trial | Estimate | Std. Error | df | Lower CL | Upper CL | T ratio | P Value |
---|---|---|---|---|---|---|---|---|
blue | Initial Test | 1.135 | 3.494 | 18 | -8.465 | 10.734 | 0.325 | 0.995 |
green | Initial Test | 9.559 | 4.935 | 18 | -3.999 | 23.118 | 1.937 | 0.236 |
blue | Final Test | 20.720 | 12.828 | 18 | -14.526 | 55.967 | 1.615 | 0.393 |
green | Final Test | 113.486 | 17.595 | 18 | 65.143 | 161.829 | 6.450 | < .001 |
We find that only green-trained guppies during the final test trial display a rewarding object preference that is significantly different from chance (green object preference: 113 ± 18 seconds, p < .001).
A discrepancy in reinforcement between treatments may influence performance on a final preference test. To see whether there was a difference in feeding between treatments we counted the number of trials in which an individual fish ate throughout all of training and compared the feeding counts between treatments. To do this we fit a generalized linear model with a negative binomial distribution. The response variable ‘feeding count’ is a sum of the number of trials in which a guppy ate.
feeding_data_model <-
glm.nb(feeding.count ~ rewarding.object.colour,
data = my_feeding_data
)
simulationOutput <- simulateResiduals(fittedModel = feeding_data_model)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-model-4-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 2.603 | 0.130 | 19.99 | 0.000 |
Rewarding object colour | 0.027 | 0.172 | 0.16 | 0.873 |
We found no significant difference in the number of trials individuals fed between green-rewarded and blue-rewarded fish (Figure 4, p = 0.873). We also incorporated feeding count as a covariate in Model 3, finding the same pattern of results (See ESM Model 1).
In this section we describe models not included in the main text.
This model was run to address the potential role of feeding count on test performance. The concern was that the results may be explained by differential levels of reinforcement between the groups or between individuals during training. To ensure our results were robust to matters arising from feeding count variation we included feeding count as a co-variate and re-ran the analysis in Model 3. With this model we looked to see whether including the amount of trials an individual fed as a covariate in the model changes the conclusions.
test_data_feeding_controlled_model <-
glmmTMB(rewarding.object.preference ~
trial * rewarding.object.colour + feeding.count + (1 | id) +
diag(0 + rewarding.object.colour * trial | id),
data = test_feeding_data,
family = gaussian
)
simulationOutput <- simulateResiduals(fittedModel = test_data_feeding_controlled_model)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-ESM-model-1-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 2.513 | 9.872 | 0.255 | 0.799 |
Trial | 19.583 | 13.428 | 1.458 | 0.145 |
Rewarding object colour | 8.463 | 6.017 | 1.407 | 0.16 |
Feeding count | -0.102 | 0.684 | -0.149 | 0.881 |
Rewarding object colour X Trial | 84.346 | 22.040 | 3.827 | < .001 |
The main results do not change if we control for feeding count. Table 4 is the output for the feeding controlled model. Below we have the output table for the non-feeding-count controlled model from Model 3 (Table 1) reproduced. In both models the statistical conclusions are the same.
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 1.135 | 3.494 | 0.325 | 0.745 |
Trial | 19.586 | 13.295 | 1.473 | 0.141 |
Rewarding object colour | 8.425 | 6.046 | 1.393 | 0.163 |
Rewarding object colour X Trial | 84.341 | 22.598 | 3.732 | < .001 |
A reviewer of our manuscript asks
Why the authors did not (also) exploit a percentage preference, which is commonly used?
To determine whether our results are robust despite our different measure we also ran an analysis with the percentage preference as ESM Model 2. The results from the main experiment remain unchanged when we do this.
Since our data are continuous proportions we used a generalized linear mixed effect model with a beta distribution.
prop_test_data_model_glm <-
glmmTMB(prop.rewarding.object.preference ~
trial * rewarding.object.colour + (1|id),
data = test_data,
family = beta_family(link="logit")
)
simulationOutput <- simulateResiduals(
fittedModel = prop_test_data_model_glm,
n = 1000
)
plot(simulationOutput)
# Saving plot to figs directory
ggsave(
filename = "exp-1-ESM-model-2-residual-plot.png",
plot = (plot(simulationOutput)),
path = "figs/exp-1/residual-plots/",
device = "png",
dpi = 300
)
Factor | Estimate | Std. Error | T statistic | P value |
---|---|---|---|---|
Intercept | 0.035 | 0.199 | 0.178 | 0.859 |
Trial | 0.235 | 0.282 | 0.834 | 0.404 |
Rewarding object colour | 0.310 | 0.264 | 1.172 | 0.241 |
Rewarding object colour X Trial | 0.901 | 0.394 | 2.286 | 0.022 |
As in Model 3 we find there is an interaction effect between trial and rewarding object colour for the test trials (p = 0.022). We again conduct post-hoc tests as described in the Model 3 section.
prop_test_data_model_emmeans <- emmeans(prop_test_data_model_glm,
specs = ~ trial:rewarding.object.colour, type = "response"
)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
prop_custom_contrasts <- contrast(prop_test_data_model_emmeans,
method = list(
"21 blue - 0 blue" = blue21 - blue0,
"21 green - 0 green " = green21 - green0,
"21 green - 21 blue" = green21 - blue21,
"0 green - 0 blue" = green0 - blue0
), adjust = "mvt"
) %>%
summary(infer = TRUE)
Contrast | Odds ratio | Lower CL | Upper CL | df | P Value |
---|---|---|---|---|---|
21 blue / 0 blue | 1.265 | 0.600 | 2.668 | 22 | 0.8 |
21 green / 0 green | 3.116 | 1.502 | 6.466 | 22 | 0.001 |
21 green / 21 blue | 3.358 | 1.546 | 7.291 | 22 | 0.001 |
0 green / 0 blue | 1.363 | 0.678 | 2.743 | 22 | 0.592 |
We find the same pattern of results as in Model 3. Green-trained guppies significantly increase their preference for the green object by 23% (p = 0.001) whereas blue-trained guppies non-significantly increase their preference for the blue object by 6% (p = 0.8. There is no significant difference in preference for the rewarding object between the treatments before training (p = 0.592).
Additionally, we again compared each test trial mean to chance (50% in this case).
set.seed(123)
emmeans(prop_test_data_model_glm,
specs = ~ rewarding.object.colour:trial,
adjust = "mvt",
type = "response"
) %>%
summary(infer = TRUE)
Rewarding object colour | Trial | Estimate | Std. Error | df | Lower CL | Upper CL | T ratio | P Value |
---|---|---|---|---|---|---|---|---|
blue | Initial Test | 0.509 | 0.050 | 22 | 0.377 | 0.639 | 0.178 | 1 |
green | Initial Test | 0.585 | 0.042 | 22 | 0.469 | 0.693 | 1.980 | 0.212 |
blue | Final Test | 0.567 | 0.049 | 22 | 0.433 | 0.692 | 1.352 | 0.553 |
green | Final Test | 0.815 | 0.032 | 22 | 0.711 | 0.887 | 6.901 | < .001 |
We find that only green-trained guppies during the final test trial display a rewarding object preference that is significantly different from chance (green object preference: 81.5 ± 3.2%, p < .001) which is the same result we find when using a difference measure.
The analyses on this page were done with R version 3.6.2 (2019-12-12) and with functions
from packages listed in Table 8. This page was written in
Rmarkdown and rendered with knitr
. In Table 8 we provide a
list of packages used, their versions, and references. Note this list does not
include the dependencies of these packages, it includes only packages
explicitly loaded with a library()
call. Installing these packages would
automatically install all their dependencies but to see the full list of
dependencies for all packages used as well as their versions please visit the
metadata
section of the README.
Package | Version | Reference |
---|---|---|
broom | 0.5.5 | David Robinson and Alex Hayes (2020). broom: Convert Statistical Analysis Objects into Tidy Tibbles. R package version 0.5.5. https://CRAN.R-project.org/package=broom |
broom.mixed | 0.2.6 | Ben Bolker and David Robinson (2020). broom.mixed: Tidying Methods for Mixed Models. R package version 0.2.6. https://CRAN.R-project.org/package=broom.mixed |
DHARMa | 0.3.3.0 | Florian Hartig (2020). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models. R package version 0.3.3.0. http://florianhartig.github.io/DHARMa/ |
dplyr | 1.0.3 | Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.3. https://CRAN.R-project.org/package=dplyr |
emmeans | 1.5.1 | Russell Lenth (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.5.1. https://CRAN.R-project.org/package=emmeans |
ggplot2 | 3.3.3 | H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. |
ggpubr | 0.2.5 | Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.2.5. https://CRAN.R-project.org/package=ggpubr |
glmmTMB | 1.0.0 | Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400. |
knitr | 1.30 | Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.30. |
lme4 | 1.1.21 | Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01. |
lmerTest | 3.1.1 | Kuznetsova A, Brockhoff PB, Christensen RHB (2017). “lmerTest Package:Tests in Linear Mixed Effects Models.” Journal of StatisticalSoftware, 82(13), 1-26. doi: 10.18637/jss.v082.i13 (URL:https://doi.org/10.18637/jss.v082.i13). |
magrittr | 2.0.1 | Stefan Milton Bache and Hadley Wickham (2020). magrittr: A Forward-Pipe Operator for R. R package version 2.0.1. https://CRAN.R-project.org/package=magrittr |
MASS | 7.3.51.4 | Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0 |
Matrix | 1.2.18 | Douglas Bates and Martin Maechler (2019). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-18. https://CRAN.R-project.org/package=Matrix |
R | 3.6.2 | R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. |
report | 0.2.0 | Makowski, D., Ben-Shachar, M.S., Patil, I. & Lüdecke, D. (2020). Automated reporting as a practical tool to improve reproducibility and methodological best practices adoption. CRAN. Available from https://github.com/easystats/report. doi: . |
tidyr | 1.0.2 | Hadley Wickham and Lionel Henry (2020). tidyr: Tidy Messy Data. R package version 1.0.2. https://CRAN.R-project.org/package=tidyr |