formatting updates 02/07/2024

cambiotraining · Jul 2, 2024 · b3ad731 · b3ad731
1 parent 114f3f4
commit b3ad731
Show file tree

Hide file tree

Showing 34 changed files with 116 additions and 59 deletions.
diff --git a/_freeze/materials/checking-assumptions/execute-results/html.json b/_freeze/materials/checking-assumptions/execute-results/html.json
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-11-1.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-11-1.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-11-2.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-11-2.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-4-1.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-4-1.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-5-1.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-5-1.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-6-1.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-6-1.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-8-1.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-8-1.png
diff --git a/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-8-2.png b/_freeze/materials/checking-assumptions/figure-html/unnamed-chunk-8-2.png
diff --git a/_freeze/materials/crossed-random-effects/execute-results/html.json b/_freeze/materials/crossed-random-effects/execute-results/html.json
diff --git a/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-13-1.png b/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-13-1.png
diff --git a/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-13-2.png b/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-13-2.png
diff --git a/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-9-1.png b/_freeze/materials/crossed-random-effects/figure-html/unnamed-chunk-9-1.png
diff --git a/_freeze/materials/fitting-mixed-models/execute-results/html.json b/_freeze/materials/fitting-mixed-models/execute-results/html.json
diff --git a/_freeze/materials/generalised-mixed-models/execute-results/html.json b/_freeze/materials/generalised-mixed-models/execute-results/html.json
diff --git a/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-10-1.png b/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-10-1.png
diff --git a/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-13-1.png b/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-13-1.png
diff --git a/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-14-1.png b/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-14-1.png
diff --git a/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-5-1.png b/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-5-1.png
diff --git a/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-7-1.png b/_freeze/materials/generalised-mixed-models/figure-html/unnamed-chunk-7-1.png
diff --git a/_freeze/materials/nested-random-effects/execute-results/html.json b/_freeze/materials/nested-random-effects/execute-results/html.json
diff --git a/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-17-1.png b/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-17-1.png
diff --git a/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-17-2.png b/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-17-2.png
diff --git a/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-25-1.png b/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-25-1.png
diff --git a/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-25-2.png b/_freeze/materials/nested-random-effects/figure-html/unnamed-chunk-25-2.png
diff --git a/_freeze/materials/random-effects/execute-results/html.json b/_freeze/materials/random-effects/execute-results/html.json
diff --git a/_freeze/materials/significance-and-model-comparison/execute-results/html.json b/_freeze/materials/significance-and-model-comparison/execute-results/html.json
diff --git a/index.md b/index.md
@@ -1,6 +1,6 @@
 ---
 title: "Mixed effects models"
-author: "Vicki Hodgson"
+author: "Vicki Hodgson*, Hugo Tavares, Paul Fannon, Martin van Rongen"
 date: today
 number-sections: false
 ---
@@ -33,7 +33,7 @@ You should have a working knowledge of R/RStudio, and a grasp of core statistics
 Exercises in these materials are labelled according to their level of difficulty:
 
 | Level | Description |
-| ----: | :---------- |
+| :-: | :----------- |
 | {{< fa solid star >}} {{< fa regular star >}} {{< fa regular star >}} | Exercises in level 1 are simpler and designed to get you familiar with the concepts and syntax covered in the course. |
 | {{< fa solid star >}} {{< fa solid star >}} {{< fa regular star >}} | Exercises in level 2 combine different concepts together and apply it to a given task. |
 | {{< fa solid star >}} {{< fa solid star >}} {{< fa solid star >}} | Exercises in level 3 require going beyond the concepts and syntax introduced to solve new problems. |
@@ -68,16 +68,18 @@ About the authors:
 
 ## References
 
-Bolker, B. (2023, 8 October). *GLMM FAQ*. https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
+Baath, R. (2024, 28 January). *The source of the cake dataset*. <https://www.sumsar.net/blog/source-of-the-cake-dataset/>
 
-Choe, J. (2020). *The Correlation Parameter in the Random Effects of Mixed Effects Models.* https://rpubs.com/yjunechoe/correlationsLMEM 
+Bolker, B. (2023, 8 October). *GLMM FAQ*. <https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html>
+
+Choe, J. (2020). *The Correlation Parameter in the Random Effects of Mixed Effects Models.* <https://rpubs.com/yjunechoe/correlationsLMEM> 
 
 Cook, F. E. (1938). *Chocolate cake: I. Optimum baking temperature.* (Doctoral dissertation, Iowa State College).
 
 Faraway, J. J. (2016). *Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models.* Chapman and Hall/CRC.
 
-Hadjuk, G. K. & Gallois E. (2022, 9 February). *Introduction to linear mixed models.* Our Coding Club. https://ourcodingclub.github.io/tutorials/mixed-models/
+Hadjuk, G. K. & Gallois E. (2022, 9 February). *Introduction to linear mixed models.* Our Coding Club. <https://ourcodingclub.github.io/tutorials/mixed-models/>
 
-Oehlert, G. W. (2010). *A first course in design and analysis of experiments.* https://conservancy.umn.edu/server/api/core/bitstreams/87e0734d-31ea-4596-8295-d87705271c07/content 
+Oehlert, G. W. (2010). *A first course in design and analysis of experiments.* <https://conservancy.umn.edu/server/api/core/bitstreams/87e0734d-31ea-4596-8295-d87705271c07/content> 
 
 Winter, B., & Grawunder, S. (2012). *The phonetic profile of Korean formal and informal speech registers.* Journal of Phonetics, 40(6), 808-815.
diff --git a/materials/checking-assumptions.qmd b/materials/checking-assumptions.qmd
@@ -138,11 +138,13 @@ If you find the green, blue and red default colours in `check_model` to be a lit
 
 ## Exercises
 
-### Exercise 1 - Dragons revisited (again)
+### Dragons revisited (again) {#sec-exr_dragons3}
+
+::: {.callout-exercise}
 
 {{< level 1 >}}
 
-Let's once again revisit the `dragons` dataset, and the minimal model that we chose in the previous section based on significance testing:
+Let's once again revisit the `dragons` dataset, and the minimal model that we chose in [Exercise -@sec-exr_dragons2] based on significance testing:
 
 ::: {.panel-tabset group="language"}
 ## R
@@ -157,7 +159,7 @@ lme_dragons_dropx <- lmer(intelligence ~ wingspan + scales +
 
 Fit diagnostic plots for this model using the code given above. What do they show?
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 ::: {.panel-tabset group="language"}
@@ -175,7 +177,11 @@ check_model(lme_dragons_dropx,
 
 Try comparing these diagnostic plots to the diagnostic plots for the full model, `intelligence ~ wingspan*scales + (1 + wingspan|mountain)`. Are the assumptions better met? Why/why not?
 
-### Exercise 2 - Arabidopsis
+:::
+
+### Arabidopsis {#sec-exr_arabidopsis}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -209,7 +215,7 @@ Fit the following mixed effects model:
 
 and check its assumptions. What can you conclude about the suitability of a linear mixed effects model for this dataset?
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 #### Fitting the model
@@ -263,6 +269,7 @@ To figure out why, and whether it's fixable, think about the types of variables
 
 Chat about these bonus questions with a neighbour, or a trainer. Understanding why these diagnostic plots look bad, and why we might need to take a closer look at the dataset before we fit things, will serve you really well when working with your own data.
 
+:::
 :::
 
 ## Summary

diff --git a/materials/crossed-random-effects.qmd b/materials/crossed-random-effects.qmd
@@ -139,7 +139,9 @@ If you check the output, you can see that we do indeed have 4 groups each for `r
 
 ## Exercises
 
-### Exercise 1 - Penicillin
+### Penicillin {#sec-exr_penicillin}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -169,7 +171,7 @@ For this exercise:
 3. Check the model assumptions
 4. Visualise the model
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 This is quite a simple dataset, in that there are only two variables besides the response. But, given the research question, we likely want to consider both of these two variables as random effects.
@@ -210,7 +212,11 @@ ggplot(augment(lme_penicillin), aes(x = plate, y = diameter, colour = sample)) +
 
 :::
 
-### Exercise 2 - Politeness
+:::
+
+### Politeness {#sec-exr_solutions}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -243,7 +249,7 @@ To answer this question:
 2. Try drawing out the structure of the dataset, and think about what levels the different variables are varying at
 3. You may want to assess the quality and significance of the model to help you draw your final conclusions
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 #### Consider the experimental design
@@ -337,6 +343,8 @@ In the final line of code for the plot, we've included the lines of best fit for
 
 :::
 
+:::
+
 ## Summary
 
 This section has addressed how to fit models with multiple clustering variables, in scenarios where those clustering variables are not nested with one another.

diff --git a/materials/fitting-mixed-models.qmd b/materials/fitting-mixed-models.qmd
@@ -429,7 +429,9 @@ This idea of taking into account the global average when calculating our set of
 
 ## Exercises
 
-### Exercise 1 - Irrigation
+### Irrigation {#sec-exr_irrigation}
+
+::: {.callout-exercise}
 
 {{< level 1 >}}
 
@@ -458,7 +460,7 @@ For this exercise:
 
 Does it look as if `irrigation` method or crop `variety` are likely to affect `yield`?
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 #### Visualise the data
@@ -536,7 +538,11 @@ ggplot(augment(lme_yield), aes(x = irrigation, y = yield, shape = variety)) +
 
 :::
 
-### Exercise 2 - Solutions
+:::
+
+### Solutions {#sec-exr_solutions}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -554,7 +560,11 @@ There is no worked answer provided for this exercise, in order to challenge you
 Note: if you encounter the `boundary (singular) fit: see help('isSingular')` error, this doesn't mean that you've used the `lme4` syntax incorrectly; as we'll discuss later in the course, it means that the model you've fitted is too complex to be supported by the size of the dataset.
 :::
 
-### Exercise 3 - Dragons
+:::
+
+### Dragons {#sec-exr_dragons}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -576,7 +586,7 @@ With more variables, there are more possible models that could be fitted. Think
 
 Try to work through this yourself, before expanding the answer below.
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 Here, we'll work through how to fit and visualise one possible mixed effects model that could be fitted to these data.
@@ -738,7 +748,9 @@ You might also notice in the model summary that the estimated variance for the r
 
 :::
 
-::: {.callout-tip appearance="minimal"}
+:::
+
+::: {.callout-exercise}
 #### Bonus questions
 
 {{< level 3 >}}
@@ -782,6 +794,8 @@ Where $y$ is `intelligence`, $x_1$ is `wingspan`, $x_2$ is `scales`, $j$ represe
 
 :::
 
+:::
+
 ## Summary
 
 This section of the course is designed to introduce the syntax required for fitting two-level mixed models in R, including both random intercepts and random slopes, and how we can visualise the resulting models.

diff --git a/materials/generalised-mixed-models.qmd b/materials/generalised-mixed-models.qmd
@@ -60,9 +60,9 @@ The assumptions of a GLMM are an amalgamation of the assumptions of a GLM and a
 - Correct link function; there is a linear relationship between the linearised model
 - Normally distributed random effects
 
-## Revisiting arabidopsis
+## Revisiting Arabidopsis
 
-To give an illustration of how we fit and assess generalised linear mixed effects models (GLMMs), we'll look at the internal dataset `Arabidopsis` from `lme4`.
+To give an illustration of how we fit and assess generalised linear mixed effects models (GLMMs), we'll look at the internal dataset `Arabidopsis`, which we investigated earlier in the course in [Exercise -@sec-exr_arabidopsis].
 
 ::: {.panel-tabset group="language"}
 ## R

diff --git a/materials/nested-random-effects.qmd b/materials/nested-random-effects.qmd
@@ -270,7 +270,9 @@ And, no matter which method you choose, always check the model output to see tha
 
 ## Exercises
 
-### Exercise 1 - Cake
+### Cake {#sec-exr_cake}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -302,7 +304,7 @@ For this exercise:
 3. Consider how you might recode the dataset to reflect implicit nesting
 4. Fit and test at least one appropriate model
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 #### Consider the experimental design
@@ -417,12 +419,12 @@ ggplot(augment(lme_cake), aes(x = temperature, y = angle, colour = recipe)) +
 ```
 :::
 
-
+:::
 
 :::
 
-::: {.callout-tip appearance="minimal"}
-#### Follow-up questions
+::: {.callout-exercise}
+#### Bonus questions
 
 {{< level 2 >}}
 
@@ -435,7 +437,9 @@ If you want to think a bit harder about this dataset, consider these additional
 
 For more information on the very best way to bake a chocolate cake (and a lovely demonstration at the end about the dangers of extrapolating from a linear model), [this blog post](https://www.sumsar.net/blog/source-of-the-cake-dataset/) is a nice source. It's written by a data scientist who was so curious about the quirky `cake` dataset that he contacted Iowa State University, who helped him unearth Cook's original thesis.
 
-### Exercise 2 - Parallel fibres
+### Parallel fibres {#sec-exr_parallel}
+
+::: {.callout-exercise}
 
 {{< level 2 >}}
 
@@ -466,7 +470,7 @@ For this exercise:
 2. Determine whether the dataset requires recoding or explicit nesting
 3. Fit and test at least one appropriate model
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Worked answer
 
 #### Visualise the design
@@ -593,8 +597,10 @@ Our diagnostic plots look pretty good for our simpler, intercepts-only model, bu
 
 :::
 
-::: {.callout-tip appearance="minimal"}
-#### Optional follow-up question: notation
+:::
+
+::: {.callout-exercise}
+#### Bonus question: notation
 
 {{< level 3 >}}
 
@@ -604,7 +610,7 @@ What would the equation of a three level model fitted to the `parallel` dataset
 
 Hint: you'll need more subscript letters than you did for a two-level model!
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Answer: three-level intercepts-only
 
 E.g., `length ~ depth + (1|slice:cat) + (1|cat)`
@@ -641,7 +647,7 @@ $$
 
 :::
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Answer: three-level intercepts & slopes
 
 E.g., `length ~ depth + (1|slice:cat) + (1 + depth|cat)`

diff --git a/materials/random-effects.qmd b/materials/random-effects.qmd
@@ -64,7 +64,9 @@ There'll be more about the maths of fitting random effects later in the course,
 
 ## Exercises
 
-### Exercise 1 - Primary schools
+### Primary schools {#sec-exr_primaryschools}
+
+::: {.callout-exercise}
 
 {{< level 1 >}}
 
@@ -82,7 +84,7 @@ The response variable in this example is the standardised academic test scores,
 
 Which of these predictors should be treated as fixed versus random effects? Are there any other "hidden" grouping variables that we should consider, based on the description of the experiment?
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Answer
 
 We care about the effects of `gender` and `SES score`. We might also be interested in testing for the interaction between them, like so: `academic test scores ~ SES + gender + SES:gender`.
@@ -101,7 +103,11 @@ The `classroom` variable would in fact be "nested" inside the `school` variable
 Our other possible hidden variable is `family`. If siblings have been included in the study, they will share an identical SES score, because this has been derived from the parent(s) rather than the students themselves. Siblings are, in this context, technical replicates! One way to deal with this is to simply remove siblings from the study; or, if there are enough sibling pairs to warrant it, we could also treat `family` as a random effect.
 :::
 
-### Exercise 2 - Ferns
+:::
+
+### Ferns {#sec-exr_ferns}
+
+::: {.callout-exercise}
 
 {{< level 1 >}}
 
@@ -115,7 +121,7 @@ What are our variables? What's the relationship we're interested in, and which o
 
 ![Predictor variables](images_mixed-effects/example2_1.png){fig-alt="Graphic with three variables listed: Tray, Itensity and Timepoint"}
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Answer
 
 There are four things here that vary: `tray`, `light intensity`, `timepoint` and `height`. 
@@ -138,7 +144,11 @@ In this case, then, `time` would probably be best treated as a fixed rather than
 However, if we were not measuring a response variable that changes over time (like growth), that might change. If, for instance, we were investigating the relationship between light intensity and chlorophyll production in adult plants, then measuring across different time points would be a case of technical replication instead, and `time` would be best treated as a random effect. **The research question is key in making this decision.**
 :::
 
-### Exercise 3 - Wolves
+:::
+
+### Wolves {#sec-exr_wolves}
+
+::: {.callout-exercise}
 
 {{< level 1 >}}
 
@@ -148,7 +158,7 @@ What's the relationship of interest? Is our total *n* really 60?
 
 ![Predictor variables](images_mixed-effects/example3_1.png){fig-alt="Graphic with three variables listed: Wolf population, National park and Year."}
 
-::: {.callout-note collapse="true"}
+::: {.callout-tip collapse="true"}
 #### Answer
 
 Though we have 60 observations, it would of course be a case of pseudoreplication if we failed to understand the clustering within these data.
@@ -165,6 +175,8 @@ We have measured across several national parks, and over a 10 year period, in or
 Of course, you might know more about ecology than me, and have a good reason to believe that the exact years *do* matter - that perhaps something fundamental in the relationship between `flood depth ~ wolf population` really does vary with year in a meaningful way. But given that our research question does not focus on change over time, both `year` and `national park` would be best treated as random effects given the information we currently have.
 :::
 
+:::
+
 ## Summary
 
 ::: {.callout-tip}