You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: mod_stats.qmd
+10-10Lines changed: 10 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -32,7 +32,7 @@ Each project group should:
32
32
33
33
Ingrid Slette, Postdoctoral Researcher at the University Minnesota Coordinator of [NutNet](https://nutnet.org/) and [DRAGNet](https://dragnetglobal.weebly.com/) coordinated, distributed experiments.
34
34
35
-
Ingid synthesizes data in order to understand the impacts of intersecting global changes. Her PhD research focused on how previous climate extremes alter the impacts of extreme drought on root traits and patterns of aboveground vs. belowground plant production and carbon cycling in grasslands. Name pronunciation: ing-rid sleh-tuh. Pronouns: she/her
35
+
Ingrid synthesizes data in order to understand the impacts of intersecting global changes. Her PhD research focused on how previous climate extremes alter the impacts of extreme drought on root traits and patterns of aboveground vs. belowground plant production and carbon cycling in grasslands. Name pronunciation: \[ing-rid sleh-tuh\]. Pronouns: she/her
36
36
37
37
:::
38
38
@@ -89,7 +89,7 @@ For the purposes of SSECR, **our discussion of frequentist inference will focus
89
89
90
90
### Multi-Model Inference
91
91
92
-
Hyoptheses here are a question of <u>which variables explain the *most* variation in the data</u>. Methods in this framing are unconcerned--or at least less concerned than in frequentist inference--with the probability associated with a particular variable. Intead, these methods focus on which of a set of user-defined candidate models explains most of the noise in the data *even when that best model does not necessarily explain much of that variation in absolute terms*.
92
+
Hypotheses here are a question of <u>which variables explain the *most* variation in the data</u>. Methods in this framing are unconcerned--or at least less concerned than in frequentist inference--with the probability associated with a particular variable. Instead, these methods focus on which of a set of user-defined candidate models explains most of the noise in the data *even when that best model does not necessarily explain much of that variation in absolute terms*.
93
93
94
94
If your hypothesis can be summarized as something along the lines of 'we hypothesize that models including X explain more of the variation in Y than those that do not' then multi-model inference may be a more appropriate methodology.
95
95
@@ -126,9 +126,9 @@ By including site as a random slope in this context, we can account for this eff
126
126
127
127
### Nested Random Effects
128
128
129
-
To further complicate matters, we can use nested random effects as well. These can be either random intercepts or random slopes though they are more commonly seen with random intercepts. A nested random effect accounts for the effect of one random variable that *is itself affected by another variable!* A classic example of this is when a study design uses two (or more) levels of spatial nestedness in their experimentall design.
129
+
To further complicate matters, we can use nested random effects as well. These can be either random intercepts or random slopes though they are more commonly seen with random intercepts. A nested random effect accounts for the effect of one random variable that *is itself affected by another variable!* A classic example of this is when a study design uses two (or more) levels of spatial nestedness in their experimental design.
130
130
131
-
For instance, let's imagine we were conducting a global study of marine plankton biodiversity. To gether these data we took several cruises (scientific not--exclusively--pleasure) at different places around the world and during each cruise we followed a set of transects. In each transect we did several plankton tows and quantified the diversity of each tow. We can reasonably assume the following:
131
+
For instance, let's imagine we were conducting a global study of marine plankton biodiversity. To gather these data we took several cruises (scientific not--exclusively--pleasure) at different places around the world and during each cruise we followed a set of transects. In each transect we did several plankton tows and quantified the diversity of each tow. We can reasonably assume the following:
132
132
133
133
1. Each cruise differs from each other cruise (due to any number of climatic/ecological factors)
134
134
- But cruises within the same part of the world are still likely to have similar planktonic communities
@@ -165,7 +165,7 @@ With a small group, decide whether you think the terms in the examples below sho
165
165
166
166
### Mixed-Effects Case Study
167
167
168
-
Let's imagine we are researching tarantula populations for several years in the Chihuahuan Desert. Our hypothesis is that the number of tarantulas will be greater in sites further from the nearest road. We select ten study sites of varying distances from the nearest road and intensively count our furry friends at three plots within each site for several months. We return to our sites--and their associated plots--and repeat this process each year for three years. In the second year we have help from a new member of our lab but in the third year we're back to working alone (they had their own project to handle by then). We enter our data and perform careful quality control to get it into a tidy format ready for analyis.
168
+
Let's imagine we are researching tarantula populations for several years in the Chihuahuan Desert. Our hypothesis is that the number of tarantulas will be greater in sites further from the nearest road. We select ten study sites of varying distances from the nearest road and intensively count our furry friends at three plots within each site for several months. We return to our sites--and their associated plots--and repeat this process each year for three years. In the second year we have help from a new member of our lab but in the third year we're back to working alone (they had their own project to handle by then). We enter our data and perform careful quality control to get it into a tidy format ready for analysis.
With our data in hand, we now want to run some statistical tests and--hopefully--get some endorphine-inducingly small *p*-values. If we choose to simply ignore our possible random effects, we could fit a linear regression.
178
+
With our data in hand, we now want to run some statistical tests and--hopefully--get some endorphine-inducing-ly small *p*-values. If we choose to simply ignore our possible random effects, we could fit a linear regression.
179
179
180
180
```{r mem-lm}
181
181
# Fit model
@@ -219,7 +219,7 @@ ggplot(tarantula_df, aes(y = tarantula_count, x = plot, fill = plot)) +
1. Violin plots are a nice alternative to boxplots because they allow visualizing data distributions directly rather than requiring an intutive grasp of the distribution metrics described by each bit of a boxplot
222
+
1. Violin plots are a nice alternative to boxplots because they allow visualizing data distributions directly rather than requiring an intuitive grasp of the distribution metrics described by each bit of a boxplot
223
223
2. This is allowing us to 'tilt' the X axis tick labels so they don't overlap with one another
224
224
225
225
This graph clearly supports our intuition that among-plot variation is dramatic! We *could* account for this by including plot as a fixed effect but we'll need to sacrifice a lot of degrees of freedom (can be thought of as "statistical power") for a variable that we don't actually care about. Instead, we could include plot as another random effect.
@@ -265,7 +265,7 @@ To begin, it can be helpful to write out all possible "candidate models". For in
265
265
266
266
We might also fit other candidate models for pairs of X, W, and Z but for the sake of simplicity in this hypothetical we'll skip those. Note that for this method to be appropriate <u>you need to fit the same type of model in all cases</u>!
267
267
268
-
Once we've fit all of our models and assigned them to objects, we can use the `AIC` function included in base R to compare the AIC score of each model. "AIC" stands for <u>A</u>kaike (*AH-kuh-ee-kay*) <u>I</u>nformation <u>C</u>riterion and is one of several related information criteria for summarizing a model's explanatory power. Models with more parameters are penalized to make it mathematically possible for a model with fewer explanatory variables to still do a better job capturing the variation in the data.
268
+
Once we've fit all of our models and assigned them to objects, we can use the `AIC` function included in base R to compare the AIC score of each model. "AIC" stands for <u>A</u>kaike (\[AH-kuh-ee-kay\]) <u>I</u>nformation <u>C</u>riterion and is one of several related information criteria for summarizing a model's explanatory power. Models with more parameters are penalized to make it mathematically possible for a model with fewer explanatory variables to still do a better job capturing the variation in the data.
269
269
270
270
The model with the *lowest* AIC best explains the data. Technically any difference in AIC indicates model improvement but many scientists use a rule of thumb of a difference of 2. So, if two models have AIC scores that differ by less than 2, you can safely say that they have comparable explanatory power. That is definitely a semi-arbitrary threshold but so is the 0.05 threshold for *p*-value "significance".
Note that Cohen's *d* is just one effect size available to you and others may be more appropriate in certain contexts. Just like any other metric, which effect size you choose is a mix of your scientific intution and appropriateness for the content of your data. For a deeper dive into the breadth of effect size considerations available to you, see [the relevant chapter](https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/effects.html#effects) of the *Doing Meta-Analysis in R* online book.
363
+
Note that Cohen's *d* is just one effect size available to you and others may be more appropriate in certain contexts. Just like any other metric, which effect size you choose is a mix of your scientific intuition and appropriateness for the content of your data. For a deeper dive into the breadth of effect size considerations available to you, see [the relevant chapter](https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/effects.html#effects) of the *Doing Meta-Analysis in R* online book.
364
364
365
365
After you've calculated all relevant effect sizes--using your chosen flavor of effect size--the "actual" meta-analysis is nearly finished. Simply create a graph of the effect sizes with error bars indicating confidence intervals (included by default in most effect size functions). Where the error bars overlap among studies, there is no significant difference between those effect sizes. Conversely, where the error bars *do not* overlap among studies the effect sizes do significantly differ indicating that the studies results' differ for the data used to calculate the effect size.
366
366
367
367
## Additional Resources
368
368
369
369
### Papers & Documents
370
370
371
-
- Spake, R. *et al.*[Understanding 'It Depends' in Ecology: A Guide to Hypothesising, Visualising and Interpreting Statistical Interactions](https://onlinelibrary.wiley.com/doi/10.1111/brv.12939). **2023**. *Biological Reviews*
371
+
- Spake, R. *et al.*[Understanding 'It Depends' in Ecology: A Guide to Hypothesizing, Visualising and Interpreting Statistical Interactions](https://onlinelibrary.wiley.com/doi/10.1111/brv.12939). **2023**. *Biological Reviews*
372
372
- Spake, R. *et al.*[Improving Quantitative Synthesis to Achieve Generality in Ecology](https://www.nature.com/articles/s41559-022-01891-z). **2022**. *Nature Ecology and Evolution*
373
373
- Tredennick, A.T. *et al.*[A Practical Guide to Selecting Models for Exploration, Inference, and Prediction in Ecology](https://esajournals.onlinelibrary.wiley.com/doi/10.1002/ecy.3336). **2021**. *Ecology*
374
374
- Harrier, M. *et al.*[Doing Meta-Analysis with R: A Hands-On Guide](https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/). **2021**.
0 commit comments