Skip to content

Commit 4ab6717

Browse files
Update instructions and refactor tables
1 parent 3a186c4 commit 4ab6717

File tree

9 files changed

+168
-145
lines changed

9 files changed

+168
-145
lines changed

R.Rmd

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Analysing the TreeSAPP output in R {#R}
1+
# Analysing the TreeSAPP output in R
22

33
```{r setup-R, echo = FALSE, warning = FALSE, message = FALSE}
44
knitr::opts_chunk$set(eval = FALSE)
@@ -8,21 +8,18 @@ These instructions will help you to complete the script template `treesapp_analy
88

99
## Resources for analysis of TreeSAPP output in R
1010

11-
- Both TreeSAPP output files `marker_contig_map.tsv` (one each for the metagenomic and metatranscriptomic analysis) you generated on the [Google Cloud Platform (GCP)](#shell).
12-
11+
- Both TreeSAPP output files `marker_contig_map.tsv` (one each for the metagenomic and metatranscriptomic analysis) you generated with [TreeSAPP](#running-treesapp on a server).
1312
- An R script template called `treesapp_analysis.R` on Canvas.
14-
1513
- `Saanich_Data.csv` file containing geochemical measurements.
1614

1715
## Checklist to write and run R script
1816

1917
- [ ] Place `treesapp_analysis.R` and both `marker_contig_map.tsv` files into a single folder and create a new RStudio project in that folder on your local computer.
18+
- [ ] Edit the `treesapp_analysis.R` script following [the instructions below](#writing-the-r-script).
2019

21-
- [ ] Edit the `treesapp_analysis.R` script following [the instructions below](#R-writing).
22-
23-
## Writing the R script {#R-writing}
20+
## Writing the R script
2421

25-
In the `treesapp_analysis.R` script you will load and subset your TreeSAPP data to the variables and marker genes of interest. You will then combine the metagenomic and metatranscriptomic data into a single data frame and then break the taxonomic information into taxonomic ranks. You will also load the `Saanich_Data.csv` file for geochemical measurements to learn more about the conditions at your assigned depth. Further process your data sets to address [the four main research questions](#research-questions).
22+
In the `treesapp_analysis.R` script you will load and subset your TreeSAPP data to the variables and marker genes of interest. You will then combine the metagenomic and metatranscriptomic data into a single data frame and then break the taxonomic information into taxonomic ranks. You will also load the `Saanich_Data.csv` file for geochemical measurements to learn more about the conditions at your assigned depth. Further process your data sets to address [the four main research questions](#results).
2623

2724
## Load any required packages
2825

_bookdown.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,22 @@ rmd_files:
99
# Providing array as YAML sequence, so it's easier to comment out Rmd files that
1010
# are not needed.
1111
- index.Rmd
12-
- part_introduction.Rmd
12+
- part_introduction.Rmd # separator for TOC navigation
1313
- introduction.Rmd
14-
- part_tutorials.Rmd
14+
- part_tutorials.Rmd # separator for TOC navigation
1515
- tools.Rmd
1616
- cheat_sheet.Rmd
1717
- golden_data.Rmd
1818
- treesapp_create.Rmd
1919
- treesapp_assign_and_update.Rmd
2020
- treesapp_abundance.Rmd
21-
- part_capstone.Rmd
21+
- part_capstone.Rmd # separator for TOC navigation
2222
- instructions.Rmd
2323
- assessment.Rmd
2424
- data.Rmd
25+
- shell.Rmd
26+
- R.Rmd
2527
# - summary.Rmd
2628
# - gcp_setup.Rmd
27-
# - shell.Rmd
28-
# - R.Rmd
29-
- appendix.Rmd
29+
- appendix.Rmd # separator for TOC navigation
3030
- references.Rmd

assessment.Rmd

Lines changed: 2 additions & 19 deletions
Large diffs are not rendered by default.

child_Rmds/report_structure.Rmd

Lines changed: 43 additions & 0 deletions
Large diffs are not rendered by default.

child_Rmds/rubric.Rmd

Lines changed: 23 additions & 0 deletions
Large diffs are not rendered by default.

child_Rmds/timeline.Rmd

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
<!-- Table for capstone timeline, edit with RStudio visual editor -->
2+
3+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4+
| Date | Description |
5+
+:=============+:====================================================================================================================================================================================+
6+
| Mar 29 | Introduction and begin running TreeSAPP |
7+
| | |
8+
| | Ideally, your analyses on the server should run over night |
9+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10+
| Mar 31 | **Start of capstone project** |
11+
| | |
12+
| | Group work |
13+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
14+
| Apr 7, 9, 12 | Group work |
15+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
16+
| Apr 14 | **Project presentation** |
17+
| | |
18+
| | Project synthesis and comparison |
19+
| | |
20+
| | Be prepared to discuss the main conclusions that you've reached for your assigned depth |
21+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
22+
| Apr 16 | Course recap and discussion |
23+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
24+
| April 14--24 | **Report writing** |
25+
| | |
26+
| | Groups are expected to meet remotely as needed over the Finals Period in order to complete the report. This report serves as a final for this course and should be treated as such. |
27+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
28+
| April 24 | **Final due date** |
29+
| | |
30+
| | For reports and completed portfolios |
31+
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
32+
33+
: Capstone timeline

data.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ We will work with real-world data collected as part of an ongoing oceanographic
88
knitr::include_graphics("images/saanich_inlet_map.png")
99
```
1010

11-
Figure \@ref(fig:saanich-inlet) shows a map of Saanich Inlet indicating conventional sample collection stations (S1-S9). The data used in this tutorial (sourced from S3, Cruise 72) include various geochemical measurements and the genomic and transcriptomic data of microbial samples at depths 10, 100, 120, 135, 150, 165, 200.
11+
Figure \@ref(fig:saanich-inlet) shows a map of Saanich Inlet indicating conventional sample collection stations (S1-S9). The data used in this tutorial (sourced from S3, Cruise 72) include various geochemical measurements and the genomic and transcriptomic data of microbial samples at depths 10, 100, 120, 135, 150, 165 and 200 m.
1212

1313
For more details about these data, see @Hallam.2017, and for more detailed information on the environmental context and time series data, see @Torres-Beltrán.2017.
1414

instructions.Rmd

Lines changed: 28 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,56 @@
11
# Project description
22

3-
In this project, you will work in teams to explore marine microbial communities and the nitrogen cycle, particularly **denitrification**. You will use metagenomic and metatranscriptomic data from Cruise 72 at 7 depths in Saanich Inlet, a seasonally anoxic fjord that serves as a model ecosystem for studying microbial community responses to changing levels of oxygen. Each group has been assigned a specific depth (given in Canvas group name) and will assess microbial communities in terms of taxonomic rank, abundance, and expression along the redoxcline in Saanich Inlet.
3+
In this project, you will work in teams to explore marine microbial communities and geochemical pathways, for example the nitrogen cycle in general or **denitrification** in particular. You will use metagenomic and metatranscriptomic data from Cruise 72 at 7 depths in [Saanich Inlet](#the-saanich-inlet-data-set), a seasonally anoxic fjord that serves as a model ecosystem for studying microbial community responses to changing levels of oxygen. Each group will explore a group of genes with TreeSAPP to assess microbial communities in terms of taxonomic rank, abundance, and expression along the redoxcline in Saanich Inlet.
44

5-
## Timeline
6-
7-
The following provides an outline as well as some specific milestones within the project.
8-
9-
1. Mar 27: Introduction and begin running TreeSAPP
10-
11-
- Ideally, your GCP analyses should run over the weekend so that you have data to work with in class on Monday.
12-
13-
2. Mar 30: Introduction to statistics; Group work
14-
15-
3. Apr 1 and 3: Group work
16-
17-
4. Apr 6: Project synthesis
18-
19-
- Be prepared to discuss the main conclusions that you've reached for your assigned depth.
20-
21-
5. (April 8: Course recap and discussion)
5+
## Guiding research questions
226

23-
6. April 9-24: Report writing.
7+
You will need to decide on the scope and focus of your project:
248

25-
- Groups are expected to meet remotely as needed over the Finals Period in order to complete the report. This report serves as a final for this course and should be treated as such.
9+
1. Select your genes for analysis based on one of the following options:
2610

27-
7. April 24: Final reports due with final portfolios
11+
a. Choose a pathway of a geochemical cycle and its associated genes with reference packages already available for TreeSAPP (see table).
2812

29-
## Reports
30-
31-
Reports should be formatted as per the [Instructions to Authors](https://jb.asm.org/sites/default/files/additional-assets/JB-ITA.pdf) for the [Journal of Bacteriology](https://jb.asm.org/).
32-
33-
Each group will complete **one** report with the following sections.
34-
35-
### Abstract
13+
b. Choose a pathway of a geochemical cycle and its associated genes which do NOT have reference packages available for TreeSAPP. As part of your project, you will create those reference packages.
3614

37-
*200--250 words*
15+
c. Look at all genes available as reference packages for TreeSAPP without a focus on any particular pathway.
3816

39-
Note that an Importance section is not required.
17+
2. Perform a preliminary analysis the Saanich Inlet data for all depths.
4018

41-
### Introduction
19+
3. Update ref packages based on preliminary analysis.
4220

43-
*500--750 words*
21+
4. Look for pattern in abundance of genes across water column.
4422

45-
- Overview of the nitrogen cycle including its global impacts and microbial foundations.
46-
- Introduce Saanich Inlet as a model ecosystem for studying microbial community responses to ocean deoxygenation *e.g.* seasonal cycles, relevant biogeochemistry, previous studies, etc.
23+
### Questions for project with a focus on geochemical pathways (repeated in report structure below)
4724

48-
### Methods
25+
1. How does abundance of those genes differ across the pathway? Are trends similar for both RNA and DNA?
4926

50-
*300--500 words*
27+
2. How does microbial diversity differ across the pathway? Are trends similar for both RNA and DNA?
5128

52-
- Briefly describe the data (sampling, sequencing, processing, etc.)
29+
3. What specific taxa are responsible for the geochemical cycle under investigation? Are they the same for all steps? For DNA versus RNA?
5330

54-
- Briefly describe your analysis methods including
31+
4. How does the abundance of those genes relate to nitrogen species in Saanich (use the geochemical data in Saanich_Data.csv from our previous data science sessions)?
5532

56-
- TreeSAPP version and commands used
57-
- iTOL version
58-
- R version and packages used
59-
- Statistics (if applicable)
33+
### Questions for project with a focus on using all reference packages
6034

61-
- Provide one single shell script and one single R script (i.e "treesapp_analysis.sh" and "treesapp_analysis.R") as individual files (i.e. not as part of your manuscript) containing the complete code to generate your results.
35+
Add questions.
6236

63-
### Results {#research-questions}
37+
## Your submission
6438

65-
*500--750 words*
39+
Your final submission will consist of 3 separate files: the report itself (`docx` or `pdf`), one shell script `treesapp_analysis.sh`, and one R script `treesapp_analysis.R` (both script files must be in plain text format). The report should not contain any code, but should contain versions of software tools used and a high-level description of your workflow (i.e describe *what* was done and NOT *how*).
6640

67-
Your analysis will focus on **denitrification genes at your assigned depth** (which is given in your Canvas group name) and the following questions:
68-
69-
1. How does abundance of denitrification genes differ across the pathway? Are trends similar for both RNA and DNA?
70-
2. How does microbial diversity differ across the pathway? Are trends similar for both RNA and DNA?
71-
3. What specific taxa are responsible for denitrification? Are they the same for all steps? For DNA versus RNA?
72-
4. How does the abundance of denitrification genes relate to nitrogen species in Saanich (use the geochemical data in `Saanich_Data.csv` from our previous data science sessions)?
41+
## Timeline
7342

74-
You must include ≥ 5 figures/panels with titles and full captions. These figures can be combined into multi-panel figures if desired.
43+
The following provides an outline as well as some specific milestones within the project.
7544

76-
### Discussion
45+
```{r child = "child_Rmds/timeline.Rmd"}
46+
```
7747

78-
*750--1000 words*
48+
## Reports
7949

80-
- Frame your depth's results within a broader discussion of Saanich Inlet and the other depths (Apr 6 discussion)
81-
- Propose evolutionary, environmental, etc. reasoning for distributed metabolism as seen in the denitrification pathway
82-
- Future directions
50+
Reports should be formatted as per the [Instructions to Authors](https://jb.asm.org/sites/default/files/additional-assets/JB-ITA.pdf) for the [Journal of Bacteriology](https://jb.asm.org/).
8351

84-
### References
52+
Each group will submit **one** report with the sections below.
8553

86-
*10 or more* formatted in the ASM style such as for the [Journal of Bacteriology](https://jb.asm.org/content/organization-and-format). If you are using a reference manager, this style can be downloaded for [EndNote](https://endnote.com/style_download/american-society-for-microbiology-asm-journals-2/), [Mendeley](https://github.com/citation-style-language/styles/blob/master/american-society-for-microbiology.csl), or [Zotero](https://www.zotero.org/styles?q=microbiology).
54+
```{r child = "child_Rmds/report_structure.Rmd"}
55+
```
8756

88-
Make sure to cite the data source papers!

0 commit comments

Comments
 (0)