GitHub - YingtongAamandaWu/MonkeyFlower_ampliconSeq_DNA_data_codes: This is repository of amplicon sequencing data and bioinformatics pipeline codes for the nectar microbiome of Monkey Flowers.

This is a repository of data, R scripts, and codes for Linux-based commands for the bioinformatics analyses of amplicon sequencing data from fungi and bacteria microbiomes within the nectar samples of Sticky Monkeyflower (Diplacus aurantiacus).

The repository is comprised of 4 folders:

01_Data folder includes two CSV files:

"sampling_sheet_regional_survey_2015_final_corrected.csv": metadata for the DNA samples, documenting the site ID, plant ID, flower ID of each sample, as well as the corresponding concentration of fungi and bacteria unit forming colonies (CFUs) in each sample.

"2015_survey_siteinfo_location_envi.csv": documents the environmental data and coordinates for each flower.

"Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx": documents the biosample information for each flower for sample names started from B to N

"Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx": documents the biosample information for each flower for sample names started from O to S

"SRA_metadata_site_started_with_B_N.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx".

"SRA_metadata_site_started_with_O_S.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx".

02_Rscripts includes R script used in the bioinformatics analyses:

"make_Map_20230413.Rmd": R code that makes the Figure 1 map, showing the distributions and locations of samples.

"Bioinformatics_ITS1_DADA2_CONSTAXtaxa_20230202.Rmd": R code that implements the Dada2 pipeline on fungi ITS1 sequences.

"Bioinformatics_16S_DADA2_SILVAtaxa_20230207.Rmd": R code that implements the Dada2 pipeline on bacteria 16S sequences.

"make_phyloseq_objects_&_run_CLAM_test_20220203.Rmd": R code that generates phyloseq objects for ITS1 and 16S sequences, respectively; the code also uses the data from plantings -- densities of bacteria and fungi colony-forming units (CFUs) to categorize whether the nectar samples are (1) bacteria-dominated flowers, (2) fungi-dominated flowers, (3) co-dominated flowers, and (4) flowers with too few microbes to be classified into any of the three other groups.

"diversity_analyses_fungi_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of fungi sequences (ITS1); data analyses include pairwise two-sample permutation tests for alpha diversity, permutational multivariate ANOVA for species composition, differential abundance analyses, etc.

"diversity_analyses_bacteria_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of bacteria sequences (16S); data analyses similar to those included in "diversity_analyses_fungi_threshold=clam_20230413.Rmd".

"ASVlevel_Co_occurence_network_NetCoMi_pearson_sparcc_r0.1_clam_20220429.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using Pearson correlation network and SparCC (Sparse Correlations for Compositional data) network method.

"ASVlevel_co_occurence_network_NetCoMi_spieceasi_r0.1_clam_20220504.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using SPIEC-EASI (Sparse InversE Covariance estimation for Ecological Association and Statistical Inference) network method.

03_Output includes key output files from the bioinformatics pipeline:

"ITS1.unpooled.ASVs.fa": fasta file that documents the representative sequence for each fungi ITS1 ASV (amplicon sequence variant);

"16S.unpooled.ASVs.fa": fasta file that documents the representative sequence for each bacteria 16S ASV (amplicon sequence variant);

"Appendix1_ITS1_fungi_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each fungi ITS1 ASV.

"Appendix2_16S_bacteria_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each bacteria 16S ASV.

04_Docs includes a file that documents other bioinformatics steps not conducted in R:

"Wu_Monkeyflower_bioinformatics_steps.docx": a docx file that records the bioinformatics steps from demultiplexing to species assignment, based in Linux environment.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
01_Data		01_Data
02_Rscripts		02_Rscripts
03_Output		03_Output
04_Docs		04_Docs
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01_Data

01_Data

02_Rscripts

02_Rscripts

03_Output

03_Output

04_Docs

04_Docs

.gitattributes

.gitattributes

README.md

README.md

Repository files navigation

About

Releases

Packages

YingtongAamandaWu/MonkeyFlower_ampliconSeq_DNA_data_codes

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks